Skip to content

Documentation for Swappingweights Module

SwappingWeightsAttack

Bases: ModelAttack

Implements a swapping weights attack on the received model weights.

This attack performs stochastic swapping of weights in a specified layer of the model, potentially disrupting its performance. The attack is not deterministic, and its performance can vary. The code may not work as expected for some layers due to reshaping, and its computational cost scales quadratically with the layer size. It should not be applied to the last layer, as it would make the attack detectable due to high loss on the malicious node.

Parameters:

Name Type Description Default
engine object

The training engine object that manages the aggregator.

required
attack_params dict

Parameters for the attack, including: - layer_idx (int): The index of the layer where the weights will be swapped.

required
Source code in nebula/addons/attacks/model/swappingweights.py
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
class SwappingWeightsAttack(ModelAttack):
    """
    Implements a swapping weights attack on the received model weights.

    This attack performs stochastic swapping of weights in a specified layer of the model, 
    potentially disrupting its performance. The attack is not deterministic, and its performance 
    can vary. The code may not work as expected for some layers due to reshaping, and its 
    computational cost scales quadratically with the layer size. It should not be applied to 
    the last layer, as it would make the attack detectable due to high loss on the malicious node.

    Args:
        engine (object): The training engine object that manages the aggregator.
        attack_params (dict): Parameters for the attack, including:
            - layer_idx (int): The index of the layer where the weights will be swapped.
    """
    def __init__(self, engine, attack_params):
        """
        Initializes the SwappingWeightsAttack with the specified engine and parameters.

        Args:
            engine (object): The training engine object.
            attack_params (dict): Dictionary of attack parameters, including the layer index.
        """
        super().__init__(engine)
        self.layer_idx = int(attack_params["layer_idx"])
        self.round_start_attack = int(attack_params["round_start_attack"])
        self.round_stop_attack = int(attack_params["round_stop_attack"])

    def model_attack(self, received_weights):
        """
        Performs the swapping weights attack by computing a similarity matrix and 
        swapping the weights of a specified layer based on their similarity.

        This method applies a greedy algorithm to swap weights in the selected layer 
        in a way that could potentially disrupt the training process. The attack also 
        ensures that some rows are fully permuted, and others are swapped based on 
        similarity.

        Args:
            received_weights (dict): The aggregated model weights to be modified.

        Returns:
            dict: The modified model weights after performing the swapping attack.
        """
        logging.info("[SwappingWeightsAttack] Performing swapping weights attack")
        lkeys = list(received_weights.keys())
        wm = received_weights[lkeys[self.layer_idx]]

        # Compute similarity matrix
        sm = torch.zeros((wm.shape[0], wm.shape[0]))
        for j in range(wm.shape[0]):
            sm[j] = pairwise_cosine_similarity(wm[j].reshape(1, -1), wm.reshape(wm.shape[0], -1))

        # Check rows/cols where greedy approach is optimal
        nsort = np.full(sm.shape[0], -1)
        rows = []
        for j in range(sm.shape[0]):
            k = torch.argmin(sm[j])
            if torch.argmin(sm[:, k]) == j:
                nsort[j] = k
                rows.append(j)
        not_rows = np.array([i for i in range(sm.shape[0]) if i not in rows])

        # Ensure the rest of the rows are fully permuted (not optimal, but good enough)
        nrs = copy.deepcopy(not_rows)
        nrs = np.random.permutation(nrs)
        while np.any(nrs == not_rows):
            nrs = np.random.permutation(nrs)
        nsort[not_rows] = nrs
        nsort = torch.tensor(nsort)

        # Apply permutation to weights
        received_weights[lkeys[self.layer_idx]] = received_weights[lkeys[self.layer_idx]][nsort]
        received_weights[lkeys[self.layer_idx + 1]] = received_weights[lkeys[self.layer_idx + 1]][nsort]
        if self.layer_idx + 2 < len(lkeys):
            received_weights[lkeys[self.layer_idx + 2]] = received_weights[lkeys[self.layer_idx + 2]][:, nsort]

        return received_weights

__init__(engine, attack_params)

Initializes the SwappingWeightsAttack with the specified engine and parameters.

Parameters:

Name Type Description Default
engine object

The training engine object.

required
attack_params dict

Dictionary of attack parameters, including the layer index.

required
Source code in nebula/addons/attacks/model/swappingweights.py
25
26
27
28
29
30
31
32
33
34
35
36
def __init__(self, engine, attack_params):
    """
    Initializes the SwappingWeightsAttack with the specified engine and parameters.

    Args:
        engine (object): The training engine object.
        attack_params (dict): Dictionary of attack parameters, including the layer index.
    """
    super().__init__(engine)
    self.layer_idx = int(attack_params["layer_idx"])
    self.round_start_attack = int(attack_params["round_start_attack"])
    self.round_stop_attack = int(attack_params["round_stop_attack"])

model_attack(received_weights)

Performs the swapping weights attack by computing a similarity matrix and swapping the weights of a specified layer based on their similarity.

This method applies a greedy algorithm to swap weights in the selected layer in a way that could potentially disrupt the training process. The attack also ensures that some rows are fully permuted, and others are swapped based on similarity.

Parameters:

Name Type Description Default
received_weights dict

The aggregated model weights to be modified.

required

Returns:

Name Type Description
dict

The modified model weights after performing the swapping attack.

Source code in nebula/addons/attacks/model/swappingweights.py
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
def model_attack(self, received_weights):
    """
    Performs the swapping weights attack by computing a similarity matrix and 
    swapping the weights of a specified layer based on their similarity.

    This method applies a greedy algorithm to swap weights in the selected layer 
    in a way that could potentially disrupt the training process. The attack also 
    ensures that some rows are fully permuted, and others are swapped based on 
    similarity.

    Args:
        received_weights (dict): The aggregated model weights to be modified.

    Returns:
        dict: The modified model weights after performing the swapping attack.
    """
    logging.info("[SwappingWeightsAttack] Performing swapping weights attack")
    lkeys = list(received_weights.keys())
    wm = received_weights[lkeys[self.layer_idx]]

    # Compute similarity matrix
    sm = torch.zeros((wm.shape[0], wm.shape[0]))
    for j in range(wm.shape[0]):
        sm[j] = pairwise_cosine_similarity(wm[j].reshape(1, -1), wm.reshape(wm.shape[0], -1))

    # Check rows/cols where greedy approach is optimal
    nsort = np.full(sm.shape[0], -1)
    rows = []
    for j in range(sm.shape[0]):
        k = torch.argmin(sm[j])
        if torch.argmin(sm[:, k]) == j:
            nsort[j] = k
            rows.append(j)
    not_rows = np.array([i for i in range(sm.shape[0]) if i not in rows])

    # Ensure the rest of the rows are fully permuted (not optimal, but good enough)
    nrs = copy.deepcopy(not_rows)
    nrs = np.random.permutation(nrs)
    while np.any(nrs == not_rows):
        nrs = np.random.permutation(nrs)
    nsort[not_rows] = nrs
    nsort = torch.tensor(nsort)

    # Apply permutation to weights
    received_weights[lkeys[self.layer_idx]] = received_weights[lkeys[self.layer_idx]][nsort]
    received_weights[lkeys[self.layer_idx + 1]] = received_weights[lkeys[self.layer_idx + 1]][nsort]
    if self.layer_idx + 2 < len(lkeys):
        received_weights[lkeys[self.layer_idx + 2]] = received_weights[lkeys[self.layer_idx + 2]][:, nsort]

    return received_weights