Skip to content

Documentation for Modelpoison Module

This module provides a function for adding noise to a machine learning model's parameters, simulating data poisoning attacks. The main function allows for the injection of various types of noise into the model parameters, effectively altering them to test the model's robustness against malicious manipulations.

Function: - modelpoison: Modifies the parameters of a model by injecting noise according to a specified ratio and type of noise (e.g., Gaussian, salt, salt-and-pepper).

ModelPoisonAttack

Bases: ModelAttack

Implements a model poisoning attack by modifying the received model weights during the aggregation process.

This attack introduces specific modifications to the model weights to influence the global model's behavior.

Parameters:

Name Type Description Default
engine object

The training engine object that manages the aggregator.

required
attack_params dict

Parameters for the attack, including: - poisoned_ratio (float): The ratio of model weights to be poisoned. - noise_type (str): The type of noise to introduce during the attack.

required
Source code in nebula/addons/attacks/model/modelpoison.py
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
class ModelPoisonAttack(ModelAttack):
    """
    Implements a model poisoning attack by modifying the received model weights 
    during the aggregation process.

    This attack introduces specific modifications to the model weights to 
    influence the global model's behavior.

    Args:
        engine (object): The training engine object that manages the aggregator.
        attack_params (dict): Parameters for the attack, including:
            - poisoned_ratio (float): The ratio of model weights to be poisoned.
            - noise_type (str): The type of noise to introduce during the attack.
    """
    def __init__(self, engine, attack_params):
        """
        Initializes the ModelPoisonAttack with the specified engine and parameters.

        Args:
            engine (object): The training engine object.
            attack_params (dict): Dictionary of attack parameters.
        """
        super().__init__(engine)
        self.poisoned_ratio = float(attack_params["poisoned_ratio"])
        self.noise_type = attack_params["noise_type"].lower()
        self.round_start_attack = int(attack_params["round_start_attack"])
        self.round_stop_attack = int(attack_params["round_stop_attack"])


    def modelPoison(self, model: OrderedDict, poisoned_ratio, noise_type="gaussian"):
        """
        Adds random noise to the parameters of a model for the purpose of data poisoning.

        This function modifies the model's parameters by injecting noise according to the specified
        noise type and ratio. Various types of noise can be applied, including salt noise, Gaussian
        noise, and salt-and-pepper noise.

        Args:
            model (OrderedDict): The model's parameters organized as an `OrderedDict`. Each key corresponds
                                 to a layer, and each value is a tensor representing the parameters of that layer.
            poisoned_ratio (float): The proportion of noise to apply, expressed as a fraction (0 <= poisoned_ratio <= 1).
            noise_type (str, optional): The type of noise to apply to the model parameters. Supported types are:
                                        - "salt": Applies salt noise, replacing random elements with 1.
                                        - "gaussian": Applies Gaussian-distributed additive noise.
                                        - "s&p": Applies salt-and-pepper noise, replacing random elements with either 1 or low_val.
                                        Default is "gaussian".

        Returns:
            OrderedDict: A new `OrderedDict` containing the model parameters with noise added.

        Raises:
            ValueError: If `poisoned_ratio` is not between 0 and 1, or if `noise_type` is unsupported.

        Notes:
            - If a layer's tensor is a single point (0-dimensional), it will be reshaped for processing.
            - Unsupported noise types will result in an error message, and the original tensor will be retained.
        """
        poisoned_model = OrderedDict()
        if not isinstance(noise_type, str):
            noise_type = noise_type[0]

        for layer in model:
            bt = model[layer]
            t = bt.detach().clone()
            single_point = False
            if len(t.shape) == 0:
                t = t.view(-1)
                single_point = True
            # print(t)
            if noise_type == "salt":
                # Replaces random pixels with 1.
                poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))
            elif noise_type == "gaussian":
                # Gaussian-distributed additive noise.
                poisoned = torch.tensor(random_noise(t, mode=noise_type, mean=0, var=poisoned_ratio, clip=True))
            elif noise_type == "s&p":
                # Replaces random pixels with either 1 or low_val, where low_val is 0 for unsigned images or -1 for signed images.
                poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))
            else:
                print("ERROR: poison attack type not supported.")
                poisoned = t
            if single_point:
                poisoned = poisoned[0]
            poisoned_model[layer] = poisoned

        return poisoned_model

    def model_attack(self, received_weights):
        """
        Applies the model poisoning attack by modifying the received model weights.

        Args:
            received_weights (any): The aggregated model weights to be poisoned.

        Returns:
            any: The modified model weights after applying the poisoning attack.
        """
        logging.info("[ModelPoisonAttack] Performing model poison attack")
        received_weights = self.modelPoison(received_weights, self.poisoned_ratio, self.noise_type)
        return received_weights

__init__(engine, attack_params)

Initializes the ModelPoisonAttack with the specified engine and parameters.

Parameters:

Name Type Description Default
engine object

The training engine object.

required
attack_params dict

Dictionary of attack parameters.

required
Source code in nebula/addons/attacks/model/modelpoison.py
34
35
36
37
38
39
40
41
42
43
44
45
46
def __init__(self, engine, attack_params):
    """
    Initializes the ModelPoisonAttack with the specified engine and parameters.

    Args:
        engine (object): The training engine object.
        attack_params (dict): Dictionary of attack parameters.
    """
    super().__init__(engine)
    self.poisoned_ratio = float(attack_params["poisoned_ratio"])
    self.noise_type = attack_params["noise_type"].lower()
    self.round_start_attack = int(attack_params["round_start_attack"])
    self.round_stop_attack = int(attack_params["round_stop_attack"])

modelPoison(model, poisoned_ratio, noise_type='gaussian')

Adds random noise to the parameters of a model for the purpose of data poisoning.

This function modifies the model's parameters by injecting noise according to the specified noise type and ratio. Various types of noise can be applied, including salt noise, Gaussian noise, and salt-and-pepper noise.

Parameters:

Name Type Description Default
model OrderedDict

The model's parameters organized as an OrderedDict. Each key corresponds to a layer, and each value is a tensor representing the parameters of that layer.

required
poisoned_ratio float

The proportion of noise to apply, expressed as a fraction (0 <= poisoned_ratio <= 1).

required
noise_type str

The type of noise to apply to the model parameters. Supported types are: - "salt": Applies salt noise, replacing random elements with 1. - "gaussian": Applies Gaussian-distributed additive noise. - "s&p": Applies salt-and-pepper noise, replacing random elements with either 1 or low_val. Default is "gaussian".

'gaussian'

Returns:

Name Type Description
OrderedDict

A new OrderedDict containing the model parameters with noise added.

Raises:

Type Description
ValueError

If poisoned_ratio is not between 0 and 1, or if noise_type is unsupported.

Notes
  • If a layer's tensor is a single point (0-dimensional), it will be reshaped for processing.
  • Unsupported noise types will result in an error message, and the original tensor will be retained.
Source code in nebula/addons/attacks/model/modelpoison.py
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
def modelPoison(self, model: OrderedDict, poisoned_ratio, noise_type="gaussian"):
    """
    Adds random noise to the parameters of a model for the purpose of data poisoning.

    This function modifies the model's parameters by injecting noise according to the specified
    noise type and ratio. Various types of noise can be applied, including salt noise, Gaussian
    noise, and salt-and-pepper noise.

    Args:
        model (OrderedDict): The model's parameters organized as an `OrderedDict`. Each key corresponds
                             to a layer, and each value is a tensor representing the parameters of that layer.
        poisoned_ratio (float): The proportion of noise to apply, expressed as a fraction (0 <= poisoned_ratio <= 1).
        noise_type (str, optional): The type of noise to apply to the model parameters. Supported types are:
                                    - "salt": Applies salt noise, replacing random elements with 1.
                                    - "gaussian": Applies Gaussian-distributed additive noise.
                                    - "s&p": Applies salt-and-pepper noise, replacing random elements with either 1 or low_val.
                                    Default is "gaussian".

    Returns:
        OrderedDict: A new `OrderedDict` containing the model parameters with noise added.

    Raises:
        ValueError: If `poisoned_ratio` is not between 0 and 1, or if `noise_type` is unsupported.

    Notes:
        - If a layer's tensor is a single point (0-dimensional), it will be reshaped for processing.
        - Unsupported noise types will result in an error message, and the original tensor will be retained.
    """
    poisoned_model = OrderedDict()
    if not isinstance(noise_type, str):
        noise_type = noise_type[0]

    for layer in model:
        bt = model[layer]
        t = bt.detach().clone()
        single_point = False
        if len(t.shape) == 0:
            t = t.view(-1)
            single_point = True
        # print(t)
        if noise_type == "salt":
            # Replaces random pixels with 1.
            poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))
        elif noise_type == "gaussian":
            # Gaussian-distributed additive noise.
            poisoned = torch.tensor(random_noise(t, mode=noise_type, mean=0, var=poisoned_ratio, clip=True))
        elif noise_type == "s&p":
            # Replaces random pixels with either 1 or low_val, where low_val is 0 for unsigned images or -1 for signed images.
            poisoned = torch.tensor(random_noise(t, mode=noise_type, amount=poisoned_ratio))
        else:
            print("ERROR: poison attack type not supported.")
            poisoned = t
        if single_point:
            poisoned = poisoned[0]
        poisoned_model[layer] = poisoned

    return poisoned_model

model_attack(received_weights)

Applies the model poisoning attack by modifying the received model weights.

Parameters:

Name Type Description Default
received_weights any

The aggregated model weights to be poisoned.

required

Returns:

Name Type Description
any

The modified model weights after applying the poisoning attack.

Source code in nebula/addons/attacks/model/modelpoison.py
107
108
109
110
111
112
113
114
115
116
117
118
119
def model_attack(self, received_weights):
    """
    Applies the model poisoning attack by modifying the received model weights.

    Args:
        received_weights (any): The aggregated model weights to be poisoned.

    Returns:
        any: The modified model weights after applying the poisoning attack.
    """
    logging.info("[ModelPoisonAttack] Performing model poison attack")
    received_weights = self.modelPoison(received_weights, self.poisoned_ratio, self.noise_type)
    return received_weights