datapoison
add_x_to_image(img)
¶
Adds a 10x10 pixel 'X' mark to the top-left corner of an image.
This function modifies the input image by setting specific pixels in the top-left 10x10 region to a high intensity value, forming an 'X' shape. Pixels on or below the main diagonal and above the secondary diagonal are set to 255 (white).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
img
|
array - like
|
A 2D array or image tensor representing pixel values. It is expected to be in grayscale, where each pixel has a single intensity value. |
required |
Returns:
Type | Description |
---|---|
torch.Tensor: A tensor representation of the modified image with the 'X' mark. |
Source code in nebula/addons/attacks/poisoning/datapoison.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
|
apply_noise(t, noise_type, poisoned_ratio)
¶
Applies noise to a tensor based on the specified noise type and poisoning ratio.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
t
|
Tensor
|
The input tensor to which noise will be applied. |
required |
noise_type
|
str
|
The type of noise to apply. Supported types are: - "salt": Salt noise (binary salt-and-pepper noise with only 'salt'). - "gaussian": Gaussian noise with mean 0 and specified variance. - "s&p": Salt-and-pepper noise. - "nlp_rawdata": Applies a custom NLP raw data poisoning function. |
required |
poisoned_ratio
|
float
|
The ratio or variance of noise to be applied, depending on the noise type. |
required |
Returns:
Type | Description |
---|---|
torch.Tensor: The tensor with noise applied. If the noise type is not supported, returns the original tensor with an error message printed. |
Raises:
Type | Description |
---|---|
ValueError
|
If the specified noise_type is not supported. |
Notes
- The "nlp_rawdata" noise type requires the custom
poison_to_nlp_rawdata
function. - Noise for types "salt", "gaussian", and "s&p" is generated using
random_noise
from theskimage.util
package, and returned as atorch.Tensor
.
Source code in nebula/addons/attacks/poisoning/datapoison.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
datapoison(dataset, indices, poisoned_percent, poisoned_ratio, targeted=False, target_label=3, noise_type='salt')
¶
Adds noise to a specified portion of a dataset for data poisoning purposes.
This function applies noise to randomly selected samples within a dataset. Noise can be targeted or non-targeted. In non-targeted poisoning, random samples are chosen and altered using the specified noise type and ratio. In targeted poisoning, only samples with a specified label are altered by adding an 'X' pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
The dataset to poison, expected to have |
required |
indices
|
list of int
|
The list of indices in the dataset to consider for poisoning. |
required |
poisoned_percent
|
float
|
The percentage of |
required |
poisoned_ratio
|
float
|
The intensity or probability parameter for the noise, depending on the noise type. |
required |
targeted
|
bool
|
If True, applies targeted poisoning by adding an 'X' only to samples with |
False
|
target_label
|
int
|
The label to target when |
3
|
noise_type
|
str
|
The type of noise to apply in non-targeted poisoning. Supported types are: - "salt": Applies salt noise. - "gaussian": Applies Gaussian noise. - "s&p": Applies salt-and-pepper noise. Default is "salt". |
'salt'
|
Returns:
Name | Type | Description |
---|---|---|
Dataset |
A deep copy of the original dataset with poisoned data in |
Raises:
Type | Description |
---|---|
ValueError
|
If |
Notes
- Non-targeted poisoning randomly selects samples from
indices
based onpoisoned_percent
. - Targeted poisoning modifies only samples with
target_label
by adding an 'X' pattern, regardless ofpoisoned_ratio
.
Source code in nebula/addons/attacks/poisoning/datapoison.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
|
poison_to_nlp_rawdata(text_data, poisoned_ratio)
¶
Poisons NLP data by setting word vectors to zero with a given probability.
This function randomly selects a portion of non-zero word vectors in the input text data and sets them to zero vectors based on the specified poisoning ratio. This simulates a form of data corruption by partially nullifying the information in the input data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text_data
|
list of torch.Tensor
|
A list where each entry is a tensor representing a word vector. Non-zero vectors are assumed to represent valid words. |
required |
poisoned_ratio
|
float
|
The fraction of non-zero word vectors to set to zero, where 0 <= poisoned_ratio <= 1. |
required |
Returns:
Type | Description |
---|---|
list of torch.Tensor: The modified text data with some word vectors set to zero. |
Raises:
Type | Description |
---|---|
ValueError
|
If |
Notes
poisoned_ratio
controls the percentage of non-zero vectors to poison.- If
num_poisoned_token
is zero or exceeds the number of non-zero vectors, the function returns the originaltext_data
without modification.
Source code in nebula/addons/attacks/poisoning/datapoison.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
|