ART Evasion Attacks¶

Note

You can find a basic example notebook here.

Base Wrapper Class¶

The base classes do not implement any attack. The ART evasion attack wrappers inherit from the BaseEvasionAttack or BasePatchAttack class and have the same attributes.

class pepr.robustness.art_wrapper.BaseEvasionAttack(attack_alias, use_labels, data, labels, attack_indices_per_target, target_models, art_attacks, pars_descriptors)¶

Base ART attack class implementing the logic for running an evasion attack and generating a report.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
use_labels (bool) – If true, the true labels are passed to the generate function. Set true if targeted is true for a targeted attack.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
attack_indices_per_target (numpy.ndarray) – Array of indices to attack per target model.
target_models (iterable) – List of target models which should be tested.
art_attacks (list(art.attacks.Attack)) – List of ART attack objects per target model which are wrapped in this class.
pars_descriptors (dict) – Dictionary of attack parameters and their description shown in the attack report. Example: {“norm”: “Adversarial perturbation norm”} for the attribute named “norm” of FastGradientMethod.

attack_alias¶: str – Alias for a specific instantiation of the class.

use_labels¶: bool – If true, the true labels are passed to the generate function. Set true if targeted is true for a targeted attack.

data¶: numpy.ndarray – Dataset with all training samples used in the given pentesting setting.

labels¶: numpy.ndarray – Array of all labels used in the given pentesting setting.

target_models¶: iterable – List of target models which should be tested.

attack_indices_per_target¶: numpy.ndarray – Array of indices to attack per target model.

art_attacks¶: list(art.attacks.Attack) – List of ART attack objects per target model which are wrapped in this class.

pars_descriptors¶: dict – Dictionary of attack parameters and their description shown in the attack report. Example: {“norm”: “Adversarial perturbation norm”} for the attribute named “norm” of FastGradientMethod.

attack_results¶

dict – Dictionary storing the attack model results.

adversarial_examples (list): Array of adversarial examples per target model.
success_rate (list): Percentage of misclassified adversarial examples per target model.
avg_l2_distance (list): Average euclidean distance (L2 norm) between original and perturbed images per target model.
success_rate_list (list): Percentage of misclassified adversarial examples per target model and per class.
l2_distance (list): Euclidean distance (L2 norm) between original and perturbed images for every image per target model.

class pepr.robustness.art_wrapper.BasePatchAttack(attack_alias, data, labels, attack_indices_per_target, target_models, art_attacks, pars_descriptors)¶

Base ART attack class implementing the logic for creating an adversarial patch, applying them to generate adversarial examples and generating a report.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
attack_indices_per_target (numpy.ndarray) – Array of indices to attack per target model.
target_models (iterable) – List of target models which should be tested.
art_attacks (list(art.attacks.Attack)) – List of ART attack objects per target model which are wrapped in this class.
pars_descriptors (dict) – Dictionary of attack parameters and their description shown in the attack report. Example: {“norm”: “Adversarial perturbation norm”} for the attribute named “norm” of FastGradientMethod.

attack_alias¶: str – Alias for a specific instantiation of the class.

data¶: numpy.ndarray – Dataset with all training samples used in the given pentesting setting.

labels¶: numpy.ndarray – Array of all labels used in the given pentesting setting.

target_models¶: iterable – List of target models which should be tested.

attack_indices_per_target¶: numpy.ndarray – Array of indices to attack per target model.

art_attacks¶: list(art.attacks.Attack) – List of ART attack objects per target model which are wrapped in this class.

pars_descriptors¶: dict – Dictionary of attack parameters and their description shown in the attack report. Example: {“norm”: “Adversarial perturbation norm”} for the attribute named “norm” of FastGradientMethod.

attack_results¶

dict – Dictionary storing the attack model results.

adversarial_examples (list): Array of adversarial examples per target model.
success_rate (list): Percentage of misclassified adversarial examples per target model.
avg_l2_distance (list): Average euclidean distance (L2 norm) between original and perturbed images per target model.
success_rate_list (list): Percentage of misclassified adversarial examples per target model and per class.
l2_distance (list): Euclidean distance (L2 norm) between original and perturbed images for every image per target model.

ART Evasion Attack Wrappers¶

`AdversarialPatch`	art.attacks.evasion.AdversarialPatch wrapper class.
`AutoAttack`	art.attacks.evasion.AutoAttack wrapper class.
`AutoProjectedGradientDescent`	art.attacks.evasion.AutoProjectedGradientDescent wrapper class.
`BoundaryAttack`	art.attacks.evasion.BoundaryAttack wrapper class.
`BrendelBethgeAttack`	art.attacks.evasion.BrendelBethgeAttack wrapper class.
`CarliniL2Method`	art.attacks.evasion.CarliniL2Method wrapper class.
`CarliniLInfMethod`	art.attacks.evasion.CarliniLInfMethod wrapper class.
`DeepFool`	art.attacks.evasion.DeepFool wrapper class.
`ElasticNet`	art.attacks.evasion.ElasticNet wrapper class.
`FastGradientMethod`	art.attacks.evasion.FastGradientMethod wrapper class.
`FeatureAdversaries`	art.attacks.evasion.FeatureAdversaries wrapper class.
`FrameSaliencyAttack`	art.attacks.evasion.FrameSaliencyAttack wrapper class.
`HopSkipJump`	art.attacks.evasion.HopSkipJump wrapper class.
`BasicIterativeMethod`	art.attacks.evasion.BasicIterativeMethod wrapper class.
`ProjectedGradientDescent`	art.attacks.evasion.ProjectedGradientDescent wrapper class.
`NewtonFool`	art.attacks.evasion.NewtonFool wrapper class.
`PixelAttack`	art.attacks.evasion.PixelAttack wrapper class.
`ThresholdAttack`	art.attacks.evasion.ThresholdAttack wrapper class.
`SaliencyMapMethod`	art.attacks.evasion.SaliencyMapMethod wrapper class.
`SimBA`	art.attacks.evasion.SimBA wrapper class.
`SpatialTransformation`	art.attacks.evasion.SpatialTransformation wrapper class.
`SquareAttack`	art.attacks.evasion.SquareAttack wrapper class.
`TargetedUniversalPerturbation`	art.attacks.evasion.TargetedUniversalPerturbation wrapper class.
`UniversalPerturbation`	art.attacks.evasion.UniversalPerturbation wrapper class.
`VirtualAdversarialMethod`	art.attacks.evasion.VirtualAdversarialMethod wrapper class.
`ZooAttack`	art.attacks.evasion.ZooAttack wrapper class.

Note

PePR only supports ART attacks that can handle the KerasClassifier and image input. The Imperceptible ASR Attack for example is not supported because it expects a speech recognition estimator.

class pepr.robustness.art_wrapper.AdversarialPatch(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.AdversarialPatch wrapper class.

Attack description: Implementation of the adversarial patch attack for square and rectangular images and videos.

Paper link: https://arxiv.org/abs/1712.09665

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- rotation_max (float): (optional) The maximum rotation applied to random patches. The value is expected to be in the range [0, 180].
- scale_min (float): (optional) The minimum scaling applied to random patches. The value should be in the range [0, 1], but less than scale_max.
- scale_max (float): (optional) The maximum scaling applied to random patches. The value should be in the range [0, 1], but larger than scale_min.
- learning_rate (float): (optional) The learning rate of the optimization.
- max_iter (int): (optional) The number of optimization steps.
- batch_size (int): (optional) The size of the training batch.
- patch_shape: (optional) The shape of the adversarial patch as a tuple of shape (width, height, nb_channels). Currently only supported for TensorFlowV2Classifier. For classifiers of other frameworks the patch_shape is set to the shape of the input samples.
- verbose (bool): (optional) Show progress bars.
- gen_mask (numpy.ndarray): (optional) A boolean array of shape equal to the shape of a single samples (1, H, W) or the shape of x (N, H, W) without their channel dimensions. Any features for which the mask is True can be the center location of the patch during sampling.
- gen_reset_patch (bool): (optional) If True reset patch to initial values of mean of minimal and maximal clip value, else if False (default) restart from previous patch values created by previous call to generate or mean of minimal and maximal clip value if first call to generate.
- apply_scale (float): Scale of the applied patch in relation to the classifier input shape.
- apply_patch_external: (optional) External patch to apply to the images.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.AutoAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.AutoAttack wrapper class.

Attack description: Implementation of the AutoAttack attack.

Paper link: https://arxiv.org/abs/2003.01690

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 1 or 2.
- eps (float): (optional) Maximum perturbation that the attacker can introduce.
- eps_step (float): (optional) Attack step size (input variation) at each iteration.
- targeted (bool): (optional) If False run only untargeted attacks, if True also run targeted attacks against each possible target.
- estimator_orig (int): (optional) Original estimator to be attacked by adversarial examples.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- attacks (bool): (optional) The list of art.attacks.EvasionAttack attacks to be used for AutoAttack. If it is None or empty the standard attacks (PGD, APGD-ce, APGD-dlr, DeepFool, Square) will be used.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.AutoProjectedGradientDescent(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.AutoProjectedGradientDescent wrapper class.

Attack description: Implementation of the Auto Projected Gradient Descent attack.

Paper link: https://arxiv.org/abs/2003.01690

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 1 or 2.
- eps (float): (optional) Maximum perturbation that the attacker can introduce.
- eps_step (float): (optional) Attack step size (input variation) at each iteration.
- max_iter (int): (optional) The maximum number of iterations.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- nb_random_init (int): (optional) Number of random initialisations within the epsilon ball. For num_random_init=0 starting at the original input.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- loss_type: Defines the loss to attack. Available options: None (Use loss defined by estimator), “cross_entropy”, or “difference_logits_ratio”.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.BoundaryAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.BoundaryAttack wrapper class.

Attack description: Implementation of the boundary attack from Brendel et al. (2018). This is a powerful black-box attack that only requires final class prediction.

Paper link: https://arxiv.org/abs/1712.04248

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- batch_size (int): (optional) The size of the batch used by the estimator during inference.
- targeted (bool): (optional) Should the attack target one specific class.
- delta (float): (optional) Initial step size for the orthogonal step.
- epsilon (float): (optional) Initial step size for the step towards the target.
- step_adapt (float): (optional) Factor by which the step sizes are multiplied or divided, must be in the range (0, 1).
- max_iter (int): (optional) Maximum number of iterations.
- num_trial (int): (optional) Maximum number of trials per iteration.
- sample_size (int): (optional) Number of samples per trial.
- init_size (int): (optional) Maximum number of trials for initial generation of adversarial examples.
- min_epsilon (float): (optional) Stop attack if perturbation is smaller than min_epsilon.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.BrendelBethgeAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.BrendelBethgeAttack wrapper class.

Attack description: Base class for the Brendel & Bethge adversarial attack, a powerful gradient-based adversarial attack that follows the adversarial boundary (the boundary between the space of adversarial and non-adversarial images as defined by the adversarial criterion) to find the minimum distance to the clean image.

This is implementation of the Brendel & Bethge attack follows the reference implementation at https://github.com/bethgelab/foolbox/blob/master/foolbox/attacks/brendel_bethge.py.

Implementation differs from the attack used in the paper in two ways:

The initial binary search is always using the full 10 steps (for ease of implementation).
The adaptation of the trust region over the course of optimisation is less greedy but is more robust, reliable and simpler (decay every K steps)

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 1 or 2.
- targeted (bool): (optional) Flag determining if attack is targeted.
- overshoot (float): (optional) If 1 the attack tries to return exactly to the adversarial boundary in each iteration. For higher values the attack tries to overshoot over the boundary to ensure that the perturbed sample in each iteration is adversarial.
- steps (int): (optional) Maximum number of iterations to run. Might converge and stop before that.
- lr (float): (optional) Trust region radius, behaves similar to a learning rate. Smaller values decrease the step size in each iteration and ensure that the attack follows the boundary more faithfully.
- lr_decay (float): (optional) The trust region lr is multiplied with lr_decay in regular intervals (see lr_num_decay).
- lr_num_decay (int): (optional) Number of learning rate decays in regular intervals of length steps / lr_num_decay.
- momentum (float): (optional) Averaging of the boundary estimation over multiple steps. A momentum of zero would always take the current estimate while values closer to one average over a larger number of iterations.
- binary_search_steps (int): (optional) Number of binary search steps used to find the adversarial boundary between the starting point and the clean image.
- batch_size (int): (optional) Batch size for evaluating the model for predictions and gradients.
- init_size (int): (optional) Maximum number of random search steps to find initial adversarial example.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.CarliniL2Method(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.CarliniL2Method wrapper class.

Attack description: The L_2 optimized attack of Carlini and Wagner (2016). This attack is among the most effective and should be used among the primary attacks to evaluate potential defences. A major difference wrt to the original implementation (https://github.com/carlini/nn_robust_attacks) is that we use line search in the optimization of the attack objective.

Paper link: https://arxiv.org/abs/1608.04644

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- confidence (float): (optional) Confidence of adversarial examples: a higher value produces examples that are farther away, from the original input, but classified with higher confidence as the target class.
- targeted (bool): (optional) Should the attack target one specific class.
- learning_rate (float): (optional) The initial learning rate for the attack algorithm. Smaller values produce better results but are slower to converge.
- binary_search_steps (int): (optional) Number of times to adjust constant with binary search (positive value). If binary_search_steps is large, then the algorithm is not very sensitive to the value of initial_const. Note that the values gamma=0.999999 and c_upper=10e10 are hardcoded with the same values used by the authors of the method.
- max_iter (int): (optional) The maximum number of iterations.
- initial_const (float): (optional) The initial trade-off constant c to use to tune the relative importance of distance and confidence. If binary_search_steps is large, the initial constant is not important, as discussed in Carlini and Wagner (2016).
- max_halving (int): (optional) Maximum number of halving steps in the line search optimization.
- max_doubling (int): (optional) Maximum number of doubling steps in the line search optimization.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.CarliniLInfMethod(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.CarliniLInfMethod wrapper class.

Attack description: This is a modified version of the L_2 optimized attack of Carlini and Wagner (2016). It controls the L_Inf norm, i.e. the maximum perturbation applied to each pixel.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- confidence (float): (optional) Confidence of adversarial examples: a higher value produces examples that are farther away, from the original input, but classified with higher confidence as the target class.
- targeted (bool): (optional) Should the attack target one specific class.
- learning_rate (float): (optional) The initial learning rate for the attack algorithm. Smaller values produce better results but are slower to converge.
- max_iter (int): (optional) The maximum number of iterations.
- max_halving (int): (optional) Maximum number of halving steps in the line search optimization.
- max_doubling (int): (optional) Maximum number of doubling steps in the line search optimization.
- eps (float): (optional) An upper bound for the L_0 norm of the adversarial perturbation.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.DeepFool(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.DeepFool wrapper class.

Attack description: Implementation of the attack from Moosavi-Dezfooli et al. (2015).

Paper link: https://arxiv.org/abs/1511.04599

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- max_iter (int): (optional) The maximum number of iterations.
- epsilon (float): (optional) Overshoot parameter.
- nb_grads (int): (optional) The number of class gradients (top nb_grads w.r.t. prediction) to compute. This way only the most likely classes are considered, speeding up the computation.
- batch_size (int): (optional) Batch size
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.ElasticNet(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.ElasticNet wrapper class.

Attack description: The elastic net attack of Pin-Yu Chen et al. (2018).

Paper link: https://arxiv.org/abs/1709.04114

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- confidence (float): (optional) Confidence of adversarial examples: a higher value produces examples that are farther away, from the original input, but classified with higher confidence as the target class.
- targeted (bool): (optional) Should the attack target one specific class.
- learning_rate (float): (optional) The initial learning rate for the attack algorithm. Smaller values produce better results but are slower to converge.
- binary_search_steps (int): (optional) Number of times to adjust constant with binary search (positive value).
- max_iter (int): (optional) The maximum number of iterations.
- beta (float): (optional) Hyperparameter trading off L2 minimization for L1 minimization.
- initial_const (float): (optional) The initial trade-off constant c to use to une the relative importance of distance and confidence. If binary_search_steps is large, the initial constant is not important, as discussed in Carlini and Wagner (2016).
- batch_size (int): (optional) Internal size of batches on which adversarial samples are generated.
- decision_rule (str): (optional) Decision rule. ‘EN’ means Elastic Net rule, ‘L1’ means L1 rule, ‘L2’ means L2 rule.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.FastGradientMethod(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.FastGradientMethod wrapper class.

Attack description: This attack was originally implemented by Goodfellow et al. (2015) with the infinity norm (and is known as the “Fast Gradient Sign Method”). This implementation extends the attack to other norms, and is therefore called the Fast Gradient Method.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 1 or 2.
- eps (float): (optional) Attack step size (input variation).
- eps_step (float): (optional) Step size of input variation for minimal perturbation computation.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- num_random_init (int): (optional) Number of random initialisations within the epsilon ball. For random_init=0 starting at the original input.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- minimal (bool): (optional) Indicates if computing the minimal perturbation (True). If True, also define eps_step for the step size and eps for the maximum perturbation.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.FeatureAdversaries(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.FeatureAdversaries wrapper class.

Attack description: This class represent a Feature Adversaries evasion attack.

Paper link: https://arxiv.org/abs/1511.05122

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- delta: (optional) The maximum deviation between source and guide images.
- layer: (optional) Index of the representation layer.
- batch_size (int): (optional) Batch size.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.FrameSaliencyAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.FrameSaliencyAttack wrapper class.

Attack description: Implementation of the attack framework proposed by Inkawhich et al. (2018). Prioritizes the frame of a sequential input to be adversarially perturbed based on the saliency score of each frame.

Paper link: https://arxiv.org/abs/1811.11875

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- attacker (EvasionAttack): (optional) An adversarial evasion attacker which supports masking. Currently supported: ProjectedGradientDescent, BasicIterativeMethod, FastGradientMethod.
- method (str): (optional) Specifies which method to use: “iterative_saliency” (adds perturbation iteratively to frame with highest saliency score until attack is successful), “iterative_saliency_refresh” (updates perturbation after each iteration), “one_shot” (adds all perturbations at once, i.e. defaults to original attack).
- frame_index (int): (optional) Index of the axis in input (feature) array x representing the frame dimension.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.HopSkipJump(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.HopSkipJump wrapper class.

Attack description: Implementation of the HopSkipJump attack from Jianbo et al. (2019). This is a powerful black-box attack that only requires final class prediction, and is an advanced version of the boundary attack.

Paper link: https://arxiv.org/abs/1904.02144

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- batch_size (int): (optional) The size of the batch used by the estimator during inference.
- targeted (bool): (optional) Should the attack target one specific class.
- norm:(optional) Order of the norm. Possible values: “inf”, np.inf or 2.
- max_iter (int): (optional) Maximum number of iterations.
- max_eval (int): (optional) Maximum number of evaluations for estimating gradient.
- init_eval (int): (optional) Initial number of evaluations for estimating gradient.
- init_size (int): (optional) Maximum number of trials for initial generation of adversarial examples.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.BasicIterativeMethod(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.BasicIterativeMethod wrapper class.

Attack description: The Basic Iterative Method is the iterative version of FGM and FGSM.

Paper link: https://arxiv.org/abs/1607.02533

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- eps: (optional) Maximum perturbation that the attacker can introduce.
- eps_step: (optional) Attack step size (input variation) at each iteration.
- max_iter (int): (optional) The maximum number of iterations.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.ProjectedGradientDescent(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.ProjectedGradientDescent wrapper class.

Attack description: The Projected Gradient Descent attack is an iterative method in which, after each iteration, the perturbation is projected on an lp-ball of specified radius (in addition to clipping the values of the adversarial sample so that it lies in the permitted data range). This is the attack proposed by Madry et al. for adversarial training.

Paper link: https://arxiv.org/abs/1706.06083

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation supporting “inf”, np.inf, 1 or 2.
- eps: (optional) Maximum perturbation that the attacker can introduce.
- eps_step: (optional) Attack step size (input variation) at each iteration.
- random_eps (bool): (optional) When True, epsilon is drawn randomly from truncated normal distribution. The literature suggests this for FGSM based training to generalize across different epsilons. eps_step is modified to preserve the ratio of eps / eps_step. The effectiveness of this method with PGD is untested (https://arxiv.org/pdf/1611.01236.pdf).
- max_iter (int): (optional) The maximum number of iterations.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- num_random_init (int): (optional) Number of random initialisations within the epsilon ball. For num_random_init=0 starting at the original input.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.NewtonFool(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.NewtonFool wrapper class.

Attack description: Implementation of the attack from Uyeong Jang et al. (2017).

Paper link: http://doi.acm.org/10.1145/3134600.3134635

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- max_iter (int): (optional) The maximum number of iterations.
- eta (float): (optional) The eta coefficient.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.PixelAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.PixelAttack wrapper class.

Attack description: This attack was originally implemented by Vargas et al. (2019). It is generalisation of One Pixel Attack originally implemented by Su et al. (2019).

One Pixel Attack Paper link: https://ieeexplore.ieee.org/abstract/document/8601309/citations#citations (arXiv link: https://arxiv.org/pdf/1710.08864.pdf) Pixel Attack Paper link: https://arxiv.org/abs/1906.06026

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- th: (optional) threshold value of the Pixel/ Threshold attack. th=None indicates finding a minimum threshold.
- es (int): (optional) Indicates whether the attack uses CMAES (0) or DE (1) as Evolutionary Strategy.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- verbose (bool): (optional) Indicates whether to print verbose messages of ES used.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.ThresholdAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.ThresholdAttack wrapper class.

Attack description: This attack was originally implemented by Vargas et al. (2019).

Paper link: https://arxiv.org/abs/1906.06026

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- th: (optional) threshold value of the Pixel/ Threshold attack. th=None indicates finding a minimum threshold.
- es (int): (optional) Indicates whether the attack uses CMAES (0) or DE (1) as Evolutionary Strategy.
- targeted (bool): (optional) Indicates whether the attack is targeted (True) or untargeted (False).
- verbose (bool): (optional) Indicates whether to print verbose messages of ES used.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.SaliencyMapMethod(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.SaliencyMapMethod wrapper class.

Attack description: Implementation of the Jacobian-based Saliency Map Attack (Papernot et al. 2016).

Paper link: https://arxiv.org/abs/1511.07528

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- theta (float): (optional) Amount of Perturbation introduced to each modified feature per step (can be positive or negative).
- gamma (float): (optional) Maximum fraction of features being perturbed (between 0 and 1).
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.SimBA(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.SimBA wrapper class.

Attack description: This class implements the black-box attack SimBA.

Paper link: https://arxiv.org/abs/1905.07121

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- attack (str): (optional) attack type: pixel (px) or DCT (dct) attacks
- max_iter (int): (optional) The maximum number of iterations.
- epsilon (float): (optional) Overshoot parameter.
- order (str): (optional) order of pixel attacks: random or diagonal (diag)
- freq_dim (int): (optional) dimensionality of 2D frequency space (DCT).
- stride (int): (optional) stride for block order (DCT).
- targeted (bool): (optional) perform targeted attack
- batch_size (int): (optional) Batch size (but, batch process unavailable in this implementation)
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.SpatialTransformation(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.SpatialTransformation wrapper class.

Attack description: Implementation of the spatial transformation attack using translation and rotation of inputs. The attack conducts black-box queries to the target model in a grid search over possible translations and rotations to find optimal attack parameters.

Paper link: https://arxiv.org/abs/1712.02779

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- max_translation (float): (optional) The maximum translation in any direction as percentage of image size. The value is expected to be in the range [0, 100].
- num_translations (int): (optional) The number of translations to search on grid spacing per direction.
- max_rotation (float): (optional) The maximum rotation in either direction in degrees. The value is expected to be in the range [0, 180].
- num_rotations (int): (optional) The number of rotations to search on grid spacing.
- verbose (bool): (optional) Show progress bars.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.SquareAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.SquareAttack wrapper class.

Attack description: This class implements the SquareAttack attack.

Paper link: https://arxiv.org/abs/1912.00049

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 1 or 2.
- max_iter (int): (optional) Maximum number of iterations.
- eps (float): (optional) Maximum perturbation that the attacker can introduce.
- p_init (float): (optional) Initial fraction of elements.
- nb_restarts (int): (optional) Number of restarts.
- batch_size (int): (optional) Batch size for estimator evaluations.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.TargetedUniversalPerturbation(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.TargetedUniversalPerturbation wrapper class.

Attack description: Implementation of the attack from Hirano and Takemoto (2019). Computes a fixed perturbation to be applied to all future inputs. To this end, it can use any adversarial attack method.

Paper link: https://arxiv.org/abs/1911.06502

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- attacker (str): (optional) Adversarial attack name. Default is ‘deepfool’. Supported names: ‘fgsm’.
- attacker_params: (optional) Parameters specific to the adversarial attack. If this parameter is not specified, the default parameters of the chosen attack will be used.
- delta (float): (optional) desired accuracy
- max_iter (int): (optional) The maximum number of iterations for computing universal perturbation.
- eps (float): (optional) Attack step size (input variation)
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 2
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.UniversalPerturbation(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.UniversalPerturbation wrapper class.

Attack description: Implementation of the attack from Moosavi-Dezfooli et al. (2016). Computes a fixed perturbation to be applied to all future inputs. To this end, it can use any adversarial attack method.

Paper link: https://arxiv.org/abs/1610.08401

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- attacker (str): (optional) Adversarial attack name. Adversarial attack name. Default is ‘deepfool’. Supported names: ‘carlini’, ‘carlini_inf’, ‘deepfool’, ‘fgsm’, ‘bim’, ‘pgd’, ‘margin’, ‘ead’, ‘newtonfool’, ‘jsma’, ‘vat’, ‘simba’.
- attacker_params: (optional) Parameters specific to the adversarial attack. If this parameter is not specified, the default parameters of the chosen attack will be used.
- delta (float): (optional) desired accuracy
- max_iter (int): (optional) The maximum number of iterations for computing universal perturbation.
- eps (float): (optional) Attack step size (input variation)
- norm: (optional) The norm of the adversarial perturbation. Possible values: “inf”, np.inf, 2
- batch_size (int): (optional) Batch size for model evaluations in UniversalPerturbation.
- verbose (bool): (optional) Show progress bars.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.VirtualAdversarialMethod(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.VirtualAdversarialMethod wrapper class.

Attack description: This attack was originally proposed by Miyato et al. (2016) and was used for virtual adversarial training.

Paper link: https://arxiv.org/abs/1507.00677

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- eps (float): (optional) Attack step (max input variation).
- finite_diff (float): (optional) The finite difference parameter.
- max_iter (int): (optional) The maximum number of iterations.
- batch_size (int): (optional) Size of the batch on which adversarial samples are generated.
- verbose (bool): (optional) Show progress bars.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.

class pepr.robustness.art_wrapper.ZooAttack(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

art.attacks.evasion.ZooAttack wrapper class.

Attack description: The black-box zeroth-order optimization attack from Pin-Yu Chen et al. (2018). This attack is a variant of the C&W attack which uses ADAM coordinate descent to perform numerical estimation of gradients.

Paper link: https://arxiv.org/abs/1708.03999

Parameters:

attack_alias (str) – Alias for a specific instantiation of the class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- confidence (float): (optional) Confidence of adversarial examples: a higher value produces examples that are farther away, from the original input, but classified with higher confidence as the target class.
- targeted (bool): (optional) Should the attack target one specific class.
- learning_rate (float): (optional) The initial learning rate for the attack algorithm. Smaller values produce better results but are slower to converge.
- max_iter (int): (optional) The maximum number of iterations.
- binary_search_steps (int): (optional) Number of times to adjust constant with binary search (positive value).
- initial_const (float): (optional) The initial trade-off constant c to use to tune the relative importance of distance and confidence. If binary_search_steps is large, the initial constant is not important, as discussed in Carlini and Wagner (2016).
- abort_early (bool): (optional) True if gradient descent should be abandoned when it gets stuck.
- use_resize (bool): (optional) True if to use the resizing strategy from the paper: first, compute attack on inputs resized to 32x32, then increase size if needed to 64x64, followed by 128x128.
- use_importance (bool): (optional) True if to use importance sampling when choosing coordinates to update.
- nb_parallel (int): (optional) Number of coordinate updates to run in parallel. A higher value for nb_parallel should be preferred over a large batch size.
- batch_size (int): (optional) Internal size of batches on which adversarial samples are generated. Small batch sizes are encouraged for ZOO, as the algorithm already runs nb_parallel coordinate updates in parallel for each sample. The batch size is a multiplier of nb_parallel in terms of memory consumption.
- variable_h (float): (optional) Step size for numerical estimation of derivatives.
- verbose (bool): (optional) Show progress bars.
- use_labels (bool): (optional) If true, the true labels are passed to the attack as target labels.
data (numpy.ndarray) – Dataset with all input images used to attack the target models.
labels (numpy.ndarray) – Array of all labels used to attack the target models.
data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- attack_indices_per_target (numpy.ndarray): Array of indices of images to attack per target model.
target_models (iterable) – List of target models which should be tested.