ART Extraction Attacks¶
Base Wrapper Class¶
The base class does not implement any attack. The ART extraction attack wrappers inherit from the BaseExtractionAttack class and have the same attributes.
-
class
pepr.privacy.art_extraction_wrapper.
BaseExtractionAttack
(attack_alias, attack_pars, data, labels, data_conf, target_models, extraction_attacks, pars_descriptors)¶ Base ART extraction attack class implementing the logic for running an extraction attack and generating a report.
Parameters: - attack_alias (str) – Alias for a specific instantiation of the class.
- attack_pars (dict) –
Extraction attack specific attack parameters:
- stolen_models (list): List of untrained input models for every target model to store the stolen training data in.
- data (numpy.ndarray) – Dataset with all input images used to attack the target models.
- labels (numpy.ndarray) – Array of all labels used to attack the target models.
- data_conf (dict) –
Record-indices for extraction and evaluation:
- stolen_record_indices (np.ndarray): Indices of records to use for the extraction attack.
- eval_record_indices (np.ndarray): Indices of records for measuring the accuracy of the extracted model.
- target_models (iterable) – List of target models which should be tested.
- extraction_attacks (list(art.attacks.Attack)) – List of ART extraction attack objects per target model which are wrapped in this class.
- pars_descriptors (dict) – Dictionary of attack parameters and their description shown in the attack report. Example: {“classifier”: “A victim classifier”} for the attribute named “classifier” of CopycatCNN.
-
attack_alias
¶ str – Alias for a specific instantiation of the class.
-
attack_pars
¶ dict – Extraction attack specific attack parameters:
-
data
¶ numpy.ndarray – Dataset with all training samples used in the given pentesting setting.
-
labels
¶ numpy.ndarray – Array of all labels used in the given pentesting setting.
-
target_models
¶ iterable – List of target models which should be tested.
-
data_conf
¶ dict – Record-indices for extraction and evaluation:
-
extraction_attacks
¶ list(art.attacks.Attack) – List of ART attack objects per target model which are wrapped in this class.
-
pars_descriptors
¶ dict – Dictionary of attack parameters and their description shown in the attack report. Example: {“classifier”: “A victim classifier”} for the attribute named “classifier” of CopycatCNN.
-
attack_results
¶ dict – Dictionary storing the attack model results.
- extracted_classifiers (list): List of extracted classifiers per target model.
- ec_accuracy (list): List of the accuracy of the extracted classifiers per target model.
- ec_accuracy_list (list): List of the accuracy of the extracted classifiers per target model and class. Shape: (target_model, class)
ART Extraction Attack Wrappers¶
CopycatCNN |
art.attacks.extraction.CopycatCNN wrapper class. |
KnockoffNets |
art.attacks.extraction.KnockoffNets wrapper class. |
-
class
pepr.privacy.art_extraction_wrapper.
CopycatCNN
(attack_alias, attack_pars, data, labels, data_conf, target_models)¶ art.attacks.extraction.CopycatCNN wrapper class.
Attack description: Implementation of the Copycat CNN attack from Rodrigues Correia-Silva et al. (2018).
Paper link: https://arxiv.org/abs/1806.05476
Parameters: - attack_alias (str) – Alias for a specific instantiation of the class.
- attack_pars (dict) –
Dictionary containing all needed attack parameters:
- batch_size_fit (int): (optional) Size of batches for fitting the thieved classifier.
- batch_size_query (int): (optional) Size of batches for querying the victim classifier.
- nb_epochs (int): (optional) Number of epochs to use for training.
- nb_stolen (int): (optional) Number of queries submitted to the victim classifier to steal it.
- use_probability (bool): (optional) Use probability.
- stolen_record_indices (np.ndarray): Indices of records to use for the extraction attack.
- data (numpy.ndarray) – Dataset with all input images used to attack the target models.
- labels (numpy.ndarray) – Array of all labels used to attack the target models.
- data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- stolen_record_indices (np.ndarray): Indices of records to use for the extraction attack.
- eval_record_indices (np.ndarray): Indices of records for measuring the accuracy of the extracted model.
- target_models (iterable) – List of target models which should be tested.
-
class
pepr.privacy.art_extraction_wrapper.
KnockoffNets
(attack_alias, attack_pars, data, labels, data_conf, target_models)¶ art.attacks.extraction.KnockoffNets wrapper class.
Attack description: Implementation of the Knockoff Nets attack from Orekondy et al. (2018).
Paper link: https://arxiv.org/abs/1812.02766
Parameters: - attack_alias (str) – Alias for a specific instantiation of the class.
- attack_pars (dict) –
Dictionary containing all needed attack parameters:
- batch_size_fit (int): (optional) Size of batches for fitting the thieved classifier.
- batch_size_query (int): (optional) Size of batches for querying the victim classifier.
- nb_epochs (int): (optional) Number of epochs to use for training.
- nb_stolen (int): (optional) Number of queries submitted to the victim classifier to steal it.
- use_probability (bool): (optional) Use probability.
- sampling_strategy (str): Sampling strategy, either random or adaptive.
- reward (str): Reward type, in [‘cert’, ‘div’, ‘loss’, ‘all’].
- verbose (bool): Show progress bars.
- stolen_record_indices (np.ndarray): Indices of records to use for the extraction attack.
- data (numpy.ndarray) – Dataset with all input images used to attack the target models.
- labels (numpy.ndarray) – Array of all labels used to attack the target models.
- data_conf (dict) –
Dictionary describing for every target model which record-indices should be used for the attack.
- stolen_record_indices (np.ndarray): Indices of records to use for the extraction attack.
- eval_record_indices (np.ndarray): Indices of records for measuring the accuracy of the extracted model.
- target_models (iterable) – List of target models which should be tested.