Membership Inference Attack¶

class pepr.privacy.mia.Mia(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

Membership Inference Attacks (MIA) Against Machine Learning Models.

Attack-Steps:

Create dataset mapping for shadow models.
Train shadow models.
Generate attack model dataset.
Train attack models.
Evaluate attack models.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the mia class.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- number_classes (int): Number of different classes the target model predicts.
- number_shadow_models (int): Number of shadow models to be trained.
- shadow_training_set_size (int): Size of the trainings set for each shadow model. The corresponding evaluation sets will have the same size.
- create_compile_shadow_model (function): Function that returns a compiled TensorFlow model (typically identical to the target model) used in the training of the shadow models.
- create_compile_attack_model (function): Function that returns a compiled TensorFlow model used for the attack models. The model output is expected to be a single floating-point value per prediction.
- shadow_epochs (int): Number of training epochs of the shadow models.
- shadow_batch_size (int): Batch size used in the training of the shadow models.
- attack_epochs (int): Number of training epochs of the attack models.
- attack_batch_size (int): Batch size used in the training of the attack models.
data (numpy.ndarray) – Dataset with all training samples used in the given pentesting setting.
labels (numpy.ndarray) – Array of all labels used in the given pentesting setting.
data_conf (dict) –
Dictionary describing which record-indices are used to train the shadow models, the target model(s) and which are used for the evaluation of the attack.
- shadow_indices (list): List of indices describing which of the records from data are used to train the shadow models.
- target_indices (list): List of indices describing which of the records from data were used to train the target model(s).
- evaluation_indices (list): List of indices describing which of the records from data are used to evaluate the attack.
- record_indices_per_target (numpy.ndarray): n*m array describing for all n target models which m indices where used in the training.
target_models (iterable) – List of target models which should be tested.

attack_alias¶: str – Alias for a specific instantiation of the attack class.

attack_pars¶: dict – Dictionary containing all needed parameters fo the attack.

data¶: numpy.ndarray – Dataset with all training samples used in the given pentesting setting.

labels¶: numpy.ndarray – Array of all labels used in the given pentesting setting.

data_conf¶

dict – Dictionary describing the data configuration of the given pentesting setting by specifying which record-indices are used to train the shadow models, the target model(s) and which are used for the evaluation of the attack.

shadow_indices (list): List of indices describing which of the records from data are used to train the shadow models.
target_indices (list): List of indices describing which of the records from data were used to train the target model(s).
evaluation_indices (list): List of indices describing which of the records from data are used to evaluate the attack.
record_indices_per_target (numpy.ndarray): n*m array describing for all n target models which m indices where used in the training.

target_models¶: iterable – List of target models which should be tested.

attack_results¶

dict – Dictionary storing the attack model results. A list “per attack model and target model” has the shape (attack model, target model) -> First index specifies the attack model, the second index the target model.

tp_list (numpy.ndarray): True positives per attack model and target model.
fp_list (numpy.ndarray): False positives per attack model and target model.
fn_list (numpy.ndarray): False negatives per attack model and target model.
tn_list (numpy.ndarray): True negatives per attack model and target model.
eval_accuracy_list (numpy.ndarray): Evaluation accuracy on evaluation records per attack model and target model.
precision_list (numpy.ndarray): Attack precision per attack model and target model.
recall_list (numpy.ndarray): Attack recall per attack model and target model.
eval_accuracy (numpy.ndarray): Evaluation accuracy averaged over all attack models per target model.
precision (numpy.ndarray): Attack precision averaged over all attack models per target model.
recall (numpy.ndarray): Attack recall averaged over all attack models per target model.
overall_eval_accuracy (float): Evaluation accuracy averaged over all target models.
overall_precision (float): Attack precision averaged over all target models.
overall_recall (float): Attack recall averaged over all target models.
shadow_train_accuracy_list (list): Accuracy on training records per shadow model and target model.
shadow_test_accuracy_list (list): Accuracy on test records per shadow model and target model.
shadow_train_accuracy (float): Accuracy on train records averaged over all shadow models per target model.
shadow_test_accuracy (float): Accuracy on test records averaged over all shadow models per target model.

References

Implementation of the basic membership inference attack by Reza Shokri, Marco Stronati, Congzheng Song and Vitaly Shmatikov. Membership inference attacks against machine learning models 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017.

create_attack_report(save_path='mia_report', pdf=False)¶

Create an attack report just for the given attack instantiation.

Parameters:	save_path (str) – Path to save the tex, pdf and asset files of the attack report. pdf (bool) – If set, generate pdf out of latex file.

create_attack_section(save_path)¶

Create an attack section for the given attack instantiation.

Parameters:	save_path – Path to save the tex, pdf and asset files of the attack report.

run(save_path=None, load_pars=None)¶

Run membership inference attack.

Parameters:

save_path (str) –
If path is given, the following (partly computational expensive) intermediate results are saved to disk:
- The mapping of training-records to shadow models.
- The trained shadow models.
- The attack datasets for training the attack model.
- The trained attack models.
load_pars (dict) –
If this dictionary is given, the following computational intermediate results can be loaded from disk.
- shadow_data_indices (str) : Path to shadow data mapping.
- shadow_models (list) : List of paths to shadow models.
- attack_datasets (str) : Path to attack datasets.
- attack_models (list) : List of paths to attack models.