Generalized Membership Inference Attack (Direct)¶

class pepr.privacy.gmia.DirectGmia(attack_alias, attack_pars, data, labels, data_conf, target_models)¶

Direct Generalized Membership Inference Attack (d-GMIA)

Attack-Steps:

Create mapping of records to reference models.
Train the reference models.
Generate intermediate models.
Extract reference high-level features.
Extract target high-level features.
Compute pairwise distances between reference and target high-level features.
Determine potential vulnerable target records.
Infer log losses of reference models.
Infer log losses of target model.
Sample reference losses, approximate empirical cumulative distribution function, smooth ecdf with piecewise cubic interpolation.
Determine members and non-members with left-tailed hypothesis test.
Evaluate the attack results.

Parameters:

attack_alias (str) – Alias for a specific instantiation of the gmia.
attack_pars (dict) –
Dictionary containing all needed attack parameters:
- number_classes (int): Number of different classes the target model predict.
- number_reference_models (int): Number of reference models to be trained.
- reference_training_set_size (int): Size of the trainings set for each reference model.
- create_compile_model (function): Function that returns a compiled TensorFlow model (typically identical to the target model) used in the training of the reference models.
- reference_epochs (int): Number of training epochs of the reference models.
- reference_batch_size (int): Batch size used in the training of the reference models.
- hlf_metric (str): Metric (typically ‘cosine’) used for the distance calculations in the high-level feature space. For valid metrics see documentation of sklearn.metrics.pairwise_distances.
- hlf_layer_number (int): If value is n, the n-th layer of the model returned by create_compile_model is used to extract the high-level feature vectors.
- neighbor_threshold (float): If distance is smaller then the neighbor threshold the record is selected as target record.
- probability_threshold (float): For details see section 4.3 from the paper.
- number_target_records (int): If set, the selection algorithm performs max_search_rounds, to find a neighbor_threshold, that leads to a finding of n_targets target records. These target records are most vulnerable with respect to our selection criterion.
- max_search_rounds (int): If number_target_records is given, maximal max_search_rounds are performed to find number_target_records of potential vulnerable target records.
data (numpy.ndarray) – Dataset with all training samples used in the given pentesting setting.
labels (numpy.ndarray) – Array of all labels used in the given pentesting setting.
data_conf (dict) –
Dictionary describing which record-indices are used to train the reference models, the target model(s) and which are used for the evaluation of the attack.
- reference_indices (list): List of indices describing which of the records from data are used to train the reference models.
- target_indices (list): List of indices describing which of the records from data were used to train the target model(s).
- evaluation_indices (list): List of indices describing which of the records from data are used to evaluate the attack. Typically these are to one half records used to train the target models and one half neither used to train the target model(s) or the reference models.
- record_indices_per_target (numpy.ndarray): n*m array describing for all n target models which m indices where used in the training.
target_models (iterable) – List of target models which should be tested.

attack_alias¶: str – Alias for a specific instantiation of the attack class.

attack_pars¶: dict – Dictionary containing all needed parameters fo the attack.

data¶: numpy.ndarray – Dataset with all training samples used in the given pentesting setting.

labels¶: numpy.ndarray – Array of all labels used in the given pentesting setting.

data_conf¶: dict – Dictionary describing the data configuration of the given pentesting setting.

target_models¶: iterable – List of target models which should be tested.

attack_results¶

dict –

selected_target_records (numpy.ndarray): List of record indices selected as potential vulnerable.
neighbor_threshold (float): If distance is smaller then the neighbor threshold the record is selected as target record.
probability_threshold (float): For details see section 4.3 from the original publication.
reference_inferences (numpy.ndarray): Array of log losses of the predictions on the reference models.
target_inferences (numpy.ndarray): Array of log losses of the predictions on the target models.
used_target_records (numpy.ndarray): Target records finally used for the attack.
pchip_references (list): Interpolated ecdfs of sampled log losses.
ecdf_references (list): Estimated CDF of sampled log losses.
tp_list (list): True positives per cut-off-p-value (0, 0.01, 0.02, …, 1) and target model.
fp_list (list): False positives per cut-off-p-value and target model.
fn_list (list): False negatives per cut-off-p-value and target model.
tn_list (list): True negatives per cut-off-p-value and target model.
precision_list (list): Attack precision per cut-off-p-value and target model.
recall_list (list): Attack recall per cut-off-p-value and target model.
overall_precision (list): Attack precision averaged over all target models per cut-off-p-value.
overall_recall (list): Attack recall averaged over all target models per cut-off-p-value.

References

Implementation of the direct gmia from Long, Yunhui and Bindschaedler, Vincent and Wang, Lei and Bu, Diyue and Wang, Xiaofeng and Tang, Haixu and Gunter, Carl A and Chen, Kai (2018). Understanding membership inferences on well-generalized learning models. arXiv preprint arXiv:1802.04889.

create_attack_report(save_path='gmia_report', pdf=False)¶

Create an attack report just for the given attack instantiation.

Parameters:	save_path (str) – Path to save the tex and pdf file of the attack report. pdf (bool) – If set, generate pdf out of latex file.

create_attack_section(save_path)¶: Create a report section for the gmia attack instantiation.

run(save_path=None, load_pars=None)¶

Run the direct generalized membership inference attack.

Parameters:

save_path (str) –
If path is given, the following (partly computational expensive) intermediate results are saved to disk:
- The mapping of training-records to reference models
- The trained reference models
- The reference high-level features
- The target high-level features
- The matrix containing all pairwise distances between the reference- and target high-level features.
load_pars (dict) –
If this dictionary is given, the following computational intermediate results can be loaded from disk.
- records_per_reference_model (str) : Path to the mapping.
- reference_models (list) : List of paths to the reference models.
- pairwise_distance_hlf_<hlf_metric> (str) : Path to the pairwise distance matrix between the reference- and target high-level features using a hlf_metric (e.g. cosine).