mindarmour.detectors
This module includes detector methods on distinguishing adversarial examplesfrom benign examples.
- class
mindarmour.detectors.
ErrorBasedDetector
(auto_encoder, false_positive_rate=0.01, bounds=(0.0, 1.0))[source] - The detector reconstructs input samples, measures reconstruction errors andrejects samples with large reconstruction errors.
Reference: MagNet: a Two-Pronged Defense against Adversarial Examples,by Dongyu Meng and Hao Chen, at CCS 2017.
- Parameters
detect
(inputs)[source]Detect if input samples are adversarial or not.
- Parameters
inputs (numpy.ndarray) – Suspicious samples to be judged.
Returns
- list[int], whether a sample is adversarial. if res[i]=1, then theinput sample with index i is adversarial.
detectdiff
(_inputs)[source]Detect the distance between the original samples and reconstructed samples.
- Parameters
inputs (numpy.ndarray) – Input samples.
Returns
- float, the distance between reconstructed and original samples.
fit
(inputs, labels=None)[source]Find a threshold for a given dataset to distinguish adversarial examples.
- Parameters
inputs (numpy.ndarray) – Input samples.
labels (numpy.ndarray) – Labels of input samples. Default: None.
Returns
- float, threshold to distinguish adversarial samples from benign ones.
setthreshold
(_threshold)[source]Set the parameters threshold.
- Parameters
- threshold (float) – Detection threshold. Default: None.
transform
(inputs)[source]Reconstruct input samples.
- Parameters
inputs (numpy.ndarray) – Input samples.
Returns
- numpy.ndarray, reconstructed images.
- class
mindarmour.detectors.
DivergenceBasedDetector
(auto_encoder, model, option='jsd', t=1, bounds=(0.0, 1.0))[source] - This class implement a divergence-based detector.
Reference: MagNet: a Two-Pronged Defense against Adversarial Examples,by Dongyu Meng and Hao Chen, at CCS 2017.
- Parameters
auto_encoder (Model) – Encoder model.
model (Model) – Targeted model.
option (str) – Method used to calculate Divergence. Default: “jsd”.
t (int) – Temperature used to overcome numerical problem. Default: 1.
bounds (tuple) – Upper and lower bounds of data.In form of (clip_min, clip_max). Default: (0.0, 1.0).
detectdiff
(_inputs)[source]- Detect the distance between original samples and reconstructed samples.
The distance is calculated by JSD.
- Parameters
-
inputs (numpy.ndarray) – Input samples.
- Returns
-
float, the distance.
- Raises
-
NotImplementedError – If the param option is not supported.
- class
mindarmour.detectors.
RegionBasedDetector
(model, number_points=10, initial_radius=0.0, max_radius=1.0, search_step=0.01, degrade_limit=0.0, sparse=False)[source] - This class implement a region-based detector.
Reference: Mitigating evasion attacks to deep neural networks viaregion-based classification
- Parameters
model (Model) – Target model.
number_points (int) – The number of samples generate from thehyper cube of original sample. Default: 10.
initial_radius (float) – Initial radius of hyper cube. Default: 0.0.
max_radius (float) – Maximum radius of hyper cube. Default: 1.0.
search_step (float) – Incremental during search of radius. Default: 0.01.
degrade_limit (float) – Acceptable decrease of classification accuracy.Default: 0.0.
sparse (bool) – If True, input labels are sparse-encoded. If False,input labels are one-hot-encoded. Default: False.
Examples
- Copy>>> detector = RegionBasedDetector(model)
- >>> detector.fit(Tensor(ori), Tensor(labels))
- >>> adv_ids = detector.detect(Tensor(adv))
detect
(inputs)[source]Tell whether input samples are adversarial or not.
- Parameters
inputs (numpy.ndarray) – Suspicious samples to be judged.
Returns
- list[int], whether a sample is adversarial. if res[i]=1, then theinput sample with index i is adversarial.
detectdiff
(_inputs)[source]Return raw prediction results and region-based prediction results.
- Parameters
inputs (numpy.ndarray) – Input samples.
Returns
- numpy.ndarray, raw prediction results and region-based prediction results of input samples.
fit
(inputs, labels=None)[source]Train detector to decide the best radius.
- Parameters
inputs (numpy.ndarray) – Benign samples.
labels (numpy.ndarray) – Ground truth labels of the input samples.Default:None.
Returns
- float, the best radius.
setradius
(_radius)[source]Set radius.
transform
(inputs)[source]Generate hyper cube for input samples.
- Parameters
inputs (numpy.ndarray) – Input samples.
Returns
- numpy.ndarray, hyper cube corresponds to every sample.
- class
mindarmour.detectors.
SpatialSmoothing
(model, ksize=3, is_local_smooth=True, metric='l1', false_positive_ratio=0.05)[source] Detect method based on spatial smoothing.
- Parameters
model (Model) – Target model.
ksize (int) – Smooth window size. Default: 3.
is_local_smooth (bool) – If True, trigger local smooth. If False, nonelocal smooth. Default: True.
metric (str) – Distance method. Default: ‘l1’.
false_positive_ratio (float) – False positive rate overbenign samples. Default: 0.05.
Examples
- Copy>>> detector = SpatialSmoothing(model)
- >>> detector.fit(Tensor(ori), Tensor(labels))
- >>> adv_ids = detector.detect(Tensor(adv))
detect
(inputs)[source]Detect if an input sample is an adversarial example.
- Parameters
inputs (numpy.ndarray) – Suspicious samples to be judged.
Returns
- list[int], whether a sample is adversarial. if res[i]=1, then theinput sample with index i is adversarial.
detectdiff
(_inputs)[source]Return the raw distance value (before apply the threshold) betweenthe input sample and its smoothed counterpart.
- Parameters
inputs (numpy.ndarray) – Suspicious samples to be judged.
Returns
- float, distance.
fit
(inputs, labels=None)[source]Train detector to decide the threshold. The proper threshold makesure the actual false positive rate over benign sample is less thanthe given value.
- Parameters
inputs (numpy.ndarray) – Benign samples.
labels (numpy.ndarray) – Default None.
Returns
- float, threshold, distance larger than which is reportedas positive, i.e. adversarial.
setthreshold
(_threshold)[source]Set the parameters threshold.
- Parameters
- threshold (float) – Detection threshold. Default: None.
- class
mindarmour.detectors.
EnsembleDetector
(detectors, policy='vote')[source] Ensemble detector.
- Parameters
detect
(inputs)[source]Detect adversarial examples from input samples.
- Parameters
inputs (numpy.ndarray) – Input samples.
Returns
list[int], whether a sample is adversarial. if res[i]=1, then theinput sample with index i is adversarial.
Raises
- ValueError – If policy is not supported.
detectdiff
(_inputs)[source]This method is not available in this class.
- Parameters
inputs (Union__[numpy.ndarray, list, tuple]) – Data been used asreferences to create adversarial examples.
Raises
- NotImplementedError – This function is not available in ensemble.
fit
(inputs, labels=None)[source]Fit detector like a machine learning model. This method is not availablein this class.
- Parameters
inputs (numpy.ndarray) – Data to calculate the threshold.
labels (numpy.ndarray) – Labels of data.
Raises
- NotImplementedError – This function is not available in ensemble.
transform
(inputs)[source]Filter adversarial noises in input samples.This method is not available in this class.
- Raises
- NotImplementedError – This function is not available in ensemble.
- class
mindarmour.detectors.
SimilarityDetector
(trans_model, max_k_neighbor=1000, chunk_size=1000, max_buffer_size=10000, tuning=False, fpr=0.001)[source] - The detector measures similarity among adjacent queries and rejects querieswhich are remarkably similar to previous queries.
- Parameters
trans_model (Model) – A MindSpore model to encode input data into lowerdimension vector.
max_k_neighbor (int) – The maximum number of the nearest neighbors.Default: 1000.
chunk_size (int) – Buffer size. Default: 1000.
max_buffer_size (int) – Maximum buffer size. Default: 10000.
tuning (bool) – Calculate the average distance for the nearest kneighbours, if tuning is true, k=K. If False k=1,…,K.Default: False.
fpr (float) – False positive ratio on legitimate query sequences.Default: 0.001
Examples
- Copy>>> detector = SimilarityDetector(model)
- >>> detector.fit(Tensor(ori), Tensor(labels))
- >>> adv_ids = detector.detect(Tensor(adv))
clear_buffer
()[source]Clear the buffer memory.
detect
(inputs)[source]Process queries to detect black-box attack.
- Parameters
inputs (numpy.ndarray) – Query sequence.
Raises
- ValueError – The parameters of threshold or num_of_neighbors is not available.
detectdiff
(_inputs)[source]Detect adversarial samples from input samples, like the predict_probafunction in common machine learning model.
- Parameters
inputs (Union__[numpy.ndarray, list, tuple]) – Data been used asreferences to create adversarial examples.
Raises
- NotImplementedError – This function is not available in class SimilarityDetector.
fit
(inputs, labels=None)[source]Process input training data to calculate the threshold.A proper threshold should make sure the false positiverate is under a given value.
- Parameters
inputs (numpy.ndarray) – Training data to calculate the threshold.
labels (numpy.ndarray) – Labels of training data.
Returns
-
list[int], number of the nearest neighbors.
-
list[float], calculated thresholds for different K.
- Raises
-
ValueError – The number of training data is less than max_k_neighbor!
get_detected_queries
()[source]Get the indexes of detected queries.
- Returns
- list[int], sequence number of detected malicious queries.
get_detection_interval
()[source]Get the interval between adjacent detections.
- Returns
- list[int], number of queries between adjacent detections.
setthreshold
(_num_of_neighbors, threshold)[source]Set the parameters num_of_neighbors and threshold.
transform
(inputs)[source]Filter adversarial noises in input samples.
- Raises
- NotImplementedError – This function is not available in class SimilarityDetector.