aggmap.aggmodel package
Submodules
aggmap.aggmodel.cbks module
- class aggmap.aggmodel.cbks.CLA_EarlyStoppingAndPerformance(train_data, valid_data, MASK=-1, patience=5, criteria='val_loss', metric='ROC', last_avf=None, verbose=0)[source]
Bases:
Callback
- on_epoch_end(epoch, logs={})[source]
Called at the end of an epoch.
Subclasses should override for any actions to run. This function should only be called during TRAIN mode.
- Parameters:
epoch – Integer, index of epoch.
logs –
- Dict, metric results for this training epoch, and for the
validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the
- Model’s metrics are returned. Example`{‘loss’: 0.2, ‘accuracy’:
0.7}`.
- on_train_begin(logs=None)[source]
Called at the beginning of training.
Subclasses should override for any actions to run.
- Parameters:
logs – Dict. Currently no data is passed to this argument for this method but that may change in the future.
- class aggmap.aggmodel.cbks.Reg_EarlyStoppingAndPerformance(train_data, valid_data, MASK=-1, patience=5, criteria='val_loss', verbose=0)[source]
Bases:
Callback
- on_epoch_end(epoch, logs={})[source]
Called at the end of an epoch.
Subclasses should override for any actions to run. This function should only be called during TRAIN mode.
- Parameters:
epoch – Integer, index of epoch.
logs –
- Dict, metric results for this training epoch, and for the
validation epoch if validation is performed. Validation result keys are prefixed with val_. For training epoch, the values of the
- Model’s metrics are returned. Example`{‘loss’: 0.2, ‘accuracy’:
0.7}`.
- on_train_begin(logs=None)[source]
Called at the beginning of training.
Subclasses should override for any actions to run.
- Parameters:
logs – Dict. Currently no data is passed to this argument for this method but that may change in the future.
aggmap.aggmodel.explain_dev module
Created on Tue Feb 2 14:54:38 2021
@author: wanxiang.shen@u.nus.edu
- aggmap.aggmodel.explain_dev.GlobalIMP(clf, mp, X, Y, task_type='classification', binary_task=False, sigmoidy=False, apply_logrithm=False, apply_smoothing=False, kernel_size=5, sigma=1.6)[source]
Forward prop. Feature importance
apply_scale_smothing: alpplying a smothing on the map
aggmap.aggmodel.explainer module
Created on Fri Sep. 17 17:10:53 2021
@author: wanxiang.shen@u.nus.edu
- class aggmap.aggmodel.explainer.shapley_explainer(estimator, mp, backgroud='min', k_means_sampling=True, link='identity', **args)[source]
Bases:
object
Kernel Shap based model explaination, the limiations can be found in this paper:https://christophm.github.io/interpretable-ml-book/shapley.html#disadvantages-16 <Problems with Shapley-value-based explanations as feature importance measures>. The SHAP values do not identify causality Global mean absolute Deep SHAP feature importance is the average impact on model output magnitude.
- Parameters:
estimator – model with a predict or predict_probe method
mp – aggmap object
backgroud (string or int) – {‘min’, ‘global_min’,’all’, int}. if min, then use the min value as the backgroud data (equals to 1 sample) if global_min, then use the min value of all data as the backgroud data. if int, then sample the K samples as the backgroud data if ‘all’ use all of the train data as the backgroud data for shap,
k_means_sampling (bool,) – whether use the k-mean to sample the backgroud values or not
link – {“identity”, “logit”}. A generalized linear model link to connect the feature importance values to the model output. Since the feature importance values, phi, sum up to the model output, it often makes sense to connect them to the output with a link function where link(output) = sum(phi). If the model output is a probability then the LogitLink link function makes the feature importance values have log-odds units.
args – Other parameters for shap.KernelExplainer.
Examples
>>> import seaborn as sns >>> from aggmap.aggmodel.explainer import shapley_explainer >>> ## shapley explainer >>> shap_explainer = shapley_explainer(estimator, mp) >>> global_imp_shap = shap_explainer.global_explain(clf.X_) >>> local_imp_shap = shap_explainer.local_explain(clf.X_[[0]]) >>> ## S-map of shapley explainer >>> sns.heatmap(local_imp_shap.shapley_importance_class_1.values.reshape(mp.fmap_shape), >>> cmap = 'rainbow') >>> ## shapley plot >>> shap.summary_plot(shap_explainer.shap_values, >>> feature_names = shap_explainer.feature_names) # #global plot_type='bar >>> shap.initjs() >>> shap.force_plot(shap_explainer.explainer.expected_value[1], >>> shap_explainer.shap_values[1], feature_names = shap_explainer.feature_names)
- global_explain(X=None, nsamples='auto', **args)[source]
Explaination of many samples.
- Parameters:
X (None or 4D array, where the shape is (n, w, h, c)) – the 4D array of AggMap multi-channel fmaps. Noted that if X is None, then use the estimator.X_ instead, namely explain the training set of the estimator
nsamples ({'auto', int}) – Number of times to re-evaluate the model when explaining each prediction. More samples lead to lower variance estimates of the SHAP values. The “auto” setting uses nsamples = 2 * X.shape[1] + 2048
args (other parameters in the shape_values method of shap.KernelExplainer) –
- local_explain(X=None, idx=0, nsamples='auto', **args)[source]
Explaination of one sample only:
- Parameters:
X (None or 4D array, where the shape is (n, w, h, c)) – the 4D array of AggMap multi-channel fmaps. Noted if X is None, then use the estimator.X_[[idx]] instead, namely explain the first sample if idx=0
nsamples ({'auto', int}) – Number of times to re-evaluate the model when explaining each prediction. More samples lead to lower variance estimates of the SHAP values. The “auto” setting uses nsamples = 2 * X.shape[1] + 2048
args (other parameters in the shape_values method of shap.KernelExplainer) –
- class aggmap.aggmodel.explainer.simply_explainer(estimator, mp, backgroud='min', apply_logrithm=False, apply_smoothing=False, kernel_size=5, sigma=1.0)[source]
Bases:
object
Simply-explainer for model explaination.
- Parameters:
estimator (object) – model with a predict or predict_probe method
mp (object) – aggmap object
backgroud ({'min', 'global_min','zeros'}, default: 'min'.) – if “min”, then use the min value of a vector of the training set, if ‘global_min’, then use the min value of all training set. if ‘zero’, then use all zeros as the backgroud data.
apply_logrithm (bool, default: False) – apply the logirthm to the feature importance score
apply_smoothing (bool, default: False) – apply the gaussian smoothing on the feature importance score (Saliency map)
kernel_size (int, default: 5.) – the kernel size for the smoothing
sigma (float, default: 1.0.) – the sigma for the smoothing.
Examples
>>> import seaborn as sns >>> from aggmap.aggmodel.explainer import simply_explainer >>> simp_explainer = simply_explainer(estimator, mp) >>> global_imp_simp = simp_explainer.global_explain(clf.X_, clf.y_) >>> local_imp_simp = simp_explainer.local_explain(clf.X_[[0]], clf.y_[[0]]) >>> ## S-map of simply explainer >>> sns.heatmap(local_imp_simp.simply_importance.values.reshape(mp.fmap_shape), >>> cmap = 'rainbow')
- global_explain(X=None, y=None)[source]
Explaination of many samples.
- Parameters:
X (None or 4D array, where the shape is (n, w, h, c)) – the 4D array of AggMap multi-channel fmaps
y (None or 4D array, where the shape is (n, class_num)) – the True label
None (Noted that if X and y are) –
instead (then use the estimator.X and estimator.y) –
estimator (namely explain the training set of the) –
- local_explain(X=None, y=None, idx=0)[source]
Explaination of one sample only.
- Parameters:
X (None or 4D array, where the shape is (1, w, h, c)) –
y (the True label, None or 4D array, where the shape is (1, class_num).) –
idx (int,) – index of the sample to interpret Noted that if X and y are None, then use the estimator.X_[[idx]] and estimator.y_[[idx]] instead, namely explain the first sample if idx=0.
- Return type:
Feature importance of the current class