Sensitivity indices computation
Correlation coefficients
- gsa_framework.sensitivity_methods.correlations.correlation_coefficients(filepath_Y, filepath_X_rescaled, cpus=None, selected_iterations=None)
Compute estimates of Pearson and Spearman correlation coefficients between vector
Yand all columns ofX.- Parameters
filepath_Y (Path or str) – Filepath to model outputs saved in .hdf5 format.
filepath_X_rescaled (Path or str) – Filepath to rescaled model inputs sampling in .hdf5 format.
- Returns
sa_dict – Dictionary that contains Pearson and Spearman correlation coefficients.
- Return type
dict
- gsa_framework.sensitivity_methods.correlations.get_corrcoef_interval_width(theta=None, iterations=100, confidence_level=0.95)
Computes confidence interval width given number of iterations, “true” value of correlation coefficient
thetaandconfidence_level.- Parameters
theta (float) – “True” correlation coefficient value that the estimator should approach. Can be Pearson, Kendall or Spearman.
iterations (int) – Number of iterations.
confidence_level (float) – Desired confidence level.
- Returns
interval_width_dict – Dictionary with analytical confidence interval width for Pearson, Kendall and Spearman coefficients.
- Return type
dict
References
- Paper:
Bonett and Wright [BW00]
- gsa_framework.sensitivity_methods.correlations.get_corrcoef_num_iterations(theta=None, interval_width=0.01, confidence_level=0.95)
Computes number of iterations for confident estimation of correlation coefficient with true value equal to
theta.- Parameters
theta (float) – “True” correlation coefficient value that the estimator should approach. Can be Pearson, Kendall or Spearman.
interval_width (float) – Desired width of the confidence interval.
confidence_level (float) – Desired confidence level.
- Returns
iterations_dict – Dictionary with number of iterations for Pearson, Kendall and Spearman coefficients.
- Return type
dict
References
- Paper:
Bonett and Wright [BW00]
- Remark for testing:
num_iterationsshould agree with the values from Table 1 of the paper. Part of the table is tested in tests. Sometimes there is a difference of +-1 iteration. I think this is due to minor numerical imprecision.
Sobol indices
Saltelli estimators
- gsa_framework.sensitivity_methods.saltelli_sobol.sobol_indices(filepath_Y, num_params, selected_iterations=None)
Compute estimations of Sobol’ first and total order indices.
High values of the Sobol first order index signify important parameters, while low values of the total indices point to non-important parameters. First order computes main effects only, total order takes into account interactions between parameters.
- Parameters
filepath_Y (Path or str) – Filepath to model outputs
yin .hdf5 format obtained by running model according to Saltelli samples.num_params (int) – Number of model inputs.
selected_iterations (array of ints) – Iterations that should be included to compute Sobol indices.
- Returns
sa_dict – Dictionary that contains computed first and total order Sobol indices.
- Return type
dict
References
- Paper:
Saltelli, Annoni, Azzini, Campolongo, Ratto, and Tarantola [SAA+10]
- Link to the original implementation:
https://github.com/SALib/SALib/blob/master/src/SALib/analyze/sobol.py
Extended Fourier Amplitude Sensitivity Test (eFAST)
- gsa_framework.sensitivity_methods.extended_FAST.eFAST_indices(filepath_Y, num_params, M=4, selected_iterations=None)
Compute estimations of Sobol’ first and total order indices with extended Fourier Amplitude Sensitivity Test (eFAST).
High values of the Sobol first order index signify important parameters, while low values of the total indices point to non-important parameters. First order computes main effects only, total order takes into account interactions between parameters.
- Parameters
filepath_Y (Path or str) – Filepath to model outputs
yin .hdf5 format obtained by running model according to eFAST samples.num_params (int) – Number of model inputs.
M (int) – Interference factor, usually 4 or higher, should be consistent with eFAST sampling.
selected_iterations (array of ints) – Iterations that should be included to compute eFAST Sobol indices.
- Returns
sa_dict – Dictionary that contains computed first and total order Sobol indices.
- Return type
dict
References
- Paper:
Saltelli, Tarantola, and Chan [STC99]
- Link to the original implementation:
https://github.com/SALib/SALib/blob/master/src/SALib/analyze/fast.py
Delta moment-independent indices
- gsa_framework.sensitivity_methods.delta.delta_indices(filepath_Y, filepath_X_rescaled, num_resamples=1, conf_level=0.95, seed=None, cpus=None)
Compute estimations of delta moment-independent indices.
- Parameters
filepath_Y (Path or str) – Filepath to model outputs
yin .hdf5 format.filepath_X_rescaled (Path or str) – Filepath to rescaled model inputs sampling in .hdf5 format.
num_resamples (int) – Number of bootstrap resamples to employ bias reduction bootstrap approach.
confidence_level (float) – Desired confidence level.
seed (int) – Random seed.
cpus (int) – Number of cpus for parallel computation of delta indices with
multiprocessinglibrary.
- Returns
sa_dict – Dictionary that contains computed delta indices with their confidence intervals.
- Return type
dict
References
- Paper:
Borgonovo [Bor07]
- Link to the original implementation:
https://github.com/SALib/SALib/blob/master/src/SALib/analyze/delta.py
Feature importance with gradient boosting
- gsa_framework.sensitivity_methods.gradient_boosting.xgboost_indices(filepath_Y, filepath_X, tuning_parameters=None, test_size=0.2, xgb_model=None, importance_types=None, flag_return_xgb_model=True)
Compute fscores obtained from the gradient boosted trees regression using XGBoost library.
- Parameters
filepath_Y (Path or str) – Filepath to model outputs
yin .hdf5 format.filepath_X (Path or str) – Filepath to unitcube or rescaled model inputs sampling in .hdf5 format.
tuning_parameters (dict) – Dictionary with XGBoost tuning parameters.
test_size (float) – Fraction of samples for test set.
xgb_model (Path or Booster object) – Model that can be used as warm start.
importance_types (list) – List of feature importance types to compute, by default computes everything.
flag_return_xgb_model (Bool) – Specify whether Booster model should be saved after training.
- Returns
sa_dict – Dictionary that contains computed sensitivity indices.
- Return type
dict
References
- Paper:
Chen and Guestrin [CG16]
- Link to XGBoost library: