Sensitivity analysis

Method base class

class gsa_framework.sensitivity_analysis.method_base.SensitivityAnalysisMethod(model, write_dir, iterations=None, seed=None, cpus=None, available_memory=2, bytes_per_entry=8, use_parallel=True, tag=None)

Base class to define sensitivity analysis. Should be subclassed to define specific sensitivity methods.

Has the following components:

A sampling strategy.
A model execution step computed in parallel.
An analysis function to calculate sensitivity indices.

This class provides a common interface for these components, and utility functions to save data at each step.

Parameters

model (object) – Model should have __len__, __call__ and rescale methods, can be a child class of ModelBase.
write_dir (Path or str) – Directory to store all generated data. It is advisable to have separate directories for each model.
iterations (int) – Number of Monte Carlo simulations. If not specified, should be assigned automatically based on requirements of specific GSA methods.
seed (int) – Random seed.
cpus (int) – Number of cpus for parallel computations with multiprocessing library.
available_memory (int) – Available RAM in GB.
bytes_per_entry (int) – Desired precision of data, by default is 8 bytes per array entry, which corresponds to float64.
use_pararllel (Bool) – Flag to use parallel computations.

Returns

sa_dict – Dictionary with sensitivity indices.

Return type

dict

generate_gsa_indices(**kwargs)

Computation of GSA indices.

Returns: gsa_indices_dict – Keys are GSA indices names, values - sensitivity indices for all parameters.
Return type: dict

generate_model_output(return_Y=True)

Wrapper function to obtain model outputs from the X_rescaled parameter sampling matrix.

Run Monte Carlo simulations in parallel or sequentially, and write results to a file.

Returns: Y – Model outputs
Return type: np.array

generate_model_output_parallel(return_Y=True): Obtain model outputs from the X_rescaled in parallel and write them to a file.

generate_model_output_sequential(return_Y=True): Obtain model outputs from the X_rescaled sequentially and write them to a file.

generate_rescaled_samples(return_X=True): Rescale samples from standard uniform to appropriate distributions and write X_rescaled to a file.

generate_unitcube_samples(return_X=True): Geneate samples in [0,1] range and write X_unitcube to a file.

generate_unitcube_samples_based_on_method(iterations): Geneate samples in [0,1] range that follow sampling designs, redefined by specific sensitivity methods if needed.

get_parallel_params(across_iterations): Compute parameters necessary for parallel computations, eg chunk_sizes, num_chunks.

make_dirs(): Create subdirectories where intermediate results will be stored, such arrays and figures.

perform_gsa(**kwargs): Performs sensitivity analysis from sampling to model runs and computation of indices, and displays required time in seconds.

plot_sa_results(S_dict, S_boolean=None, S_dict_analytical=None, fig_format=[])

Plotting of computed sensitivity indices vs parameters. Figure is saved in the write_dir.

Parameters

S_dict (dict) – Keys are GSA indices names, values - estimated sensitivity indices for all parameters.
S_boolean (boolean array) – True values for known influential inputs, and False - for non-influential.
S_dict_analytical (dict) – Keys are GSA indices names, values - analytical sensitivity indices for all parameters.
fig_format (list) – List of formats to save figure, can be “pickle”, “jpeg”, “pdf”, etc.

Returns

fig – Graphical objects figure from plotly.

Return type

plotly go.Figure object

Correlation coefficients

class gsa_framework.sensitivity_analysis.correlations.Correlations(**kwargs)

Global sensitivity analysis with correlation coefficients.

calculate_iterations(interval_width=0.1): Computes minimum number of iterations to obtain confidence intervals of width interval_width.

generate_gsa_indices_based_on_method(**kwargs): Uses random samples to compute correlation coefficients.

Sobol indices

Saltelli estimators

class gsa_framework.sensitivity_analysis.saltelli_sobol.SaltelliSobol(**kwargs)

Global sensitivity analysis with Sobol indices estimated by Saltelli and Jansen estimators.

References

Paper:: Saltelli, Annoni, Azzini, Campolongo, Ratto, and Tarantola [SAA+10]

calculate_iterations(): Compute number of iterations closest to iterations (if given) and consistent with Saltelli sampling.

generate_gsa_indices_based_on_method(**kwargs): Uses Saltelli samples to compute first and total order Sobol indices.

generate_unitcube_samples_based_on_method(iterations): Generate samples in [0,1] range based on Saltelli block sampling design.

Extended Fourier Amplitude Sensitivity Test (eFAST)

class gsa_framework.sensitivity_analysis.extended_FAST.eFAST(M=4, **kwargs)

Global sensitivity analysis with Sobol indices estimated by extended Fourier Amplitude Sensitivity Test.

References

Paper:: Saltelli, Tarantola, and Chan [STC99]

generate_unitcube_samples_based_on_method(iterations): Geneate samples in [0,1] range that follow sampling designs, redefined by specific sensitivity methods if needed.

Delta moment-independent indices

class gsa_framework.sensitivity_analysis.delta.Delta(num_resamples=1, **kwargs)

Global sensitivity analysis with delta moment independent indices and latin hypercube sampling.

References

Paper:: Borgonovo [Bor07]

generate_gsa_indices_based_on_method(**kwargs): Uses latin hypercube samples to compute Borgonovo delta indices.

generate_unitcube_samples_based_on_method(iterations): Generate samples in [0,1] range based on latin hypercube sampling design.

Feature importance with gradient boosting

class gsa_framework.sensitivity_analysis.gradient_boosting.GradientBoosting(tuning_parameters=None, test_size=0.2, xgb_model=None, **kwargs)

Global sensitivity analysis with feature importance measures from gradient boosted trees.

Computed sensitivity indices include:

weight: the number of times a feature is used to split the data across all trees.
gain: the average gain across all splits the feature is used in.
cover: the average coverage across all splits the feature is used in.
total_gain: the total gain across all splits the feature is used in.
total_cover: the total coverage across all splits the feature is used in.
fscore: how many times each feature is split on.

References

Paper:: Chen and Guestrin [CG16]
Useful links:: https://xgboost.readthedocs.io/en/latest/python/python_api.html

generate_gsa_indices_based_on_method(**kwargs): Uses XGBoost gradient boosted trees and random samples to compute feature importances.