Sensitivity analysis

Method base class

class gsa_framework.sensitivity_analysis.method_base.SensitivityAnalysisMethod(model, write_dir, iterations=None, seed=None, cpus=None, available_memory=2, bytes_per_entry=8, use_parallel=True, tag=None)

Base class to define sensitivity analysis. Should be subclassed to define specific sensitivity methods.

Has the following components:

  • A sampling strategy.

  • A model execution step computed in parallel.

  • An analysis function to calculate sensitivity indices.

This class provides a common interface for these components, and utility functions to save data at each step.

Parameters
  • model (object) – Model should have __len__, __call__ and rescale methods, can be a child class of ModelBase.

  • write_dir (Path or str) – Directory to store all generated data. It is advisable to have separate directories for each model.

  • iterations (int) – Number of Monte Carlo simulations. If not specified, should be assigned automatically based on requirements of specific GSA methods.

  • seed (int) – Random seed.

  • cpus (int) – Number of cpus for parallel computations with multiprocessing library.

  • available_memory (int) – Available RAM in GB.

  • bytes_per_entry (int) – Desired precision of data, by default is 8 bytes per array entry, which corresponds to float64.

  • use_pararllel (Bool) – Flag to use parallel computations.

Returns

sa_dict – Dictionary with sensitivity indices.

Return type

dict

generate_gsa_indices(**kwargs)

Computation of GSA indices.

Returns

gsa_indices_dict – Keys are GSA indices names, values - sensitivity indices for all parameters.

Return type

dict

generate_model_output(return_Y=True)

Wrapper function to obtain model outputs from the X_rescaled parameter sampling matrix.

Run Monte Carlo simulations in parallel or sequentially, and write results to a file.

Returns

Y – Model outputs

Return type

np.array

generate_model_output_parallel(return_Y=True)

Obtain model outputs from the X_rescaled in parallel and write them to a file.

generate_model_output_sequential(return_Y=True)

Obtain model outputs from the X_rescaled sequentially and write them to a file.

generate_rescaled_samples(return_X=True)

Rescale samples from standard uniform to appropriate distributions and write X_rescaled to a file.

generate_unitcube_samples(return_X=True)

Geneate samples in [0,1] range and write X_unitcube to a file.

generate_unitcube_samples_based_on_method(iterations)

Geneate samples in [0,1] range that follow sampling designs, redefined by specific sensitivity methods if needed.

get_parallel_params(across_iterations)

Compute parameters necessary for parallel computations, eg chunk_sizes, num_chunks.

make_dirs()

Create subdirectories where intermediate results will be stored, such arrays and figures.

perform_gsa(**kwargs)

Performs sensitivity analysis from sampling to model runs and computation of indices, and displays required time in seconds.

plot_sa_results(S_dict, S_boolean=None, S_dict_analytical=None, fig_format=[])

Plotting of computed sensitivity indices vs parameters. Figure is saved in the write_dir.

Parameters
  • S_dict (dict) – Keys are GSA indices names, values - estimated sensitivity indices for all parameters.

  • S_boolean (boolean array) – True values for known influential inputs, and False - for non-influential.

  • S_dict_analytical (dict) – Keys are GSA indices names, values - analytical sensitivity indices for all parameters.

  • fig_format (list) – List of formats to save figure, can be “pickle”, “jpeg”, “pdf”, etc.

Returns

fig – Graphical objects figure from plotly.

Return type

plotly go.Figure object

Correlation coefficients

class gsa_framework.sensitivity_analysis.correlations.Correlations(**kwargs)

Global sensitivity analysis with correlation coefficients.

calculate_iterations(interval_width=0.1)

Computes minimum number of iterations to obtain confidence intervals of width interval_width.

generate_gsa_indices_based_on_method(**kwargs)

Uses random samples to compute correlation coefficients.

Sobol indices

Saltelli estimators

class gsa_framework.sensitivity_analysis.saltelli_sobol.SaltelliSobol(**kwargs)

Global sensitivity analysis with Sobol indices estimated by Saltelli and Jansen estimators.

References

Paper:

Saltelli, Annoni, Azzini, Campolongo, Ratto, and Tarantola [SAA+10]

calculate_iterations()

Compute number of iterations closest to iterations (if given) and consistent with Saltelli sampling.

generate_gsa_indices_based_on_method(**kwargs)

Uses Saltelli samples to compute first and total order Sobol indices.

generate_unitcube_samples_based_on_method(iterations)

Generate samples in [0,1] range based on Saltelli block sampling design.

Extended Fourier Amplitude Sensitivity Test (eFAST)

class gsa_framework.sensitivity_analysis.extended_FAST.eFAST(M=4, **kwargs)

Global sensitivity analysis with Sobol indices estimated by extended Fourier Amplitude Sensitivity Test.

References

Paper:

Saltelli, Tarantola, and Chan [STC99]

generate_unitcube_samples_based_on_method(iterations)

Geneate samples in [0,1] range that follow sampling designs, redefined by specific sensitivity methods if needed.

Delta moment-independent indices

class gsa_framework.sensitivity_analysis.delta.Delta(num_resamples=1, **kwargs)

Global sensitivity analysis with delta moment independent indices and latin hypercube sampling.

References

Paper:

Borgonovo [Bor07]

generate_gsa_indices_based_on_method(**kwargs)

Uses latin hypercube samples to compute Borgonovo delta indices.

generate_unitcube_samples_based_on_method(iterations)

Generate samples in [0,1] range based on latin hypercube sampling design.

Feature importance with gradient boosting

class gsa_framework.sensitivity_analysis.gradient_boosting.GradientBoosting(tuning_parameters=None, test_size=0.2, xgb_model=None, **kwargs)

Global sensitivity analysis with feature importance measures from gradient boosted trees.

Computed sensitivity indices include:

  • weight: the number of times a feature is used to split the data across all trees.

  • gain: the average gain across all splits the feature is used in.

  • cover: the average coverage across all splits the feature is used in.

  • total_gain: the total gain across all splits the feature is used in.

  • total_cover: the total coverage across all splits the feature is used in.

  • fscore: how many times each feature is split on.

References

Paper:

Chen and Guestrin [CG16]

Useful links:

https://xgboost.readthedocs.io/en/latest/python/python_api.html

generate_gsa_indices_based_on_method(**kwargs)

Uses XGBoost gradient boosted trees and random samples to compute feature importances.