ibicus.evaluate module

The evaluate-module: provides a set of functionalities to assess the performance of your bias adjustment method.

Bias adjustment is prone to mis-use and requires careful evaluation, as demonstrated and argued in Maraun et al. 2017. In particular, the bias adjustment methods implemented in this package operate on a marginal level which means that they correct distribution of individual variables at individual locations. There is therefore only a subset of climate model biases that these debiasers will be able to correct. Biases in the temporal or spatial structure of climate models, or the feedbacks to large-scale weather patterns might not be well corrected.

The evaluate-module: attempts to provide the user with the functionality to make an informed decision whether a chosen bias adjustment method is fit for purpose - whether it corrects marginal, as well as spatial and temporal statistical properties properties in the desired manner, as well as how it modifies the multivariate structure, if and how it modifies the climate change trend, and how it changes the bias in selected climate impact metrics.

There are three components to the evaluation module:

1. Evaluating the bias adjusted model on a validation period

In order to assess the performance of a bias adjustment method, the bias adjusted model data is compared to observational / reanalysis data. The historical period for which observations exist is therefore split into to dataset in pre-processing - a reference period, and a validation period.

Both statistical properties such as quantiles or the mean of the bias adjusted variables, as well as tailored threshold metrics of particular relevance to the use-case can be investigated. A threshold metric is an instance of the class ThresholdMetric-class. A number of threshold metrics such as dry days are pre-defined in the package. The user can modify existing metrics or create new metrics from scratch. Threshold metrics are defined by a variable they refer to, an absolute threshold value that can also be defined for by location or time period (such as day of year, or season), a name, and whether the threshold sets a lower, higher, outer or inner bound to the variables of interest. An example of a threshold metric is:

>>> frost_days = ThresholdMetric(name="Frost days (tasmin<0°C)", variable="tasmin", threshold_value=273.13, threshold_type="lower")

The bias before and after bias adjustment in both statistical properties as well as threshold metrics can be evaluated marginally (i.e. location-wise). Furthermore, the temporal spell length, the spatial extent and the spatiotemporal cluster size of threshold metrics such as hot days can be analysed and plotted. Spatial and multivariate statistical properties can also be evaluated in an experimental setting. The following table provides an overview of the different components that can be analysed in each of these two categories:

Statistical properties

Threshold metrics

Marginal

x

x

Temporal

x (spell length)

Spatial

x (RMSE)

x (spatial extent)

Spatiotemporal

x (cluster size)

Multivariate

x (correlation)

x (joint exceedance)

Within the metrics class, the following functions are available:

metrics.ThresholdMetric.from_quantile(x, q, ...)

Creates a threshold metrics from a quantile respective to an array x.

metrics.ThresholdMetric.calculate_instances_of_threshold_exceedance(dataset)

Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.

metrics.ThresholdMetric.filter_threshold_exceedances(dataset)

Returns an array containing the values of dataset where the threshold condition is met and zero where not.

metrics.ThresholdMetric.calculate_exceedance_probability(dataset)

Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).

metrics.ThresholdMetric.calculate_number_annual_days_beyond_threshold(...)

Calculates number of days beyond threshold for each year in the dataset.

metrics.ThresholdMetric.calculate_spell_length(...)

Returns a pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.

metrics.ThresholdMetric.calculate_spatial_extent(...)

Returns a pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

metrics.ThresholdMetric.calculate_spatiotemporal_clusters(...)

Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

AccumulativeThresholdMetric-class is a child class of ThresholdMetric-class that adds additional functionalities for variables and metrics where the total accumulative amount over a given threshold is of interest - this is the case for precipitation, but not for temperature for example. The following functions are added:

metrics.AccumulativeThresholdMetric.calculate_percent_of_total_amount_beyond_threshold(dataset)

Calculates percentage of total amount beyond threshold for each location over all timesteps.

metrics.AccumulativeThresholdMetric.calculate_annual_value_beyond_threshold(...)

Calculates amount beyond threshold for each year in the dataset.

metrics.AccumulativeThresholdMetric.calculate_intensity_index(dataset)

Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.

For the evaluation of marginal properties, the following functions are currently available:

marginal.calculate_marginal_bias(obs[, ...])

Returns a pd.DataFrame containing location-wise percentage bias of different metrics: mean, 5th and 95th percentile, as well as metrics specific in metrics, comparing observations to climate model output during a validation period.

marginal.plot_marginal_bias(variable, bias_df)

Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in calculate_marginal_bias().

marginal.plot_bias_spatial(variable, metric, ...)

Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in calculate_marginal_bias().

marginal.calculate_bias_days_metrics(obs_data)

Returns a pd.DataFrame containing location-wise mean number of yearly threshold exceedances.

marginal.plot_spatiotemporal([data, ...])

Plots empirical CDFs of spatiotemporal clustersizes over entire area.

marginal.plot_histogram(variable, data_obs)

Plots histogram over entire area or at single location.

The following functions are available to analyse the bias in spatial correlation structure:

correlation.rmse_spatial_correlation_distribution(...)

Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location.

correlation.rmse_spatial_correlation_boxplot(...)

Boxplot of RMSE of spatial correlation across locations.

To analyse the multivariate correlation structure, as well as joint threshold exceedances:

multivariate.calculate_conditional_joint_threshold_exceedance(...)

Returns a pd.DataFrame containing location-wise conditional exceedance probability.

multivariate.plot_conditional_joint_threshold_exceedance(...)

Accepts ouput given by calculate_conditional_joint_threshold_exceedance() and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets.

multivariate.plot_conditional_probability_spatial(bias_df)

Spatial plot of bias at each location with respect to one specified metric.

multivariate.calculate_and_spatialplot_multivariate_correlation(...)

Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.

multivariate.plot_correlation_single_location(...)

Uses seaborn.regplot and output of create_multivariate_dataframes() to plot scatterplot and Pearson correlation estimate of the two specified variables.

multivariate.plot_bootstrap_correlation_replicates(...)

Plots histograms of correlation between variables in input dataframes estimated via bootstrap using _calculate_bootstrap_correlation_replicates().

2. Investigating whether the climate change trend is preserved

Bias adjustment methods can significantly modify the trend projected in the climate model simulation (Switanek 2017). If the user does not consider the simulated trend to be credible, then modifying it can be a good thing to do. However, any trend modification should always be a concious and informed choice, and it the belief that a bias adjustment method will improve the trend should be justified. Otherwise, the trend modification through the application of a bias adjustment method should be considered an artifact.

This component helps the user assess whether a certain method preserves the cliamte model trend or not. Some methods implemented in this package are explicitly trend preserving, for more details see the methodologies and descriptions of the individual debiasers.

trend.calculate_future_trend_bias(...[, ...])

For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class ThresholdMetric (from ibicus.evaluate.metrics) passed as arguments to the function.

trend.plot_future_trend_bias_boxplot(...[, ...])

Accepts ouput given by calculate_future_trend_bias() and creates an overview boxplot of the bias in the trend of different metrics.

trend.plot_future_trend_bias_spatial(...[, ...])

Accepts ouput given by calculate_future_trend_bias() and creates an spatial plot of trend bias for one chosen metric.

3. Testing assumptions of different debiasers

Different debiasers rely on different assumptions - some are parametrics, others non-parametric, some bias correct each day or month of the year separately, others are applied to all days of the year in the same way.

This components enables the user to check some of these assumptions and for example help the user choose an appropriate function to fit the data to, an appropriate application window (entire year vs each days or month individually) and rule out the use of some debiasers that are not fit for purpose in a specific application.

The current version of this component can analyse the following two questions? - Is the fit of the default distribution ‘good enough’ or should a different distribution be used? - Is there any seasonality in the data that should be accounted for, for example by applying a ‘running window mode’ (meaning that the bias adjustment is fitted separately for different parts of the year, i.e. windows)?

The following functions are currently available:

assumptions.calculate_aic(variable, dataset, ...)

Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.

assumptions.plot_aic(variable, aic_values[, ...])

Creates a boxplot of AIC values across all locations.

assumptions.plot_fit_worst_aic(variable, ...)

Plots a histogram and overlayed fit at the location of worst AIC.

assumptions.plot_quantile_residuals(...[, ...])

Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

ibicus.evaluate.metrics

Metrics module - Provides the possiblity to define threshold sensitive climate metrics that are analysed here and used further in ibicus.marginal, ibicus.multivariate and ibicus.trend.

ibicus.evaluate.metrics.ThresholdMetric

class ibicus.evaluate.metrics.ThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')

Generic climate metric defined by exceedance or underceedance of threshold; or values between an upper and lower threshold. This is determined by threshold_type. These metrics can be defined either overall or daily, monthly, seasonally (threshold_scope) and either globally or location-wise (threshold_locality).

Organises the definition and functionalities of such metrics. This enables among others to implement the ETCCDI / Climdex climate extreme indices.

Examples

>>> warm_days = ThresholdMetric(threshold_value = 295, threshold_type = "higher", name = "Mean warm days (K)", variable = "tas")
>>> warm_days_by_season = ThresholdMetric(threshold_value = {"Winter": 290, "Spring": 292, "Summer": 295, "Autumn": 292}, threshold_type = "higher", threshold_scope = "season", name = "Mean warm days (K)", variable = "tas")
>>> q_90 = ThresholdMetricfrom_quantile(obs, 0.9, threshold_type = "higher", name = "90th observational quantile", variable = "tas")
>>> q_10_season = ThresholdMetric.from_quantile(obs, 0.1, threshold_type = "lower", threshold_scope = "season", time = time_obs, name = "10th quantile by season", variable = "tas")
>>> outside_10_90_month_local = ThresholdMetric.from_quantile(obs, [0.1, 0.9], threshold_type = "outside", threshold_scope = "month", threshold_locality = "local", time = time_obs, name = "Outside 10th, 9th quantile by month", variable = "tas")
Attributes:
threshold_valueUnion[np.array, float, list, dict]

Threshold value(s) for the variable (in the correct unit).

  • If threshold_type = "higher" or threshold_type = "lower", this is just a single float value and the metric is defined as exceedance or underceedance of that value (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).

  • If threshold_type = "between" or threshold_type = "outside", then this needs to be a list in the form: [lower_bound, upper_bound] and the metric is defined as falling in between, or falling outside these values (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).

  • If threshold_locality = "local" then instead of a single element (within a list, depending on threshold_type) a np.ndarray is stored here for locally defined threshold.

  • If threshold_scope is one of ["day", "month", "season"] then instead of a (list of) single element(s) or a np.ndarray a dict is stored whose keys are the times (for example the seasons) and values contain the thresholds (either locally or globally).

threshold_typestr

One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, not strict including the bounds) or outside the threshold values (“outside”, strict not including the bounds).

threshold_scopestr = “overall”

One of ["day", "month", "season", "overall"]. Indicates wether thresholds are irrespective of time or defined on a daily, monthly or seasonal basis.

threshold_localitystr = “global”

One of ["global", "local"]. Indicates wether thresholds are defined globally or locationwise.

namestr = “unknown”

Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : Frost days n (tasmin < 0°C). Default: “unknown”.

variablestr = “unknown”

Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Methods

calculate_exceedance_probability(dataset[, time])

Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).

calculate_instances_of_threshold_exceedance(dataset)

Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.

calculate_number_annual_days_beyond_threshold(...)

Calculates number of days beyond threshold for each year in the dataset.

calculate_spatial_extent(**climate_data)

Returns a pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

calculate_spatiotemporal_clusters(**climate_data)

Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

calculate_spell_length(minimum_length, ...)

Returns a pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.

filter_threshold_exceedances(dataset[, time])

Returns an array containing the values of dataset where the threshold condition is met and zero where not.

from_quantile(x, q, threshold_type[, ...])

Creates a threshold metrics from a quantile respective to an array x.

classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')

Creates a threshold metrics from a quantile respective to an array x.

Parameters:
xnp.ndarray

Array respective to which the quantile is calculated.

qUnion[int, float, list]

Quantile (or list of lower and upper quantile if threshold_type in ["higher", "lower"]) as which the threshold is instantiated.

threshold_typestr

One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).

threshold_scopestr = “overall”

One of ["day", "month", "season", "overall"]. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.

threshold_localitystr = “global”

One of ["global", "local"]. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.

time: Optional[np.ndarray] = None

If the threshold is time-sensitive (threshold_scope in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.

namestr = “unknown”

Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days

(tasmin < 0°C)’. Default: `”unknown”`.
variablestr = “unknown”

Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Examples

>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1")
>>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2")
>>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")
calculate_instances_of_threshold_exceedance(dataset, time=None)

Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projections to be analysed, numeric entries expected.

timenp.ndarray = None

Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

filter_threshold_exceedances(dataset, time=None)

Returns an array containing the values of dataset where the threshold condition is met and zero where not.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projections to be analysed, numeric entries expected.

timenp.ndarray = None

Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

calculate_exceedance_probability(dataset, time=None)

Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).

Parameters:
datasetnp.ndarray

Input data, either observations or climate projections to be analysed, numeric entries expected

timenp.ndarray = None

Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

Returns:
np.ndarray

Probability of metric occurrence at each location.

calculate_number_annual_days_beyond_threshold(dataset, time)

Calculates number of days beyond threshold for each year in the dataset.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projections to be analysed, numeric entries expected.

timenp.ndarray

Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.

Returns:
np.ndarray

3d array - [years, lat, long]

calculate_spell_length(minimum_length, **climate_data)

Returns a pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.

A spell length is defined as the number of days that a threshold is continuesly exceeded, underceeded or where values are continuously between or outside the threshold (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser as specified in **climate_data, ‘Metric’ - name of the threshold metric, ‘Spell length - individual spell length counts’.

Parameters:
minimum lengthint

Minimum spell length (in days) investigated.

climate_data

Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = ['day', 'month', 'year']) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:
pd.DataFrame

Dataframe of spell lengths of metrics occurrences.

Examples

>>> dry_days.calculate_spell_length(minimum_length = 4, obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
calculate_spatial_extent(**climate_data)

Returns a pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

The spatial extent is defined as the percentage of the area where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type), given that it is exceeded at one location. The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatial extent (% of area)’

Parameters:
**climate_data

Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:
pd.DataFrame

Dataframe of spatial extends of metrics occurrences.

Examples

>>> dry_days.calculate_spatial_extent(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
calculate_spatiotemporal_clusters(**climate_data)

Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

A spatiotemporal cluster is defined as a connected set (in time and/or space) where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatiotemporal cluster size’

Parameters:
climate_data

Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:
pd.DataFrame

Dataframe of sizes of individual spatiotemporal clusters of metrics occurrences.

Examples

>>> dry_days.calculate_spatiotemporal_clusters(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

ibicus.evaluate.metrics.AccumulativeThresholdMetric

class ibicus.evaluate.metrics.AccumulativeThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')

Climate for metrics that are defined by thresholds (child class of ThresholdMetric), but are accumulative. This mainly concerns precipitation metrics.

An example of such a metric is “total precipitation by very wet days (days > 10mm precipitation)”.

Examples

>>> R10mm = AccumulativeThresholdMetric(name="Very wet days (> 10 mm/day)", variable="pr", threshold_value=10 / 86400,threshold_type="higher")

Methods

calculate_annual_value_beyond_threshold(...)

Calculates amount beyond threshold for each year in the dataset.

calculate_intensity_index(dataset[, time])

Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.

calculate_percent_of_total_amount_beyond_threshold(dataset)

Calculates percentage of total amount beyond threshold for each location over all timesteps.

from_quantile(x, q, threshold_type[, ...])

Creates a threshold metrics from a quantile respective to an array x.

calculate_percent_of_total_amount_beyond_threshold(dataset, time=None)

Calculates percentage of total amount beyond threshold for each location over all timesteps.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.

timenp.ndarray = None

Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

Returns:
np.ndarray

2d array with percentage of total amount above threshold at each location.

calculate_annual_value_beyond_threshold(dataset, time)

Calculates amount beyond threshold for each year in the dataset.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projections to be analysed, numeric entries expected.

timenp.ndarray

Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.

Returns:
np.ndarray

3d array - [years, lat, long]

calculate_intensity_index(dataset, time=None)

Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.

Designed to calculate the simple precipitation intensity index but can be used for other variables.

Parameters:
datasetnp.ndarray

Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.

timenp.ndarray = None

Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')

Creates a threshold metrics from a quantile respective to an array x.

Parameters:
xnp.ndarray

Array respective to which the quantile is calculated.

qUnion[int, float, list]

Quantile (or list of lower and upper quantile if threshold_type in ["higher", "lower"]) as which the threshold is instantiated.

threshold_typestr

One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).

threshold_scopestr = “overall”

One of ["day", "month", "season", "overall"]. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.

threshold_localitystr = “global”

One of ["global", "local"]. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.

time: Optional[np.ndarray] = None

If the threshold is time-sensitive (threshold_scope in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.

namestr = “unknown”

Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days

(tasmin < 0°C)’. Default: `”unknown”`.
variablestr = “unknown”

Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Examples

>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1")
>>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2")
>>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")

Concrete metrics

ibicus.evaluate.metrics.dry_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Dry days \n (< 1 mm/day)', variable='pr')

Dry days (< 1 mm/day) for pr.

ibicus.evaluate.metrics.wet_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Wet days \n (> 1 mm/day)', variable='pr')

Wet days (> 1 mm/day) for pr.

ibicus.evaluate.metrics.R10mm = AccumulativeThresholdMetric(threshold_value=0.00011574074074074075, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Very wet days \n (> 10 mm/day)', variable='pr')

Very wet days (> 10 mm/day) for pr.

ibicus.evaluate.metrics.R20mm = AccumulativeThresholdMetric(threshold_value=0.0002314814814814815, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Extremely wet days \n (> 20 mm/day)', variable='pr')

Extremely wet days (> 20 mm/day) for pr.

ibicus.evaluate.metrics.warm_days = ThresholdMetric(threshold_value=295, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Mean warm days (K)', variable='tas')

Warm days (>295K) for tas.

ibicus.evaluate.metrics.cold_days = ThresholdMetric(threshold_value=275, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Mean cold days (K)', variable='tas')

Cold days (<275) for tas.

ibicus.evaluate.metrics.frost_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Frost days \n  (tasmin<0°C)', variable='tasmin')

Frost days (<0°C) for tasmin.

ibicus.evaluate.metrics.tropical_nights = ThresholdMetric(threshold_value=293.13, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Tropical Nights \n (tasmin>20°C)', variable='tasmin')

Tropical Nights (>20°C) for tasmin.

ibicus.evaluate.metrics.summer_days = ThresholdMetric(threshold_value=298.15, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Summer days \n  (tasmax>25°C)', variable='tasmax')

Summer days (>25°C) for tasmax.

ibicus.evaluate.metrics.icing_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Icing days \n (tasmax<0°C)', variable='tasmax')

Icing days (<0°C) for tasmax.


ibicus.evaluate.marginal

Marginal module - Calculate and plot the location-wise bias of the climate model before and after applying different bias adjustment methods. Provides the possiblity to calculate either the absolute or percentage bias at each location, for both statistical properties of the variable and specified threshold metrics, and plot either a boxplot across locations, or a spatial heatmap.

ibicus.evaluate.marginal.calculate_marginal_bias(obs, statistics=['mean', 0.05, 0.95], metrics=[], percentage_or_absolute='percentage', **cm_data)

Returns a pd.DataFrame containing location-wise percentage bias of different metrics: mean, 5th and 95th percentile, as well as metrics specific in metrics, comparing observations to climate model output during a validation period.

Output dataframes contains four columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which shows the name of the threshold metric or statistic calculated, ‘Type’ which specifies whether the absolute or percentage bias is calculated, and ‘Bias’ which contains a np.ndarray which in turn contains the output values at each location.

The output can be plotted using plot_marginal_bias() or plot_bias_spatial() or manually be analyzed further.

Parameters:
obsnp.ndarray

observational dataset in validation period. If one of the metrics is time sensitive (defined daily, monthly, seasonally) this needs to be a list of form [obs_data, time_obs_data] where time_obs_data is a 1d numpy arrays of times corresponding the the values in obs_data.

statisticslist

List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated. Default: ["mean", 0.05, 0.95].

metricslist

List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].

percentage_or_absolutestr

Specifies whether for the climate threshold metrics the percentage bias (p(cm)-p(obs))/p(obs) is computed, or the absolute bias. For threshold metrics, the absolute bias is difference in the mean days per year that this metric is exceeded and for statistics it is the absolute difference in [physical units] of the two statistics.

**cm_data

Keyword arguments of type debiaser_name = debiased_dataset in validation period (example: QM = tas_val_debiased_QM), covering all debiasers that are to be compared. If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) this needs to be a list of form lists of [cm_data, time_cm_data] where time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data.

Returns:
pd.DataFrame

DataFrame with marginal bias at all locations, for all metrics specified.

Examples

>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
ibicus.evaluate.marginal.plot_marginal_bias(variable, bias_df, statistics=['Mean', '0.95 qn', '0.05 qn'], manual_title=' ', remove_outliers=False, outlier_threshold_statistics=100, outlier_threshold_metrics=100, color_palette='tab10', metrics_title=' ', statistics_title=' ')

Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in calculate_marginal_bias().

Two boxplots are created: one for the descriptive statistics and one for threshold metrics present in the bias_df dataframe.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

bias_dfpd.DataFrame

pd.DataFrame containing percentage bias for descriptive statistics and specified metrics. Output of calculate_marginal_bias().

statisticslist

List of strings specifying summary statistics computed on the data. Strings have to be equal to entry in the ‘Metric’ column of bias_df.

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.

remove_outliersbool

If set to True, values above the threshold specified through the next argument are removed

outlier_threshold_statisticsint,

Threshold above which to remove values from the plot for bias statistics (mean, quantiles)

outlier_threshold_metricsint

Threshold above which to remove values from the plot for bias in metrics (such as dry days, hot days, etc)

color_palettestr

Seaborn color palette to use for the boxplot.

Examples

>>> tas_marginal_bias_plot = marginal.plot_marginal_bias(variable = 'tas', bias_df = tas_marginal_bias)
ibicus.evaluate.marginal.plot_bias_spatial(variable, metric, bias_df, remove_outliers=False, outlier_threshold=100, manual_title=' ')

Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in calculate_marginal_bias().

Parameters:
variable: str

Variable name, has to be given in standard form following CMIP convention.

metric: str

Specifies the metric analysed. Has to exactly match the name of this metric in the bias_df DataFrame.

bias_df: pd.DataFrame

pd.DataFrame containing percentage bias for descriptive statistics and specified metrics. Output of calculate_marginal_bias().

remove_outliers: bool

If set to True, values above the threshold specified through the next argument are removed.

outlier_threshold: int,

Threshold above which to remove values from the plot.

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.

Examples

>>> tas_marginal_bias_plot_mean = marginal.plot_bias_spatial(variable = 'tas', metric = 'Mean', bias_df = tas_marginal_bias)
ibicus.evaluate.marginal.calculate_bias_days_metrics(obs_data, metrics=[], **cm_data)

Returns a pd.DataFrame containing location-wise mean number of yearly threshold exceedances.

The output dataframes contains five columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which is in [metrics_names], ‘CM’ which contains the mean number of days of threshold exceedance in the climate models, ‘Obs’ which which contains the mean number of days of threshold exceedance in the observations, and ‘Bias’ which contains the difference (CM-Obs) between the mean number of threshold exceedance days in the climate model and the observations.

Parameters:
obs_datanp.ndarray

List of observational dataset in validation period and corresponding time information: [obs_data, time_obs_data]. Here time_obs_data is a 1d numpy arrays of times corresponding to the values in obs_data.

metricslist

Array of strings containing the names of the metrics that are to be assessed.

**cm_data

Keyword arguments of type debiaser_name = [cm_data, time_cm_data] covering all debiasers to be compared. Here time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data and cm_data refers to a debiased dataset in a validation period. Example: QM = [tas_val_debiased_QM, time_val].

Returns:
pd.DataFrame

DataFrame with marginal bias at all locations, for all metrics specified.

Examples

>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs_data = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
ibicus.evaluate.marginal.plot_spatiotemporal(data=[], column_names=['Spell length (days)', 'Spatiotemporal cluster size', 'Spatial extent (% of area)'], xlims=[30, 30, 1])

Plots empirical CDFs of spatiotemporal clustersizes over entire area.

Parameters:
datalist = []

List of dataframes, output of the type produced by metrics.calculate_spell_length, metrics.calculate_spatial_extent, metrics.calculate_spatiotemporal_clusters expected as elements of the list

column_nameslist = [“Spell length (days)”, “Spatiotemporal cluster size”, “Spatial extent (% of area)”,]

Names of the columns containing spatiotemporal cluster sizes corresponding to the dataframes given to the argument data.

xlimslist = [30, 30, 1]

xlim for each of the plots, corresponding to the dataframes given to the argument data.

Examples

>>> spatiotemporal_figure = marginal.plot_spatiotemporal(data = [spelllength_dry, spatiotemporal_dry, spatial_dry])
ibicus.evaluate.marginal.plot_histogram(variable, data_obs, bin_number=100, manual_title=' ', **cm_data)

Plots histogram over entire area or at single location. Expects a one-dimensional array as input.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

data_obsnp.ndarray

1d-array - either observational data specified at one location, or flattened array of all observed values over the area. Numeric values expected.

bin_numberint

Number of bins plotted in histogram, set to 100 by default

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.

Examples

>>> histogram = plot_histogram(variable='tas', data_obs=tas_obs_validate[:, 0,0], raw = tas_cm_validate[:, 0,0],  ISIMIP = tas_val_debiased_ISIMIP[:, 0,0], CDFt = tas_val_debiased_CDFT[:, 0,0])

ibicus.evaluate.multivariate

Multivariate module - calculate and plot conditional threshold exceedances, and analyse and plot the correlation between two variables at each location before and after bias adjustment to check for changes in the multivariate structure.

ibicus.evaluate.multivariate.calculate_conditional_joint_threshold_exceedance(metric1, metric2, **climate_data)

Returns a pd.DataFrame containing location-wise conditional exceedance probability. Calculates:

\[p (\text{Metric1} | \text{Metric2}) = p (\text{Metric1} , \text{Metric2}) / p(\text{Metric2})\]

Output is a pd.DataFrame with 3 columns:

  • Correction Method: Type of climate data - obs, raw, bias_correction_name. Given through keys of climate_data.

  • Compound metric: str reading Metric1.name given Metric2.name.

  • Conditional exceedance probability: 2d numpy array with conditional exceedance probability at each location.

Parameters:
metric1ThresholdMetric

Metric 1 whose exceedance conditional on metric 2 shall be assessed.

metric2ThresholdMetric

Metric 2 on which metric 1 is contioned upon.

**climate_data

Keyword arguments of type key = [variable1_debiased_dataset, variable2_debiased_dataset]. Here the exceedance of metric 1 is calculated on the first dataset (variable1_debiased_dataset) and the exceedance of metric 2 on the second one (variable2_debiased_dataset) to calculate the conditional exceedance. Example: obs = [pr_obs_validate, tasmin_obs_validate], or ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP]).

If one the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) a third list element needs to be passed: a 1D np.ndarray containing the time information corresponding to the entries in the variable1 and variable2 datasets. Example: obs = [pr_obs_validate, tasmin_obs_validate, time_obs_validate].

Warning

Datasets for variable1 and variable2 need to be during the same time period and the entries need to happen at the same dates.

Returns:
pd.DataFrame

DataFrame with conditional exceedance probability at all locations for the combination of metrics chosen.

Examples

>>> dry_frost_data = calculate_conditional_exceedance(metric1 = dry_days, metric2 = frost_days, obs = [pr_obs_validate, tasmin_obs_validate], raw = [pr_cm_validate, tasmin_cm_validate], ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP])
ibicus.evaluate.multivariate.plot_conditional_joint_threshold_exceedance(conditional_exceedance_df)

Accepts ouput given by calculate_conditional_joint_threshold_exceedance() and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets.

Parameters:
bias_array: np.ndarray

Output of calculate_conditional_joint_threshold_exceedance(). py:class:pd.DataFrame containing location-wise conditional exceedance probability.

ibicus.evaluate.multivariate.plot_conditional_probability_spatial(bias_df, remove_outliers=False, outlier_threshold=100, plot_title=' ')

Spatial plot of bias at each location with respect to one specified metric.

Parameters:
bias_df: pd.DataFrame

pd.DataFrame containing conditional exceedance probability, expects output of calculate_conditional_joint_threshold_exceedance().

remove_outliers: bool

If set to True, values above the threshold specified through the next argument are removed.

outlier_threshold: int,

Threshold above which to remove values from the plot.

plot_titlestr

No default plot title set within the function, plot_title will be used as title of the plot.

Examples

>>> warm_wet_vis = plot_conditional_probability_spatial(bias_df=warm_wet, plot_title ="Conditional probability of warm days (>20°C) given wet days (>1mm)")
ibicus.evaluate.multivariate.calculate_and_spatialplot_multivariate_correlation(variables, manual_title=' ', **kwargs)

Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.

Parameters:
variablelist

Variable name, has to be given in standard form specified in documentation.

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.

kwargs

Keyword arguments specifying a list of two np.ndarrays containing the two variables of interest.

Examples

>>> correlation.calculate_multivariate_correlation_locationwise(variables = ['tas', 'pr'], obs = [tas_obs_validate, pr_obs_validate], raw = [tas_cm_validate, pr_cm_validate], ISIMIP = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP])
ibicus.evaluate.multivariate.create_multivariate_dataframes(variables, datasets_obs, datasets_bc, gridpoint=(0, 0))

Helper function creating a joint pd.Dataframe of two variables specified, for observational dataset as well as one bias corrected dataset at one datapoint.

Parameters:
variableslist

List of two variable names, has to be given in standard form following CMIP convention

datasets_obslist

List of two observational datasets during same period for the two variables.

datasets_bclist

List of two bias corrected datasets during same period for the two variables.

gridpointtupel

Tuple that specifies location from which data will be extracted

Examples

>>> tas_pr_obs, tas_pr_isimip = create_multivariate_dataframes(variables = ['tas', 'pr'], datasets_obs = [tas_obs_validate, pr_obs_validate], datasets_bc = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP], gridpoint = (1,1))
ibicus.evaluate.multivariate.plot_correlation_single_location(variables, obs_df, bc_df)

Uses seaborn.regplot and output of create_multivariate_dataframes() to plot scatterplot and Pearson correlation estimate of the two specified variables. Offers visual comparison of correlation at single location.

Parameters:
variablelist

List of variable name, has to be given in standard form following CMIP convetion.

obs_dfpd.DataFrame

First argument of output of create_multivariate_dataframes()

bc_dfpd.DataFrame

Second argument of output of create_multivariate_dataframes()

Examples

>>> plot_correlation_single_location(variables = ['tas', 'pr'], obs_df = tas_pr_obs, bc_df = tas_pr_isimip)
ibicus.evaluate.multivariate.plot_bootstrap_correlation_replicates(obs_df, bc_df, bc_name, size)

Plots histograms of correlation between variables in input dataframes estimated via bootstrap using _calculate_bootstrap_correlation_replicates().

Parameters:
obs_dfpd.DataFrame

First argument of output of create_multivariate_dataframes()

bc_dfpd.DataFrame

Second argument of output of create_multivariate_dataframes()

bc_name: str

Name of bias correction method

size: int

Number of draws in bootstrapping procedure

Examples

>>> plot_bootstrap_correlation_replicates(obs_df = tas_pr_obs, bc_df = tas_pr_isimip, bc_name = 'ISIMIP', size=500)

ibicus.evaluate.correlation

Correlation module - Calculate and plot the RMSE between spatial correlation matrices at each location.

ibicus.evaluate.correlation.rmse_spatial_correlation_distribution(variable, obs_data, **cm_data)

Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location.

The computation involves the following steps: At each location, calculate the correlation to each other location in the observed as well as the climate model data set. Then calculate the mean squared error between these two matrices.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

obs_datanp.ndarray

Optional argument present in all plot functions: manual_title will be used as title of the plot.

cm_data

Keyword arguments specifying climate model datasets, for example: QM = tas_debiased_QM

Examples

>>> tas_rmsd_spatial = rmse_spatial_correlation_distribution(variable = 'tas', obs_data = tas_obs_validate, raw = tas_cm_future, QDM = tas_val_debiased_QDM)
ibicus.evaluate.correlation.rmse_spatial_correlation_boxplot(variable, dataset, manual_title=' ')

Boxplot of RMSE of spatial correlation across locations.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

datasetpd.DataFrame

Ouput format of function rmse_spatial_correlation_distribution()

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.


ibicus.evaluate.trend

Trend module - Calculate and plot changes to the climate model trend through application of different bias adjustment methods. Changes in the trend in both statistical properties (mean, quantiles), as well as threshold metrics (ThresholdMetric) can be calculated. Trends are defined here between a validation and future period.

ibicus.evaluate.trend.calculate_future_trend_bias(raw_validate, raw_future, statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)

For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class ThresholdMetric (from ibicus.evaluate.metrics) passed as arguments to the function.

The trend can be specified as either additive or multiplicative by setting trend_type (default: additive).

The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]

Parameters:
raw_validatenp.ndarray

Raw climate data set in validation period.

raw_future: np.ndarray

Raw climate data set in future period.

statisticslist

List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.

trend_typestr

Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].

metricslist

List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].

time_validatenp.ndarray

If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_validate and the first entry of each debiased_cms keyword arguments correspond.

time_futurenp.ndarray

If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_future and the second entry of each debiased_cms keyword arguments correspond.

debiased_cmsnp.ndarray

Keyword arguments given in format debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period] specifying the climate models to be analysed for trends in biases. Example: QM = [tas_val_debiased_QM, tas_future_debiased_QM].

Examples

>>> tas_trend_bias_data = trend.calculate_future_trend_bias(variable = 'tas', raw_validate = tas_cm_validate, raw_future = tas_cm_future, metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])
ibicus.evaluate.trend.calculate_future_trend(statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)

For each location, calculates the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class ThresholdMetric (from ibicus.evaluate.metrics) passed as arguments to the function.

The trend can be specified as either additive or multiplicative by setting trend_type (default: additive).

The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]

Parameters:
statisticslist

List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.

trend_typestr

Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].

metricslist

List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].

time_validatenp.ndarray

If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the first entry of each debiased_cms keyword arguments correspond.

time_futurenp.ndarray

If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the second entry of each debiased_cms keyword arguments correspond.

debiased_cmsnp.ndarray

Keyword arguments given in format debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period] specifying the climate models to be analysed for trends in biases. Example: QM = [tas_val_debiased_QM, tas_future_debiased_QM].

Examples

>>> tas_trend_bias_data = trend.calculate_future_trend(variable = 'tas', metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])
ibicus.evaluate.trend.plot_future_trend_bias_boxplot(variable, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100, color_palette='tab10')

Accepts ouput given by calculate_future_trend_bias() and creates an overview boxplot of the bias in the trend of different metrics.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

bias_dfpd.DataFrame

Numpy array with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from calculate_future_trend_bias().

manual_titlestr

Manual title to replace the automatically generated one.

remove_outliersbool

If set to True, values above the threshold specified through the next argument are removed

outlier_threshold: int,

Threshold above which to remove values from the plot

color_palettestr

Seaborn color palette to use for the boxplot.

ibicus.evaluate.trend.plot_future_trend_bias_spatial(variable, metric, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100)

Accepts ouput given by calculate_future_trend_bias() and creates an spatial plot of trend bias for one chosen metric.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

metricstr

Metric in bias_df to plot.

bias_df: pd.DataFrame

Dataframe with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from calculate_future_trend_bias().

manual_titlestr

Optional argument present in all plot functions: manual_title will be used as title of the plot.


ibicus.evaluate.assumptions

Assumptions module - test assumptions of bias adjustment methods. Currently allows to fit different distributions to the data, calculate and plot the Akaike Information Criterion to compare distributions and plot timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

ibicus.evaluate.assumptions.calculate_aic(variable, dataset, *distributions)

Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.

Warning

*distributions can currently only be scipy.stats.rv_continuous and not as usually also StatisticalModel.

Parameters:
variable: str

Variable name, has to be given in standard form specified in documentation.

datasetnp.ndarray

Input data, either observations or climate projections dataset to be analysed, numeric entries expected.

*distributionslist[scipy.stats.rv_continuous]

Distributions to be tested, elements are scipy.stats.rv_continuous

Returns:
pd.DataFrame

DataFrame with all locations, distributions and associated AIC values.

ibicus.evaluate.assumptions.plot_aic(variable, aic_values, manual_title=' ')

Creates a boxplot of AIC values across all locations.

Parameters:
variablestr

Variable name, has to be given in standard form specified in documentation.

aic_valuespd.DataFrame

Pandas dataframe of type of output by calculate_aic_goodness_of_fit.

ibicus.evaluate.assumptions.plot_fit_worst_aic(variable, dataset, data_type, distribution, nr_bins='auto', aic_values=None, manual_title=' ')

Plots a histogram and overlayed fit at the location of worst AIC.

Warning

distribution can currently only be scipy.stats.rv_continuous and not as usually also StatisticalModel.

Parameters:
variablestr

Variable name, has to be given in standard CMIP convention

datasetnp.ndarray

3d-input data [time, lat, long], numeric entries expected. Either observations or climate projections dataset to be analysed.

data_typestr

Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.

distributionscipy.stats.rv_continuous

Distribution providing fit to the data

nr_binsUnion[int, str] = “auto”

Number of bins used for the histogram. Either :py:class:int` or “auto” (default).

aic_valuesOptional[pd.DataFrame] = None

Pandas dataframe of type output by calculate_aic_goodness_of_fit. If None then they are recalculated;

manual_title: str = “ “

Optional argument present in all plot functions: manual_title will be used as title of the plot.

ibicus.evaluate.assumptions.plot_quantile_residuals(variable, dataset, distribution, data_type, manual_title=' ')

Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

Parameters:
variable: str

Variable name, has to be given in standard form specified in documentation.

datasetnp.ndarray

1d numpy array. Input data, either observations or climate projections dataset at one location, numeric entries expected.

distribution: scipy.stats.rv_continuous

Name of the distribution analysed, used for title only.

data_type: str

Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.

manual_title: str = “ “

Allows to set plot title manually.

Examples

>>> tas_obs_plot_gof = assumptions.plot_quantile_residuals(variable = 'tas', dataset = tas_obs[:,0,0], distribution = scipy.stats.norm, data_type = 'observation data')