ibicus.evaluate module

The evaluate-module: provides a set of functionalities to assess the performance of your bias adjustment method.

Bias adjustment is prone to mis-use and requires careful evaluation, as demonstrated and argued in Maraun et al. 2017. In particular, the bias adjustment methods implemented in this package operate on a marginal level which means that they correct distribution of individual variables at individual locations. There is therefore only a subset of climate model biases that these debiasers will be able to correct. Biases in the temporal or spatial structure of climate models, or the feedbacks to large-scale weather patterns might not be well corrected.

The evaluate-module: attempts to provide the user with the functionality to make an informed decision whether a chosen bias adjustment method is fit for purpose - whether it corrects marginal, as well as spatial and temporal statistical properties properties in the desired manner, as well as how it modifies the multivariate structure, if and how it modifies the climate change trend, and how it changes the bias in selected climate impact metrics.

There are three components to the evaluation module:

1. Evaluating the bias adjusted model on a validation period

In order to assess the performance of a bias adjustment method, the bias adjusted model data is compared to observational / reanalysis data. The historical period for which observations exist is therefore split into to dataset in pre-processing - a reference period, and a validation period.

Both statistical properties such as quantiles or the mean of the bias adjusted variables, as well as tailored threshold metrics of particular relevance to the use-case can be investigated. A threshold metric is an instance of the class ThresholdMetric-class. A number of threshold metrics such as dry days are pre-defined in the package. The user can modify existing metrics or create new metrics from scratch. Threshold metrics are defined by a variable they refer to, an absolute threshold value that can also be defined for by location or time period (such as day of year, or season), a name, and whether the threshold sets a lower, higher, outer or inner bound to the variables of interest. An example of a threshold metric is:

>>> frost_days = ThresholdMetric(name="Frost days (tasmin<0°C)", variable="tasmin", threshold_value=273.13, threshold_type="lower")

The bias before and after bias adjustment in both statistical properties as well as threshold metrics can be evaluated marginally (i.e. location-wise). Furthermore, the temporal spell length, the spatial extent and the spatiotemporal cluster size of threshold metrics such as hot days can be analysed and plotted. Spatial and multivariate statistical properties can also be evaluated in an experimental setting. The following table provides an overview of the different components that can be analysed in each of these two categories:

	Statistical properties	Threshold metrics
Marginal	x	x
Temporal		x (spell length)
Spatial	x (RMSE)	x (spatial extent)
Spatiotemporal		x (cluster size)
Multivariate	x (correlation)	x (joint exceedance)

Within the metrics class, the following functions are available:

`metrics.ThresholdMetric.from_quantile`(x, q, ...)	Creates a threshold metrics from a quantile respective to an array x.
`metrics.ThresholdMetric.calculate_instances_of_threshold_exceedance`(dataset)	Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
`metrics.ThresholdMetric.filter_threshold_exceedances`(dataset)	Returns an array containing the values of dataset where the threshold condition is met and zero where not.
`metrics.ThresholdMetric.calculate_exceedance_probability`(dataset)	Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
`metrics.ThresholdMetric.calculate_number_annual_days_beyond_threshold`(...)	Calculates number of days beyond threshold for each year in the dataset.
`metrics.ThresholdMetric.calculate_spell_length`(...)	Returns a `pd.DataFrame` of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.
`metrics.ThresholdMetric.calculate_spatial_extent`(...)	Returns a `pd.DataFrame` of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
`metrics.ThresholdMetric.calculate_spatiotemporal_clusters`(...)	Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

AccumulativeThresholdMetric-class is a child class of ThresholdMetric-class that adds additional functionalities for variables and metrics where the total accumulative amount over a given threshold is of interest - this is the case for precipitation, but not for temperature for example. The following functions are added:

`metrics.AccumulativeThresholdMetric.calculate_percent_of_total_amount_beyond_threshold`(dataset)	Calculates percentage of total amount beyond threshold for each location over all timesteps.
`metrics.AccumulativeThresholdMetric.calculate_annual_value_beyond_threshold`(...)	Calculates amount beyond threshold for each year in the dataset.
`metrics.AccumulativeThresholdMetric.calculate_intensity_index`(dataset)	Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.

For the evaluation of marginal properties, the following functions are currently available:

`marginal.calculate_marginal_bias`(obs[, ...])	Returns a `pd.DataFrame` containing location-wise percentage bias of different metrics: mean, 5th and 95th percentile, as well as metrics specific in metrics, comparing observations to climate model output during a validation period.
`marginal.plot_marginal_bias`(variable, bias_df)	Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in `calculate_marginal_bias()`.
`marginal.plot_bias_spatial`(variable, metric, ...)	Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in `calculate_marginal_bias()`.
`marginal.calculate_bias_days_metrics`(obs_data)	Returns a `pd.DataFrame` containing location-wise mean number of yearly threshold exceedances.
`marginal.plot_spatiotemporal`([data, ...])	Plots empirical CDFs of spatiotemporal clustersizes over entire area.
`marginal.plot_histogram`(variable, data_obs)	Plots histogram over entire area or at single location.

The following functions are available to analyse the bias in spatial correlation structure:

`correlation.rmse_spatial_correlation_distribution`(...)	Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location.
`correlation.rmse_spatial_correlation_boxplot`(...)	Boxplot of RMSE of spatial correlation across locations.

To analyse the multivariate correlation structure, as well as joint threshold exceedances:

`multivariate.calculate_conditional_joint_threshold_exceedance`(...)	Returns a `pd.DataFrame` containing location-wise conditional exceedance probability.
`multivariate.plot_conditional_joint_threshold_exceedance`(...)	Accepts ouput given by `calculate_conditional_joint_threshold_exceedance()` and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets.
`multivariate.plot_conditional_probability_spatial`(bias_df)	Spatial plot of bias at each location with respect to one specified metric.
`multivariate.calculate_and_spatialplot_multivariate_correlation`(...)	Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.
`multivariate.plot_correlation_single_location`(...)	Uses seaborn.regplot and output of `create_multivariate_dataframes()` to plot scatterplot and Pearson correlation estimate of the two specified variables.
`multivariate.plot_bootstrap_correlation_replicates`(...)	Plots histograms of correlation between variables in input dataframes estimated via bootstrap using `_calculate_bootstrap_correlation_replicates()`.

2. Investigating whether the climate change trend is preserved

Bias adjustment methods can significantly modify the trend projected in the climate model simulation (Switanek 2017). If the user does not consider the simulated trend to be credible, then modifying it can be a good thing to do. However, any trend modification should always be a concious and informed choice, and it the belief that a bias adjustment method will improve the trend should be justified. Otherwise, the trend modification through the application of a bias adjustment method should be considered an artifact.

This component helps the user assess whether a certain method preserves the cliamte model trend or not. Some methods implemented in this package are explicitly trend preserving, for more details see the methodologies and descriptions of the individual debiasers.

`trend.calculate_future_trend_bias`(...[, ...])	For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class `ThresholdMetric` (from `ibicus.evaluate.metrics`) passed as arguments to the function.
`trend.plot_future_trend_bias_boxplot`(...[, ...])	Accepts ouput given by `calculate_future_trend_bias()` and creates an overview boxplot of the bias in the trend of different metrics.
`trend.plot_future_trend_bias_spatial`(...[, ...])	Accepts ouput given by `calculate_future_trend_bias()` and creates an spatial plot of trend bias for one chosen metric.

3. Testing assumptions of different debiasers

Different debiasers rely on different assumptions - some are parametrics, others non-parametric, some bias correct each day or month of the year separately, others are applied to all days of the year in the same way.

This components enables the user to check some of these assumptions and for example help the user choose an appropriate function to fit the data to, an appropriate application window (entire year vs each days or month individually) and rule out the use of some debiasers that are not fit for purpose in a specific application.

The current version of this component can analyse the following two questions? - Is the fit of the default distribution ‘good enough’ or should a different distribution be used? - Is there any seasonality in the data that should be accounted for, for example by applying a ‘running window mode’ (meaning that the bias adjustment is fitted separately for different parts of the year, i.e. windows)?

The following functions are currently available:

`assumptions.calculate_aic`(variable, dataset, ...)	Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.
`assumptions.plot_aic`(variable, aic_values[, ...])	Creates a boxplot of AIC values across all locations.
`assumptions.plot_fit_worst_aic`(variable, ...)	Plots a histogram and overlayed fit at the location of worst AIC.
`assumptions.plot_quantile_residuals`(...[, ...])	Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

ibicus.evaluate.metrics

Metrics module - Provides the possiblity to define threshold sensitive climate metrics that are analysed here and used further in ibicus.marginal, ibicus.multivariate and ibicus.trend.

ibicus.evaluate.metrics.ThresholdMetric

class ibicus.evaluate.metrics.ThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')

Generic climate metric defined by exceedance or underceedance of threshold; or values between an upper and lower threshold. This is determined by threshold_type. These metrics can be defined either overall or daily, monthly, seasonally (threshold_scope) and either globally or location-wise (threshold_locality).

Organises the definition and functionalities of such metrics. This enables among others to implement the ETCCDI / Climdex climate extreme indices.

Attributes:

threshold_valueUnion[np.array, float, list, dict]

Threshold value(s) for the variable (in the correct unit).

If threshold_type = "higher" or threshold_type = "lower", this is just a single float value and the metric is defined as exceedance or underceedance of that value (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).
If threshold_type = "between" or threshold_type = "outside", then this needs to be a list in the form: [lower_bound, upper_bound] and the metric is defined as falling in between, or falling outside these values (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).
If threshold_locality = "local" then instead of a single element (within a list, depending on threshold_type) a np.ndarray is stored here for locally defined threshold.
If threshold_scope is one of ["day", "month", "season"] then instead of a (list of) single element(s) or a np.ndarray a dict is stored whose keys are the times (for example the seasons) and values contain the thresholds (either locally or globally).

threshold_typestr

One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, not strict including the bounds) or outside the threshold values (“outside”, strict not including the bounds).

threshold_scopestr = “overall”

One of ["day", "month", "season", "overall"]. Indicates wether thresholds are irrespective of time or defined on a daily, monthly or seasonal basis.

threshold_localitystr = “global”

One of ["global", "local"]. Indicates wether thresholds are defined globally or locationwise.

namestr = “unknown”

Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : Frost days n (tasmin < 0°C). Default: “unknown”.

variablestr = “unknown”

Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Methods

`calculate_exceedance_probability`(dataset[, time])	Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
`calculate_instances_of_threshold_exceedance`(dataset)	Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
`calculate_number_annual_days_beyond_threshold`(...)	Calculates number of days beyond threshold for each year in the dataset.
`calculate_spatial_extent`(**climate_data)	Returns a `pd.DataFrame` of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
`calculate_spatiotemporal_clusters`(**climate_data)	Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
`calculate_spell_length`(minimum_length, ...)	Returns a `pd.DataFrame` of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.
`filter_threshold_exceedances`(dataset[, time])	Returns an array containing the values of dataset where the threshold condition is met and zero where not.
`from_quantile`(x, q, threshold_type[, ...])	Creates a threshold metrics from a quantile respective to an array x.

Examples

>>> warm_days = ThresholdMetric(threshold_value = 295, threshold_type = "higher", name = "Mean warm days (K)", variable = "tas")
>>> warm_days_by_season = ThresholdMetric(threshold_value = {"Winter": 290, "Spring": 292, "Summer": 295, "Autumn": 292}, threshold_type = "higher", threshold_scope = "season", name = "Mean warm days (K)", variable = "tas")
>>> q_90 = ThresholdMetricfrom_quantile(obs, 0.9, threshold_type = "higher", name = "90th observational quantile", variable = "tas")
>>> q_10_season = ThresholdMetric.from_quantile(obs, 0.1, threshold_type = "lower", threshold_scope = "season", time = time_obs, name = "10th quantile by season", variable = "tas")
>>> outside_10_90_month_local = ThresholdMetric.from_quantile(obs, [0.1, 0.9], threshold_type = "outside", threshold_scope = "month", threshold_locality = "local", time = time_obs, name = "Outside 10th, 9th quantile by month", variable = "tas")

classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')

Creates a threshold metrics from a quantile respective to an array x.

Parameters:

xnp.ndarray

Array respective to which the quantile is calculated.

qUnion[int, float, list]: Quantile (or list of lower and upper quantile if threshold_type in ["higher", "lower"]) as which the threshold is instantiated.
threshold_typestr: One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).
threshold_scopestr = “overall”: One of ["day", "month", "season", "overall"]. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.
threshold_localitystr = “global”: One of ["global", "local"]. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.
time: Optional[np.ndarray] = None: If the threshold is time-sensitive (threshold_scope in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.
namestr = “unknown”: Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days

(tasmin < 0°C)’. Default: `”unknown”`.

variablestr = “unknown”: Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Examples

>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1")
>>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2")
>>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")

calculate_instances_of_threshold_exceedance(dataset, time=None)

Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projections to be analysed, numeric entries expected.
timenp.ndarray = None: Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

filter_threshold_exceedances(dataset, time=None)

Returns an array containing the values of dataset where the threshold condition is met and zero where not.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projections to be analysed, numeric entries expected.
timenp.ndarray = None: Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

calculate_exceedance_probability(dataset, time=None)

Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).

Parameters:

datasetnp.ndarray: Input data, either observations or climate projections to be analysed, numeric entries expected
timenp.ndarray = None: Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

Returns:

np.ndarray: Probability of metric occurrence at each location.

calculate_number_annual_days_beyond_threshold(dataset, time)

Calculates number of days beyond threshold for each year in the dataset.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projections to be analysed, numeric entries expected.
timenp.ndarray: Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.

Returns:

np.ndarray: 3d array - [years, lat, long]

calculate_spell_length(minimum_length, **climate_data)

Returns a pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.

A spell length is defined as the number of days that a threshold is continuesly exceeded, underceeded or where values are continuously between or outside the threshold (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser as specified in **climate_data, ‘Metric’ - name of the threshold metric, ‘Spell length - individual spell length counts’.

Parameters:

minimum lengthint: Minimum spell length (in days) investigated.
climate_data: Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = ['day', 'month', 'year']) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:

pd.DataFrame: Dataframe of spell lengths of metrics occurrences.

Examples

>>> dry_days.calculate_spell_length(minimum_length = 4, obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

calculate_spatial_extent(**climate_data)

Returns a pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

The spatial extent is defined as the percentage of the area where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type), given that it is exceeded at one location. The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatial extent (% of area)’

Parameters:

**climate_data: Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:

pd.DataFrame: Dataframe of spatial extends of metrics occurrences.

Examples

>>> dry_days.calculate_spatial_extent(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

calculate_spatiotemporal_clusters(**climate_data)

Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.

A spatiotemporal cluster is defined as a connected set (in time and/or space) where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatiotemporal cluster size’

Parameters:

climate_data: Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.

Returns:

pd.DataFrame: Dataframe of sizes of individual spatiotemporal clusters of metrics occurrences.

Examples

>>> dry_days.calculate_spatiotemporal_clusters(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

ibicus.evaluate.metrics.AccumulativeThresholdMetric

class ibicus.evaluate.metrics.AccumulativeThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')

Climate for metrics that are defined by thresholds (child class of ThresholdMetric), but are accumulative. This mainly concerns precipitation metrics.

An example of such a metric is “total precipitation by very wet days (days > 10mm precipitation)”.

Methods

`calculate_annual_value_beyond_threshold`(...)	Calculates amount beyond threshold for each year in the dataset.
`calculate_intensity_index`(dataset[, time])	Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.
`calculate_percent_of_total_amount_beyond_threshold`(dataset)	Calculates percentage of total amount beyond threshold for each location over all timesteps.
`from_quantile`(x, q, threshold_type[, ...])	Creates a threshold metrics from a quantile respective to an array x.

Examples

>>> R10mm = AccumulativeThresholdMetric(name="Very wet days (> 10 mm/day)", variable="pr", threshold_value=10 / 86400,threshold_type="higher")

calculate_percent_of_total_amount_beyond_threshold(dataset, time=None)

Calculates percentage of total amount beyond threshold for each location over all timesteps.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
timenp.ndarray = None: Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

Returns:

np.ndarray: 2d array with percentage of total amount above threshold at each location.

calculate_annual_value_beyond_threshold(dataset, time)

Calculates amount beyond threshold for each year in the dataset.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projections to be analysed, numeric entries expected.
timenp.ndarray: Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.

Returns:

np.ndarray: 3d array - [years, lat, long]

calculate_intensity_index(dataset, time=None)

Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.

Designed to calculate the simple precipitation intensity index but can be used for other variables.

Parameters:

datasetnp.ndarray: Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
timenp.ndarray = None: Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).

classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')

Creates a threshold metrics from a quantile respective to an array x.

Parameters:

xnp.ndarray

Array respective to which the quantile is calculated.

qUnion[int, float, list]: Quantile (or list of lower and upper quantile if threshold_type in ["higher", "lower"]) as which the threshold is instantiated.
threshold_typestr: One of ["higher", "lower", "between", "outside"]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).
threshold_scopestr = “overall”: One of ["day", "month", "season", "overall"]. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.
threshold_localitystr = “global”: One of ["global", "local"]. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.
time: Optional[np.ndarray] = None: If the threshold is time-sensitive (threshold_scope in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.
namestr = “unknown”: Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days

(tasmin < 0°C)’. Default: `”unknown”`.

variablestr = “unknown”: Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.

Examples

>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1")
>>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2")
>>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")

Concrete metrics

ibicus.evaluate.metrics.dry_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Dry days \n (< 1 mm/day)', variable='pr'): Dry days (< 1 mm/day) for pr.

ibicus.evaluate.metrics.wet_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Wet days \n (> 1 mm/day)', variable='pr'): Wet days (> 1 mm/day) for pr.

ibicus.evaluate.metrics.R10mm = AccumulativeThresholdMetric(threshold_value=0.00011574074074074075, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Very wet days \n (> 10 mm/day)', variable='pr'): Very wet days (> 10 mm/day) for pr.

ibicus.evaluate.metrics.R20mm = AccumulativeThresholdMetric(threshold_value=0.0002314814814814815, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Extremely wet days \n (> 20 mm/day)', variable='pr'): Extremely wet days (> 20 mm/day) for pr.

ibicus.evaluate.metrics.warm_days = ThresholdMetric(threshold_value=295, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Mean warm days (K)', variable='tas'): Warm days (>295K) for tas.

ibicus.evaluate.metrics.cold_days = ThresholdMetric(threshold_value=275, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Mean cold days (K)', variable='tas'): Cold days (<275) for tas.

ibicus.evaluate.metrics.frost_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Frost days \n (tasmin<0°C)', variable='tasmin'): Frost days (<0°C) for tasmin.

ibicus.evaluate.metrics.tropical_nights = ThresholdMetric(threshold_value=293.13, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Tropical Nights \n (tasmin>20°C)', variable='tasmin'): Tropical Nights (>20°C) for tasmin.

ibicus.evaluate.metrics.summer_days = ThresholdMetric(threshold_value=298.15, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Summer days \n (tasmax>25°C)', variable='tasmax'): Summer days (>25°C) for tasmax.

ibicus.evaluate.metrics.icing_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Icing days \n (tasmax<0°C)', variable='tasmax'): Icing days (<0°C) for tasmax.

ibicus.evaluate.marginal

Marginal module - Calculate and plot the location-wise bias of the climate model before and after applying different bias adjustment methods. Provides the possiblity to calculate either the absolute or percentage bias at each location, for both statistical properties of the variable and specified threshold metrics, and plot either a boxplot across locations, or a spatial heatmap.

ibicus.evaluate.marginal.calculate_marginal_bias(obs, statistics=['mean', 0.05, 0.95], metrics=[], percentage_or_absolute='percentage', **cm_data)

Returns a pd.DataFrame containing location-wise percentage bias of different metrics: mean, 5th and 95th percentile, as well as metrics specific in metrics, comparing observations to climate model output during a validation period.

Output dataframes contains four columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which shows the name of the threshold metric or statistic calculated, ‘Type’ which specifies whether the absolute or percentage bias is calculated, and ‘Bias’ which contains a np.ndarray which in turn contains the output values at each location.

The output can be plotted using plot_marginal_bias() or plot_bias_spatial() or manually be analyzed further.

Parameters:

obsnp.ndarray: observational dataset in validation period. If one of the metrics is time sensitive (defined daily, monthly, seasonally) this needs to be a list of form [obs_data, time_obs_data] where time_obs_data is a 1d numpy arrays of times corresponding the the values in obs_data.
statisticslist: List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated. Default: ["mean", 0.05, 0.95].
metricslist: List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].
percentage_or_absolutestr: Specifies whether for the climate threshold metrics the percentage bias (p(cm)-p(obs))/p(obs) is computed, or the absolute bias. For threshold metrics, the absolute bias is difference in the mean days per year that this metric is exceeded and for statistics it is the absolute difference in [physical units] of the two statistics.
**cm_data: Keyword arguments of type debiaser_name = debiased_dataset in validation period (example: QM = tas_val_debiased_QM), covering all debiasers that are to be compared. If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) this needs to be a list of form lists of [cm_data, time_cm_data] where time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data.

Returns:

pd.DataFrame: DataFrame with marginal bias at all locations, for all metrics specified.

Examples

>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

ibicus.evaluate.marginal.plot_marginal_bias(variable, bias_df, statistics=['Mean', '0.95 qn', '0.05 qn'], manual_title=' ', remove_outliers=False, outlier_threshold_statistics=100, outlier_threshold_metrics=100, color_palette='tab10', metrics_title=' ', statistics_title=' ')

Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in calculate_marginal_bias().

Two boxplots are created: one for the descriptive statistics and one for threshold metrics present in the bias_df dataframe.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
bias_dfpd.DataFrame: pd.DataFrame containing percentage bias for descriptive statistics and specified metrics. Output of calculate_marginal_bias().
statisticslist: List of strings specifying summary statistics computed on the data. Strings have to be equal to entry in the ‘Metric’ column of bias_df.
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.
remove_outliersbool: If set to True, values above the threshold specified through the next argument are removed
outlier_threshold_statisticsint,: Threshold above which to remove values from the plot for bias statistics (mean, quantiles)
outlier_threshold_metricsint: Threshold above which to remove values from the plot for bias in metrics (such as dry days, hot days, etc)
color_palettestr: Seaborn color palette to use for the boxplot.

Examples

>>> tas_marginal_bias_plot = marginal.plot_marginal_bias(variable = 'tas', bias_df = tas_marginal_bias)

ibicus.evaluate.marginal.plot_bias_spatial(variable, metric, bias_df, remove_outliers=False, outlier_threshold=100, manual_title=' ')

Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in calculate_marginal_bias().

Parameters:

variable: str: Variable name, has to be given in standard form following CMIP convention.
metric: str: Specifies the metric analysed. Has to exactly match the name of this metric in the bias_df DataFrame.
bias_df: pd.DataFrame: pd.DataFrame containing percentage bias for descriptive statistics and specified metrics. Output of calculate_marginal_bias().
remove_outliers: bool: If set to True, values above the threshold specified through the next argument are removed.
outlier_threshold: int,: Threshold above which to remove values from the plot.
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.

Examples

>>> tas_marginal_bias_plot_mean = marginal.plot_bias_spatial(variable = 'tas', metric = 'Mean', bias_df = tas_marginal_bias)

ibicus.evaluate.marginal.calculate_bias_days_metrics(obs_data, metrics=[], **cm_data)

Returns a pd.DataFrame containing location-wise mean number of yearly threshold exceedances.

The output dataframes contains five columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which is in [metrics_names], ‘CM’ which contains the mean number of days of threshold exceedance in the climate models, ‘Obs’ which which contains the mean number of days of threshold exceedance in the observations, and ‘Bias’ which contains the difference (CM-Obs) between the mean number of threshold exceedance days in the climate model and the observations.

Parameters:

obs_datanp.ndarray: List of observational dataset in validation period and corresponding time information: [obs_data, time_obs_data]. Here time_obs_data is a 1d numpy arrays of times corresponding to the values in obs_data.
metricslist: Array of strings containing the names of the metrics that are to be assessed.
**cm_data: Keyword arguments of type debiaser_name = [cm_data, time_cm_data] covering all debiasers to be compared. Here time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data and cm_data refers to a debiased dataset in a validation period. Example: QM = [tas_val_debiased_QM, time_val].

Returns:

pd.DataFrame: DataFrame with marginal bias at all locations, for all metrics specified.

Examples

>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs_data = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)

ibicus.evaluate.marginal.plot_spatiotemporal(data=[], column_names=['Spell length (days)', 'Spatiotemporal cluster size', 'Spatial extent (% of area)'], xlims=[30, 30, 1])

Plots empirical CDFs of spatiotemporal clustersizes over entire area.

Parameters:

datalist = []: List of dataframes, output of the type produced by metrics.calculate_spell_length, metrics.calculate_spatial_extent, metrics.calculate_spatiotemporal_clusters expected as elements of the list
column_nameslist = [“Spell length (days)”, “Spatiotemporal cluster size”, “Spatial extent (% of area)”,]: Names of the columns containing spatiotemporal cluster sizes corresponding to the dataframes given to the argument data.
xlimslist = [30, 30, 1]: xlim for each of the plots, corresponding to the dataframes given to the argument data.

Examples

>>> spatiotemporal_figure = marginal.plot_spatiotemporal(data = [spelllength_dry, spatiotemporal_dry, spatial_dry])

ibicus.evaluate.marginal.plot_histogram(variable, data_obs, bin_number=100, manual_title=' ', **cm_data)

Plots histogram over entire area or at single location. Expects a one-dimensional array as input.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
data_obsnp.ndarray: 1d-array - either observational data specified at one location, or flattened array of all observed values over the area. Numeric values expected.
bin_numberint: Number of bins plotted in histogram, set to 100 by default
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.

Examples

>>> histogram = plot_histogram(variable='tas', data_obs=tas_obs_validate[:, 0,0], raw = tas_cm_validate[:, 0,0],  ISIMIP = tas_val_debiased_ISIMIP[:, 0,0], CDFt = tas_val_debiased_CDFT[:, 0,0])

ibicus.evaluate.multivariate

Multivariate module - calculate and plot conditional threshold exceedances, and analyse and plot the correlation between two variables at each location before and after bias adjustment to check for changes in the multivariate structure.

ibicus.evaluate.multivariate.calculate_conditional_joint_threshold_exceedance(metric1, metric2, **climate_data)

Returns a pd.DataFrame containing location-wise conditional exceedance probability. Calculates:

\[p (\text{Metric1} | \text{Metric2}) = p (\text{Metric1} , \text{Metric2}) / p(\text{Metric2})\]

Output is a pd.DataFrame with 3 columns:

Correction Method: Type of climate data - obs, raw, bias_correction_name. Given through keys of climate_data.
Compound metric: str reading Metric1.name given Metric2.name.
Conditional exceedance probability: 2d numpy array with conditional exceedance probability at each location.

Parameters:

metric1ThresholdMetric

Metric 1 whose exceedance conditional on metric 2 shall be assessed.

metric2ThresholdMetric

Metric 2 on which metric 1 is contioned upon.

**climate_data

Keyword arguments of type key = [variable1_debiased_dataset, variable2_debiased_dataset]. Here the exceedance of metric 1 is calculated on the first dataset (variable1_debiased_dataset) and the exceedance of metric 2 on the second one (variable2_debiased_dataset) to calculate the conditional exceedance. Example: obs = [pr_obs_validate, tasmin_obs_validate], or ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP]).

If one the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) a third list element needs to be passed: a 1D np.ndarray containing the time information corresponding to the entries in the variable1 and variable2 datasets. Example: obs = [pr_obs_validate, tasmin_obs_validate, time_obs_validate].

Warning

Datasets for variable1 and variable2 need to be during the same time period and the entries need to happen at the same dates.

Returns:

pd.DataFrame: DataFrame with conditional exceedance probability at all locations for the combination of metrics chosen.

Examples

>>> dry_frost_data = calculate_conditional_exceedance(metric1 = dry_days, metric2 = frost_days, obs = [pr_obs_validate, tasmin_obs_validate], raw = [pr_cm_validate, tasmin_cm_validate], ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP])

ibicus.evaluate.multivariate.plot_conditional_joint_threshold_exceedance(conditional_exceedance_df)

Accepts ouput given by calculate_conditional_joint_threshold_exceedance() and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets.

Parameters:

bias_array: np.ndarray: Output of calculate_conditional_joint_threshold_exceedance(). py:class:pd.DataFrame containing location-wise conditional exceedance probability.

ibicus.evaluate.multivariate.plot_conditional_probability_spatial(bias_df, remove_outliers=False, outlier_threshold=100, plot_title=' ')

Spatial plot of bias at each location with respect to one specified metric.

Parameters:

bias_df: pd.DataFrame: pd.DataFrame containing conditional exceedance probability, expects output of calculate_conditional_joint_threshold_exceedance().
remove_outliers: bool: If set to True, values above the threshold specified through the next argument are removed.
outlier_threshold: int,: Threshold above which to remove values from the plot.
plot_titlestr: No default plot title set within the function, plot_title will be used as title of the plot.

Examples

>>> warm_wet_vis = plot_conditional_probability_spatial(bias_df=warm_wet, plot_title ="Conditional probability of warm days (>20°C) given wet days (>1mm)")

ibicus.evaluate.multivariate.calculate_and_spatialplot_multivariate_correlation(variables, manual_title=' ', **kwargs)

Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.

Parameters:

variablelist: Variable name, has to be given in standard form specified in documentation.
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.
kwargs: Keyword arguments specifying a list of two np.ndarrays containing the two variables of interest.

Examples

>>> correlation.calculate_multivariate_correlation_locationwise(variables = ['tas', 'pr'], obs = [tas_obs_validate, pr_obs_validate], raw = [tas_cm_validate, pr_cm_validate], ISIMIP = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP])

ibicus.evaluate.multivariate.create_multivariate_dataframes(variables, datasets_obs, datasets_bc, gridpoint=(0, 0))

Helper function creating a joint pd.Dataframe of two variables specified, for observational dataset as well as one bias corrected dataset at one datapoint.

Parameters:

variableslist: List of two variable names, has to be given in standard form following CMIP convention
datasets_obslist: List of two observational datasets during same period for the two variables.
datasets_bclist: List of two bias corrected datasets during same period for the two variables.
gridpointtupel: Tuple that specifies location from which data will be extracted

Examples

>>> tas_pr_obs, tas_pr_isimip = create_multivariate_dataframes(variables = ['tas', 'pr'], datasets_obs = [tas_obs_validate, pr_obs_validate], datasets_bc = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP], gridpoint = (1,1))

ibicus.evaluate.multivariate.plot_correlation_single_location(variables, obs_df, bc_df)

Uses seaborn.regplot and output of create_multivariate_dataframes() to plot scatterplot and Pearson correlation estimate of the two specified variables. Offers visual comparison of correlation at single location.

Parameters:

variablelist: List of variable name, has to be given in standard form following CMIP convetion.
obs_dfpd.DataFrame: First argument of output of create_multivariate_dataframes()
bc_dfpd.DataFrame: Second argument of output of create_multivariate_dataframes()

Examples

>>> plot_correlation_single_location(variables = ['tas', 'pr'], obs_df = tas_pr_obs, bc_df = tas_pr_isimip)

ibicus.evaluate.multivariate.plot_bootstrap_correlation_replicates(obs_df, bc_df, bc_name, size)

Plots histograms of correlation between variables in input dataframes estimated via bootstrap using _calculate_bootstrap_correlation_replicates().

Parameters:

obs_dfpd.DataFrame: First argument of output of create_multivariate_dataframes()
bc_dfpd.DataFrame: Second argument of output of create_multivariate_dataframes()
bc_name: str: Name of bias correction method
size: int: Number of draws in bootstrapping procedure

Examples

>>> plot_bootstrap_correlation_replicates(obs_df = tas_pr_obs, bc_df = tas_pr_isimip, bc_name = 'ISIMIP', size=500)

ibicus.evaluate.correlation

Correlation module - Calculate and plot the RMSE between spatial correlation matrices at each location.

ibicus.evaluate.correlation.rmse_spatial_correlation_distribution(variable, obs_data, **cm_data)

Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location.

The computation involves the following steps: At each location, calculate the correlation to each other location in the observed as well as the climate model data set. Then calculate the mean squared error between these two matrices.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
obs_datanp.ndarray: Optional argument present in all plot functions: manual_title will be used as title of the plot.
cm_data: Keyword arguments specifying climate model datasets, for example: QM = tas_debiased_QM

Examples

>>> tas_rmsd_spatial = rmse_spatial_correlation_distribution(variable = 'tas', obs_data = tas_obs_validate, raw = tas_cm_future, QDM = tas_val_debiased_QDM)

ibicus.evaluate.correlation.rmse_spatial_correlation_boxplot(variable, dataset, manual_title=' ')

Boxplot of RMSE of spatial correlation across locations.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
datasetpd.DataFrame: Ouput format of function rmse_spatial_correlation_distribution()
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.

ibicus.evaluate.trend

Trend module - Calculate and plot changes to the climate model trend through application of different bias adjustment methods. Changes in the trend in both statistical properties (mean, quantiles), as well as threshold metrics (ThresholdMetric) can be calculated. Trends are defined here between a validation and future period.

ibicus.evaluate.trend.calculate_future_trend_bias(raw_validate, raw_future, statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)

For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class ThresholdMetric (from ibicus.evaluate.metrics) passed as arguments to the function.

The trend can be specified as either additive or multiplicative by setting trend_type (default: additive).

The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]

Parameters:

raw_validatenp.ndarray: Raw climate data set in validation period.
raw_future: np.ndarray: Raw climate data set in future period.
statisticslist: List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.
trend_typestr: Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].
metricslist: List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].
time_validatenp.ndarray: If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_validate and the first entry of each debiased_cms keyword arguments correspond.
time_futurenp.ndarray: If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_future and the second entry of each debiased_cms keyword arguments correspond.
debiased_cmsnp.ndarray: Keyword arguments given in format debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period] specifying the climate models to be analysed for trends in biases. Example: QM = [tas_val_debiased_QM, tas_future_debiased_QM].

Examples

>>> tas_trend_bias_data = trend.calculate_future_trend_bias(variable = 'tas', raw_validate = tas_cm_validate, raw_future = tas_cm_future, metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])

ibicus.evaluate.trend.calculate_future_trend(statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)

For each location, calculates the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class ThresholdMetric (from ibicus.evaluate.metrics) passed as arguments to the function.

The trend can be specified as either additive or multiplicative by setting trend_type (default: additive).

The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]

Parameters:

statisticslist: List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.
trend_typestr: Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].
metricslist: List of ThresholdMetric metrics whose trend bias shall be calculated. Example: metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days'].
time_validatenp.ndarray: If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the first entry of each debiased_cms keyword arguments correspond.
time_futurenp.ndarray: If one of the metrics is time sensitive (defined daily, monthly, seasonally: metric.threshold_scope = ['day', 'month', 'year']) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the second entry of each debiased_cms keyword arguments correspond.
debiased_cmsnp.ndarray: Keyword arguments given in format debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period] specifying the climate models to be analysed for trends in biases. Example: QM = [tas_val_debiased_QM, tas_future_debiased_QM].

Examples

>>> tas_trend_bias_data = trend.calculate_future_trend(variable = 'tas', metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])

ibicus.evaluate.trend.plot_future_trend_bias_boxplot(variable, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100, color_palette='tab10')

Accepts ouput given by calculate_future_trend_bias() and creates an overview boxplot of the bias in the trend of different metrics.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
bias_dfpd.DataFrame: Numpy array with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from calculate_future_trend_bias().
manual_titlestr: Manual title to replace the automatically generated one.
remove_outliersbool: If set to True, values above the threshold specified through the next argument are removed
outlier_threshold: int,: Threshold above which to remove values from the plot
color_palettestr: Seaborn color palette to use for the boxplot.

ibicus.evaluate.trend.plot_future_trend_bias_spatial(variable, metric, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100)

Accepts ouput given by calculate_future_trend_bias() and creates an spatial plot of trend bias for one chosen metric.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
metricstr: Metric in bias_df to plot.
bias_df: pd.DataFrame: Dataframe with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from calculate_future_trend_bias().
manual_titlestr: Optional argument present in all plot functions: manual_title will be used as title of the plot.

ibicus.evaluate.assumptions

Assumptions module - test assumptions of bias adjustment methods. Currently allows to fit different distributions to the data, calculate and plot the Akaike Information Criterion to compare distributions and plot timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

ibicus.evaluate.assumptions.calculate_aic(variable, dataset, *distributions)

Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.

Warning

*distributions can currently only be scipy.stats.rv_continuous and not as usually also StatisticalModel.

Parameters:

variable: str: Variable name, has to be given in standard form specified in documentation.
datasetnp.ndarray: Input data, either observations or climate projections dataset to be analysed, numeric entries expected.
*distributionslist[scipy.stats.rv_continuous]: Distributions to be tested, elements are scipy.stats.rv_continuous

Returns:

pd.DataFrame: DataFrame with all locations, distributions and associated AIC values.

ibicus.evaluate.assumptions.plot_aic(variable, aic_values, manual_title=' ')

Creates a boxplot of AIC values across all locations.

Parameters:

variablestr: Variable name, has to be given in standard form specified in documentation.
aic_valuespd.DataFrame: Pandas dataframe of type of output by calculate_aic_goodness_of_fit.

ibicus.evaluate.assumptions.plot_fit_worst_aic(variable, dataset, data_type, distribution, nr_bins='auto', aic_values=None, manual_title=' ')

Plots a histogram and overlayed fit at the location of worst AIC.

Warning

distribution can currently only be scipy.stats.rv_continuous and not as usually also StatisticalModel.

Parameters:

variablestr: Variable name, has to be given in standard CMIP convention
datasetnp.ndarray: 3d-input data [time, lat, long], numeric entries expected. Either observations or climate projections dataset to be analysed.
data_typestr: Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.
distributionscipy.stats.rv_continuous: Distribution providing fit to the data
nr_binsUnion[int, str] = “auto”: Number of bins used for the histogram. Either :py:class:int` or “auto” (default).
aic_valuesOptional[pd.DataFrame] = None: Pandas dataframe of type output by calculate_aic_goodness_of_fit. If None then they are recalculated;
manual_title: str = “ “: Optional argument present in all plot functions: manual_title will be used as title of the plot.

ibicus.evaluate.assumptions.plot_quantile_residuals(variable, dataset, distribution, data_type, manual_title=' ')

Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.

Parameters:

variable: str: Variable name, has to be given in standard form specified in documentation.
datasetnp.ndarray: 1d numpy array. Input data, either observations or climate projections dataset at one location, numeric entries expected.
distribution: scipy.stats.rv_continuous: Name of the distribution analysed, used for title only.
data_type: str: Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.
manual_title: str = “ “: Allows to set plot title manually.

Examples

>>> tas_obs_plot_gof = assumptions.plot_quantile_residuals(variable = 'tas', dataset = tas_obs[:,0,0], distribution = scipy.stats.norm, data_type = 'observation data')