ibicus.evaluate module¶
The evaluate
module: provides a set of functionalities to assess the performance of your bias correction method.
Bias correction is prone to misuse and requires careful evaluation, as demonstrated and argued in Maraun et al. 2017. In particular, the bias correction methods implemented in this package operate on a marginal level, that is they correct distribution of individual variables at individual locations. There is therefore only a subset of climate model biases that these debiasers will be able to correct. Biases in the temporal or spatial structure of climate models, or the feedbacks to largescale weather patterns might not be well corrected.
The evaluate
module: attempts to provide the user with the functionality to make an informed decision whether a chosen bias correction
method is fit for purpose  whether it corrects marginal, as well as spatial and temporal statistical properties properties in the desired manner,
as well as how it modifies the multivariate structure, if and how it modifies the climate change trend, and how it changes the bias in selected
climate impact metrics.
There are three components to the evaluation module:
1. Testing assumptions of different debiasers
Different debiasers rely on different assumptions  some are parametrics, others nonparametric, some bias correct each day or month of the year separately, others are applied to all days of the year in the same way.
This components is meant to check some of these assumptions and for example help the user choose an appropriate function to fit the data to, an appropriate application window (entire year vs each days or month individually) and rule out the use of some debiasers that are not fit for purpose in a specific application.
The current version of this component can analyse the following two questions?  Is the fit of the default distribution ‘good enough’ or should a different distribution be used?  Is there any seasonality in the data that should be accounted for, for example by applying a ‘running window mode’ (meaning that the bias correction is fitted separately for different parts of the year, i.e. windows)?
The following functions are currently available:

Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified. 

Creates a boxplot of AIC values across all locations. 

Plots a histogram and overlayed fit at the location of worst AIC. 

Plots timeseries and autocorrelation function of quantile residuals, as well as a QQplot of normalized quantile residuals at one location. 
2. Evaluating the bias corrected model on a validation period
In order to assess the performance of a bias correction method, the bias corrected model data has to be compared to observational / reanalysis data. The historical period for which observations exist is therefore split into to dataset in preprocessing  a reference period, and a validation period.
There are two types of analysis that the evaluation module enables you to conduct:
Statistical properties: this includes the marginal bias of descriptive statistics such as the mean, or 5th and 95th percentile, as well as the difference in spatial and multivariate correlation structure.
Threshold metrics: A threshold metric is an instance of the class
ThresholdMetric
class and is needs to be one of four types: exceedance of the specified threshold value (‘higher’), underceedance of the threshold value (‘lower’), falling within two specified bounds (‘between’) or falling outside two specified bounds (‘outside’). With the functionalities provided as part ofThresholdMetric
class, the marginal exceedance probability as well as the temporal spell length, the spatial extent and the spatiotemporal cluster size can be analysed. Some threshold metrics are prespecified, and the user can add further metrics in the following way:
>>> frost_days = ThresholdMetric(name="Frost days (tasmin<0°C)", variable="tasmin", threshold_value=273.13, threshold_type="lower")
The following table provides an overview of the different components that can be analysed in each of these two categories:
Statistical properties 
Threshold metrics 


Marginal 
x 
x 
Temporal 
x (spell length) 

Spatial 
x (RMSE) 
x (spatial extent) 
Spatiotemporal 
x (cluster size) 

Multivariate 
x (correlation) 
x (joint exceedance) 
Within the metrics class, the following functions are available:

Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not. 

Returns an array containing the values of dataset where the threshold condition is met and zero where not. 

Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period). 

Calculates number of days beyond threshold for each year in the dataset. 
Returns a py:class:`pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data. 

Returns a py:class:`pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data. 


Returns a py:class:`pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data. 
Returns three violinplots with distributions of temporal, spatial and spatiotemporal extends of metric occurrences, comparing all climate dataset specified in **climate_data. 
AccumulativeThresholdMetric
class is a child class of ThresholdMetric
class that adds additional functionalities for variables and metrics where
the total accumulative amount over a given threshold is of interest  this is the case for precipitation, but not for temperature for example. The following functions are added:

Calculates percentage of total amount beyond threshold for each location over all timesteps. 

Calculates amount beyond threshold for each year in the dataset. 

Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded. 
For the evaluation of marginal properties, the following functions are currently available:
The following functions are available to analyse the bias in spatial correlation structure:
Calculates RootMeanSquaredError between observed and modelled spatial correlation matrix at each location. 

Boxplot of RMSE of spatial correlation across locations. 
To analyse the multivariate correlation structure, as well as joint threshold exceedances:

Returns a 

Accepts ouput given by 

Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot. 
Uses seaborn.regplot and output of 

Plots histograms of correlation between variables in input dataframes estimated via bootstrap using 
3. Investigating whether the climate change trend is preserved
Bias correction methods can significantly modify the trend projected in the climate model simulation (Switanek 2017). If the user does not consider the simulated trend to be credible, then modifying it can be a good thing to do. However, any trend modification should always be a concious and informed choice, and it the belief that a bias correction method will improve the trend should be justified. Otherwise, the trend modification through the application of a bias correction method should be considered an artifact.
This component helps the user assess whether a certain method preserves the cliamte model trend or not. Some methods implemented in this package are explicitly trend preserving, for more details see the methodologies and descriptions of the individual debiasers.

For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics passed as arguments to the function. 

Accepts ouput given by 

Accepts ouput given by 
ibicus.evaluate.metrics¶
Metrics module  Standard metric definitions
 class ibicus.evaluate.metrics.AccumulativeThresholdMetric(threshold_value, threshold_type, name='unknown', variable='unknown')¶
Bases:
ibicus.evaluate.metrics.ThresholdMetric
Class for climate metrics that are defined by thresholds (child class of
ThresholdMetric
), but are accumulative. This mainly concerns precipitation metrics.An example of such a metric is total precipitation by very wet days (days > 10mm precipitation).
 Attributes:
 name
 threshold_type
 threshold_value
 variable
Methods
Calculates amount beyond threshold for each year in the dataset.
calculate_exceedance_probability
(dataset)Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
calculate_instances_of_threshold_exceedance
(dataset)Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
calculate_intensity_index
(dataset)Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.
calculate_number_annual_days_beyond_threshold
(...)Calculates number of days beyond threshold for each year in the dataset.
Calculates percentage of total amount beyond threshold for each location over all timesteps.
calculate_spatial_extent
(**climate_data)Returns a py:class:`pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
calculate_spatiotemporal_clusters
(**climate_data)Returns a py:class:`pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
calculate_spell_length
(minimum_length, ...)Returns a py:class:`pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.
filter_threshold_exceedances
(dataset)Returns an array containing the values of dataset where the threshold condition is met and zero where not.
violinplots_clusters
(minimum_length, ...)Returns three violinplots with distributions of temporal, spatial and spatiotemporal extends of metric occurrences, comparing all climate dataset specified in **climate_data.
 calculate_annual_value_beyond_threshold(dataset, dates_array, time_func=<function year>)¶
Calculates amount beyond threshold for each year in the dataset.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
 dates_arraynp.ndarray
Array of dates matching time dimension of dataset. Has to be of form time_dictionary[time_specification]  for example: tas_dates_validate[‘time_obs’]
 time_funcfunctions
Points to utils function to either extract days or months.
 Returns:
 np.ndarray
3d array  [years, lat, long]
 calculate_intensity_index(dataset)¶
Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.
Designed to calculate the simple precipitation intensity index but can be used for other variables.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
 calculate_percent_of_total_amount_beyond_threshold(dataset)¶
Calculates percentage of total amount beyond threshold for each location over all timesteps.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
 Returns:
 np.ndarray
2d array with percentage of total amount above threshold at each location.
 class ibicus.evaluate.metrics.ThresholdMetric(threshold_value, threshold_type, name='unknown', variable='unknown')¶
Bases:
object
Generic climate metric defined by exceedance or underceedance of threshold; or values between an upper and lower threshold.
Organises the definition and functionalities of such metrics. Enables to implement a subsection of the Climdex climate extreme indices<https://www.climdex.org/learn/indices/>.
Examples
>>> warm_days = ThresholdMetric(name = 'Mean warm days (K)', variable = 'tas', threshold_value = [295], threshold_type = 'higher')
 Attributes:
 threshold_valueUnion[float, list[float], tuple[float]]
Threshold value(s) for the variable (in the correct unit). If threshold_type = “higher” or threshold_type = “lower” this is just a single float value and the metric is defined as exceedance or underceedance of that value. If threshold_type = “between” or threshold_type = “outside” then this needs to be a list in the form: [lower_bound, upper_bound] and the metric is defined as falling in between, or falling outside these values.
 threshold_typestr
One of [“higher”, “lower”, “between”, “outside”]. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, not strict including the bounds) or outside the threshold values (“outside”, strict not including the bounds).
 namestr = “unknown”
Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days
 (tasmin < 0°C)’. Default: `”unknown”`.
 variablestr = “unknown”
Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.
Methods
calculate_exceedance_probability
(dataset)Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
Calculates number of days beyond threshold for each year in the dataset.
calculate_spatial_extent
(**climate_data)Returns a py:class:`pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
calculate_spatiotemporal_clusters
(**climate_data)Returns a py:class:`pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
calculate_spell_length
(minimum_length, ...)Returns a py:class:`pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.
filter_threshold_exceedances
(dataset)Returns an array containing the values of dataset where the threshold condition is met and zero where not.
violinplots_clusters
(minimum_length, ...)Returns three violinplots with distributions of temporal, spatial and spatiotemporal extends of metric occurrences, comparing all climate dataset specified in **climate_data.
 calculate_exceedance_probability(dataset)¶
Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected
 Returns:
 np.ndarray
Probability of metric occurrence at each location.
 calculate_instances_of_threshold_exceedance(dataset)¶
Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
 calculate_number_annual_days_beyond_threshold(dataset, dates_array, time_func=<function year>)¶
Calculates number of days beyond threshold for each year in the dataset.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
 dates_arraynp.ndarray
Array of dates matching time dimension of dataset. Has to be of form time_dictionary[time_specification]  for example: tas_dates_validate[‘time_obs’]
 time_funcfunctions
Points to utils function to either extract days or months.
 Returns:
 np.ndarray
3d array  [years, lat, long]
 calculate_spatial_extent(**climate_data)¶
Returns a py:class:`pd.DataFrame of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
The spatial extent is defined as the percentage of the area where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type), given that it is exceeded at one location. The output dataframe has three columns: ‘Correction Method’  obs/raw or name of debiaser, ‘Metric’  name of the threshold metric, ‘Spatial extent (% of area)’
 Parameters:
 **climate_data
Keyword arguments, providing the input data to investigate.
 Returns:
 pd.DataFrame
Dataframe of spatial extends of metrics occurrences.
Examples
>>> dry_days.calculate_spatial_extent(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
 calculate_spatiotemporal_clusters(**climate_data)¶
Returns a py:class:`pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
A spatiotemporal cluster is defined as a connected set (in time and/or space) where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’  obs/raw or name of debiaser, ‘Metric’  name of the threshold metric, ‘Spatiotemporal cluster size’
 Parameters:
 climate_data
Keyword arguments, providing the input data to investigate.
 Returns:
 pd.DataFrame
Dataframe of sizes of individual spatiotemporal clusters of metrics occurrences.
Examples
>>> dry_days.calculate_spatiotemporal_clusters(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
 calculate_spell_length(minimum_length, **climate_data)¶
Returns a py:class:`pd.DataFrame of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.
A spell length is defined as the number of days that a threshold is continuesly exceeded, underceeded or where values are continuously between or outside the threshold (depending on self.threshold_type). The output dataframe has three columns: ‘Correction Method’  obs/raw or name of debiaser as specified in **climate_data, ‘Metric’  name of the threshold metric, ‘Spell length  individual spell length counts’.
 Parameters:
 minimum lengthint
Minimum spell length (in days) investigated.
 climate_data
Keyword arguments, providing the input data to investigate.
 Returns:
 pd.DataFrame
Dataframe of spell lengths of metrics occurrences.
Examples
>>> dry_days.calculate_spell_length(minimum_length = 4, obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
 filter_threshold_exceedances(dataset)¶
Returns an array containing the values of dataset where the threshold condition is met and zero where not.
 Parameters:
 datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
 violinplots_clusters(minimum_length, **climate_data)¶
Returns three violinplots with distributions of temporal, spatial and spatiotemporal extends of metric occurrences, comparing all climate dataset specified in **climate_data.
 Parameters:
 minimum lengthint
Minimum spell length (in days) investigated for temporal extends.
 name¶
 threshold_type¶
 threshold_value¶
 variable¶
 ibicus.evaluate.metrics.R10mm = AccumulativeThresholdMetric(threshold_value=0.00011574074074074075, threshold_type='higher', name='Very wet days \n (> 10 mm/day)', variable='pr')¶
Very wet days (> 10 mm/day) for pr.
 ibicus.evaluate.metrics.R20mm = AccumulativeThresholdMetric(threshold_value=0.0002314814814814815, threshold_type='higher', name='Extremely wet days \n (> 20 mm/day)', variable='pr')¶
Extremely wet days (> 20 mm/day) for pr.
 ibicus.evaluate.metrics.cold_days = ThresholdMetric(threshold_value=275, threshold_type='lower', name='Mean cold days (K)', variable='tas')¶
Cold days (<275) for tas.
 ibicus.evaluate.metrics.dry_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e05, threshold_type='lower', name='Dry days \n (< 1 mm/day)', variable='pr')¶
Dry days (< 1 mm/day) for pr.
 ibicus.evaluate.metrics.frost_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', name='Frost days \n (tasmin<0°C)', variable='tasmin')¶
Frost days (<0°C) for tasmin.
 ibicus.evaluate.metrics.icing_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', name='Icing days \n (tasmax<0°C)', variable='tasmax')¶
Icing days (<0°C) for tasmax.
 ibicus.evaluate.metrics.summer_days = ThresholdMetric(threshold_value=298.15, threshold_type='higher', name='Summer days \n (tasmax>25°C)', variable='tasmax')¶
Summer days (>25°C) for tasmax.
 ibicus.evaluate.metrics.tropical_nights = ThresholdMetric(threshold_value=293.13, threshold_type='higher', name='Tropical Nights \n (tasmin>20°C)', variable='tasmin')¶
Tropical Nights (>20°C) for tasmin.
 ibicus.evaluate.metrics.warm_days = ThresholdMetric(threshold_value=295, threshold_type='higher', name='Mean warm days (K)', variable='tas')¶
Warm days (>295K) for tas.
 ibicus.evaluate.metrics.wet_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e05, threshold_type='higher', name='Wet days \n (> 1 mm/day)', variable='pr')¶
Wet days (> 1 mm/day) for pr.
ibicus.evaluate.marginal¶
ibicus.evaluate.multivariate¶
 ibicus.evaluate.multivariate.calculate_and_spatialplot_multivariate_correlation(variables, manual_title=' ', **kwargs)¶
Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.
 Parameters:
 variablelist
Variable name, has to be given in standard form specified in documentation.
 manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
 kwargs
Keyword arguments specifying a list of two np.ndarrays containing the two variables of interest.
Examples
>>> correlation.calculate_multivariate_correlation_locationwise(variables = ['tas', 'pr'], obs = [tas_obs_validate, pr_obs_validate], raw = [tas_cm_validate, pr_cm_validate], ISIMIP = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP])
 ibicus.evaluate.multivariate.calculate_conditional_joint_threshold_exceedance(metric1, metric2, **climate_data)¶
Returns a
pd.DataFrame
containing locationwise conditional exceedance probability.Calculates:
\[p (\text{Metric1}  \text{Metric2}) = p (\text{Metric1} , \text{Metric2}) / p(\text{Metric2})\]Output is a pd.DataFrame with 3 columns:  Correction Method: Type of climate data  obs, raw, bias_correction_name. Given through key of climate_data  Compound metric: str reading ‘Metric1.name given Metric2.name’  Conditional exceedance probability: 2d numpy array with conditional exceedance probability at each location
 Parameters:
 metric1ThresholdMetric
observational dataset in validation period
 metric2ThresholdMetric
Array of strings containing the names of the metrics that are to be assessed.
 **climate_data
Keyword arguments of type key = debiased_dataset in validation period (example: ‘QM = tas_val_debiased_QM’, or ‘obs = tas_val_debiased_obs’).
 Returns:
 pd.DataFrame
DataFrame with conditional exceedance probability at all locations for the combination of metrics chosen.
Examples
>>> dry_frost_data = calculate_conditional_exceedance(metric1 = dry_days, metric2 = frost_days, obs = [pr_obs_validate, tasmin_obs_validate], raw = [pr_cm_validate, tasmin_cm_validate], ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP])
 ibicus.evaluate.multivariate.create_multivariate_dataframes(variables, datasets_obs, datasets_bc, gridpoint=(0, 0))¶
Helper function creating two joint pd.Dataframe of two variables specified, for observational dataset as well as one bias corrected dataset at one datapoint.
 Parameters:
 variableslist
List of two variable names, has to be given in standard form following CMIP convention
 datasets_obslist
List of two observational datasets during same period for the two variables.
 datasets_bclist
List of two bias corrected datasets during same period for the two variables.
 gridpointtupel
Tupel that specifies location from which data will be extracted
Examples
>>> tas_pr_obs, tas_pr_isimip = _create_multivariate_dataframes(variables = ['tas', 'pr'], datasets_obs = [tas_obs_validate, pr_obs_validate], datasets_bc = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP], gridpoint = (1,1))
 ibicus.evaluate.multivariate.plot_bootstrap_correlation_replicates(obs_df, bc_df, bc_name, size)¶
Plots histograms of correlation between variables in input dataframes estimated via bootstrap using
_calculate_bootstrap_correlation_replicates()
. Parameters:
 obs_dfpd.DataFrame
First argument of output of
create_multivariate_dataframes()
 bc_dfpd.DataFrame
Second argument of output of
create_multivariate_dataframes()
 bc_name: str
Name of bias correction method
 size: int
Number of draws in bootstrapping procedure
Examples
>>> plot_bootstrap_correlation_replicates(obs_df = tas_pr_obs, bc_df = tas_pr_isimip, bc_name = 'ISIMIP', size=500)
 ibicus.evaluate.multivariate.plot_conditional_joint_threshold_exceedance(conditional_exceedance_df)¶
Accepts ouput given by
calculate_conditional_joint_threshold_exceedance()
and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets. Parameters:
 bias_array: np.ndarray
Output of
calculate_conditional_joint_threshold_exceedance()
 ibicus.evaluate.multivariate.plot_correlation_single_location(variables, obs_df, bc_df)¶
Uses seaborn.regplot and output of
create_multivariate_dataframes()
to plot scatterplot and Pearson correlation estimate of the two specified variables. Offers visual comparison of correlation at single location. Parameters:
 variablelist
List of variable name, has to be given in standard form following CMIP convetion.
 obs_dfpd.DataFrame
First argument of output of
create_multivariate_dataframes()
 bc_dfpd.DataFrame
Second argument of output of
create_multivariate_dataframes()
Examples
>>> plot_correlation_single_location(variables = ['tas', 'pr'], obs_df = tas_pr_obs, bc_df = tas_pr_isimip)
ibicus.evaluate.correlation¶
 ibicus.evaluate.correlation.rmse_spatial_correlation_boxplot(variable, dataset, manual_title=' ')¶
Boxplot of RMSE of spatial correlation across locations.
 Parameters:
 variablestr
Variable name, has to be given in standard form specified in documentation.
 datasetpd.DataFrame
Ouput format of function
rmse_spatial_correlation_distribution()
 manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
 ibicus.evaluate.correlation.rmse_spatial_correlation_distribution(variable, obs_data, **cm_data)¶
Calculates RootMeanSquaredError between observed and modelled spatial correlation matrix at each location.
The computation involves the following steps: At each location, calculate the correlation to each other location in the observed as well as the climate model data set. Then calculate the mean squared error between these two matrices.
 Parameters:
 variablestr
Variable name, has to be given in standard form specified in documentation.
 obs_datanp.ndarray
Optional argument present in all plot functions: manual_title will be used as title of the plot.
 cm_data
Keyword arguments specifying climate model datasets, for example: QM = tas_debiased_QM
Examples
>>> tas_rmsd_spatial = rmse_spatial_correlation_distribution(variable = 'tas', obs_data = tas_obs_validate, raw = tas_cm_future, QDM = tas_val_debiased_QDM)
ibicus.evaluate.trend¶
 ibicus.evaluate.trend.calculate_future_trend_bias(variable, raw_validate, raw_future, metrics=[], trend_type='additive', remove_outliers=True, **debiased_cms)¶
For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics passed as arguments to the function.
Trend can be specified as either additive or multiplicative.
Function returns numpy array with three columns: [Correction method: str, Metric: str, Relative change bias (%): List containing one 2d np.ndarray containing trend bias at each location]
 Parameters:
 variablestr
Variable name, has to be given in standard form following CMIP convention.
 raw_validatenp.ndarray
Raw climate data set in validation period
 raw_future: np.ndarray
Raw climate data set in future period
 metricsnp.ndarray
1d numpy array of strings containing the keys of the metrics to be analysed. Example: metrics = [‘dry’, ‘wet’]
 trend_type: str
Determines whether additive or multiplicative trend is analysed. Has to be one of [‘additive’, ‘multiplicative’]
 debiased_cmsnp.ndarray
Keyword arguments given in format debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period] Example: QM = [tas_val_debiased_QM, tas_future_debiased_QM].
Examples
>>> tas_trend_bias_data = trend.calculate_future_trend_bias(variable = 'tas', raw_validate = tas_cm_validate, raw_future = tas_cm_future, metrics = ['warm_days', 'cold_days'], trend_type = additive, QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])
 ibicus.evaluate.trend.plot_future_trend_bias_boxplot(variable, bias_df, manual_title=' ')¶
Accepts ouput given by
calculate_future_trend_bias()
and creates an overview boxplot of the bias in the trend of metrics. Parameters:
 variable: str
Variable name, has to be given in standard form specified in documentation.
 bias_df: pd.DataFrame
Numpy array with three columns: [Bias correction method, Metric, Bias value at certain location]
 manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
 ibicus.evaluate.trend.plot_future_trend_bias_spatial(variable, metric, bias_df, manual_title=' ')¶
Accepts ouput given by
calculate_future_trend_bias()
and creates an spatial plot of trend bias for one chosen metric. Parameters:
 variable: str
Variable name, has to be given in standard form specified in documentation.
 bias_array: np.ndarray
Numpy array with three columns: [Bias correction method, Metric, Bias value at certain location]
 manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
ibicus.evaluate.assumptions¶
 ibicus.evaluate.assumptions.calculate_aic(variable, dataset, *distributions)¶
Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.
Warning
*distributions can currently only be
scipy.stats.rv_continuous
and not as usually alsoStatisticalModel
. Parameters:
 variable: str
Variable name, has to be given in standard form specified in documentation.
 datasetnp.ndarray
Input data, either observations or climate projections dataset to be analysed, numeric entries expected.
 *distributionslist[scipy.stats.rv_continuous]
Distributions to be tested, elements are scipy.stats.rv_continuous
 Returns:
 pd.DataFrame
DataFrame with all locations, distributions and associated AIC values.
 ibicus.evaluate.assumptions.plot_aic(variable, aic_values, manual_title=' ')¶
Creates a boxplot of AIC values across all locations.
 Parameters:
 variablestr
Variable name, has to be given in standard form specified in documentation.
 aic_valuespd.DataFrame
Pandas dataframe of type of output by calculate_aic_goodness_of_fit.
 ibicus.evaluate.assumptions.plot_fit_worst_aic(variable, dataset, data_type, distribution, nr_bins='auto', aic_values=None, manual_title=' ')¶
Plots a histogram and overlayed fit at the location of worst AIC.
Warning
distribution can currently only be
scipy.stats.rv_continuous
and not as usually alsoStatisticalModel
. Parameters:
 variablestr
Variable name, has to be given in standard CMIP convention
 datasetnp.ndarray
3dinput data [time, lat, long], numeric entries expected. Either observations or climate projections dataset to be analysed.
 data_typestr
Data type analysed  can be observational data or raw / debiased climate model data. Used to generate title only.
 distributionscipy.stats.rv_continuous
Distribution providing fit to the data
 nr_binsUnion[int, str] = “auto”
Number of bins used for the histogram. Either :py:class:int` or “auto” (default).
 aic_valuesOptional[pd.DataFrame] = None
Pandas dataframe of type output by calculate_aic_goodness_of_fit. If None then they are recalculated;
 ibicus.evaluate.assumptions.plot_quantile_residuals(variable, dataset, distribution, data_type, manual_title=' ')¶
Plots timeseries and autocorrelation function of quantile residuals, as well as a QQplot of normalized quantile residuals at one location.
 Parameters:
 variable: str
Variable name, has to be given in standard form specified in documentation.
 datasetnp.ndarray
1d numpy array. Input data, either observations or climate projections dataset at one location, numeric entries expected.
 distribution: scipy.stats.rv_continuous
Name of the distribution analysed, used for title only.
 data_type: str
Data type analysed  can be observational data or raw / debiased climate model data. Used to generate title only.
Examples
>>> tas_obs_plot_gof = assumptions.plot_quantile_residuals(variable = 'tas', dataset = tas_obs[:,0,0], distribution = scipy.stats.norm, data_type = 'observation data')