ibicus.evaluate module
The evaluate
-module: provides a set of functionalities to assess the performance of your bias adjustment method.
Bias adjustment is prone to mis-use and requires careful evaluation, as demonstrated and argued in Maraun et al. 2017. In particular, the bias adjustment methods implemented in this package operate on a marginal level which means that they correct distribution of individual variables at individual locations. There is therefore only a subset of climate model biases that these debiasers will be able to correct. Biases in the temporal or spatial structure of climate models, or the feedbacks to large-scale weather patterns might not be well corrected.
The evaluate
-module: attempts to provide the user with the functionality to make an informed decision whether a chosen bias adjustment
method is fit for purpose - whether it corrects marginal, as well as spatial and temporal statistical properties properties in the desired manner,
as well as how it modifies the multivariate structure, if and how it modifies the climate change trend, and how it changes the bias in selected
climate impact metrics.
There are three components to the evaluation module:
1. Evaluating the bias adjusted model on a validation period
In order to assess the performance of a bias adjustment method, the bias adjusted model data is compared to observational / reanalysis data. The historical period for which observations exist is therefore split into to dataset in pre-processing - a reference period, and a validation period.
Both statistical properties such as quantiles or the mean of the bias adjusted variables, as well as tailored threshold metrics of particular relevance to the use-case
can be investigated. A threshold metric is an instance of the class ThresholdMetric
-class. A number of threshold metrics such as dry days
are pre-defined in the package. The user can modify existing metrics or create new metrics from scratch. Threshold metrics are defined by a variable
they refer to, an absolute threshold value that can also be defined for by location or time period (such as day of year, or season), a name, and whether
the threshold sets a lower, higher, outer or inner bound to the variables of interest. An example of a threshold metric is:
>>> frost_days = ThresholdMetric(name="Frost days (tasmin<0°C)", variable="tasmin", threshold_value=273.13, threshold_type="lower")
The bias before and after bias adjustment in both statistical properties as well as threshold metrics can be evaluated marginally (i.e. location-wise). Furthermore, the temporal spell length, the spatial extent and the spatiotemporal cluster size of threshold metrics such as hot days can be analysed and plotted. Spatial and multivariate statistical properties can also be evaluated in an experimental setting. The following table provides an overview of the different components that can be analysed in each of these two categories:
Statistical properties |
Threshold metrics |
|
---|---|---|
Marginal |
x |
x |
Temporal |
x (spell length) |
|
Spatial |
x (RMSE) |
x (spatial extent) |
Spatiotemporal |
x (cluster size) |
|
Multivariate |
x (correlation) |
x (joint exceedance) |
Within the metrics class, the following functions are available:
|
Creates a threshold metrics from a quantile respective to an array x. |
|
Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not. |
|
Returns an array containing the values of dataset where the threshold condition is met and zero where not. |
|
Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period). |
|
Calculates number of days beyond threshold for each year in the dataset. |
Returns a |
|
Returns a |
|
|
Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data. |
AccumulativeThresholdMetric
-class is a child class of ThresholdMetric
-class that adds additional functionalities for variables and metrics where
the total accumulative amount over a given threshold is of interest - this is the case for precipitation, but not for temperature for example. The following functions are added:
|
Calculates percentage of total amount beyond threshold for each location over all timesteps. |
|
Calculates amount beyond threshold for each year in the dataset. |
|
Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded. |
For the evaluation of marginal properties, the following functions are currently available:
|
Returns a |
|
Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in |
|
Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in |
|
Returns a |
|
Plots empirical CDFs of spatiotemporal clustersizes over entire area. |
|
Plots histogram over entire area or at single location. |
The following functions are available to analyse the bias in spatial correlation structure:
Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location. |
|
Boxplot of RMSE of spatial correlation across locations. |
To analyse the multivariate correlation structure, as well as joint threshold exceedances:
|
Returns a |
|
Accepts ouput given by |
Spatial plot of bias at each location with respect to one specified metric. |
|
|
Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot. |
Uses seaborn.regplot and output of |
|
Plots histograms of correlation between variables in input dataframes estimated via bootstrap using |
2. Investigating whether the climate change trend is preserved
Bias adjustment methods can significantly modify the trend projected in the climate model simulation (Switanek 2017). If the user does not consider the simulated trend to be credible, then modifying it can be a good thing to do. However, any trend modification should always be a concious and informed choice, and it the belief that a bias adjustment method will improve the trend should be justified. Otherwise, the trend modification through the application of a bias adjustment method should be considered an artifact.
This component helps the user assess whether a certain method preserves the cliamte model trend or not. Some methods implemented in this package are explicitly trend preserving, for more details see the methodologies and descriptions of the individual debiasers.
|
For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class |
|
Accepts ouput given by |
|
Accepts ouput given by |
3. Testing assumptions of different debiasers
Different debiasers rely on different assumptions - some are parametrics, others non-parametric, some bias correct each day or month of the year separately, others are applied to all days of the year in the same way.
This components enables the user to check some of these assumptions and for example help the user choose an appropriate function to fit the data to, an appropriate application window (entire year vs each days or month individually) and rule out the use of some debiasers that are not fit for purpose in a specific application.
The current version of this component can analyse the following two questions? - Is the fit of the default distribution ‘good enough’ or should a different distribution be used? - Is there any seasonality in the data that should be accounted for, for example by applying a ‘running window mode’ (meaning that the bias adjustment is fitted separately for different parts of the year, i.e. windows)?
The following functions are currently available:
|
Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified. |
|
Creates a boxplot of AIC values across all locations. |
|
Plots a histogram and overlayed fit at the location of worst AIC. |
|
Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location. |
ibicus.evaluate.metrics
Metrics module - Provides the possiblity to define threshold sensitive climate metrics that are analysed here and used further in ibicus.marginal
, ibicus.multivariate
and ibicus.trend
.
ibicus.evaluate.metrics.ThresholdMetric
- class ibicus.evaluate.metrics.ThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')
Generic climate metric defined by exceedance or underceedance of threshold; or values between an upper and lower threshold. This is determined by
threshold_type
. These metrics can be defined either overall or daily, monthly, seasonally (threshold_scope
) and either globally or location-wise (threshold_locality
).Organises the definition and functionalities of such metrics. This enables among others to implement the ETCCDI / Climdex climate extreme indices.
- Attributes:
- threshold_valueUnion[np.array, float, list, dict]
Threshold value(s) for the variable (in the correct unit).
If
threshold_type = "higher"
orthreshold_type = "lower"
, this is just a single float value and the metric is defined as exceedance or underceedance of that value (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).If
threshold_type = "between"
orthreshold_type = "outside"
, then this needs to be a list in the form: [lower_bound, upper_bound] and the metric is defined as falling in between, or falling outside these values (if threshold_scope = ‘overall’ and threshold_locality = ‘global’).If
threshold_locality = "local"
then instead of a single element (within a list, depending on threshold_type) anp.ndarray
is stored here for locally defined threshold.If
threshold_scope
is one of["day", "month", "season"]
then instead of a (list of) single element(s) or anp.ndarray
a dict is stored whose keys are the times (for example the seasons) and values contain the thresholds (either locally or globally).
- threshold_typestr
One of
["higher", "lower", "between", "outside"]
. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, not strict including the bounds) or outside the threshold values (“outside”, strict not including the bounds).- threshold_scopestr = “overall”
One of
["day", "month", "season", "overall"]
. Indicates wether thresholds are irrespective of time or defined on a daily, monthly or seasonal basis.- threshold_localitystr = “global”
One of
["global", "local"]
. Indicates wether thresholds are defined globally or locationwise.- namestr = “unknown”
Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : Frost days n (tasmin < 0°C). Default: “unknown”.
- variablestr = “unknown”
Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.
Methods
calculate_exceedance_probability
(dataset[, time])Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
Calculates number of days beyond threshold for each year in the dataset.
calculate_spatial_extent
(**climate_data)Returns a
pd.DataFrame
of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.calculate_spatiotemporal_clusters
(**climate_data)Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
calculate_spell_length
(minimum_length, ...)Returns a
pd.DataFrame
of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.filter_threshold_exceedances
(dataset[, time])Returns an array containing the values of dataset where the threshold condition is met and zero where not.
from_quantile
(x, q, threshold_type[, ...])Creates a threshold metrics from a quantile respective to an array x.
Examples
>>> warm_days = ThresholdMetric(threshold_value = 295, threshold_type = "higher", name = "Mean warm days (K)", variable = "tas") >>> warm_days_by_season = ThresholdMetric(threshold_value = {"Winter": 290, "Spring": 292, "Summer": 295, "Autumn": 292}, threshold_type = "higher", threshold_scope = "season", name = "Mean warm days (K)", variable = "tas") >>> q_90 = ThresholdMetricfrom_quantile(obs, 0.9, threshold_type = "higher", name = "90th observational quantile", variable = "tas") >>> q_10_season = ThresholdMetric.from_quantile(obs, 0.1, threshold_type = "lower", threshold_scope = "season", time = time_obs, name = "10th quantile by season", variable = "tas") >>> outside_10_90_month_local = ThresholdMetric.from_quantile(obs, [0.1, 0.9], threshold_type = "outside", threshold_scope = "month", threshold_locality = "local", time = time_obs, name = "Outside 10th, 9th quantile by month", variable = "tas")
- classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')
Creates a threshold metrics from a quantile respective to an array x.
- Parameters:
- xnp.ndarray
Array respective to which the quantile is calculated.
- qUnion[int, float, list]
Quantile (or list of lower and upper quantile if
threshold_type
in["higher", "lower"]
) as which the threshold is instantiated.- threshold_typestr
One of
["higher", "lower", "between", "outside"]
. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).- threshold_scopestr = “overall”
One of
["day", "month", "season", "overall"]
. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.- threshold_localitystr = “global”
One of
["global", "local"]
. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.- time: Optional[np.ndarray] = None
If the threshold is time-sensitive (
threshold_scope
in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.- namestr = “unknown”
Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days
- (tasmin < 0°C)’. Default: `”unknown”`.
- variablestr = “unknown”
Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.
Examples
>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1") >>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2") >>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")
- calculate_instances_of_threshold_exceedance(dataset, time=None)
Returns an array of the same size as dataset containing 1 when the threshold condition is met and 0 when not.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
- timenp.ndarray = None
Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).
- filter_threshold_exceedances(dataset, time=None)
Returns an array containing the values of dataset where the threshold condition is met and zero where not.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
- timenp.ndarray = None
Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).
- calculate_exceedance_probability(dataset, time=None)
Returns the probability of metrics occurrence (threshold exceedance/underceedance or inside/outside range), at each location (across the entire time period).
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected
- timenp.ndarray = None
Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).
- Returns:
- np.ndarray
Probability of metric occurrence at each location.
- calculate_number_annual_days_beyond_threshold(dataset, time)
Calculates number of days beyond threshold for each year in the dataset.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
- timenp.ndarray
Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.
- Returns:
- np.ndarray
3d array - [years, lat, long]
- calculate_spell_length(minimum_length, **climate_data)
Returns a
pd.DataFrame
of individual spell lengths of metrics occurrences (threshold exceedance/underceedance or inside/outside range), counted across locations, for each climate dataset specified in **climate_data.A spell length is defined as the number of days that a threshold is continuesly exceeded, underceeded or where values are continuously between or outside the threshold (depending on
self.threshold_type
). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser as specified in **climate_data, ‘Metric’ - name of the threshold metric, ‘Spell length - individual spell length counts’.- Parameters:
- minimum lengthint
Minimum spell length (in days) investigated.
- climate_data
Keyword arguments, providing the input data to investigate. Should be
np.ndarrays
of observations or if the threshold is time sensitive (threshold_scope = ['day', 'month', 'year']
) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.
- Returns:
- pd.DataFrame
Dataframe of spell lengths of metrics occurrences.
Examples
>>> dry_days.calculate_spell_length(minimum_length = 4, obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
- calculate_spatial_extent(**climate_data)
Returns a
pd.DataFrame
of spatial extends of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.The spatial extent is defined as the percentage of the area where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on
self.threshold_type
), given that it is exceeded at one location. The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatial extent (% of area)’- Parameters:
- **climate_data
Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.
- Returns:
- pd.DataFrame
Dataframe of spatial extends of metrics occurrences.
Examples
>>> dry_days.calculate_spatial_extent(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
- calculate_spatiotemporal_clusters(**climate_data)
Returns a py:class:pd.DataFrame of sizes of individual spatiotemporal clusters of metrics occurrences (threshold exceedance/underceedance or inside/outside range), for each climate dataset specified in **climate_data.
A spatiotemporal cluster is defined as a connected set (in time and/or space) where the threshold is exceeded/underceeded or values are between or outside the bounds (depending on
self.threshold_type
). The output dataframe has three columns: ‘Correction Method’ - obs/raw or name of debiaser, ‘Metric’ - name of the threshold metric, ‘Spatiotemporal cluster size’- Parameters:
- climate_data
Keyword arguments, providing the input data to investigate. Should be np.ndarrays of observations or if the threshold is time sensitive (threshold_scope = [‘day’, ‘month’, ‘year’]) lists of [cm_data, time_cm_data] where time_cm_data are 1d numpy arrays of times corresponding the the values in cm_data.
- Returns:
- pd.DataFrame
Dataframe of sizes of individual spatiotemporal clusters of metrics occurrences.
Examples
>>> dry_days.calculate_spatiotemporal_clusters(obs = tas_obs_validate, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
ibicus.evaluate.metrics.AccumulativeThresholdMetric
- class ibicus.evaluate.metrics.AccumulativeThresholdMetric(threshold_value, threshold_type, threshold_scope='overall', threshold_locality='global', name='unknown', variable='unknown')
Climate for metrics that are defined by thresholds (child class of
ThresholdMetric
), but are accumulative. This mainly concerns precipitation metrics.An example of such a metric is “total precipitation by very wet days (days > 10mm precipitation)”.
Methods
Calculates amount beyond threshold for each year in the dataset.
calculate_intensity_index
(dataset[, time])Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.
Calculates percentage of total amount beyond threshold for each location over all timesteps.
from_quantile
(x, q, threshold_type[, ...])Creates a threshold metrics from a quantile respective to an array x.
Examples
>>> R10mm = AccumulativeThresholdMetric(name="Very wet days (> 10 mm/day)", variable="pr", threshold_value=10 / 86400,threshold_type="higher")
- calculate_percent_of_total_amount_beyond_threshold(dataset, time=None)
Calculates percentage of total amount beyond threshold for each location over all timesteps.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
- timenp.ndarray = None
Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).
- Returns:
- np.ndarray
2d array with percentage of total amount above threshold at each location.
- calculate_annual_value_beyond_threshold(dataset, time)
Calculates amount beyond threshold for each year in the dataset.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projections to be analysed, numeric entries expected.
- timenp.ndarray
Time corresponding to each observation in dataset, required to calculate annual threshold occurrences.
- Returns:
- np.ndarray
3d array - [years, lat, long]
- calculate_intensity_index(dataset, time=None)
Calculates the amount beyond a threshold divided by the number of instance the threshold is exceeded.
Designed to calculate the simple precipitation intensity index but can be used for other variables.
- Parameters:
- datasetnp.ndarray
Input data, either observations or climate projectionsdataset to be analysed, numeric entries expected.
- timenp.ndarray = None
Time corresponding to each observation in dataset, required only for time sensitive thresholds (threshold_scope = [‘day’, ‘month’, ‘year’]).
- classmethod from_quantile(x, q, threshold_type, threshold_scope='overall', threshold_locality='global', time=None, name='unknown', variable='unknown')
Creates a threshold metrics from a quantile respective to an array x.
- Parameters:
- xnp.ndarray
Array respective to which the quantile is calculated.
- qUnion[int, float, list]
Quantile (or list of lower and upper quantile if
threshold_type
in["higher", "lower"]
) as which the threshold is instantiated.- threshold_typestr
One of
["higher", "lower", "between", "outside"]
. Indicates whether we are either interested in values above the threshold value (“higher”, strict >), values below the threshold value (“lower”, strict <), values between the threshold values (“between”, strict, not including the bounds) or outside the threshold values (“outside”, strict not including the bounds).- threshold_scopestr = “overall”
One of
["day", "month", "season", "overall"]
. Indicates wether thresholds (and the quantiles calculated) are irrespective of time or defined on a daily, monthly or seasonal basis.- threshold_localitystr = “global”
One of
["global", "local"]
. Indicates wether thresholds (and the quantiles calculated) are defined globally or locationwise.- time: Optional[np.ndarray] = None
If the threshold is time-sensitive (
threshold_scope
in [“day”, “month”, “season”]) then time information corresponding to x is required. Should be a numpy 1d array of times.- namestr = “unknown”
Metric name. Will be used in dataframes, plots etc. Recommended to include threshold value and units. Example : ‘Frost days
- (tasmin < 0°C)’. Default: `”unknown”`.
- variablestr = “unknown”
Unique variable that this threshold metric refers to. Example for frost days: tasmin. Default: “unknown”.
Examples
>>> m1 = ThresholdMetric.from_quantile(obs, 0.8, threshold_type = "higher", name = "m1") >>> m2 = ThresholdMetric.from_quantile(obs, 0.2, threshold_type = "lower", threshold_scope = "season", threshold_locality = "local", time = time_obs, name = "m2") >>> m3 = ThresholdMetric.from_quantile(obs, [0.2, 0.8], threshold_type = "outside", threshold_scope="month", threshold_locality = "local", time=time_obs, name = "m3")
Concrete metrics
- ibicus.evaluate.metrics.dry_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Dry days \n (< 1 mm/day)', variable='pr')
Dry days (< 1 mm/day) for pr.
- ibicus.evaluate.metrics.wet_days = AccumulativeThresholdMetric(threshold_value=1.1574074074074073e-05, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Wet days \n (> 1 mm/day)', variable='pr')
Wet days (> 1 mm/day) for pr.
- ibicus.evaluate.metrics.R10mm = AccumulativeThresholdMetric(threshold_value=0.00011574074074074075, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Very wet days \n (> 10 mm/day)', variable='pr')
Very wet days (> 10 mm/day) for pr.
- ibicus.evaluate.metrics.R20mm = AccumulativeThresholdMetric(threshold_value=0.0002314814814814815, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Extremely wet days \n (> 20 mm/day)', variable='pr')
Extremely wet days (> 20 mm/day) for pr.
- ibicus.evaluate.metrics.warm_days = ThresholdMetric(threshold_value=295, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Mean warm days (K)', variable='tas')
Warm days (>295K) for tas.
- ibicus.evaluate.metrics.cold_days = ThresholdMetric(threshold_value=275, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Mean cold days (K)', variable='tas')
Cold days (<275) for tas.
- ibicus.evaluate.metrics.frost_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Frost days \n (tasmin<0°C)', variable='tasmin')
Frost days (<0°C) for tasmin.
- ibicus.evaluate.metrics.tropical_nights = ThresholdMetric(threshold_value=293.13, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Tropical Nights \n (tasmin>20°C)', variable='tasmin')
Tropical Nights (>20°C) for tasmin.
- ibicus.evaluate.metrics.summer_days = ThresholdMetric(threshold_value=298.15, threshold_type='higher', threshold_scope='overall', threshold_locality='global', name='Summer days \n (tasmax>25°C)', variable='tasmax')
Summer days (>25°C) for tasmax.
- ibicus.evaluate.metrics.icing_days = ThresholdMetric(threshold_value=273.13, threshold_type='lower', threshold_scope='overall', threshold_locality='global', name='Icing days \n (tasmax<0°C)', variable='tasmax')
Icing days (<0°C) for tasmax.
ibicus.evaluate.marginal
Marginal module - Calculate and plot the location-wise bias of the climate model before and after applying different bias adjustment methods. Provides the possiblity to calculate either the absolute or percentage bias at each location, for both statistical properties of the variable and specified threshold metrics, and plot either a boxplot across locations, or a spatial heatmap.
- ibicus.evaluate.marginal.calculate_marginal_bias(obs, statistics=['mean', 0.05, 0.95], metrics=[], percentage_or_absolute='percentage', **cm_data)
Returns a
pd.DataFrame
containing location-wise percentage bias of different metrics: mean, 5th and 95th percentile, as well as metrics specific in metrics, comparing observations to climate model output during a validation period.Output dataframes contains four columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which shows the name of the threshold metric or statistic calculated, ‘Type’ which specifies whether the absolute or percentage bias is calculated, and ‘Bias’ which contains a np.ndarray which in turn contains the output values at each location.
The output can be plotted using
plot_marginal_bias()
orplot_bias_spatial()
or manually be analyzed further.- Parameters:
- obsnp.ndarray
observational dataset in validation period. If one of the metrics is time sensitive (defined daily, monthly, seasonally) this needs to be a list of form
[obs_data, time_obs_data]
where time_obs_data is a 1d numpy arrays of times corresponding the the values in obs_data.- statisticslist
List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated. Default:
["mean", 0.05, 0.95]
.- metricslist
List of
ThresholdMetric
metrics whose trend bias shall be calculated. Example:metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days']
.- percentage_or_absolutestr
Specifies whether for the climate threshold metrics the percentage bias (p(cm)-p(obs))/p(obs) is computed, or the absolute bias. For threshold metrics, the absolute bias is difference in the mean days per year that this metric is exceeded and for statistics it is the absolute difference in [physical units] of the two statistics.
- **cm_data
Keyword arguments of type
debiaser_name = debiased_dataset
in validation period (example:QM = tas_val_debiased_QM
), covering all debiasers that are to be compared. If one of the metrics is time sensitive (defined daily, monthly, seasonally:metric.threshold_scope = ['day', 'month', 'year']
) this needs to be a list of form lists of[cm_data, time_cm_data]
where time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data.
- Returns:
- pd.DataFrame
DataFrame with marginal bias at all locations, for all metrics specified.
Examples
>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
- ibicus.evaluate.marginal.plot_marginal_bias(variable, bias_df, statistics=['Mean', '0.95 qn', '0.05 qn'], manual_title=' ', remove_outliers=False, outlier_threshold_statistics=100, outlier_threshold_metrics=100, color_palette='tab10', metrics_title=' ', statistics_title=' ')
Returns boxplots showing distribution of the percentage bias over locations of different metrics, based on calculation performed in
calculate_marginal_bias()
.Two boxplots are created: one for the descriptive statistics and one for threshold metrics present in the bias_df dataframe.
- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- bias_dfpd.DataFrame
pd.DataFrame
containing percentage bias for descriptive statistics and specified metrics. Output ofcalculate_marginal_bias()
.- statisticslist
List of strings specifying summary statistics computed on the data. Strings have to be equal to entry in the ‘Metric’ column of bias_df.
- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
- remove_outliersbool
If set to True, values above the threshold specified through the next argument are removed
- outlier_threshold_statisticsint,
Threshold above which to remove values from the plot for bias statistics (mean, quantiles)
- outlier_threshold_metricsint
Threshold above which to remove values from the plot for bias in metrics (such as dry days, hot days, etc)
- color_palettestr
Seaborn color palette to use for the boxplot.
Examples
>>> tas_marginal_bias_plot = marginal.plot_marginal_bias(variable = 'tas', bias_df = tas_marginal_bias)
- ibicus.evaluate.marginal.plot_bias_spatial(variable, metric, bias_df, remove_outliers=False, outlier_threshold=100, manual_title=' ')
Returns spatial plots of bias at each location with respect to one specified metric, based on calculation performed in
calculate_marginal_bias()
.- Parameters:
- variable: str
Variable name, has to be given in standard form following CMIP convention.
- metric: str
Specifies the metric analysed. Has to exactly match the name of this metric in the bias_df DataFrame.
- bias_df: pd.DataFrame
pd.DataFrame
containing percentage bias for descriptive statistics and specified metrics. Output ofcalculate_marginal_bias()
.- remove_outliers: bool
If set to True, values above the threshold specified through the next argument are removed.
- outlier_threshold: int,
Threshold above which to remove values from the plot.
- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
Examples
>>> tas_marginal_bias_plot_mean = marginal.plot_bias_spatial(variable = 'tas', metric = 'Mean', bias_df = tas_marginal_bias)
- ibicus.evaluate.marginal.calculate_bias_days_metrics(obs_data, metrics=[], **cm_data)
Returns a
pd.DataFrame
containing location-wise mean number of yearly threshold exceedances.The output dataframes contains five columns: ‘Correction Method’ (str) correspond to the cm_data keys, ‘Metric’, which is in [metrics_names], ‘CM’ which contains the mean number of days of threshold exceedance in the climate models, ‘Obs’ which which contains the mean number of days of threshold exceedance in the observations, and ‘Bias’ which contains the difference (CM-Obs) between the mean number of threshold exceedance days in the climate model and the observations.
- Parameters:
- obs_datanp.ndarray
List of observational dataset in validation period and corresponding time information:
[obs_data, time_obs_data]
. Here time_obs_data is a 1d numpy arrays of times corresponding to the values in obs_data.- metricslist
Array of strings containing the names of the metrics that are to be assessed.
- **cm_data
Keyword arguments of type
debiaser_name = [cm_data, time_cm_data]
covering all debiasers to be compared. Here time_cm_data is a 1d numpy arrays of times corresponding the the values in cm_data and cm_data refers to a debiased dataset in a validation period. Example:QM = [tas_val_debiased_QM, time_val]
.
- Returns:
- pd.DataFrame
DataFrame with marginal bias at all locations, for all metrics specified.
Examples
>>> tas_marginal_bias_df = marginal.calculate_marginal_bias(obs_data = tas_obs_validate, metrics = tas_metrics, raw = tas_cm_validate, ISIMIP = tas_val_debiased_ISIMIP)
- ibicus.evaluate.marginal.plot_spatiotemporal(data=[], column_names=['Spell length (days)', 'Spatiotemporal cluster size', 'Spatial extent (% of area)'], xlims=[30, 30, 1])
Plots empirical CDFs of spatiotemporal clustersizes over entire area.
- Parameters:
- datalist = []
List of dataframes, output of the type produced by metrics.calculate_spell_length, metrics.calculate_spatial_extent, metrics.calculate_spatiotemporal_clusters expected as elements of the list
- column_nameslist = [“Spell length (days)”, “Spatiotemporal cluster size”, “Spatial extent (% of area)”,]
Names of the columns containing spatiotemporal cluster sizes corresponding to the dataframes given to the argument data.
- xlimslist = [30, 30, 1]
xlim for each of the plots, corresponding to the dataframes given to the argument data.
Examples
>>> spatiotemporal_figure = marginal.plot_spatiotemporal(data = [spelllength_dry, spatiotemporal_dry, spatial_dry])
- ibicus.evaluate.marginal.plot_histogram(variable, data_obs, bin_number=100, manual_title=' ', **cm_data)
Plots histogram over entire area or at single location. Expects a one-dimensional array as input.
- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- data_obsnp.ndarray
1d-array - either observational data specified at one location, or flattened array of all observed values over the area. Numeric values expected.
- bin_numberint
Number of bins plotted in histogram, set to 100 by default
- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
Examples
>>> histogram = plot_histogram(variable='tas', data_obs=tas_obs_validate[:, 0,0], raw = tas_cm_validate[:, 0,0], ISIMIP = tas_val_debiased_ISIMIP[:, 0,0], CDFt = tas_val_debiased_CDFT[:, 0,0])
ibicus.evaluate.multivariate
Multivariate module - calculate and plot conditional threshold exceedances, and analyse and plot the correlation between two variables at each location before and after bias adjustment to check for changes in the multivariate structure.
- ibicus.evaluate.multivariate.calculate_conditional_joint_threshold_exceedance(metric1, metric2, **climate_data)
Returns a
pd.DataFrame
containing location-wise conditional exceedance probability. Calculates:\[p (\text{Metric1} | \text{Metric2}) = p (\text{Metric1} , \text{Metric2}) / p(\text{Metric2})\]Output is a pd.DataFrame with 3 columns:
Correction Method: Type of climate data - obs, raw, bias_correction_name. Given through keys of climate_data.
Compound metric: str reading
Metric1.name given Metric2.name
.Conditional exceedance probability: 2d numpy array with conditional exceedance probability at each location.
- Parameters:
- metric1ThresholdMetric
Metric 1 whose exceedance conditional on metric 2 shall be assessed.
- metric2ThresholdMetric
Metric 2 on which metric 1 is contioned upon.
- **climate_data
Keyword arguments of type
key = [variable1_debiased_dataset, variable2_debiased_dataset]
. Here the exceedance of metric 1 is calculated on the first dataset (variable1_debiased_dataset) and the exceedance of metric 2 on the second one (variable2_debiased_dataset) to calculate the conditional exceedance. Example:obs = [pr_obs_validate, tasmin_obs_validate]
, orISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP]
).If one the metrics is time sensitive (defined daily, monthly, seasonally:
metric.threshold_scope = ['day', 'month', 'year']
) a third list element needs to be passed: a 1Dnp.ndarray
containing the time information corresponding to the entries in the variable1 and variable2 datasets. Example:obs = [pr_obs_validate, tasmin_obs_validate, time_obs_validate]
.Warning
Datasets for variable1 and variable2 need to be during the same time period and the entries need to happen at the same dates.
- Returns:
- pd.DataFrame
DataFrame with conditional exceedance probability at all locations for the combination of metrics chosen.
Examples
>>> dry_frost_data = calculate_conditional_exceedance(metric1 = dry_days, metric2 = frost_days, obs = [pr_obs_validate, tasmin_obs_validate], raw = [pr_cm_validate, tasmin_cm_validate], ISIMIP = [pr_val_debiased_ISIMIP, tasmin_val_debiased_ISIMIP])
- ibicus.evaluate.multivariate.plot_conditional_joint_threshold_exceedance(conditional_exceedance_df)
Accepts ouput given by
calculate_conditional_joint_threshold_exceedance()
and creates an overview boxplot of the conditional exceedance probability across locations in the chosen datasets.- Parameters:
- bias_array: np.ndarray
Output of
calculate_conditional_joint_threshold_exceedance()
. py:class:pd.DataFrame containing location-wise conditional exceedance probability.
- ibicus.evaluate.multivariate.plot_conditional_probability_spatial(bias_df, remove_outliers=False, outlier_threshold=100, plot_title=' ')
Spatial plot of bias at each location with respect to one specified metric.
- Parameters:
- bias_df: pd.DataFrame
pd.DataFrame
containing conditional exceedance probability, expects output ofcalculate_conditional_joint_threshold_exceedance()
.- remove_outliers: bool
If set to True, values above the threshold specified through the next argument are removed.
- outlier_threshold: int,
Threshold above which to remove values from the plot.
- plot_titlestr
No default plot title set within the function, plot_title will be used as title of the plot.
Examples
>>> warm_wet_vis = plot_conditional_probability_spatial(bias_df=warm_wet, plot_title ="Conditional probability of warm days (>20°C) given wet days (>1mm)")
- ibicus.evaluate.multivariate.calculate_and_spatialplot_multivariate_correlation(variables, manual_title=' ', **kwargs)
Calculates correlation between the two variables specified in keyword arguments (such as tas and pr) at each location and outputs spatial plot.
- Parameters:
- variablelist
Variable name, has to be given in standard form specified in documentation.
- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
- kwargs
Keyword arguments specifying a list of two np.ndarrays containing the two variables of interest.
Examples
>>> correlation.calculate_multivariate_correlation_locationwise(variables = ['tas', 'pr'], obs = [tas_obs_validate, pr_obs_validate], raw = [tas_cm_validate, pr_cm_validate], ISIMIP = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP])
- ibicus.evaluate.multivariate.create_multivariate_dataframes(variables, datasets_obs, datasets_bc, gridpoint=(0, 0))
Helper function creating a joint
pd.Dataframe
of two variables specified, for observational dataset as well as one bias corrected dataset at one datapoint.- Parameters:
- variableslist
List of two variable names, has to be given in standard form following CMIP convention
- datasets_obslist
List of two observational datasets during same period for the two variables.
- datasets_bclist
List of two bias corrected datasets during same period for the two variables.
- gridpointtupel
Tuple that specifies location from which data will be extracted
Examples
>>> tas_pr_obs, tas_pr_isimip = create_multivariate_dataframes(variables = ['tas', 'pr'], datasets_obs = [tas_obs_validate, pr_obs_validate], datasets_bc = [tas_val_debiased_ISIMIP, pr_val_debiased_ISIMIP], gridpoint = (1,1))
- ibicus.evaluate.multivariate.plot_correlation_single_location(variables, obs_df, bc_df)
Uses seaborn.regplot and output of
create_multivariate_dataframes()
to plot scatterplot and Pearson correlation estimate of the two specified variables. Offers visual comparison of correlation at single location.- Parameters:
- variablelist
List of variable name, has to be given in standard form following CMIP convetion.
- obs_dfpd.DataFrame
First argument of output of
create_multivariate_dataframes()
- bc_dfpd.DataFrame
Second argument of output of
create_multivariate_dataframes()
Examples
>>> plot_correlation_single_location(variables = ['tas', 'pr'], obs_df = tas_pr_obs, bc_df = tas_pr_isimip)
- ibicus.evaluate.multivariate.plot_bootstrap_correlation_replicates(obs_df, bc_df, bc_name, size)
Plots histograms of correlation between variables in input dataframes estimated via bootstrap using
_calculate_bootstrap_correlation_replicates()
.- Parameters:
- obs_dfpd.DataFrame
First argument of output of
create_multivariate_dataframes()
- bc_dfpd.DataFrame
Second argument of output of
create_multivariate_dataframes()
- bc_name: str
Name of bias correction method
- size: int
Number of draws in bootstrapping procedure
Examples
>>> plot_bootstrap_correlation_replicates(obs_df = tas_pr_obs, bc_df = tas_pr_isimip, bc_name = 'ISIMIP', size=500)
ibicus.evaluate.correlation
Correlation module - Calculate and plot the RMSE between spatial correlation matrices at each location.
- ibicus.evaluate.correlation.rmse_spatial_correlation_distribution(variable, obs_data, **cm_data)
Calculates Root-Mean-Squared-Error between observed and modelled spatial correlation matrix at each location.
The computation involves the following steps: At each location, calculate the correlation to each other location in the observed as well as the climate model data set. Then calculate the mean squared error between these two matrices.
- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- obs_datanp.ndarray
Optional argument present in all plot functions: manual_title will be used as title of the plot.
- cm_data
Keyword arguments specifying climate model datasets, for example: QM = tas_debiased_QM
Examples
>>> tas_rmsd_spatial = rmse_spatial_correlation_distribution(variable = 'tas', obs_data = tas_obs_validate, raw = tas_cm_future, QDM = tas_val_debiased_QDM)
- ibicus.evaluate.correlation.rmse_spatial_correlation_boxplot(variable, dataset, manual_title=' ')
Boxplot of RMSE of spatial correlation across locations.
- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- datasetpd.DataFrame
Ouput format of function
rmse_spatial_correlation_distribution()
- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
ibicus.evaluate.trend
Trend module - Calculate and plot changes to the climate model trend through application of different bias adjustment methods. Changes in the trend in both statistical properties (mean, quantiles), as well as threshold metrics (ThresholdMetric
) can be calculated. Trends are defined here between a validation and future period.
- ibicus.evaluate.trend.calculate_future_trend_bias(raw_validate, raw_future, statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)
For each location, calculates the bias in the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class
ThresholdMetric
(fromibicus.evaluate.metrics
) passed as arguments to the function.The trend can be specified as either additive or multiplicative by setting
trend_type
(default: additive).The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]
- Parameters:
- raw_validatenp.ndarray
Raw climate data set in validation period.
- raw_future: np.ndarray
Raw climate data set in future period.
- statisticslist
List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.
- trend_typestr
Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].
- metricslist
List of
ThresholdMetric
metrics whose trend bias shall be calculated. Example:metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days']
.- time_validatenp.ndarray
If one of the metrics is time sensitive (defined daily, monthly, seasonally:
metric.threshold_scope = ['day', 'month', 'year']
) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_validate and the first entry of each debiased_cms keyword arguments correspond.- time_futurenp.ndarray
If one of the metrics is time sensitive (defined daily, monthly, seasonally:
metric.threshold_scope = ['day', 'month', 'year']
) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the values in raw_future and the second entry of each debiased_cms keyword arguments correspond.- debiased_cmsnp.ndarray
Keyword arguments given in format
debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period]
specifying the climate models to be analysed for trends in biases. Example:QM = [tas_val_debiased_QM, tas_future_debiased_QM]
.
Examples
>>> tas_trend_bias_data = trend.calculate_future_trend_bias(variable = 'tas', raw_validate = tas_cm_validate, raw_future = tas_cm_future, metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])
- ibicus.evaluate.trend.calculate_future_trend(statistics=['mean', 0.05, 0.95], trend_type='additive', metrics=[], time_validate=None, time_future=None, **debiased_cms)
For each location, calculates the trend of the bias corrected model compared to the raw climate model for the following metrics: mean, 5% and 95% quantile (default) as well as metrics of class
ThresholdMetric
(fromibicus.evaluate.metrics
) passed as arguments to the function.The trend can be specified as either additive or multiplicative by setting
trend_type
(default: additive).The function returns numpy array with three columns: [Correction method: str, Metric: str, Bias: List containing one 2d np.ndarray containing trend bias at each location]
- Parameters:
- statisticslist
List containing float values as well as “mean” specifying for which distributional aspects the trend bias shall be calculated.
- trend_typestr
Determines the type of trend that is analysed. Has to be one of [‘additive’, ‘multiplicative’].
- metricslist
List of
ThresholdMetric
metrics whose trend bias shall be calculated. Example:metrics = [ibicus.metrics.dry_days', 'ibicus.metrics.wet_days']
.- time_validatenp.ndarray
If one of the metrics is time sensitive (defined daily, monthly, seasonally:
metric.threshold_scope = ['day', 'month', 'year']
) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the first entry of each debiased_cms keyword arguments correspond.- time_futurenp.ndarray
If one of the metrics is time sensitive (defined daily, monthly, seasonally:
metric.threshold_scope = ['day', 'month', 'year']
) time information needs to be passed to calculate it. This is a 1d numpy arrays of times to which the second entry of each debiased_cms keyword arguments correspond.- debiased_cmsnp.ndarray
Keyword arguments given in format
debiaser_name = [debiased_dataset_validation_period, debiased_dataset_future_period]
specifying the climate models to be analysed for trends in biases. Example:QM = [tas_val_debiased_QM, tas_future_debiased_QM]
.
Examples
>>> tas_trend_bias_data = trend.calculate_future_trend(variable = 'tas', metrics = [ibicus.metrics.warm_days, ibicus.metrics.cold_days], trend_type = "additive", QDM = [tas_val_debiased_QDM, tas_fut_debiased_QDM], CDFT = [tas_val_debiased_CDFT, tas_fut_debiased_CDFT])
- ibicus.evaluate.trend.plot_future_trend_bias_boxplot(variable, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100, color_palette='tab10')
Accepts ouput given by
calculate_future_trend_bias()
and creates an overview boxplot of the bias in the trend of different metrics.- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- bias_dfpd.DataFrame
Numpy array with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from
calculate_future_trend_bias()
.- manual_titlestr
Manual title to replace the automatically generated one.
- remove_outliersbool
If set to
True
, values above the threshold specified through the next argument are removed- outlier_threshold: int,
Threshold above which to remove values from the plot
- color_palettestr
Seaborn color palette to use for the boxplot.
- ibicus.evaluate.trend.plot_future_trend_bias_spatial(variable, metric, bias_df, manual_title=' ', remove_outliers=False, outlier_threshold=100)
Accepts ouput given by
calculate_future_trend_bias()
and creates an spatial plot of trend bias for one chosen metric.- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- metricstr
Metric in bias_df to plot.
- bias_df: pd.DataFrame
Dataframe with three columns: [Bias correction method, Metric, Bias value at certain location]. Output from
calculate_future_trend_bias()
.- manual_titlestr
Optional argument present in all plot functions: manual_title will be used as title of the plot.
ibicus.evaluate.assumptions
Assumptions module - test assumptions of bias adjustment methods. Currently allows to fit different distributions to the data, calculate and plot the Akaike Information Criterion to compare distributions and plot timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.
- ibicus.evaluate.assumptions.calculate_aic(variable, dataset, *distributions)
Calculates the Akaike Information Criterion (AIC) at each location for each of the distributions specified.
Warning
*distributions can currently only be
scipy.stats.rv_continuous
and not as usually alsoStatisticalModel
.- Parameters:
- variable: str
Variable name, has to be given in standard form specified in documentation.
- datasetnp.ndarray
Input data, either observations or climate projections dataset to be analysed, numeric entries expected.
- *distributionslist[scipy.stats.rv_continuous]
Distributions to be tested, elements are scipy.stats.rv_continuous
- Returns:
- pd.DataFrame
DataFrame with all locations, distributions and associated AIC values.
- ibicus.evaluate.assumptions.plot_aic(variable, aic_values, manual_title=' ')
Creates a boxplot of AIC values across all locations.
- Parameters:
- variablestr
Variable name, has to be given in standard form specified in documentation.
- aic_valuespd.DataFrame
Pandas dataframe of type of output by calculate_aic_goodness_of_fit.
- ibicus.evaluate.assumptions.plot_fit_worst_aic(variable, dataset, data_type, distribution, nr_bins='auto', aic_values=None, manual_title=' ')
Plots a histogram and overlayed fit at the location of worst AIC.
Warning
distribution can currently only be
scipy.stats.rv_continuous
and not as usually alsoStatisticalModel
.- Parameters:
- variablestr
Variable name, has to be given in standard CMIP convention
- datasetnp.ndarray
3d-input data [time, lat, long], numeric entries expected. Either observations or climate projections dataset to be analysed.
- data_typestr
Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.
- distributionscipy.stats.rv_continuous
Distribution providing fit to the data
- nr_binsUnion[int, str] = “auto”
Number of bins used for the histogram. Either :py:class:int` or “auto” (default).
- aic_valuesOptional[pd.DataFrame] = None
Pandas dataframe of type output by calculate_aic_goodness_of_fit. If None then they are recalculated;
- manual_title: str = “ “
Optional argument present in all plot functions: manual_title will be used as title of the plot.
- ibicus.evaluate.assumptions.plot_quantile_residuals(variable, dataset, distribution, data_type, manual_title=' ')
Plots timeseries and autocorrelation function of quantile residuals, as well as a QQ-plot of normalized quantile residuals at one location.
- Parameters:
- variable: str
Variable name, has to be given in standard form specified in documentation.
- datasetnp.ndarray
1d numpy array. Input data, either observations or climate projections dataset at one location, numeric entries expected.
- distribution: scipy.stats.rv_continuous
Name of the distribution analysed, used for title only.
- data_type: str
Data type analysed - can be observational data or raw / debiased climate model data. Used to generate title only.
- manual_title: str = “ “
Allows to set plot title manually.
Examples
>>> tas_obs_plot_gof = assumptions.plot_quantile_residuals(variable = 'tas', dataset = tas_obs[:,0,0], distribution = scipy.stats.norm, data_type = 'observation data')