ibicus.utils module

utils-module: provides functionality to help with data processing, debiasing and evaluation.

ibicus.utils Convert variables

ibicus.utils.get_tasrange(tasmin, tasmax)

Calculates numpy array of tasrange from arrays of tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasminnp.ndarray

Numpy array of tasmin-values.

tasmaxnp.ndarray

Numpy array of tasmax-values.

Returns:
tasrangenp.ndarray

Numpy array of tasrange values

ibicus.utils.get_tasskew(tas, tasmin, tasmax)

Calculates numpy array of tasskew from arrays of tas, tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasnp.ndarray

Numpy array of tas-values.

tasminnp.ndarray

Numpy array of tasmin-values.

tasmaxnp.ndarray

Numpy array of tasmax-values.

Returns:
tasskewnp.ndarray

Numpy array of tasskew values

ibicus.utils.get_tasmin(tas, tasrange, tasskew)

Calculates numpy array of tasmin from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasnp.ndarray

Numpy array of tas-values.

tasrangenp.ndarray

Numpy array of tasrange-values.

tasskewnp.ndarray

Numpy array of tasskew-values.

Returns:
tasminnp.ndarray

Numpy array of tasmin values

ibicus.utils.get_tasmax(tas, tasrange, tasskew)

Calculates numpy array of tasmax from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasnp.ndarray

Numpy array of tas-values.

tasrangenp.ndarray

Numpy array of tasrange-values.

tasskewnp.ndarray

Numpy array of tasskew-values.

Returns:
tasmaxnp.ndarray

Numpy array of tasmax values

ibicus.utils.get_tasmin_tasmax(tas, tasrange, tasskew)

Calculates numpy arrays of both tasmin and tasmax from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasnp.ndarray

Numpy array of tas-values.

tasrangenp.ndarray

Numpy array of tasrange-values.

tasskewnp.ndarray

Numpy array of tasskew-values.

Returns:
tasminnp.ndarray

Numpy array of tasmin values

tasmaxnp.ndarray

Numpy array of tasmax values

ibicus.utils.get_tasrange_tasskew(tas, tasmin, tasmax)

Calculates numpy arrays of both tasrange and tasskew from arrays of tas, tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]
\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]
Parameters:
tasnp.ndarray

Numpy array of tas-values.

tasminnp.ndarray

Numpy array of tasmin-values.

tasmaxnp.ndarray

Numpy array of tasmax-values.

Returns:
tasrangenp.ndarray

Numpy array of tasrange values

tasskewnp.ndarray

Numpy array of tasskew values

ibicus.utils.get_prsnratio(pr, prsn)

Calculates numpy array of prsnratio from arrays of pr and prsn.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]
Parameters:
prnp.ndarray

Numpy array of pr-values.

prsnnp.ndarray

Numpy array of prsn-values.

Returns:
prsnrationnp.ndarray

Numpy array of prsnratio values

ibicus.utils.get_pr(prsn, prsnratio)

Calculates numpy array of pr from arrays of prsn and prsnratio.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]
Parameters:
prnp.ndarray

Numpy array of prsn-values.

prsnnp.ndarray

Numpy array of prsnratio-values.

Returns:
prnp.ndarray

Numpy array of pr values

ibicus.utils.get_prsn(pr, prsnratio)

Calculates numpy array of prsn from arrays of pr and prsnratio.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]
Parameters:
prnp.ndarray

Numpy array of pr-values.

prsnnp.ndarray

Numpy array of prsnratio-values.

Returns:
prsnnp.ndarray

Numpy array of prsn values

ibicus.utils StatisticalModel abstract-class

class ibicus.utils.StatisticalModel

Abstract functionality to wrap an arbitrary statistical model given by a fit-method, a cdf and a ppf.

This can be used to pass a self-defined model to a debiaser, that is then fitted and used at each location. In principle this is similar to scipy.stats.rv_continuous, however the user has the option to provide an own fit-method. Thus this is able to represent a broader class of statistical models.

Methods

cdf(x, *fit, **kwargs)

Returns cdf-values of a vector x for the cdf of a statistical model.

fit(data, **kwargs)

Fits a statistical model and returns parameter estimates.

ppf(q, *fit, **kwargs)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.

Examples

gen_PrecipitationHurdleModel is a child-class of StatisticalModel used to generate a precipitation hurdle model. For example if we want to use a precipitation hurdle model in Quantile mapping, assuming a generalised gamma distribution for amounts we can do:

>>> hurdle_model = gen_PrecipitationHurdleModel(distribution = scipy.stats.gengamma)
>>> debiaser = QuantileMapping.from_variable("pr", distribution = hurdle_model)

Warning

This is an advanced feature and requires some knowledge of the workings of the debiaser and the statistical model passed/fitted. For example CDFt does not require a model as parameter and in ISIMIP pr values are split into zero and non-zero values prior to fitting: so Statistical Models do not need to account for the zero-character.

abstract fit(data, **kwargs)

Fits a statistical model and returns parameter estimates.

Parameters:
datanp.ndarray

Array containing values on which the model is to fit.

Returns:
tuple

Tuple containing parameter estimates.

abstract cdf(x, *fit, **kwargs)

Returns cdf-values of a vector x for the cdf of a statistical model.

Parameters:
xnp.ndarray

Values for which the cdf shall be evaluated.

fittuple

Parameters controling the model fit. Return value of fit.

Returns:
np.ndarray

Array containing cdf-values for x.

abstract ppf(q, *fit, **kwargs)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.

Parameters:
qnp.ndarray

Values for which the ppf shall be evaluated.

fittuple

Parameters controling the model fit. Return value of fit.

Returns:
np.ndarray

Array containing ppf-values for x.

ibicus.utils gen_PrecipitationIgnoreZeroValuesModel-class

class ibicus.utils.gen_PrecipitationIgnoreZeroValuesModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None})

Represents a precipitation model where zero values are ignored and a cdf is only fitted to amounts.

In the cdf zero values are mapped to -np.inf and in the inverse cdf values of -np.inf are mapped to zero.

PrecipitationGammaModelIgnoreZeroValues is a concrete precipitation model ignoring zero-values with a gamma distribution for amounts.

Parameters:
self.distributionscipy.stats.rv_continuous

Distribution assumed for the precipitation amounts.

Methods

cdf(x, *fit)

Returns cdf-values for the precipitation amounts and -np.inf for zero.

fit(data)

Fits a precipitation model to the amounts (ignoring zero values).

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.

fit(data)

Fits a precipitation model to the amounts (ignoring zero values).

Parameters:
datanp.ndarray

Array containing precipitation values.

Returns:
tuple

Tuple containing parameter estimates for the amounts-distribution.

cdf(x, *fit)

Returns cdf-values for the precipitation amounts and -np.inf for zero.

Parameters:
xnp.ndarray

Values for which the cdf shall be evaluated.

fittuple

Parameters controling the amounts model. Return value of fit.

Returns:
np.ndarray

Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.

Parameters:
qnp.ndarray

Values for which the ppf shall be evaluated.

fittuple

Parameters controling the amounts model. Return value of fit.

Returns:
np.ndarray

Array containing cdf-values for x.

ibicus.debias gen_PrecipitationHurdleModel-class

class ibicus.utils.gen_PrecipitationHurdleModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None}, cdf_randomization=True)

Represents a precipitation hurdle model.

A hurdle-model is a two-step process: binomially it is determined if it rains (with probability \(p_0\) of no rain) and then we assume that theamounts follow a given distribution (often gamma) described by a cdf \(F_A\). Mathematically:

\[P(X = 0) = p_0,\]
\[P(0 < X <= x) = p_0 + (1-p_0) \cdot F_A(x)\]
Attributes:
distributionscipy.stats.rv_continuous

Distribution assumed for the precipitation amounts.

randomizationbool

Whether cdf-values for x == 0 (no rain) shall be randomized uniformly within (0, p0). Helps for quantile mapping and controlling the zero-inflation.

Methods

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model.

fit(data)

Fits a precipitation hurdle model and returns parameter estimates.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.

fit(data)

Fits a precipitation hurdle model and returns parameter estimates.

Parameters:
datanp.ndarray

Array containing precipitation values.

Returns:
tuple

Tuple containing parameter estimates: (p0, tuple of parameter estimates for the amounts-distribution).

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model. If self.cdf_randomization = True then cdf-values for x == 0 (no rain) are randomized between (0, p0).

Parameters:
xnp.ndarray

Values for which the cdf shall be evaluated.

fittuple

Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.

Returns:
np.ndarray

Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.

Parameters:
qnp.ndarray

Values for which the ppf shall be evaluated.

fittuple

Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.

Returns:
np.ndarray

Array containing cdf-values for x.

ibicus.utils gen_PrecipitationGammaLeftCensoredModel-class

class ibicus.utils.gen_PrecipitationGammaLeftCensoredModel(censoring_threshold=0.1, censor_in_ppf=True)

Represents a left censored precipitation gamma model.

A left censored gamma model is a gamma distribution where all values under a given threshold are censored: not observed. Those are represented by zero This is useful when a slightly higher threshold is used to account for the drizzle effect in climate models.

In the cdf before calculating all values below the censoring value are first randomized between (0, censoring_threshold). In the ppf values below the censoring_threshold are again set to zero. This handles possible inflation in quantile mapping by the censoring-value.

Attributes:
censoring_thresholdfloat

Value under which observations are censored.

censor_in_ppfbool

If in the ppf mapping values under the threshold are to be censored.

Methods

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model.

fit(data)

Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model.

fit(data)
Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be

a censored observation.

Parameters:
datanp.ndarray

Data on which to fit the censored gamma distribution.

Returns:
tuple

Parameter estimates for the gamma distribution.

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model. Values x below the censoring value (mainly zeros) are first randomized between (0, censoring_threshold) before calculating the gamma-cdf.

Parameters:
xnp.ndarray

Values for which the cdf shall be evaluated.

fittuple

Parameters controling the censored gamma-distribution: shape and scale

Returns:
np.ndarray

Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model. Values generated by the gamma ppf below the censoring value are set to zero.

Parameters:
qnp.ndarray

Values for which the ppf shall be evaluated.

fittuple

Parameters controling the censored gamma-distribution: shape and scale

Returns:
np.ndarray

Array containing cdf-values for x.

ibicus.utils Mathematical helpers

ibicus.utils.ecdf(x, y, method='step_function')

Return the values of the empirical CDF of x evaluated at y.

Three methods existd determined by method.

  1. method = "kernel_density": A kernel density estimate of the ecdf is used, using scipy.stats.rv_histogram.

  2. method = "linear_interpolation": Linear interpolation is used, starting from a grid of CDF-values.

  3. method = "step_function": The classical step-function.

Parameters:
xnp.ndarray

Array containing values with which the empirical cdf is defined.

ynp.ndarray

Array containing values on which the empirical cdf is evaluated.

methodstr

Method with which the ecdf is calculated. One of [“kernel_density”, “linear_interpolation”, “step_function”].

Returns:
np.ndarray

Values of the empirical cdf of x evaluated at y.

Examples

>>> x = np.random.random(1000)
>>> y = np.random.random(100)
>>> ecdf(x, y)
ibicus.utils.iecdf(x, p, method='inverted_cdf', **kwargs)

Return the values of the the inverse empirical CDF of x evaluated at p:

The call is delegated to np.quantile() with the method-argument determining what method is used.

Parameters:
xnp.ndarray

Array containing values with which the inverse empirical cdf is defined.

pnp.ndarray

Array containing values between [0, 1] for which the inverse empirical cdf is evaluated.

methodstring

Method string for np.quantile().

**kwargs

Passed to np.quantile().

Returns:
array

Values of the inverse empirical cdf of x evaluated at p.

Examples

>>> x = np.random.normal(size = 1000)
>>> p = np.linspace(0, 1, 100)
>>> iecdf(x, p)
ibicus.utils.quantile_map_non_parametically(x, y, vals, ecdf_method='step_function', iecdf_method='inverted_cdf', **kwargs)

Quantiles maps a vector of values vals using empirical distributions defined by vectors x and y. Quantiles of values in vals are first found using the ecdf of the values in x. Afterwards they are transformed onto y using the empirical inverse cdf of y.

Parameters:
x: np.ndarray

Values defining an empirical distribution with whose ecdf the quantiles are transformed.

y: np.ndarray

Values defining an empirical distribution with whose iecdf the quantiles are transformed.

vals: np.ndarray

Values to quantile map non parametically.

ecdf_method: str

Method to use for the ecdf (transformation of x). Passed to ecdf.

iecdf_method: str

Method to use for the iecdf (transformation of the quantiles). Passed to iecdf.

**kwargs:

Passed to iecdf.

ibicus.utils Logging

ibicus.utils.get_library_logger()

Returns the library logger used by the ibicus package.

ibicus.utils.get_verbosity_library_logger()

Returns the verbosity/level for the library logger as int.

ibicus.utils.set_verbosity_library_logger(verbosity)

Sets the verbosity/level for the library logger.

Parameters:
verbosity

Logging level: ["logging.INFO", logging.WARNING, "logging.ERROR", ...].