ibicus.utils module

utils-module: provides functionality to help with data processing, debiasing and evaluation.

ibicus.utils Convert variables

ibicus.utils.get_tasrange(tasmin, tasmax)

Calculates numpy array of tasrange from arrays of tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasminnp.ndarray: Numpy array of tasmin-values.
tasmaxnp.ndarray: Numpy array of tasmax-values.

Returns:

tasrangenp.ndarray: Numpy array of tasrange values

ibicus.utils.get_tasskew(tas, tasmin, tasmax)

Calculates numpy array of tasskew from arrays of tas, tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasnp.ndarray: Numpy array of tas-values.
tasminnp.ndarray: Numpy array of tasmin-values.
tasmaxnp.ndarray: Numpy array of tasmax-values.

Returns:

tasskewnp.ndarray: Numpy array of tasskew values

ibicus.utils.get_tasmin(tas, tasrange, tasskew)

Calculates numpy array of tasmin from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasnp.ndarray: Numpy array of tas-values.
tasrangenp.ndarray: Numpy array of tasrange-values.
tasskewnp.ndarray: Numpy array of tasskew-values.

Returns:

tasminnp.ndarray: Numpy array of tasmin values

ibicus.utils.get_tasmax(tas, tasrange, tasskew)

Calculates numpy array of tasmax from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasnp.ndarray: Numpy array of tas-values.
tasrangenp.ndarray: Numpy array of tasrange-values.
tasskewnp.ndarray: Numpy array of tasskew-values.

Returns:

tasmaxnp.ndarray: Numpy array of tasmax values

ibicus.utils.get_tasmin_tasmax(tas, tasrange, tasskew)

Calculates numpy arrays of both tasmin and tasmax from arrays of tas, tasrange and tasskew.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasnp.ndarray: Numpy array of tas-values.
tasrangenp.ndarray: Numpy array of tasrange-values.
tasskewnp.ndarray: Numpy array of tasskew-values.

Returns:

tasminnp.ndarray: Numpy array of tasmin values
tasmaxnp.ndarray: Numpy array of tasmax values

ibicus.utils.get_tasrange_tasskew(tas, tasmin, tasmax)

Calculates numpy arrays of both tasrange and tasskew from arrays of tas, tasmin and tasmax.

All input arrays need to have values at the same timesteps and locations.

Formulas:

\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]

\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]

Parameters:

tasnp.ndarray: Numpy array of tas-values.
tasminnp.ndarray: Numpy array of tasmin-values.
tasmaxnp.ndarray: Numpy array of tasmax-values.

Returns:

tasrangenp.ndarray: Numpy array of tasrange values
tasskewnp.ndarray: Numpy array of tasskew values

ibicus.utils.get_prsnratio(pr, prsn)

Calculates numpy array of prsnratio from arrays of pr and prsn.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]

Parameters:

prnp.ndarray: Numpy array of pr-values.
prsnnp.ndarray: Numpy array of prsn-values.

Returns:

prsnrationnp.ndarray: Numpy array of prsnratio values

ibicus.utils.get_pr(prsn, prsnratio)

Calculates numpy array of pr from arrays of prsn and prsnratio.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]

Parameters:

prnp.ndarray: Numpy array of prsn-values.
prsnnp.ndarray: Numpy array of prsnratio-values.

Returns:

prnp.ndarray: Numpy array of pr values

ibicus.utils.get_prsn(pr, prsnratio)

Calculates numpy array of prsn from arrays of pr and prsnratio.

All input arrays need to have values at the same timesteps and locations.

Formula:

\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]

Parameters:

prnp.ndarray: Numpy array of pr-values.
prsnnp.ndarray: Numpy array of prsnratio-values.

Returns:

prsnnp.ndarray: Numpy array of prsn values

ibicus.utils StatisticalModel abstract-class

class ibicus.utils.StatisticalModel

Abstract functionality to wrap an arbitrary statistical model given by a fit-method, a cdf and a ppf.

This can be used to pass a self-defined model to a debiaser, that is then fitted and used at each location. In principle this is similar to scipy.stats.rv_continuous, however the user has the option to provide an own fit-method. Thus this is able to represent a broader class of statistical models.

Methods

`cdf`(x, fit, *kwargs)	Returns cdf-values of a vector x for the cdf of a statistical model.
`fit`(data, **kwargs)	Fits a statistical model and returns parameter estimates.
`ppf`(q, fit, *kwargs)	Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.

Examples

gen_PrecipitationHurdleModel is a child-class of StatisticalModel used to generate a precipitation hurdle model. For example if we want to use a precipitation hurdle model in Quantile mapping, assuming a generalised gamma distribution for amounts we can do:

>>> hurdle_model = gen_PrecipitationHurdleModel(distribution = scipy.stats.gengamma)
>>> debiaser = QuantileMapping.from_variable("pr", distribution = hurdle_model)

Warning

This is an advanced feature and requires some knowledge of the workings of the debiaser and the statistical model passed/fitted. For example CDFt does not require a model as parameter and in ISIMIP pr values are split into zero and non-zero values prior to fitting: so Statistical Models do not need to account for the zero-character.

abstractmethod fit(data, **kwargs)

Fits a statistical model and returns parameter estimates.

Parameters:

datanp.ndarray: Array containing values on which the model is to fit.

Returns:

tuple: Tuple containing parameter estimates.

abstractmethod cdf(x, *fit, **kwargs)

Returns cdf-values of a vector x for the cdf of a statistical model.

Parameters:

xnp.ndarray: Values for which the cdf shall be evaluated.
fittuple: Parameters controling the model fit. Return value of fit.

Returns:

np.ndarray: Array containing cdf-values for x.

abstractmethod ppf(q, *fit, **kwargs)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.

Parameters:

qnp.ndarray: Values for which the ppf shall be evaluated.
fittuple: Parameters controling the model fit. Return value of fit.

Returns:

np.ndarray: Array containing ppf-values for x.

ibicus.utils gen_PrecipitationIgnoreZeroValuesModel-class

class ibicus.utils.gen_PrecipitationIgnoreZeroValuesModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None})

Represents a precipitation model where zero values are ignored and a cdf is only fitted to amounts.

In the cdf zero values are mapped to -np.inf and in the inverse cdf values of -np.inf are mapped to zero.

PrecipitationGammaModelIgnoreZeroValues is a concrete precipitation model ignoring zero-values with a gamma distribution for amounts.

Parameters:

self.distributionscipy.stats.rv_continuous: Distribution assumed for the precipitation amounts.

Methods

`cdf`(x, *fit)	Returns cdf-values for the precipitation amounts and -np.inf for zero.
`fit`(data)	Fits a precipitation model to the amounts (ignoring zero values).
`ppf`(q, *fit)	Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.

fit(data)

Fits a precipitation model to the amounts (ignoring zero values).

Parameters:

datanp.ndarray: Array containing precipitation values.

Returns:

tuple: Tuple containing parameter estimates for the amounts-distribution.

cdf(x, *fit)

Returns cdf-values for the precipitation amounts and -np.inf for zero.

Parameters:

xnp.ndarray: Values for which the cdf shall be evaluated.
fittuple: Parameters controling the amounts model. Return value of fit.

Returns:

np.ndarray: Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.

Parameters:

qnp.ndarray: Values for which the ppf shall be evaluated.
fittuple: Parameters controling the amounts model. Return value of fit.

Returns:

np.ndarray: Array containing cdf-values for x.

ibicus.debias gen_PrecipitationHurdleModel-class

class ibicus.utils.gen_PrecipitationHurdleModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None}, cdf_randomization=True)

Represents a precipitation hurdle model.

A hurdle-model is a two-step process: binomially it is determined if it rains (with probability \(p_0\) of no rain) and then we assume that theamounts follow a given distribution (often gamma) described by a cdf \(F_A\). Mathematically:

\[P(X = 0) = p_0,\]

\[P(0 < X <= x) = p_0 + (1-p_0) \cdot F_A(x)\]

Attributes:

distributionscipy.stats.rv_continuous: Distribution assumed for the precipitation amounts.
randomizationbool: Whether cdf-values for x == 0 (no rain) shall be randomized uniformly within (0, p0). Helps for quantile mapping and controlling the zero-inflation.

Methods

`cdf`(x, *fit)	Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model.
`fit`(data)	Fits a precipitation hurdle model and returns parameter estimates.
`ppf`(q, *fit)	Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.

fit(data)

Fits a precipitation hurdle model and returns parameter estimates.

Parameters:

datanp.ndarray: Array containing precipitation values.

Returns:

tuple: Tuple containing parameter estimates: (p0, tuple of parameter estimates for the amounts-distribution).

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model. If self.cdf_randomization = True then cdf-values for x == 0 (no rain) are randomized between (0, p0).

Parameters:

xnp.ndarray: Values for which the cdf shall be evaluated.
fittuple: Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.

Returns:

np.ndarray: Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.

Parameters:

qnp.ndarray: Values for which the ppf shall be evaluated.
fittuple: Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.

Returns:

np.ndarray: Array containing cdf-values for x.

ibicus.utils gen_PrecipitationGammaLeftCensoredModel-class

class ibicus.utils.gen_PrecipitationGammaLeftCensoredModel(censoring_threshold=0.1, censor_in_ppf=True)

Represents a left censored precipitation gamma model.

A left censored gamma model is a gamma distribution where all values under a given threshold are censored: not observed. Those are represented by zero This is useful when a slightly higher threshold is used to account for the drizzle effect in climate models.

In the cdf before calculating all values below the censoring value are first randomized between (0, censoring_threshold). In the ppf values below the censoring_threshold are again set to zero. This handles possible inflation in quantile mapping by the censoring-value.

Attributes:

censoring_thresholdfloat: Value under which observations are censored.
censor_in_ppfbool: If in the ppf mapping values under the threshold are to be censored.

Methods

`cdf`(x, *fit)	Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model.
`fit`(data)	Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be
`ppf`(q, *fit)	Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model.

fit(data)

Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be: a censored observation.

Parameters:

datanp.ndarray: Data on which to fit the censored gamma distribution.

Returns:

tuple: Parameter estimates for the gamma distribution.

cdf(x, *fit)

Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model. Values x below the censoring value (mainly zeros) are first randomized between (0, censoring_threshold) before calculating the gamma-cdf.

Parameters:

xnp.ndarray: Values for which the cdf shall be evaluated.
fittuple: Parameters controling the censored gamma-distribution: shape and scale

Returns:

np.ndarray: Array containing cdf-values for x.

ppf(q, *fit)

Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model. Values generated by the gamma ppf below the censoring value are set to zero.

Parameters:

qnp.ndarray: Values for which the ppf shall be evaluated.
fittuple: Parameters controling the censored gamma-distribution: shape and scale

Returns:

np.ndarray: Array containing cdf-values for x.

ibicus.utils Mathematical helpers

ibicus.utils.ecdf(x, y, method='step_function')

Return the values of the empirical CDF of x evaluated at y.

Three methods existd determined by method.

method = "kernel_density": A kernel density estimate of the ecdf is used, using scipy.stats.rv_histogram.
method = "linear_interpolation": Linear interpolation is used, starting from a grid of CDF-values.
method = "step_function": The classical step-function.

Parameters:

xnp.ndarray: Array containing values with which the empirical cdf is defined.
ynp.ndarray: Array containing values on which the empirical cdf is evaluated.
methodstr: Method with which the ecdf is calculated. One of [“kernel_density”, “linear_interpolation”, “step_function”].

Returns:

np.ndarray: Values of the empirical cdf of x evaluated at y.

Examples

>>> x = np.random.random(1000)
>>> y = np.random.random(100)
>>> ecdf(x, y)

ibicus.utils.iecdf(x, p, method='inverted_cdf', **kwargs)

Return the values of the the inverse empirical CDF of x evaluated at p:

The call is delegated to np.quantile() with the method-argument determining what method is used.

Parameters:

xnp.ndarray: Array containing values with which the inverse empirical cdf is defined.
pnp.ndarray: Array containing values between [0, 1] for which the inverse empirical cdf is evaluated.
methodstring: Method string for np.quantile().
**kwargs: Passed to np.quantile().

Returns:

array: Values of the inverse empirical cdf of x evaluated at p.

Examples

>>> x = np.random.normal(size = 1000)
>>> p = np.linspace(0, 1, 100)
>>> iecdf(x, p)

ibicus.utils.quantile_map_non_parametically(x, y, vals, ecdf_method='step_function', iecdf_method='inverted_cdf', **kwargs)

Quantiles maps a vector of values vals using empirical distributions defined by vectors x and y. Quantiles of values in vals are first found using the ecdf of the values in x. Afterwards they are transformed onto y using the empirical inverse cdf of y.

Parameters:

x: np.ndarray: Values defining an empirical distribution with whose ecdf the quantiles are transformed.
y: np.ndarray: Values defining an empirical distribution with whose iecdf the quantiles are transformed.
vals: np.ndarray: Values to quantile map non parametically.
ecdf_method: str: Method to use for the ecdf (transformation of x). Passed to ecdf.
iecdf_method: str: Method to use for the iecdf (transformation of the quantiles). Passed to iecdf.
**kwargs:: Passed to iecdf.

ibicus.utils Logging

ibicus.utils.get_library_logger(): Returns the library logger used by the ibicus package.

ibicus.utils.get_verbosity_library_logger(): Returns the verbosity/level for the library logger as int.

ibicus.utils.set_verbosity_library_logger(verbosity)

Sets the verbosity/level for the library logger.

Parameters:

verbosity: Logging level: ["logging.INFO", logging.WARNING, "logging.ERROR", ...].