ibicus.utils module
utils
-module: provides functionality to help with data processing, debiasing and evaluation.
ibicus.utils Convert variables
- ibicus.utils.get_tasrange(tasmin, tasmax)
Calculates numpy array of
tasrange
from arrays oftasmin
andtasmax
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasminnp.ndarray
Numpy array of
tasmin
-values.- tasmaxnp.ndarray
Numpy array of
tasmax
-values.
- Returns:
- tasrangenp.ndarray
Numpy array of
tasrange
values
- ibicus.utils.get_tasskew(tas, tasmin, tasmax)
Calculates numpy array of
tasskew
from arrays oftas
,tasmin
andtasmax
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasnp.ndarray
Numpy array of
tas
-values.- tasminnp.ndarray
Numpy array of
tasmin
-values.- tasmaxnp.ndarray
Numpy array of
tasmax
-values.
- Returns:
- tasskewnp.ndarray
Numpy array of
tasskew
values
- ibicus.utils.get_tasmin(tas, tasrange, tasskew)
Calculates numpy array of
tasmin
from arrays oftas
,tasrange
andtasskew
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasnp.ndarray
Numpy array of
tas
-values.- tasrangenp.ndarray
Numpy array of
tasrange
-values.- tasskewnp.ndarray
Numpy array of
tasskew
-values.
- Returns:
- tasminnp.ndarray
Numpy array of
tasmin
values
- ibicus.utils.get_tasmax(tas, tasrange, tasskew)
Calculates numpy array of
tasmax
from arrays oftas
,tasrange
andtasskew
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasnp.ndarray
Numpy array of
tas
-values.- tasrangenp.ndarray
Numpy array of
tasrange
-values.- tasskewnp.ndarray
Numpy array of
tasskew
-values.
- Returns:
- tasmaxnp.ndarray
Numpy array of
tasmax
values
- ibicus.utils.get_tasmin_tasmax(tas, tasrange, tasskew)
Calculates numpy arrays of both
tasmin
andtasmax
from arrays oftas
,tasrange
andtasskew
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasnp.ndarray
Numpy array of
tas
-values.- tasrangenp.ndarray
Numpy array of
tasrange
-values.- tasskewnp.ndarray
Numpy array of
tasskew
-values.
- Returns:
- tasminnp.ndarray
Numpy array of
tasmin
values- tasmaxnp.ndarray
Numpy array of
tasmax
values
- ibicus.utils.get_tasrange_tasskew(tas, tasmin, tasmax)
Calculates numpy arrays of both
tasrange
andtasskew
from arrays oftas
,tasmin
andtasmax
.All input arrays need to have values at the same timesteps and locations.
Formulas:
\[\text{tasrange} = \text{tasmax} - \text{tasmin}\]\[\text{tasskew} = \frac{\text{tas} - \text{tasmin}}{\text{tasrange}}\]- Parameters:
- tasnp.ndarray
Numpy array of
tas
-values.- tasminnp.ndarray
Numpy array of
tasmin
-values.- tasmaxnp.ndarray
Numpy array of
tasmax
-values.
- Returns:
- tasrangenp.ndarray
Numpy array of
tasrange
values- tasskewnp.ndarray
Numpy array of
tasskew
values
- ibicus.utils.get_prsnratio(pr, prsn)
Calculates numpy array of
prsnratio
from arrays ofpr
andprsn
.All input arrays need to have values at the same timesteps and locations.
Formula:
\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]- Parameters:
- prnp.ndarray
Numpy array of
pr
-values.- prsnnp.ndarray
Numpy array of
prsn
-values.
- Returns:
- prsnrationnp.ndarray
Numpy array of
prsnratio
values
- ibicus.utils.get_pr(prsn, prsnratio)
Calculates numpy array of
pr
from arrays ofprsn
andprsnratio
.All input arrays need to have values at the same timesteps and locations.
Formula:
\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]- Parameters:
- prnp.ndarray
Numpy array of
prsn
-values.- prsnnp.ndarray
Numpy array of
prsnratio
-values.
- Returns:
- prnp.ndarray
Numpy array of
pr
values
- ibicus.utils.get_prsn(pr, prsnratio)
Calculates numpy array of
prsn
from arrays ofpr
andprsnratio
.All input arrays need to have values at the same timesteps and locations.
Formula:
\[\text{prsnratio} = \frac{\text{prsn}}{\text{pr}}\]- Parameters:
- prnp.ndarray
Numpy array of
pr
-values.- prsnnp.ndarray
Numpy array of
prsnratio
-values.
- Returns:
- prsnnp.ndarray
Numpy array of
prsn
values
ibicus.utils StatisticalModel abstract-class
- class ibicus.utils.StatisticalModel
Abstract functionality to wrap an arbitrary statistical model given by a fit-method, a cdf and a ppf.
This can be used to pass a self-defined model to a debiaser, that is then fitted and used at each location. In principle this is similar to
scipy.stats.rv_continuous
, however the user has the option to provide an own fit-method. Thus this is able to represent a broader class of statistical models.Methods
cdf
(x, *fit, **kwargs)Returns cdf-values of a vector x for the cdf of a statistical model.
fit
(data, **kwargs)Fits a statistical model and returns parameter estimates.
ppf
(q, *fit, **kwargs)Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.
Examples
gen_PrecipitationHurdleModel
is a child-class of StatisticalModel used to generate a precipitation hurdle model. For example if we want to use a precipitation hurdle model in Quantile mapping, assuming a generalised gamma distribution for amounts we can do:>>> hurdle_model = gen_PrecipitationHurdleModel(distribution = scipy.stats.gengamma) >>> debiaser = QuantileMapping.from_variable("pr", distribution = hurdle_model)
Warning
This is an advanced feature and requires some knowledge of the workings of the debiaser and the statistical model passed/fitted. For example
CDFt
does not require a model as parameter and inISIMIP
pr
values are split into zero and non-zero values prior to fitting: so Statistical Models do not need to account for the zero-character.- abstract fit(data, **kwargs)
Fits a statistical model and returns parameter estimates.
- Parameters:
- datanp.ndarray
Array containing values on which the model is to fit.
- Returns:
- tuple
Tuple containing parameter estimates.
- abstract cdf(x, *fit, **kwargs)
Returns cdf-values of a vector x for the cdf of a statistical model.
- Parameters:
- xnp.ndarray
Values for which the cdf shall be evaluated.
- fittuple
Parameters controling the model fit. Return value of fit.
- Returns:
- np.ndarray
Array containing cdf-values for x.
- abstract ppf(q, *fit, **kwargs)
Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a statistical model.
- Parameters:
- qnp.ndarray
Values for which the ppf shall be evaluated.
- fittuple
Parameters controling the model fit. Return value of fit.
- Returns:
- np.ndarray
Array containing ppf-values for x.
ibicus.utils gen_PrecipitationIgnoreZeroValuesModel-class
- class ibicus.utils.gen_PrecipitationIgnoreZeroValuesModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None})
Represents a precipitation model where zero values are ignored and a cdf is only fitted to amounts.
In the cdf zero values are mapped to -np.inf and in the inverse cdf values of -np.inf are mapped to zero.
PrecipitationGammaModelIgnoreZeroValues
is a concrete precipitation model ignoring zero-values with a gamma distribution for amounts.- Parameters:
- self.distributionscipy.stats.rv_continuous
Distribution assumed for the precipitation amounts.
Methods
cdf
(x, *fit)Returns cdf-values for the precipitation amounts and -np.inf for zero.
fit
(data)Fits a precipitation model to the amounts (ignoring zero values).
ppf
(q, *fit)Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.
- fit(data)
Fits a precipitation model to the amounts (ignoring zero values).
- Parameters:
- datanp.ndarray
Array containing precipitation values.
- Returns:
- tuple
Tuple containing parameter estimates for the amounts-distribution.
- cdf(x, *fit)
Returns cdf-values for the precipitation amounts and -np.inf for zero.
- Parameters:
- xnp.ndarray
Values for which the cdf shall be evaluated.
- fittuple
Parameters controling the amounts model. Return value of fit.
- Returns:
- np.ndarray
Array containing cdf-values for x.
- ppf(q, *fit)
Returns ppf (quantile / inverse cdf)-values of a vector q of the amounts ppf and 0 for -np.inf.
- Parameters:
- qnp.ndarray
Values for which the ppf shall be evaluated.
- fittuple
Parameters controling the amounts model. Return value of fit.
- Returns:
- np.ndarray
Array containing cdf-values for x.
ibicus.debias gen_PrecipitationHurdleModel-class
- class ibicus.utils.gen_PrecipitationHurdleModel(distribution=<scipy.stats._continuous_distns.gamma_gen object>, fit_kwds={'floc': 0, 'fscale': None}, cdf_randomization=True)
Represents a precipitation hurdle model.
A hurdle-model is a two-step process: binomially it is determined if it rains (with probability \(p_0\) of no rain) and then we assume that theamounts follow a given distribution (often gamma) described by a cdf \(F_A\). Mathematically:
\[P(X = 0) = p_0,\]\[P(0 < X <= x) = p_0 + (1-p_0) \cdot F_A(x)\]- Attributes:
- distributionscipy.stats.rv_continuous
Distribution assumed for the precipitation amounts.
- randomizationbool
Whether cdf-values for x == 0 (no rain) shall be randomized uniformly within (0, p0). Helps for quantile mapping and controlling the zero-inflation.
Methods
cdf
(x, *fit)Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model.
fit
(data)Fits a precipitation hurdle model and returns parameter estimates.
ppf
(q, *fit)Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.
- fit(data)
Fits a precipitation hurdle model and returns parameter estimates.
- Parameters:
- datanp.ndarray
Array containing precipitation values.
- Returns:
- tuple
Tuple containing parameter estimates: (p0, tuple of parameter estimates for the amounts-distribution).
- cdf(x, *fit)
Returns cdf-values of a vector x for the cdf of a precipitation hurdle-model. If self.cdf_randomization = True then cdf-values for x == 0 (no rain) are randomized between (0, p0).
- Parameters:
- xnp.ndarray
Values for which the cdf shall be evaluated.
- fittuple
Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.
- Returns:
- np.ndarray
Array containing cdf-values for x.
- ppf(q, *fit)
Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation hurdle-model.
- Parameters:
- qnp.ndarray
Values for which the ppf shall be evaluated.
- fittuple
Parameter controling the hurdle model: (p0, tuple of parameter estimates for the amounts-distribution). Return value of fit.
- Returns:
- np.ndarray
Array containing cdf-values for x.
ibicus.utils gen_PrecipitationGammaLeftCensoredModel-class
- class ibicus.utils.gen_PrecipitationGammaLeftCensoredModel(censoring_threshold=0.1, censor_in_ppf=True)
Represents a left censored precipitation gamma model.
A left censored gamma model is a gamma distribution where all values under a given threshold are censored: not observed. Those are represented by zero This is useful when a slightly higher threshold is used to account for the drizzle effect in climate models.
In the cdf before calculating all values below the censoring value are first randomized between (0, censoring_threshold). In the ppf values below the censoring_threshold are again set to zero. This handles possible inflation in quantile mapping by the censoring-value.
- Attributes:
- censoring_thresholdfloat
Value under which observations are censored.
- censor_in_ppfbool
If in the ppf mapping values under the threshold are to be censored.
Methods
cdf
(x, *fit)Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model.
fit
(data)Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be
ppf
(q, *fit)Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model.
- fit(data)
- Fits a censored gamma distribution to precipitation data where everything under self.censoring_threshold is assumed to be
a censored observation.
- Parameters:
- datanp.ndarray
Data on which to fit the censored gamma distribution.
- Returns:
- tuple
Parameter estimates for the gamma distribution.
- cdf(x, *fit)
Returns cdf-values of a vector x for the cdf of a precipitation left censored gamma-model. Values x below the censoring value (mainly zeros) are first randomized between (0, censoring_threshold) before calculating the gamma-cdf.
- Parameters:
- xnp.ndarray
Values for which the cdf shall be evaluated.
- fittuple
Parameters controling the censored gamma-distribution: shape and scale
- Returns:
- np.ndarray
Array containing cdf-values for x.
- ppf(q, *fit)
Returns ppf (quantile / inverse cdf)-values of a vector q for the cdf of a precipitation left-censored gamma model. Values generated by the gamma ppf below the censoring value are set to zero.
- Parameters:
- qnp.ndarray
Values for which the ppf shall be evaluated.
- fittuple
Parameters controling the censored gamma-distribution: shape and scale
- Returns:
- np.ndarray
Array containing cdf-values for x.
ibicus.utils Mathematical helpers
- ibicus.utils.ecdf(x, y, method='step_function')
Return the values of the empirical CDF of x evaluated at y.
Three methods existd determined by method.
method = "kernel_density"
: A kernel density estimate of the ecdf is used, usingscipy.stats.rv_histogram
.method = "linear_interpolation"
: Linear interpolation is used, starting from a grid of CDF-values.method = "step_function"
: The classical step-function.
- Parameters:
- xnp.ndarray
Array containing values with which the empirical cdf is defined.
- ynp.ndarray
Array containing values on which the empirical cdf is evaluated.
- methodstr
Method with which the ecdf is calculated. One of [“kernel_density”, “linear_interpolation”, “step_function”].
- Returns:
- np.ndarray
Values of the empirical cdf of x evaluated at y.
Examples
>>> x = np.random.random(1000) >>> y = np.random.random(100) >>> ecdf(x, y)
- ibicus.utils.iecdf(x, p, method='inverted_cdf', **kwargs)
Return the values of the the inverse empirical CDF of x evaluated at p:
The call is delegated to
np.quantile()
with the method-argument determining what method is used.- Parameters:
- xnp.ndarray
Array containing values with which the inverse empirical cdf is defined.
- pnp.ndarray
Array containing values between [0, 1] for which the inverse empirical cdf is evaluated.
- methodstring
Method string for
np.quantile()
.- **kwargs
Passed to
np.quantile()
.
- Returns:
- array
Values of the inverse empirical cdf of x evaluated at p.
Examples
>>> x = np.random.normal(size = 1000) >>> p = np.linspace(0, 1, 100) >>> iecdf(x, p)
- ibicus.utils.quantile_map_non_parametically(x, y, vals, ecdf_method='step_function', iecdf_method='inverted_cdf', **kwargs)
Quantiles maps a vector of values vals using empirical distributions defined by vectors x and y. Quantiles of values in vals are first found using the ecdf of the values in x. Afterwards they are transformed onto y using the empirical inverse cdf of y.
- Parameters:
- x: np.ndarray
Values defining an empirical distribution with whose ecdf the quantiles are transformed.
- y: np.ndarray
Values defining an empirical distribution with whose iecdf the quantiles are transformed.
- vals: np.ndarray
Values to quantile map non parametically.
- ecdf_method: str
Method to use for the ecdf (transformation of x). Passed to ecdf.
- iecdf_method: str
Method to use for the iecdf (transformation of the quantiles). Passed to iecdf.
- **kwargs:
Passed to iecdf.
ibicus.utils Logging
- ibicus.utils.get_library_logger()
Returns the library logger used by the ibicus package.
- ibicus.utils.get_verbosity_library_logger()
Returns the verbosity/level for the library logger as
int
.
- ibicus.utils.set_verbosity_library_logger(verbosity)
Sets the verbosity/level for the library logger.
- Parameters:
- verbosity
Logging level:
["logging.INFO", logging.WARNING, "logging.ERROR", ...]
.