pycaps.verify package

Submodules

pycaps.verify.ContingencyTable module

class ContingencyTable(tp, fp, fn, tn)[source]

Bases: object

Initializes a contingency table of the following form:
Event Yes No
Forecast Yes a b
No c d
tp

int

Number of true positives or hits.

fp (int): Number of false positives. fn (int): Number of false negatives or misses. tn (int): Number of true negatives.
__add__(other)[source]

Add two contingency tables together and return a combined one.

Parameters:other
Returns:
accuracy()[source]

Finley’s measure, fraction correct, accuracy (a+d)/N

bias()[source]

Returns Frequency Bias. Formula: (a+b)/(a+c)

csi()[source]

Gilbert’s Score or Threat Score or Critical Success Index a/(a+b+c)

css()[source]

Clayton Skill Score (ad - bc)/((a+b)(c+d))

dfr()[source]

Returns Detection Failure Ratio (DFR). Formula: c/(c+d)

ets()[source]

Equitable Threat Score, Gilbert Skill Score, v, (a - R)/(a + b + c - R), R=(a+b)(a+c)/N

far()[source]

Returns False Alarm Ratio (FAR). Formula: b/(a+b)

focn()[source]

Returns Frequency of Correct Null (FOCN). Formula: d/(c+d)

foh()[source]

Returns Frequency of Hits (FOH) or Success Ratio. Formula: a/(a+b)

fom()[source]

Returns Frequency of Misses (FOM). Formula: c/(a+c).

hss()[source]

Doolittle (Heidke) Skill Score. 2(ad-bc)/((a+b)(b+d) + (a+c)(c+d))

pod()[source]

Returns Probability of Detection (POD) or Hit Rate. Formula: a/(a+c)

pofd()[source]

Returns Probability of False Detection (POFD). b/(b+d)

pon()[source]

Returns Probability of Null (PON). Formula: d/(b+d)

pss()[source]

Peirce (Hansen-Kuipers, True) Skill Score (ad - bc)/((a+c)(b+d))

update(tp, fp, fn, tn)[source]

Update contingency table with new values without creating a new object.

pycaps.verify.MulticlassContingencyTable module

class MulticlassContingencyTable(table=None, n_classes=2, class_names=('1', '0'))[source]

Bases: object

This class is a container for a contingency table containing more than 2 classes. The contingency table is stored in table as a numpy array with the rows corresponding to forecast categories, and the columns corresponding to observation categories.

gerrity_score()[source]

Gerrity Score, which weights each cell in the contingency table by its observed relative frequency. :return:

heidke_skill_score()[source]
peirce_skill_score()[source]

Multiclass Peirce Skill Score (also Hanssen and Kuipers score, True Skill Score)

mct_main()[source]

pycaps.verify.ProbabilityMetrics module

class DistributedCRPS(thresholds=None, input_str=None)[source]

Bases: object

A container for the statistics used to calculate the Continuous Ranked Probability Score

Parameters:
  • thresholds (numpy.ndarray) – Array of the intensity threshold bins
  • input_str (str) – String containing the information for initializing the object from a storable text format.
crps()[source]

Calculates the continuous ranked probability score.

Returns:
from_str(in_str)[source]
merge(other_crps)[source]
update(forecasts, observations)[source]

Update the statistics with forecasts and observations.

Parameters:
  • forecasts
  • observations
Returns:

class DistributedROC(thresholds=None, obs_threshold=None, input_str=None)[source]

Bases: object

Store statistics for calculating receiver operating characteristic (ROC) curves and performance diagrams and permit easy aggregation of ROC curves from many small datasets.

Parameters:
  • thresholds (numpy.ndarray of floats) – List of probability thresholds in increasing order.
  • obs_threshold (float) – Observation value used as the split point for determining positives.
  • input_str (str) – String in the format output by the __str__ method so that initialization of the object can be done from items in a text file.
__add__(other)[source]

Add two DistributedROC objects together and combine their contingency table values.

Parameters:other – Another DistributedROC object.
Returns:
__str__()[source]

Output the information within the DistributedROC object to a string.

Returns:
auc()[source]

Calculate the Area Under the ROC Curve (AUC).

Returns:
from_str(in_str)[source]

Read the object string and parse the contingency table values from it. :param in_str: :return:

merge(other_roc)[source]

Ingest the values of another DistributedROC object into this one and update the statistics inplace.

Parameters:other_roc – another DistributedROC object.
Returns:
performance_curve()[source]

Calculate the Probability of Detection and False Alarm Ratio in order to output a performance diagram.

Returns:pandas.DataFrame containing POD, FAR, and probability thresholds.
roc_curve()[source]

Generate a ROC curve from the contingency table by calculating the probability of detection (TP/(TP+FN)) and the probability of false detection (FP/(FP+TN)).

Returns:A pandas.DataFrame containing the POD, POFD, and the corresponding probability thresholds.
update(forecasts, observations)[source]

Update the ROC curve with a set of forecasts and observations

Parameters:
  • forecasts – 1D array of forecast values
  • observations – 1D array of observation values.
Returns:

class DistributedReliability(thresholds=None, obs_threshold=None, input_str=None)[source]

Bases: object

A container for the statistics required to generate reliability diagrams and calculate the Brier Score.

Parameters:
  • thresholds (numpy.ndarray) – Array of probability thresholds
  • obs_threshold (float) – Split value for the observations
  • input_str (str) – String containing information to initialize the object from a text representation.
__add__(other)[source]

Add two DistributedReliability objects together and combine their values.

Parameters:other – a DistributedReliability object
Returns:a DistributedReliability Object
brier_score()[source]

Calculate the Brier Score

Returns:
brier_score_components()[source]

Calculate the components of the Brier score decomposition: reliability, resolution, and uncertainty.

Returns:
brier_skill_score()[source]

Calculate the Brier Skill Score

Returns:
climatology()[source]
from_str(in_str)[source]
merge(other_rel)[source]
reliability_curve()[source]

Calculate the reliability diagram statistics.

Returns:
update(forecasts, observations)[source]

Update the statistics with a set of forecasts and observations.

Parameters:
  • forecasts
  • observations
Returns:

class ROC(forecasts, observations, thresholds, obs_threshold)[source]

Bases: object

auc()[source]
calc_roc()[source]
class Reliability(forecasts, observations, thresholds, obs_threshold)[source]

Bases: object

brier_score()[source]
brier_score_components()[source]
brier_skill_score()[source]
calc_reliability_curve()[source]
bootstrap(score_objs, n_boot=1000)[source]

pycaps.verify.verif_modules module

NEP_2D(var2D_ens, threshold, kernel)[source]

A function for performing neighborhood ensemble probability (NEP)

Given a ensemble of 2D slices (var2D_ens), this function will perform the NEP of P[var2D > threshold] and return a corresponding 2D array with the NEP values. The NEP will use a neighborhood defined by kernel. The kernel is a 2D array and is defined using the define_kernel function (also contained in this library).

Parameters:
  • var2D_ens – The 2D ensemble of x-y slices to calculate NEP for [n_ens, ny, nx]
  • threshold – NEP will be calculated for P[var2D_ens > threshold] within the neighborhood where threshold can be any real number.
  • kernel – a 2D array of some size < (ny, nx) with values between 0 and 1 to serve as the kernel for the 2D convolution.
Returns:

a 2D [ny, nx] field containing NEP for P[var2D_ens > threshold] for the given neighborhood (defined by kernel)

Return type:

var_NEP

define_kernel(radius, dropoff=0.0)[source]

Defines a 2D kernel suitable for use in neighborhood ensemble probability calculations (or other verification methods requiring the use of a 2D convolution).

Parameters:
  • radius – The radius of the kernel, in gridpoints (can be any whole number)
  • dropoff – Set this to a value between 0.0 and 1.0 to have points near the edge of the radius recieve a lower weight. By default, dropoff is set to 0.0, giving all points within the radius equal weight.
Returns:

a 2D array containing the kernel needed for performing convolutions

Return type:

Kernel

define_kernel_ellipse(xradius, yradius, dropoff=0.0)[source]

Defines an elliptical 2D kernel suitable for use in neighborhood ensemble probability calculations (or other verification methods requiring the use of a 2D convolution).

Parameters:
  • xradius – The semi-major (or semi-minor) axis in the east-west direction, in gridpoints (can be any whole number)
  • yradius – The semi-major (or semi-minor) axis in the north-south direction, in gridpoints (can be any whole number)
  • dropoff – Set this to a value between 0.0 and 1.0 to have points near the edge of the radius recieve a lower weight. By default, dropoff is set to 0.0, giving all points within the radius equal weight.
Returns:

a 2D array containing the kernel needed for performing convolutions

Return type:

Kernel

ens_prob_within_dist(var2D_ens, threshold, kernel)[source]

A function for calculating ensemble probability of the form “probability of X within Y km of a point”. Similar to, but distinct from, NEP_2D.

Given a ensemble of 2D slices (var2D_ens), this function will compute the ensemble probability of P[var2D > threshold (anywhere within kernel)] and return a corresponding 2D array with the probability values. The kernel is a 2D array and is defined using the define_kernel function (also contained in this library).

Parameters:
  • var2D_ens – The 2D ensemble of x-y slices to calculate NEP for [n_ens, ny, nx]
  • threshold – NEP will be calculated for P[var2D_ens > threshold] within the neighborhood where threshold can be any real number.
  • kernel – a 2D array of some size < (ny, nx) with values of 0 or 1 to serve as the kernel for the 2D convolution.
Returns:

a 2D [ny, nx] field containing NEP for P[var2D_ens > threshold] for the given neighborhood (defined by kernel)

Return type:

ens_prob

pm_mean_2D(var2D_ens)[source]

A function for computing probability-matched mean (PM mean)

Given an ensemble of 2D slices (var2D_ens), this function will perform the PM mean calculation and return the PM mean as a 2D array over the same slice.

var2D_ens should have dimensions of (n_ens, ny, nx).

Parameters:var2D_ens – an ensemble of 2D slices of a variable field [nens, ny, nx]
Returns:the probability-matched mean of var2D_ens
Return type:pm_mean

Module contents