# Data Assimilation Diagnostics

With version 1.12 of PDAF, we introduce a first set of routines to compute diagnostics about the ensembles. The diagnostics can be used to assess the quality of the ensemble. The routine have been ported from the tool developed by the SANGOMA project, in which the PDAF developers were involved. We plan to provide further diagnostic routines in future releases of PDAF.

For now there are three routines:

## PDAF_diag_effsample

This routine compute the effective sample size as used in particle filters. The effective sample size is define as the inverse of the sum of the squared particle weights: **n_eff = 1 / sum[(w_i) ^{2}]**. The effective sample size can range between one - if a single particle has the maximum weight and all other particles have zero weight - and
the actual sample size - if all samples have the same weight.

For a documention on `PDAF_diag_effsample` see the detail page on PDAF_diag_effsample. The routine is used in the NETF and LNETF filter methods of PDAF.

## PDAF_diag_histogram

Rank histograms are frequently used to assess the distribution of an ensemble around an observation or, in twin experiments, the true state. The histograms use bins computed form the ensemble distribution and count how frequent e.g. the observation falls into which bin. A flat histogram typically indicates a good ensemble. A concave (U-shaped) histogram indicates too little ensemble spread, while a convex histogram is obtained when the ensemble spread is too large. Further, a sloped histogram indicates bias. (A discussion on the interpretation of rank histograms can be found in Hamill, Monthly Weather Review, 129 (2001) 550-560)

For a documention on `PDAF_diag_histogram` see the detail page on PDAF_diag_histogram. The routine is used in the Lorenz-96 model example in `testsuite/src/lorenz96/compute_truerms.F90`.

## PDAF_diag_ensstats

Ensemble Kalman filters assume that the ensemble is Gaussian distributed. In this case the distribution is symmetric and only the first and second moments of the distribution (the mean and standard deviation) are non-zero. The routine `PDAF_diag_ensstats` allows a data assimilation program to check the values of the third (skewness) and fourth (kurtosis) moment of the distribution. As there are different definition of the kurtosis, please note that PDAF uses the definition used by Lawson and Hansen, Mon. Wea. Rev. 132 (2004) 1966.

For a documention on `PDAF_diag_ensstats` see the detail page on PDAF_diag_ensstats. The routine is used in the Lorenz-96 model example in `testsuite/src/lorenz96/compute_truerms.F90`.