wiki:AddFilterAlgorithm_PDAF3

Adding a Data Assimilation Method to PDAF

This page describes the implementation for PDAF3. The previous implementation for PDAF2 is described on the Page on adding a DA Method to PDAF 2

PDAF provides an internal interface to add a data assimilation (DA) method to PDAF. Here we describe the implementation strategy and internal structure of PDAF valid for version 3.0 and later. In this text, we assume that the reader is already familiar with PDAF to the extend that it is known how PDAF is connected to a model as is described in the Implementation Guide.

The internal structure of PDAF is organized into a generic part providing the infrastructure to perform ensemble forecasts and the actual analysis step of the DA method. This generic part is independent of the particular filter algorithm. The specific routines for a DA method are called through the internal interface.

In PDAF, each DA algorithm consists of 5 Fortran modules including different subroutines for configuring the DA method, for the handling of the ensemble forecasts, and for the analysis step. The modules and routines are described below.

We provide templates for the implementation of global and local ensemble filter in the sub-directores of templates/analysis_step/.

PDAF's Internal Interface

Here, we first provide an overview of the internal interface routines of PDAF. The structure of the internal interface of PDAF is depicted in Figure 1 (For the method-specific routines, 'X' is the name of the DA method).

The left column in Fig. 1 shows the generic PDAF routines, which are called from the user code. Here, PDAF_init calls 5 interface routines to perform the specific initialization of the DA method. The other generic routines are more focused and call only one interface routine each.

The internal interface routines allowthe user to call the generic routines, which are then mapped to a specific routine of the DA method. The internal interface routines are depicted in the middle column of Fig. 1. All these routines are collected in the module PDAF_utils_filters in the file PDAF_utils_filters.F90. For each interface routine there is a specific routine of the DA method shown in the right-most column. These specific routines are collected in the module PDAF_X in the file PDAF_X.F90. The interface routines perform the initialization of a DA method, setting parameters or printing information about the configuration or the available options.

The assimilation routines, here the universal routines PDAF3_assim_offline and PDAF3_assimilate, directly call the specific assimilation routine of the DA method. These are stored the module in `PDAF_assimilate_X.F90.

/pics/internal_interface_PDAF3.png
Figure 1: Structure of the internal interface of PDAF. There are 7 internal interface routines (middle column) that connect the generic part with filter-specific routines. All these interface routines are collected in the module PDAF_utils filters. Each of the internal interface routines call one routine that is specific to the DA method. These routines are collected in the module PDAF_X, where 'X' would be the name of the DA method. The assimilation routines for the online and offline coupled modes are collected in the module PDAF_assimilate_X.

The separate routines are the following:

Internal interface routines

The purpose of the internal interface routines is as follows

Interface routine
in PDAF_utils_filters
called specific routine
in PDAF_X
Description
PDAF_init_filters PDAF_X_init Perform the filter-specific initialization of parameters and calls the user-supplied routine that initializes the initial ensemble of model states.
PDAF_alloc_filters PDAF_X_alloc Allocate the filter-specific arrays.
PDAF_options_filters PDAF_X_options Display an overview of available options for the filter algorithm.
PDAF_set_iparam_filters PDAF_X_set_iparam Set integer parameter for the DA method
PDAF_set_rparam_filters PDAF_X_set_rparam Set real (floating point) parameter for the DA method
PDAF_print_info_filters PDAF_x_memtime Display information on the run time of the different parts of the DA method as well as information on the amount of allocated memory.
PDAF_configinfo_filters PDAF_X_config Display the current configuration of the DA method

When PDAF_init is called, the DA method is chosen by its ID number or its name parameter (see page on specific options of DA method). Internally to PDAF, each DA method is identified by a string that is defined in the module PDAF_DA in PDAF_da.F90. The interface routines have a very simple structure. In general, they select the method-specific routine based on the string identifying the filters.

Collecting all interface routines in the file PDAF_utils_filters.F90 yields a single place in which additions for the configuration functionality for a new DA method need to be done.

When adding a DA method, a line for the corresponding method-specific routine has to be inserted to each of the interface routines in PDAF_utils_filters.F90. Further, one has to add to PDAF_da.F90 a line for the DA method declaring its name in the form PDAF_DA_X and a correspondig index.

Internal code structure of a DA method

Fortran Modules

Each DA method in PDAF consist of 5 standard modules. These are

PDAF_X This module declares variables specific to the DA method, e.g. the variable for the ensemble inflation. Further, it contains the different subroutines called by the routines in PDAF_utils_filters for the configuration of the DA method.
PDAFassimilate_X This module contains the assimilation interface routines. Usually these are PDAF_assimilate_X and PDAF_assim_offline_X.
PDAFput_state_X This module contains the routine PDAF_put_state_X. This routine controls the ensmeble integrations for the flexible parallelization variant. It exists as a spearate routine for backward-compatibility with PDAF2.
PDAF_X_update This module contains the main routine PDAFX_update for the analysis update. This controls the actual update, e.g. by performing a local analysis loop for domain-localized filter methods.
PDAF_X_analysis This module contains the routine that computes and applies the actual ensemble analysis increment.

Note on naming modules and subroutines: Fortran does not allow that the name of a module is identical to the name of a subroutine contained in the module. We we handle this issue with using or omitting underscores, e.g. the module PDAFassimilate_X contains the subroutine PDAF_assimilate_X. This limitation is somewhat inconvenient and might lead to inconsistencies in the naming schemes (like that there is the module PDAFassimilate_X without underscore following PDAF, but the module PDAF_X_update with underscore).

Call structure for analysis step

The call structure of an analysis step is shown in Figure 2. The interface routines PDAF3_assimilate and PDAF3_assim_offline (or the different variants of 3D-Var or the more specific routines for ensemble filters) are called directly from the model code. These generic routines call internally the method-specific routine (PDAF_assimilate_X or PDAF_assim_offline_X) according to the chosen filter. These routines control the ensemble forecasting for the online coupled mode and the ensemble handling for the offline mode. In implementations done for PDAF2, the routine PDAF_put_state_X might be called. PDAF_assimilate_X calls PDAF_put_state_X, controls the ensemble for the case that multiple ensemble states are propagated by a single model task (i.e. the flexible parallelization variant, and collects the ensemble from the ensemble tasks before the assimilate update and distributes the ensemble to the model tasks afterwards.

At the time of an analysis step, the actual analysis routines are called. Here, PDAFX_update is the actual main routine for the DA method. In this routine, the observations are initialied and for domain-locallized filters, the local analysis loop is performed. The routine PDAFX_analysis then computes and applied the assimilation increment or computes the ensemble transformation of transform filters.

/pics/analysis_call_structure_PDAF3.png
Figure 2: Internal call structure for the analysis step. The universal interface routines PDAF3_assimilate and PDAF3_Assim_offline call a corresponding specific routine of the DA method. These specific routines are PDAF_assimilate_X and PDAF_assim_offline_X which are members of the module PDAFassimilate_X. PDAF_assimilate_X calls PDAF_put_state_X. These two routines together control the online coupled mode, while PDAF_assim_offline_X controls the offline coupled mode. The actual analysis update is performed by the routines, PDAFX_update and PDAFX_analysis, each in their own module.

Further below we provide a detailed description for the different routines. The routines PDAF_assimilate_X, PDAF_assim_offline_X, and PDAF_put_state_X are framework routines and the templates only need minimal changes when implementing a new DA method. Also PDAFX_update as control routine for the analysis needs likely only minor changes. The main work when implementing a new DA method is in PDAFX_analysis where the actual analysis algorithm is implemented.

PDAF's framework infrastructure

When adding a new DA method to PDAF, one utlizes the framework infrastructure provided by PDAF. PDAF provides dimensions and arrays for the data assimilation. It also handles the ensemble forecasting before providing the ensemble to the DA method for the analysis step.

Internal dimensions

PDAF internally stores the dimensions of the assimilation system. The dimensions are declared in the Fortran module PDAF_mod_core. Important are the following dimensions:

Variable Description
dim_p The size of the state vector (with parallelization the size of the local state vector for the current process)
dim_ens The overall size of the ensemble
dim_lag The lag as number of previous analysis steps; only used if a smoother is implemented
dim_bias_p Dimension of a bias vector; only used if a bias estimation is implemented

Internal arrays

When running PDAF_init, a DA method allocates in its routine PDAF_X_alloc several arrays of the PDAF infrastructure. These arrays are declared in PDAF_mod_core. These arrays remain allocated throughout the assimilation process.

For the processes that computes the analysis (those with filterpe=.true.) the following arrays are defined:

Array Dimension Comment
state dim_p State vector. Used in all DA methods. Inside the filter code, it's usually called state_p to indicate parallelization.
ens dim_p x dim_ens Ensemble array. Used in all DA methods. Inside some filters the name is ens_p to indicate parallelization.
Ainv dim_ens-1 x dim_ens-1 (SEIK, ESTKF)
dim_ens x dim_ens (ETKF)
transform matrix U-1 (in SEIK) or A-1 (in ETKF, ESTKF). Not used in EnKF.
sens dim_p x dim_ens x dim_lag Ensemble array for smoothing, storing the ensembles of previous analysis steps. Only used if DA method supports smoothing and dim_lag>0.
bias dim_bias_p Bias vector. Only used if a DA method implements bias estimation.

For the processes that only compute model forecasts but are not involved in the analysis step (i.e. filterpe=.false.), only one array is defined:

Array Dimension Comment
ens dim_p x dim_ens_l Ensemble array on processes that are not filter processes. Used in all DA methods.
state dim_p State vector. Only used if a stae vector is integrated separately from the ensemble states.

PDAF provides the routine PDAF_alloc to perform the allocation of these arrays, thus PDAF_X_alloc calls PDAF_alloc providing dimension information for the allocation. For more information see PDAF_alloc.

Configuration routines of DA method

The configuration routines are those included in the module PDAF_X in file PDAF_X.F90.

When a filter algorithm is added, the following filter routines have to be implemented and inserted to each interface routines described above.

These routines are contained in the module PDAF_X.

The templates in the sub-directories of templates/analysis_step provide commented template source codes to assistent in implementing a new DA method.

Module PDAF_X

This module contains the different configuration routines.

In the header part of the module, the variables controlling a DA method are declared. These are independent for each DA method. Here, one has the freedom to declare any required parameter. They have to be set using the routine PDAF_X_set_iparam and PDAF_X_set_rparam.

PDAF_X_init

This routine PDAF_X_init performs the initialization of filter-specific parameters.

The routine usually performs the following operations:

  • Print a message with information on the DA method
  • initialize the PDAF-internal parameter variables specific for the DA method from the provided values of subtype, param_int, and param_real.
  • set the logical flags ensemblefilter and fixedbasis.

The existing implementations also include some screen output about the configuration.

  • Call PDAF_X_set_iparam in a loop from index 3 to dim_pint to initialize integer parameters for the DA method
  • Call PDAF_X_set_rparam in a loop from index 3 to dim_preal to initialize real-valued parameters for the DA method
  • Initialize different flags and values
    • localfilter: Set to (1) for domain-localized methods, (0) otherwise
    • fixedbasis: Set to .true. for ensemble OI schemes
    • ensemblefiles: Only .false. for mode-based filters; can be omitted if .true.
    • dim_lag: Usually =0 since smoothing is activated by an integer parameter set with PDAF_X_set_iparam
    • observed_ens: Usually .false. since this option is commonly set with integer parameter 9 (see page on available options)
  • Check if provided value of subtype is valid. If this is not the case, set error flag should to 3.

PDAF_X_alloc

This routine PDAF_X_alloc allocates arrays for the DA method. These are the arrays that are used in the forecasting and hence need to be persistently allocated, like the ensemble array and a state vector. The success of the allocation is checked.

For the arrays provided by the PDAF framework, the allocation is usually done by calling PDAF_alloc, see page on PDAF_alloc.

PDAF_X_config

This routine prints the information on the chosen configuration. It is called at the end of the configuration phase when PDAF_init_forecast is called for the online mode, or one of the 'assim_offline' routines (e.g. PDAF3_assim_offline) for the offline mode.

The template shows the typical form of the output.

Note that a DA method can also run without implementing this routine. However, for consistency it should be implemented.

PDAF_X_set_iparam

This routine sets integer parameters for the DA method. It is called either by PDAF_X_init or by PDAF_set_iparam via PDAF_set_iparam_filters.

The parameters in PDAF are linked to an index. This index is explicitly used when one creates in the user code the real parameter array (filter_param_i in the template code and tutorials). In PDAF_X_set_iparam one uses the index id to select which parameter is initialized.

There are a few parameters which are used with the same id for all DA methods. We recommend to stick to these indices to avoid irritations:

id parameter Decription
1 dim_p state dimension, set via PDAF_reset_dim_p
2 dim_ens ensemble size, set via PDAF_reset_dim_ens
8 observe_ens Whether to apply obsevation operator ensmeble mean or all ensemble states
9 type_obs_init Whether to initialize observations (0) before or (1) after prepoststep
Additional parameters are recommended to use the following index. We recommend to stick to these indices to avoid irritations. The 3D-Var methods have partly different indices, because the solvers have to be configured:.
id parameter Decription
3 dim_lag Smoother lag
5 type_forget Type for ensemble inflation
6 type_trans Type of ensemble transformation

For further parameters we recommend to check the page on available options for each DA method.

PDAF_X_set_rparam

This routine sets real-values parameters for the DA method. It is called either by PDAF_X_init or by PDAF_set_rparam via PDAF_set_rparam_filters.

The parameters in PDAF are linked to an index. This index is explicitly used when one creates in the user code the real parameter array (filter_param_r in the template code and tutorials). In PDAF_X_set_rparam one uses the index id to select which parameter is initialized.

There is one parameters which are used with the same id for all DA methods:

id parameter Decription
1 forget Value of inflation parameter (usually the 'forgetting factor'

Some DA methods, in particular the nonlinear filters and the 3D-Var methods use additional real-valued parameters. We recommend to check the page on available options for each DA method to obtain an overview.

PDAF_X_options

This routine PDAF_X_options displays information on the available options for the filter algorithm.

The template shows the typical form of the output.

Note that a DA method can also run without implementing this routine. However, for consistency it should be implemented.

PDAF_X_memtime

The routine PDAF_X_memtime displays information about allocated memory and the execution time of different parts of the filter algorithm.

PDAF provides prepared timers in the different framework routines. For the analysis step, one can add more timers to obtain a detailed runtime information. Timers with index between 53 and 66 can be freely used.

The timing operations are implemented using the module PDAF_timer, which provides the function PDAF_timeit. Memory allocation is computed using PDAF_memcount, which is provided by the module PDAF_memcounting.

Analysis routines of DA method

PDAF_assimilate_X

This routine is called by PDAF3_assimilate and other of the advanced 'assimilate'-routines (See further below for information o how to integrate a new DA method into the advanced interface routines). It provides the full interface in which all user-supplied routines are specified as arguments. For detailed information on using the routines with he full interface, please see the Page on Implementing the Analysis Step using PDAF's full interface.

Except for the status flag outflag, all arguments of PDAF_assimilate_X are the names of user-provided call-back routines.

The routine is an infrastructure routine which is nearly identical for all DA methods. The main functionality is to count time steps during the forecast time, perform possible operations during the forecast (for example apply incremental updates), and then call PDAF_put_state_X when the forecast of a state from the ensemble is complete. Afterwards, the routine PDAF_get_state is called to intialize the next forecast phase.

The template marks the lines which are generic and those which are specific to a DA method. Specific is the call to PDAF_put_state_X, both with regard to its name and to its arguments. Thus, when implementing a new DA method one has to adapt the argument list to those call-back routines that are used by the DA method.

Most of the specified call-back routines are used for all DA method. For example, the interface in the template for the global ensemble filter (template/analysis_step/global/PDAF_assimilate_GLOBALTEMPLATE.F90) is:

  SUBROUTINE PDAF_assimilate_GLOBALTEMPLATE(U_collect_state, U_distribute_state, &
       U_init_dim_obs, U_obs_op, U_init_obs, U_prodRinvA, &
       U_init_obsvar, U_next_observation, U_prepoststep, outflag)

Here, U_ marks the user-provided call-back routines. This interface is identical to that in PDAF_assimilate_etkf.F90, PDAF_assimilate_estkf.F90 and PDAF_assimilate_seik.F90.

The only routines that are specific to the DA method are U_prodRinvA and U_init_obsvar. Thus, only these should be changed (e.g. some DA method do not used the product of R-1 with some matrix which is provided by U_prodRinvA, but compute an observation likelihood. Then one would rather use U_likelihood.)

The generic routines, which are always needed are:

U_collect_state Write model fields into a state vector
U_distribute_state Write a state vector into model fields
U_init_dim_obs Initialize observations, return observation dimension to PDAF. With PDAF-OMI, this is init_dim_obs_pdafomi.
U_obs_op Observation operator, return observed model state to PDAF. With PDAF-OMI, this is obs_op_dafomi
U_init_obs Return vector of observations to PDAF. When using PDAF-OMI, this routine is not visible to the user.
U_prodRinvA Compute product of R-1 with some matrix A. Return R-1A to PDAF. When using PDAF-OMI, this routine is not visible to the user.
U_init_obsvar Return mean observation error variance to PDAF. When using PDAF-OMI, this routine is not visible to the user.
U_next_observation Return number of time steps in next forecast phase, current model time and exit flag to PDAF
U_prepoststep Pre/poststep routine

The domain-local filters (template/analysis_step/global/PDAF_assimilate_LOCALTEMPLATE.F90) have additional arguments to handle the localization of the state vectors and the localization of obsevations. However, they use the same generic routines listed above.

Some routines are hidden from the user when using the advanced interfaces like PDAF3_assimilate because they are provided by PDAF-OMI or PDAFlocal. However, the analysis routines are implemented without the assumption that PDAF-OMI or PDAFlocal are used.

To get an overview of the available user-supplied call-back routines, you can looks into the file src/PDAF_cb_procedures.F90, which declared the interfaces of all call-back routines.

PDAF_assim_offline_X

This routine is the counterpart of PDAF_assimilate_X for the offline-coupled implementation. It is called by PDAF3_assim_offline and other of the advanced 'assimilate'-routines.

The routine is contained in the same file as PDAF_assimilate_X and is also an infrastructure routine. The routine is simpler, since no timestepping is done for the offline-coupled mode. As such the routine only writes out the configuration information and then calls the specific routine PDAFX_update for the analysis step.

Except for the status flag outflag, all arguments of PDAF_assim_offline_X are the names of user-provided call-back routines. The user-supplied call-back routines specified as arguments are the same as for PDAF_assimilate_X, except that the routines U_collect_state, U_distribute_state, and U_next_observation are not present because these routines are only used for the online coupling.

For implementing a new DA method, one mainly needs to adapt the call to PDAFX_update and the related specific call-back routines.

PDAF_put_state_X

This routine is the third infrastructure routine. This routine is usually called by PDAF_assimilate_X. Also PDAF3_put_state and other of the advanced 'assimilate'-routines call this routine.

It provides the full interface in which all user-supplied routines are specified as arguments. For detailed information on using the routines with the full interface, please see the Page on Implementing the Analysis Step using PDAF's full interface.

The routine is an infrastructure routine which is nearly identical for all DA methods. Its functionality is to write the forecasted fields into a state vector from the ensemble array, and check for the completeness of the forecast phase (particularly relevant for the 'flexible parallelization' variant. When the analysis has to be computed, the routine gathers the ensemble on the filter processes and then calls PDAFX_update for the analysis step.

Except for the status flag outflag, all arguments of PDAF_put_state_X are the names of user-provided call-back routines. The user-supplied call-back routines specified as arguments are the same as for PDAF_assimilate_X, except that the routines U_distribute_state and U_next_observation are not present because these routines are already used in PDAF_assimilate_X or in PDAF_get_state in implementations for PDAF2.

For implementing a new DA method, one mainly needs to adapt the call to PDAFX_update and the related specific call-back routines.

PDAFX_update

This is the main routine for the data assimilation update, i.e. the analysis step. The routine initializes the observations by calling PDAFobs_init. It also calls U_prepoststep before and after the actual analysis update.

For the domain-local filters also the local analysis loop is performed in this routine. In this loop the local state vector and the local observations are intialized before the routine for the actual local analysis update is called.

The arguments of this routine are the call-back routines which are needed for the analysis step, and the framework arrays from PDAF (i.e. state_p, ens_p, Ainv, for smoothers also sens_p) that are used in the analysis step.

The structure of the operations in PDAF_X_update can be designed freely when implementing a new DA method. To keep the initializations clearly separate of the numical calculations we recommend the provided structure in which the actual analysis update is computed in PDAFX_analysis. Note, that it is possible to have multiple PDAFX_analysis routines with unique names. Then, one can call different analysis algorithms according to the subtype. This is, for example used for the EnSRF and EAKF, which are implemted as different subtypes in src/PDAF_assimilate_ensrf.F90

The template files in PDAF_GLOBALTEMPLATE_update.F90 and PDAF_LOCALTEMPLATE_update.F90 explain the typical steps which should be usable for other DA methods.

PDAFX_analysis

This routine computes the actual analysis update. thus, the actual numerics are in this routine. For ensemble-based Kalman filters and nonlinear filters, this is mainly linear algebra. For these, we recommend to use BLAS and LAPACK library functions, because they usually lead to optimal compute performance.

The provided templates contain some of the of common parts of the analysis step computation, which are stepwise explained. This is based on the computations in the ETKF and LETKF methods.

Note that for the 3D-Var schemes, different solver algorithms are used. These are organized in further subroutines (see e.g. src/PAF_3dvar_analysis_cvt.F90).

Integration into PDAF3 interface

The PDAF3 interface routines provide the advanced interface in which functionality is provided by PDAF-OMI and PDAFlocal, so that the interface routines have a minimum number of arguments. This alos allowed to provide the universal routines PDAF3_assimilate and PDAF3_assim_offline (see the Page on implementing the analysis step in PDAF3 and the Page on the PDAF3 interface).

A new DA method should be integrated into the PDAF3 interface routines. If the new DA method uses only the call-back routines that are already available and defined in src/PDAF_cb_procedures.F90 one can right away add the call to, e.g. PDAF_assimilate_X into PDAF3_assimilate (in the file src/PDAF3_assimilate_ens.F90) and the call to PDAF_assim_offline_X into PDAF3_assim_offline (in the file src/PDAF3_assim_offline_ens.F90).

For 3D-Var methods there are separate files src/PDAF3_assimilate_3dvars.F90 and src/PDAF3_assim_offline_3dvar.F90.

If one introduced a new call-back routine for the DA method one should check if this can be integrated into PDAF-OMI (if it is related to observations) or into PDAFlocal (if related to state localization). If this is not the case, the DA method is most likely not compatible with the universal interface routines. In this case, one has to generate a new for PDAF3_assimilate and for PDAF3_assim_offline. An example for such a routine is PDAF3_assimilate_lenkf in src/PDAF2_assimilate_ens.F90, which contains the additional routine localize_covar_pdaf for covariance localization. However, the LEnKF is also an example that one might be able to avoid additional call-back routines. The LEnKF is also integrated in the universal PDAF3 interface routine PDAF3_assimilate, despite the additional routine localize_covlar_pdaf. For this we created a call-back routine within PDAF-OMI which is used in combination with the routine PDAFomi_set_localize_covar. This routine is called in the observation-initialization routine (U_init_dim_obs) of each PDAF-OMI observation module and provides PDA-OMI with the necessary information for perform the covariance localization.

Last modified 6 hours ago Last modified on Jun 5, 2025, 7:56:43 PM
Note: See TracWiki for help on using the wiki.