= Offline Mode: Initializing PDAF =
{{{
#!html
}}}
[[PageOutline(2-3,Contents of this page)]]
== Overview ==
The PDAF release provides example code for the offline mode in `tutorial/offline_2D_parallel`. We refer to this code to use it as a basis.
The routine `PDAF_init` is called to initialize PDAF. This call sets parameters for the data assimilation, chooses the data assimilation method and initializes the ensemble. In the tutorial code in `tutorial/offline_2D_parallel`, we collect the initialization of all variables required for the call to `PDAF_init` into the single subroutine `init_pdaf_offline`, which yields a clean code. The file `init_pdaf_offline.F90` in `templates/offline` provides a commented template for this routine, which can be used as the basis of the implementation.
`PDAF_init` itself calls a user-supplied call-back routine to initialize the ensemble of model states. In the example, this is the routine in the file `init_ens_offline.F90`.
== Using `init_pdaf` ==
In the offline mode, the routine `init_pdaf_offline` is executed after the initialization of the parallelization. (Note: In the main program of the example implementation (`main_offine.F90`) we added a call to a routine `initialize` in between for clarity of the implementation. This routine simply initializes the size of the 2-dimensional model grid and then the size of the model state.)
In the routine `init_pdaf_offline` a number of variables are defined that are used in the call to `PDAF_init` as described below. There are also a few variables that are initialized in `init_pdaf_offline` but not used in the call to `PDAF_init`. These are variables that are specific for the data assimilation system, but only shared in between the user-supplied routines. For the tutorial example, these variables are described below in the section '[#Othervariablesfortheassimilation Other variables for the assimilation]'.
The example implementation and the template version allow to parse all variables through a command line parser. This method provides a convenient way to define an experiment and could also be used for other models. The parser module is provided by the file `tutorial/offline_2D_serial/parser_mpi.F90`
== Required arguments for `PDAF_init` ==
The call to `PDAF_init` has the following structure:
{{{
CALL PDAF_init(filtertype, subtype, step_null, &
filter_param_i, length_filter_param_i, &
filter_param_r, length_filter_param_r, &
COMM_model, COMM_filter, COMM_couple, &
task_id, n_modeltasks, filterpe, &
init_ens_pdaf, screen, status_pdaf)
}}}
The required variables are the following:
* `filtertype`: An integer defining the type of filter algorithm. Available are
* 1: SEIK
* 2: EnKF
* 3: LSEIK
* 4: ETKF
* 5: LETKF
* 6: ESTKF
* 7: LESTKF
* 8: LEnKF
* 9: NETF
* 10: LNETF
* 11: LKNETF
* 12: PF
* 13: ENSRF/EAKF
* 100: GENOBS
* 200: 3DVAR
* `subtype`:[[BR]] An integer defining the sub-type of the filter algorithm
* `step_null`:[[BR]] Always 0 for the offline mode.
* `filter_param_i`:[[BR]] Integer array collecting options for PDAF. The first two variables are mandatory and equal for all filters. Further variables are optional (see example code). The mandatory variables are in the following order:
* The size of the local state vector for the current process.
* The ensemble size for all ensemble-based filters (or the rank of the state covariance matrix for mode-based filters like SEEK)
* `length_filter_param_i`:[[BR]] An integer defining the length of the array `filter_param_i`. The entries in the array are parsed up to this index.
* `filter_param_r`:[[BR]] Array of reals collecting real-valued options for PDAF. The first variable is mandatory and equal for all filters. Further variables are optional (see example code). The mandatory variable is:
* The value of the forgetting factor (required to be larger than zero)
* `length_filter_param_r`:[[BR]] An integer defining the length of the array `filter_param_r`. The entries in the array are parsed up to this index.
* `COMM_model`:[[BR]] The communicator variable `COMM_model` as initialized by `init_parallel_pdaf`. If the model-communicator is named differently in the actual program, the name has to be adapted
* `COMM_filter`:[[BR]] The communicator variable `COMM_filter` as initialized by `init_parallel_pdaf`.
* `COMM_couple`:[[BR]] The communicator variable `COMM_couple` as initialized by `init_parallel_pdaf`.
* `task_id`:[[BR]] The index of the model tasks as initialized by `init_parallel_pdaf`.
* `n_modeltasks`:[[BR]] The number of model tasks as defined before the call to `init_parallel_pdaf`.
* `filterpe`:[[BR]] The flag showing if a process belongs to `COMM_filter` as initialized by `init_parallel_pdaf`.
* `init_ens_pdaf`:[[BR]] The name of the user-supplied routine that is called by `PDAF_init` to initialize the ensemble of model states. (See below: '[#User-suppliedroutineU_init_ens User-supplied routine U_init_ens]'
* `screen`:[[BR]] An integer defining whether information output is written to the screen (i.e. standard output). The following choices are available:
* 0: quite mode - no information is displayed.
* 1: Display standard information (recommended)
* 2: as 1 plus display of timing information during the assimilation process
* `status_pdaf`:[[BR]] An integer used as status flag of PDAF. If `status_pdaf` is zero upon exit from `PDAF_init` the initialization was successful. An error occurred for non-zero values. (The error codes are documented in the routine PDAF_init.)
PDAF uses two arrays `filter_param_i` and `filter_param_r` to respectively specify integer and real-valued options for PDAF. As described above, 2 integer values (state vector size, ensemble size) and 1 real option (forgetting factor for inflation) are mandatory. Additional options can be set by specifying a larger array and setting the corresponding size value (`length_filter_param_i`, `length_filter_param_r`). However, with PDAF V3.0 it can be more convenient to use the subroutines `PDAF_set_iparam and `PDAF_set_rparam`, which are explained further below.
An overview of available integer and real-valued options for each DA method can be found on the page[wiki:AvailableOptionsforInitPDAF Available options for the different DA methods]. The available options for a specific DA method can also be displayed by running the assimilation program for the selected DA method setting `subtype = -1`. (In the tutorial and template codes one can set `-subtype -1` on the command line).
It is recommended to check the value of `status_pdaf` in the program after PDAF_init (and potentially `PDAF_set_iparam and `PDAF_set_rparam`) are executed. Only if its value is 0 the initialization was successful.
== Other variables for the assimilation ==
The routine `init_pdaf` in the example also initializes several variables that are not used to call `PDAF_init`. These variables control some functionality of the user-supplied routines for the data assimilation system and are shared with these routines through `mod_assimilation`. These variables are for example:
* `cradius`: Localization cut-off radius (here in grid points) for the local observation domain
* `sradius`: support radius, if the observation errors are weighted with distance (for `locweight>0`)
* `locweight`: Type of localizing weight
It is useful to define variables like these at this central position. Of course, this definition has to be adapted to the particular model used.
Apart from the generic variables for loclaization, we also specify variables that are specific for each obseration type, for example
* `assim_A`: Flag whether to assimialtion observations of type A
* `rms_obs_A`: Assumed observation error standard deviation of observation type A
== User-supplied routine `init_ens_pdaf` ==
The user-supplied routine the we named `init_ens_pdaf` here, is the call-back routine that called by PDAF through the defined interface described below. The routine is called by all MPI processes that compute the filter analysis step (i.e. those for which 'filterpe' is set to true. In the standard configuration of `init_parallel_pdaf` these are all processes of the first model task, i.e. task_id=1.) `init_ens_pdaf` is only called by `PDAF_init` if no error occurred before; thus the status flag is zero.
The interface can be looked up in the template and tutorial codes. It is the following:
{{{
SUBROUTINE init_ens_pdaf(filtertype, dim_p, dim_ens, &
state_p, Uinv, ens_p, flag)
}}}
with
* `filtertype`:[[BR]] The integer defining the type of filter algorithm as given in the call to `PDAF_init`
* `dim_p`:[[BR]] An integer holding the size of the state dimension for the calling process as specified in the call to `PDAF_init`
* `dim_ens`:[[BR]] An integer holding the size of the ensemble
* `state_p`:[[BR]] A real array of size (`dim_p`) for the local model state of the calling process (can be used freely for ensemble-based methods)
* `Uinv`:[[BR]] A real array of size (`dim_ens-1`, `dim_ens-1`) for the inverse of matrix '''U''' from the decomposition of the state error covariance matrix '''P''' = '''VUV^T^''' (Not relevant for ensemble-based methods)
* `ens_p`:[[BR]] A real array of size (`dim_p`, `dim_ens`), which has to be filled with the ensemble of model states.
* `flag`:[[BR]] Status flag for PDAF. It is 0 upon entry and can be set by in the user-supplied routine, depending on the success of the ensemble initialization. Preferably, values above 102 should be used for failures to avoid conflicts with the error codes defined within PDAF_init.
=== Initialization for ensemble-based filters ===
Most data assimilation methods are based on ensembles (an exception is only the parameterized 3D-Var). For the ensemble-based methods only the array `ens_p` needs to be initialized by the ensemble of model states. If a model with domain decomposition is used, the full ensemble for the local sub-domain of the MPI process has to be initialized.
The arrays `state_p` and `Uinv` are allocated to their correct sizes because they are used during the assimilation cycles. They are not yet initialized and it is allowed to use these arrays in the initialization. An exception from this is the EnKF for which `Uinv` is allocated only with size (`1`,`1`), because `Uinv` is not using for EnKF.
For the offline mode, one will usually read the ensemble states from output files of the model used to perform the ensemble integrations separately (i.e. 'offline'). Thus, one has to implement a reading routine for the model files.
== Setting additional options ==
In PDAF V3.0 we added the possibility to set options for PDAF after the call to `PDAF_init`. For these there are the subroutines
{{{
PDAF_set_iparam(id, value, status_pdaf)
}}}
to set interger parameters and
{{{
PDAF_set_rparam(id, value, status_pdaf)
}}}
to set REAL (floating point) parameters. The arguments are
* `id`:[[BR]] The index value of a parameter
* `value`:[[BR]] The value of the parameter with index `id`
* `status_pdaf`:[[BR]] Status flag for PDAF. Both routines increment in the input value. The increment is 0 for no error (this allows to check `flag` once after all calls to `PDAF_init`, `PDAF_set_iparam`, and `PDAF_set_rparam`.)
An overview of available integer and real-valued options for each DA method can be found on the page[wiki:AvailableOptionsforInitPDAF Available options for the different DA methods]. The available options for a specific DA method can also be displayed by running the assimilation program for the selected DA method setting `subtype = -1`. (In the tutorial and template codes one can set `-subtype -1` on the command line).
== Testing the PDAF initialization ==
The PDAF initialization can be tested by compiling the assimilation program (one can out-comment the call to `PDAF3_assim_offline` if one likes to focus on the initialization) and executing it.
Standard output from PDAF_init should look like the following:
{{{
PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF +++ PDAF +++
PDAF +++ Parallel Data Assimilation Framework +++
PDAF +++ +++
PDAF +++ Version 3.0beta +++
PDAF +++ +++
PDAF +++ Please cite +++
PDAF +++ L. Nerger and W. Hiller, Computers and Geosciences +++
PDAF +++ 2013, 55, 110-118, doi:10.1016/j.cageo.2012.03.026 +++
PDAF +++ when publishing work resulting from using PDAF +++
PDAF +++ +++
PDAF +++ PDAF itself can also be cited as +++
PDAF +++ L. Nerger. Parallel Data Assimilation Framework +++
PDAF +++ (PDAF). Zenodo. 2024. doi:10.5281/zenodo.7861812 +++
PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF: Initialize filter
PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF +++ Error Subspace Transform Kalman Filter (ESTKF) +++
PDAF +++ +++
PDAF +++ Nerger et al., Mon. Wea. Rev. 140 (2012) 2335 +++
PDAF +++ doi:10.1175/MWR-D-11-00102.1 +++
PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF: Initialize Parallelization
PDAF Parallelization - Filter on model PEs:
PDAF Total number of PEs: 1
PDAF Number of parallel model tasks: 1
PDAF PEs for Filter: 1
PDAF # PEs per ensemble task and local ensemble sizes:
PDAF Task 1
PDAF #PEs 1
PDAF N 9
PDAF: Call ensemble initialization
Initialize state ensemble
--- read ensemble from files
--- Ensemble size: 9
PDAF: Initialization completed
}}}
The correctness of the ensemble initialization in `U_init_ens` should be checked by the user.