wiki:OfflineInitPdaf_PDAF23

Version 5 (modified by lnerger, 5 months ago) ( diff )

--

Offline Mode: Initialization of PDAF and the ensemble by PDAF_init (PDAF2)

Offline Mode: Implementation Guide for PDAF 2

  1. Main page
  2. Adaptation of the parallelization
  3. Initialization of PDAF
  4. Implementation of the analysis step
  5. Memory and timing information

This Implementaton Guide describes the implementation of the offline mode as of PDAF V2.3. This is intended as a reference for existing implementations before the release of PDAF V3.0. For new implementations, we recommend to follow the updated Implementation Guide for PDAF 3.

Overview

After the initialization of the parallelization for the assimilation program, the initialization of PDAF has to be implemented. Internally to PDAF, the initialization is performed by the routine PDAF_init. Typically, we collect the initialization of all variables required for the call to PDAF_init into a single subroutine, which yields a clean code In the example in tutorial/offline_2D_serial the routine in the file init_pdaf_offline.F90 shows this strategy. The file init_pdaf.F90 in templates/offline_omi provides a commented template for this routine, which can be used as the basis of the implementation.

PDAF_init itself calls a user-supplied routine to initialize the ensemble of model states through its interface. In the example, this is the routine in the file init_ens_offline.F90.

Using init_pdaf

In the offline mode, the routine init_pdaf_offline is executed after the initialization of the parallelization. (Note: In the main program of the example implementation (main_offine.F90) we added a call to a routine initialize in between for clarity of the implementation. As no real model initialization is conducted, this routine simply initializes the size of the model state. This initialization could also be performed in init_pdaf_offline.)

In the routine init_pdaf_offline a number of variables are defined that are used in the call to PDAF_init as described below in 'Required arguments for `PDAF_init`'. (Please note: All names of subroutines that start with PDAF_ are core routines of PDAF, while subroutines whose name end with _pdaf are generally user-supplied call-back routines) There are also a few variables that are initialized in init_pdaf_offline but not used in the call to PDAF_init. These are variables that are specific for the data assimilation system, but only shared in between the user-supplied routines. For the tutorial example, these variables are described below in the section 'Other variables for the assimilation'.

The PDAF offline mode is activated by calling the routine PDAF_set_offline_mode, described below. This routine is usually called at the end of the routine init_pdaf_offline.

The example implementation and the template version allow to parse all variables through a command line parser. This method provides a convenient way to define an experiment and could also be used for other models. The parser module is provided by the file tutorial/offline_2D_serial/parser_mpi.F90

Required arguments for PDAF_init

The call to PDAF_init has the following structure:

CALL PDAF_init(filtertype, subtype, step_null, &
               filter_param_i, length_filter_param_i, &
               filter_param_r, length_filter_param_r, &
               COMM_model, COMM_filter, COMM_couple, &
               task_id, n_modeltasks, filterpe, &
               U_init_ens, screen, status_pdaf)

The required variables are the following:

  • filtertype: An integer defining the type of filter algorithm. Available are
    • 1: SEIK
    • 2: EnKF
    • 3: LSEIK
    • 4: ETKF
    • 5: LETKF
    • 6: ESTKF
    • 7: LESTKF
    • 8: LEnKF
    • 9: NETF
    • 10: LNETF
    • 11: LKNETF
    • 12: PF
    • 100: GENOBS
    • 200: 3DVAR
  • subtype: An integer defining the sub-type of the filter algorithm
  • step_null: Always 0 for the offline mode.
  • filter_param_i: Integer array collecting several variables for PDAF. The first two variables are mandatory and equal for all filters. Further variables are optional (see example code). The mandatory variables are in the following order:
    • The size of the local state vector for the current process.
    • The ensemble size for all ensemble-based filters (or the rank of the state covariance matrix for mode-based filters like SEEK)
  • length_filter_param_i: An Integer defining the length of the array filter_param_i. The entries in the array are parsed up to this index.
  • filter_param_r: Array of reals collecting floating point variables for PDAF. The first variable is mandatory and equal for all filters. Further variables are optional (see example code). The mandatory variable is:
    • The value of the forgetting factor (required to be larger than zero)
  • length_filter_param_r: An Integer defining the length of the array filter_param_r. The entries in the array are parsed up to this index.
  • COMM_model: The communicator variable COMM_model as initialized by init_parallel_pdaf. If the model-communicator is named differently in the actual program, the name has to be adapted
  • COMM_filter: The communicator variable COMM_filter as initialized by init_parallel_pdaf.
  • COMM_couple: The communicator variable COMM_couple as initialized by init_parallel_pdaf.
  • task_id: The index of the model tasks as initialized by init_parallel_pdaf.
  • n_modeltasks: The number of model tasks as defined before the call to init_parallel_pdaf.
  • filterpe: The flag showing if a process belongs to COMM_filter as initialized by init_parallel_pdaf.
  • U_init_ens: The name of the user-supplied routine that is called by PDAF_init to initialize the ensemble of model states. (See 'User-supplied routine U_init_ens'
  • screen: An integer defining whether information output is written to the screen (i.e. standard output). The following choices are available:
    • 0: quite mode - no information is displayed.
    • 1: Display standard information (recommended)
    • 2: as 1 plus display of timing information during the assimilation process
  • status_pdaf: An integer used as status flag of PDAF. If status_pdaf is zero upon exit from PDAF_init the initialization was successful. An error occurred for non-zero values. (The error codes are documented in the routine PDAF_init.)

An overview of available options for each filter an be found on the overview page on options.

It is recommended that the value of status_pdaf is checked in the program after PDAF_init is executed. Only if its value is 0 the initialization was successful.

Other variables for the assimilation

The routine init_pdaf in the example also initializes several variables that are not used to call PDAF_init. These variables control some functionality of the user-supplied routines for the data assimilation system and are shared with these routines through mod_assimilation. These variables are for example:

  • rms_obs: Assumed observation error
  • cradius: Localization cut-off radius (here in grid points) for the local observation domain
  • sradius: support radius, if the observation errors are weighted with distance (for locweight>0)
  • locweight: Type of localizing weight

It is useful to define variables like these at this central position. Of course, this definition has to be adapted to the particular model used.

User-supplied routine U_init_ens

The user-supplied routine the we named U_init_ens here, is called by PDAF through the defined interface described below. The routine is called by all MPI processes that compute the filter analysis step (i.e. those for which 'filterpe' is set to true. In the standard configuration of init_parallel_pdaf these are all processes of the first model task, i.e. task_id=1.) U_init_ens_pdaf is only called by PDAF_init if no error occurred before; thus the status flag is zero.

The interface is the following:

SUBROUTINE U_init_ens(filtertype, dim_p, dim_ens, &
                           state_p, Uinv, ens_p, flag)

with

  • filtertype: The integer defining the type of filter algorithm as given in the call to PDAF_init
  • dim_p: An integer holding the size of the state dimension for the calling process as specified in the call to PDAF_init
  • dim_ens: An integer holding the size of the ensemble (or the rank of the state covariance matrix for SEEK)
  • state_p: A real array of size (dim_p) for the local model state of the calling process (Only relevant for mode-based filters)
  • Uinv: A real array of size (dim_ens-1, dim_ens-1) for the inverse of matrix U from the decomposition of the state error covariance matrix P = VUVT (Only relevant for mode-based filters like SEEK.)
  • ens_p: A real array of size (dim_p, dim_ens) the has to hold upon exit the ensemble of model states.
  • flag: Status flag for PDAF. It is 0 upon entry and can be set by in the user-supplied routine, depending on the success of the ensemble initialization. Preferably, values above 102 should be used for failures to avoid conflicts with the error codes defined within PDAF_init.

In the initialization routine U_init_ens_pdaf one has to distinguish between ensemble-based and mode-based filters. The only mode based filter supplied with PDAF is SEEK, while all other methods are ensemble-based.

Initialization for ensemble-based filters

Generally, we work with ensemble-based filters (an exception is only the parameterized 3D-Var). For the ensemble-based filters only the array ens_p needs to be initialized by the ensemble of model states. If a model with domain decomposition is used, the full ensemble for the local sub-domain of the MPI process has to be initialized.

The arrays state_p and Uinv are allocated to their correct sizes because they are used during the assimilation cycles. They are not yet initialized and it is allowed to use these arrays in the initialization. An exception from this is EnKF for which Uinv is allocated only with size (1,1), because Uinv is not using for EnKF.

For the offline mode, one will usually read the ensemble states from output files of the model used to perform the ensemble integrations separately (i.e. 'offline'). Thus, one has to implement a reading routine for the model files.

Activating the offline mode with PDAF_set_offline_mode

To use the offline mode, it has to be activated by calling PDAF_set_offline_mode. This is usally done at the end of init_pdaf. The routine has a simple interface as follows:

  SUBROUTINE PDAF_set_offline_mode(screen)

    INTEGER, INTENT(in) :: screen            ! Control verbosity of routine
                                             ! >0: display information output

here screen is usually the same variable as what is used as argument in the call to PDAF_init.

Note:

  • Before PDAF V2.2 one had to activate the offline mode by setting subtype=5. This has been replaced by the call to PDAF_set_offline_mode to give users flexibility in specifying subtype. Up to PDAF V2.3.1 the use of subtype=5 was still possible. In PDAF V3.0 we removed this possibility and the call to PDAF_set_offline_mode is mandatory.

Testing the PDAF initialization

The PDAF initialization can be tested by compiling the assimilation program (without the later call to PDAF_put_state_* and executing it. The Makefile of the model has to be extended to include the additional files. The core part of PDAF can be compiled separately as a library and can then simply be linked to the model code. This is the strategy followed in the PDAF-package. One can test if the initialization in PDAF_init is sucessful and if the ensemble array is correctly initialized.

Standard output from PDAF_init should look like the following:

PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF    +++                       PDAF                         +++
PDAF    +++       Parallel Data Assimilation Framework         +++
PDAF    +++                                                    +++
PDAF    +++                  Version 2.3.1                     +++
PDAF    +++                                                    +++
PDAF    +++                   Please cite                      +++
PDAF    +++ L. Nerger and W. Hiller, Computers and Geosciences +++
PDAF    +++ 2013, 55, 110-118, doi:10.1016/j.cageo.2012.03.026 +++
PDAF    +++   when publishing work resulting from using PDAF   +++
PDAF    +++                                                    +++
PDAF    +++          PDAF itself can also be cited as          +++
PDAF    +++  L. Nerger. Parallel Data Assimilation Framework   +++
PDAF    +++  (PDAF). Zenodo. 2024. doi:10.5281/zenodo.7861812  +++
PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


PDAF: Initialize filter

PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF    +++ Error Subspace Transform Kalman Filter (ESTKF) +++
PDAF    +++                                                +++
PDAF    +++  Nerger et al., Mon. Wea. Rev. 140 (2012) 2335 +++
PDAF    +++           doi:10.1175/MWR-D-11-00102.1         +++
PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++

PDAF    ESTKF configuration
PDAF          filter sub-type = 0
PDAF            --> Standard ESTKF
PDAF            --> Deterministic ensemble transformation
PDAF            --> Use fixed forgetting factor: 1.00
PDAF            --> ensemble size:   40

PDAF: Initialize Parallelization
PDAF     Parallelization - Filter on model PEs:
PDAF                 Total number of PEs:      1
PDAF      Number of parallel model tasks:      1
PDAF                      PEs for Filter:      1
PDAF     # PEs per ensemble task and local ensemble sizes: 
PDAF     Task     1
PDAF     #PEs     1
PDAF        N    40

PDAF: Call routine for ensemble initialization

PDAF: Initialization completed
PDAF    Activate PDAF offline mode

The correctness of the ensemble initialization in U_init_ens should be checked by the user.

Note: See TracWiki for help on using the wiki.