wiki:InitPdaf

Context Navigation

Version 61 (modified by lnerger, 5 months ago) ( diff )
--

Initialization of PDAF and the ensemble by PDAF_init

Implementation Guide

Contents of this page

Overview
Inserting init_pdaf into the model code
Using init_pdaf
Calling PDAF_init
1. Note on available options
Other variables for the assimilation
User-supplied routine U_init_ens (init_ens_pdaf.F90)
Testing the PDAF initialization

Overview

The PDAF release provides example code for the online mode in tutorial/online_2D_parallelmodel and tutorial/online_2D_serialmodel. We refer to these codes to use it as a basis.

The actual initialization of PDAF is done in the subroutine PDAF_init. This routine sets the internal parameters for the data assimilation, chooses the data assimilation method, allocates the internal ensemble array, and initializes the ensemble. The subroutine init_pdaf, which is called from the model code, initializes all variables required for the call to PDAF_init. By using init_pdaf, only a single additional subroutine call has to be inserted to the model source code for the initialization. The file templates/online/init_pdaf.F90 provides a commented template for this routine, which can be used as the basis of the implementation.

PDAF_init itself calls a user-supplied call-back routine to initialize the ensemble of model states. In the example and templates, this routine can be found in init_ens_pdaf.F90.

Inserting `init_pdaf` into the model code

The right place to insert the routine init_pdaf into the model code is in between the initialization part of the model and the time stepping loop. At this point, the regular model initialization is completed, which allows PDAF to initialize the ensemble of model states.

Using `init_pdaf`

In init_pdaf a number of variables are defined that are used in the call to PDAF_init as described below. There are also a few variables that are initialized in init_pdaf for later use in user-supplied call-back routies. For the tutorial example, these variables are described below in the section 'Other variables for the assimilation'. The template file templates/online/init_pdaf.F90 provide some more functionality which can be adapted.

The example implementation and the template code allow to specify all options at run time using a command line parser. These options are specified as the combination -VARIABLE VALUE. This method provides a convenient way to define an experiment and could also be used for other models. The parser module is provided by the file tutorial/offline_2D_parallel/parser_mpi.F90. An alterantive could be to use a configuration file.

Calling `PDAF_init`

In the tutorial codes and the template, the call to PDAF_init is fully implemented. Here, we provide an overview of the arguments that are set in the call to PDAF_init.

The call to PDAF_init has the following structure:

CALL PDAF_init(filtertype, subtype, step_null, &
               filter_param_i, length_filter_param_i, &
               filter_param_r, length_filter_param_r, &
               COMM_model, COMM_filter, COMM_couple, &
               task_id, n_modeltasks, filterpe, &
               init_ens_offline, screen, status_pdaf)

The required arguments are described below. In the list, we mark those variables bold, which one might like to change, like the type of the DA method. The other variables are required, but usually not changed by the user.

filtertype:
An integer defining the type of the DA method. (See the Note on Available Options)
subtype:
An integer defining the sub-type of the filter algorithm. (See the Note on Available Options)
step_null: An integer defining the initial time step. For some cases it can use useful to set step_null larger to 0.
filter_param_i:
Integer array collecting options for PDAF. The first two variables are mandatory and equal for all filters. Further variables are optional (See the Note on Available Options). The mandatory variables are in the following order:
1. The size of the local state vector for the current process.
2. The ensemble size for all ensemble-based filters
length_filter_param_i:
An integer defining the length of the array filter_param_i. The entries in the array are parsed up to this index.
filter_param_r:
Array collecting real-valued options for PDAF. The first value is mandatory and equal for all filters. Further variables are optional (See the Note on Available Options). The mandatory variable is:
1. The value of the forgetting factor controlling inflation (required to be larger than zero)
length_filter_param_r:
An integer defining the length of the array filter_param_r. The entries in the array are parsed up to this index.
COMM_model:
The communicator variable COMM_model as initialized by init_parallel_pdaf. If the model-communicator is named differently in the actual program, the name has to be adapted
COMM_filter:
The communicator variable COMM_filter as initialized by init_parallel_pdaf.
COMM_couple:
The communicator variable COMM_couple as initialized by init_parallel_pdaf.
task_id:
The index of the model tasks as initialized by init_parallel_pdaf. Always 1 for the offline mode
n_modeltasks:
The number of model tasks as defined before the call to init_parallel_pdaf.
filterpe:
The flag showing if a process belongs to COMM_filter as initialized by init_parallel_pdaf.
init_ens_pdaf:
The name of the user-supplied routine that is called by PDAF_init to initialize the ensemble of model states. (See below: 'User-supplied routine init_ens_pdaf'
screen:
An integer defining whether information output is written to the screen (i.e. standard output). The following choices are available:
- 0: quite mode - no information is displayed.
- 1: Display standard information (recommended)
- 2: as 1 plus display of timing information during the assimilation process
status_pdaf:
An integer used as status flag of PDAF. If status_pdaf is zero upon exit from PDAF_init, the initialization was successful. An error occurred for non-zero values. (The error codes are documented in the routine PDAF_init.)

PDAF uses two arrays filter_param_i and filter_param_r to respectively specify integer and real-valued options for PDAF. As described above, 2 integer values (state vector size, ensemble size) and 1 real value (forgetting factor) are mandatory. Additional options can be set by specifying a larger array and setting the corresponding size value (length_filter_param_i, length_filter_param_r). However, with PDAF V3.0 it can be more convenient to use the subroutines PDAF_set_iparam and PDAF_set_rparam`, which are explained further below.

We recommended to check the value of status_pdaf in the program after PDAF_init (and potentially PDAF_set_iparam and PDAF_set_rparam`) are executed. Only if its value is 0, the initialization was successful.

Note on available options

A list of available values of filtertype as well as an overview of available integer and real-valued options for each DA method can be found on the page Available options for the different DA methods.

The available options for a specific DA method can also be displayed by running the assimilation program for the selected DA method setting subtype = -1. (In the tutorial and template codes one can set -subtype -1 on the command line). Generally, available options and valid settings are also listed in mod_assimilation.F90 of the tutorials and template codes, but this might not be up-to-date in all cases.

Other variables for the assimilation

The routine init_pdaf in the example also initializes several variables that are not used to call PDAF_init. These variables control some functionality of the user-supplied routines for the data assimilation system and are shared with these routines through the Fortran module mod_assimilation. These variables are for example:

delt_obs: An integer specifying the number of time steps between two analysis steps
rms_obs: Assumed observation error (when PDAF-OMI is used the observation error is usually defined separately for each observation, see the description of OMI)
cradius: Localization cut-off radius in grid points for the observation domain
sradius: support radius, if observation errors are weighted (i.e. locweight>0)
locweight: Type of localizing weight

It is useful to define variables like these at this central position. Of course, their definition has to be adapted to the particular model used. The example codes describe the options for locweight.

The setting of locweight influences the weight function for the localization. If PDAF-OMI is used, the choices are standardized as follows

locweight	0	1	2	3	4
function	unit weight	exponential	5-th order polynomial	5-th order polynomial	5-th order polynomial
regulation	-	-	-	regulation using mean variance	regulation using variance of single observation point
cradius	weight=0 if distance > cradius
sradius	no impact	weight = exp(-d / sradius)	weight = 0 if d >= sradius else weight = f(sradius, distance)

Here, 'regulation' refers to the regulated localization introduced in Nerger, L., Janjić, T., Schröter, J., Hiller, W. (2012). A regulated localization scheme for ensemble-based Kalman filters. Quarterly Journal of the Royal Meteorological Society, 138, 802-812. doi:10.1002/qj.945.

User-supplied routine `U_init_ens` (init_ens_pdaf.F90)

The user-supplied routine the we named U_init_ens in the interface description above, is called by PDAF through the defined interface described below (Here, U_ marks the variable to describe a user-supplied routine while in the actual code we use the name init_ens_pdaf). The routine is called by all MPI processes that compute the filter analysis step (i.e. those for which 'filterpe' is set to true. In the standard configuration of init_parallel_pdaf these are all processes of the first model task, i.e. task_id=1.) init_ens_pdaf is only called by PDAF_init if no error occurred before; thus the status flag is zero.

The interface is the following:

SUBROUTINE U_init_ens(filtertype, dim_p, dim_ens, &
                           state_p, Uinv, ens_p, flag)

with

filtertype, integer, intent(in):
The type of filter algorithm as given in the call to PDAF_init
dim_p, integer, intent(in):
The size of the state dimension for the calling process as specified in the call to PDAF_init
dim_ens, integer, intent(in):
The size of the ensemble (or the rank of the state covariance matrix for SEEK)
state_p, real, dimension(dim_p), intent(inout):
Array for the local model state of the calling process (Only relevant for mode-based filters)
Uinv, real, dimension(dim_ens-1, dim_ens-1), intent(inout):
A real array for the inverse of matrix U from the decomposition of the state error covariance matrix P = VUV^T (Only relevant for the SEEK filter.)
ens_p, real, dimension(dim_p, dim_ens), intent(inout):
The ensemble array. It has to hold upon exit the ensemble of model states.
flag, integer, intent(inout):
Status flag for PDAF. It is 0 upon entry and can be set by in the user-supplied routine, depending on the success of the ensemble initialization. Preferably, values above 102 should be used for failures to avoid conflicts with the error codes defined within PDAF_init.

Defining the state vector

The ensemble initialization routine is the first location at which the user has to fill a state vector (or array of state vectors). A state vector is the collection of all model fields that are handled in the analysis step of the assimilation procedure into a single vector. Usually one concatenates the different model fields as complete fields. Thus, the vector could contain a full 3-dimensional temperature field, followed by the salinity field (in case of an ocean model), and then followed by the 3 fields of the velocity components.

The logical definition of the state vector will also be utilized in several other user-supplied routines. E.g. in routines that fill model fields from a state vector or in the routine providing the observation operator.

Initialization for ensemble-based filters

For the ensemble-based filters and the ensembel/hybrid 3D-Var methods only the array ens_p needs to be initialized by the ensemble of model states. If a parallel model with domain decomposition is used, the full ensemble for the local sub-domain has to be initialized.

The arrays state_p and Uinv are allocated to their correct sizes because they are used during the assimilation cycles. They are not yet initialized and it is allowed to use these arrays in the initialization. An exception from this is EnKF for which Uinv is allocated only with size (1,1), because Uinv is not used for EnKF.

Initialization for mode-based filters

The only mode-based filter supplied with PDAF is currenly the SEEK filter. For this filter, the initialization bases on the decomposition of the state error covariance matrix in the form P = VUV^T. According to this decomposition, the array ens_p has to be initialized to hold the modes from matrix V and Uinv holds the inverse of matrix U. In addition state_p has to be initialized with the initial state estimate. If a parallel model with domain decomposition is used, the part of all modes for the local sub-domain of the MPI process and the corresponding part of the state vector has to be initialized. Uinv will be identical for all MPI processes.

Testing the PDAF initialization

The PDAF initialization can be tested by compiling the program and executing it. The Makefile of the model has to be extended to include the additional files. The core part of PDAF can be compiled separately as a library and can then simply be linked to the model code. This is the strategy followed in the PDAF-package.

Remark: For the compilation with a real MPI library, one has to ensure that MPI-module (USE MPI) of the MPI-library is USED for both the model and PDAF. (Thus in the include file for make in make.arch, one might have set MPI_INC to the directory holding the module.)

At this stage it will not be meaningful to perform an actual model time stepping. However, one can test whether the initialization in PDAF_init is successful and whether the ensemble array is correctly initialized. For this one can also activate the PDAF debugging output.

Standard output from PDAF_init should look like the following:

PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF    +++                      PDAF                      +++
PDAF    +++      Parallel Data Assimilation Framework      +++
PDAF    +++                                                +++
PDAF    +++                 Version 2.1                    +++
PDAF    +++                                                +++
PDAF    +++                 Please cite                    +++
PDAF    +++     L. Nerger and W. Hiller, Computers and     +++
PDAF    +++         Geosciences, 2013, 55, 110-118,        +++
PDAF    +++         doi:10.1016/j.cageo.2012.03.026        +++
PDAF    +++ when publishing work resulting from using PDAF +++
PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++


PDAF: Initialize filter

PDAF    +++++++++++++++++++++++++++++++++++++++++++++++++++++++
PDAF    +++  Local Error Subspace Transform Kalman Filter   +++
PDAF    +++                    (LESTKF)                     +++
PDAF    +++                                                 +++
PDAF    +++ Domain-localized implementation of the ESTKF by +++
PDAF    +++  Nerger et al., Mon. Wea. Rev. 140 (2012) 2335  +++
PDAF    +++           doi:10.1175/MWR-D-11-00102.1          +++
PDAF    +++++++++++++++++++++++++++++++++++++++++++++++++++++++

PDAF    LESTKF configuration
PDAF          filter sub-type = 0
PDAF            --> Standard LESTKF
PDAF            --> Transform ensemble with deterministic Omega
PDAF            --> Use fixed forgetting factor: 1.00
PDAF            --> ensemble size:  50

PDAF: Initialize Parallelization
PDAF     Parallelization - Filter on model PEs:
PDAF                 Total number of PEs:      1
PDAF      Number of parallel model tasks:      1
PDAF                      PEs for Filter:      1
PDAF     # PEs per ensemble task and local ensemble sizes: 
PDAF     Task     1
PDAF     #PEs     1
PDAF        N    50

The correctness of the ensemble initialization in U_init_ens should be checked by the user.

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text