Version 52 (modified by 21 months ago) (diff) | ,
---|
Initialization of PDAF and the ensemble by PDAF_init
Implementation Guide
Contents of this page
Overview
After the adaption of the parallelization, the initialization of PDAF has to be implemented. It is performed by the routine PDAF_init
. This subroutine has several arguments. To ensure a clean code, one can collect the initializations of all variables required for the call to PDAF_init
a single subroutine. With this only a single additional subroutine call has to be inserted to the model source code for the initialization. In the example in tutorial/online_2D_serialmodel
the routine in the file init_pdaf.F90
collects all initializations for PDAF. The routine has no arguments. Likewise, the templates (in sub-directories of templates/
) provide a commented template for this routine, which can be used as the basis of the implementation.
PDAF_init
itself calls a user-supplied call-back routine to initialize the ensemble of model states. In the example and templates, this routine can be found in init_ens_pdaf.F90
.
Inserting init_pdaf
The right place to insert a routine like init_pdaf
into the model code is in between the initialization part of the model and the time stepping loop. Thus, the regular model initialization should be completed, which allows PDAF to initialize the ensemble of model states.
In init_pdaf
a number of variables have to be defined that are used in the call to PDAF_init
as described below. (Please note: All names of subroutines that start with PDAF_
are core routines of PDAF, while subroutines whose name end with _pdaf
are generally user-supplied interface routines) There are also a few variables that are initialized in init_pdaf
but not used in the call to PDAF_init
. These are variables that are specific for the data assimilation system, but only shared in between the user-supplied routines. For the example case, these variables are described below in 'Other variables for the assimilation'.
The tutorial implementations as well as the template routines allow to parse all variables through a command line parser by calls in init_pdaf_parse.F90
. This method provides a convenient way to define an experiment and could also be used for other models. The parser module is provided as templates/online_omi/parser_mpi.F90
and also used in the tutorials. Alternatively, one could implement an initialization of the variables with Fortran namelists or could just rely on hard-coded numbers in the source code. Generally we recommend to use a configuration strategy consistent with the model code used.
Required arguments for PDAF_init
In init_pdaf
the variables for the call to PDAF_init
are set. The call to PDAF_init
has the following structure:
CALL PDAF_init(filtertype, subtype, step_null, & filter_param_i, length_filter_param_i, & filter_param_r, length_filter_param_r, & COMM_model, COMM_filter, COMM_couple, & task_id, n_modeltasks, filterpe, & U_init_ens, screen, status_pdaf)
The required variables are the following:
filtertype
: An integer defining the type of filter algorithm. Available are- 0: SEEK
- 1: SEIK
- 2: EnKF
- 3: LSEIK
- 4: ETKF
- 5: LETKF
- 6: ESTKF
- 7: LESTKF
- 8: local EnKF
- 9: NETF
- 10: LNETF
- 11: LKNETF (new in PDAF V2.1)
- 12: PF
- 100: GENOBS (generate synthetic observations, was filtertype=11 before PDAF V2.0)
- 200: 3D-Var methods (new in PDAF V2.0)
subtype
: An integer defining the sub-type of the filter algorithm (see the example code intemplates/online_omi/init_pdaf.F90
for choices).- If
PDAF_init
is called withsubtype=-1
the available options are shown for the selected filter algorithm.
- If
step_null
: An integer defining the initial time step. For some cases it can use useful to setstep_null
larger to 0.filter_param_i
: Integer array collecting several variables for PDAF. The first two variables are mandatory and equal for all filters. Further variables are optional (see example code or usesubtype=-1
to display available options.). The mandatory variables are in the following order:- The size of the local state vector for the current process.
- The ensemble size for all ensemble-based filters
length_filter_param_i
: An integer defining the length of the arrayfilter_param_i
. The entries in the array are parsed up to this index.filter_param_r
: Array of reals collecting floating point variables for PDAF. The first variable is mandatory and equal for all filters. Further variables are optional (seetemplates/online_omi/init_pdaf.F90
or usesubtype=-1
to display available options.). The mandatory variable is:- The value of the forgetting factor controlling covariance inflation (required to be larger than zero; common are values between 0.9 and 1.0. For 1.0 the ensemble is not inflated.)
length_filter_param_r
: An Integer defining the length of the arrayfilter_param_r
. The entries in the array are parsed up to this index.COMM_model
: The communicator variableCOMM_model
as initialized byinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)COMM_filter
: The communicator variableCOMM_filter
as initialized byinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)COMM_couple
: The communicator variableCOMM_couple
as initialized byinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)task_id
: The model task index as initialized byinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)n_modeltasks
: The number of model tasks as defined before the call toinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)filterpe
: A logical flag showing whether a process belongs toCOMM_filter
as initialized byinit_parallel_pdaf
. (Usually stored in the modulemod_assimilation
)U_init_ens
: The name of the user-supplied routine that is called byPDAF_init
to initialize the ensemble of model states. (See 'User-supplied routine U_init_ens')screen
: An integer defining whether information output is written to the screen (i.e. standard output). The following choices are available:- 0: quiet mode - no information is displayed.
- 1: Display standard information (recommended)
- 2: as 1 plus display of timing information during the assimilation process
status_pdaf
: An integer used as status flag of PDAF. Ifstatus_pdaf
is zero upon exit fromPDAF_init
the initialization was successful. An error occurred for non-zero values. (The error codes are documented in the routinePDAF_init
.)
An overview of available options for each filter an be found on the overview page on options.
Notes:
- It is recommended that the value of
status_pdaf
is checked in the program after PDAF_init is executed. Only if its value is 0 the initialization was successful. - PDAF also has a Simplified Interface providing the routine
PDAF_init_si
. In the simplified interface, the name of the user-supplied routineU_init_ens
is predefined toinit_ens_pdaf
such that it does not appear in the call toPDAF_init_si
. More information on the pre-defined names is provided in the documentation of PDAF's simplified interface.
Other variables for the assimilation
The routine init_pdaf
in the example also initializes several variables that are not used to call PDAF_init
. These variables control some functionality of the user-supplied routines for the data assimilation system and are shared with these routines through the Fortran module mod_assimilation
. These variables are for example:
delt_obs
: An integer specifying the number of time steps between two analysis stepsrms_obs
: Assumed observation errorcradius
: Localization cut-off radius in grid points for the observation domainsradius
: support radius, if observation errors are weighted (i.e.locweight>0
)locweight
: Type of localizing weight
It is useful to define variables like these at this central position. Of course, their definition has to be adapted to the particular model used. The example codes describe the options for locweight
.
The setting of locweight
influence the weight function for the localization. If PDAF-OMI is used, the choices are standardized as follows
locweight | 0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|---|
function | unit weight | exponential | 5-th order polynomial | |||
regulation | - | - | - | regulation using mean variance | regulation using variance of single observation point | |
cradius | weight=0 if distance > cradius | |||||
sradius | no impact | weight = exp(-d / sradius) | weight = 0 if d >= sradius else weight = f(sradius, distance) |
User-supplied routine U_init_ens
(init_ens_pdaf.F90)
The user-supplied routine the we named U_init_ens
in the interface description above, is called by PDAF through the defined interface described below (Here, U_
marks the variable to describe a user-supplied routine while in the actual code we use the name init_ens_pdaf
). The routine is called by all MPI processes that compute the filter analysis step (i.e. those for which 'filterpe' is set to true. In the standard configuration of init_parallel_pdaf
these are all processes of the first model task, i.e. task_id=1.) init_ens_pdaf
is only called by PDAF_init
if no error occurred before; thus the status flag is zero.
The interface is the following:
SUBROUTINE U_init_ens(filtertype, dim_p, dim_ens, & state_p, Uinv, ens_p, flag)
with
filtertype
,integer, intent(in)
:
The type of filter algorithm as given in the call toPDAF_init
dim_p
,integer, intent(in)
:
The size of the state dimension for the calling process as specified in the call toPDAF_init
dim_ens
,integer, intent(in)
:
The size of the ensemble (or the rank of the state covariance matrix for SEEK)state_p
,real, dimension(dim_p), intent(inout)
:
Array for the local model state of the calling process (Only relevant for mode-based filters)Uinv
,real, dimension(dim_ens-1, dim_ens-1), intent(inout)
:
A real array for the inverse of matrix U from the decomposition of the state error covariance matrix P = VUVT (Only relevant for the SEEK filter.)ens_p
,real, dimension(dim_p, dim_ens), intent(inout)
:
The ensemble array. It has to hold upon exit the ensemble of model states.flag
,integer, intent(inout)
:
Status flag for PDAF. It is 0 upon entry and can be set by in the user-supplied routine, depending on the success of the ensemble initialization. Preferably, values above 102 should be used for failures to avoid conflicts with the error codes defined within PDAF_init.
Defining the state vector
The ensemble initialization routine is the first location at which the user has to fill a state vector (or array of state vectors). A state vector is the collection of all model fields that are handled in the analysis step of the assimilation procedure into a single vector. Usually one concatenates the different model fields as complete fields. Thus, the vector could contain a full 3-dimensional temperature field, followed by the salinity field (in case of an ocean model), and then followed by the 3 fields of the velocity components.
The logical definition of the state vector will also be utilized in several other user-supplied routines. E.g. in routines that fill model fields from a state vector or in the routine providing the observation operator.
Initialization for ensemble-based filters
For the ensemble-based filters and the ensembel/hybrid 3D-Var methods only the array ens_p
needs to be initialized by the ensemble of model states. If a parallel model with domain decomposition is used, the full ensemble for the local sub-domain has to be initialized.
The arrays state_p
and Uinv
are allocated to their correct sizes because they are used during the assimilation cycles. They are not yet initialized and it is allowed to use these arrays in the initialization. An exception from this is EnKF for which Uinv
is allocated only with size (1,1), because Uinv
is not used for EnKF.
Initialization for mode-based filters
The only mode-based filter supplied with PDAF is currenly the SEEK filter. For this filter, the initialization bases on the decomposition of the state error covariance matrix in the form P = VUVT. According to this decomposition, the array ens_p
has to be initialized to hold the modes from matrix V and Uinv
holds the inverse of matrix U. In addition state_p
has to be initialized with the initial state estimate. If a parallel model with domain decomposition is used, the part of all modes for the local sub-domain of the MPI process and the corresponding part of the state vector has to be initialized. Uinv will be identical for all MPI processes.
Testing the PDAF initialization
The PDAF initialization can be tested by compiling the program and executing it. The Makefile of the model has to be extended to include the additional files. The core part of PDAF can be compiled separately as a library and can then simply be linked to the model code. This is the strategy followed in the PDAF-package.
Remark: For the compilation with a real MPI library, one has to ensure that MPI-module (USE MPI
) of the MPI-library is USED for both the model and PDAF. (Thus in the include file for make in make.arch
, one might have set MPI_INC
to the directory holding the module.)
At this stage it will not be meaningful to perform an actual model time stepping. However, one can test whether the initialization in PDAF_init is successful and whether the ensemble array is correctly initialized. For this one can also activate the PDAF debugging output.
Standard output from PDAF_init should look like the following:
PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++ PDAF +++ PDAF +++ PDAF +++ Parallel Data Assimilation Framework +++ PDAF +++ +++ PDAF +++ Version 2.1 +++ PDAF +++ +++ PDAF +++ Please cite +++ PDAF +++ L. Nerger and W. Hiller, Computers and +++ PDAF +++ Geosciences, 2013, 55, 110-118, +++ PDAF +++ doi:10.1016/j.cageo.2012.03.026 +++ PDAF +++ when publishing work resulting from using PDAF +++ PDAF ++++++++++++++++++++++++++++++++++++++++++++++++++++++ PDAF: Initialize filter PDAF +++++++++++++++++++++++++++++++++++++++++++++++++++++++ PDAF +++ Local Error Subspace Transform Kalman Filter +++ PDAF +++ (LESTKF) +++ PDAF +++ +++ PDAF +++ Domain-localized implementation of the ESTKF by +++ PDAF +++ Nerger et al., Mon. Wea. Rev. 140 (2012) 2335 +++ PDAF +++ doi:10.1175/MWR-D-11-00102.1 +++ PDAF +++++++++++++++++++++++++++++++++++++++++++++++++++++++ PDAF LESTKF configuration PDAF filter sub-type = 0 PDAF --> Standard LESTKF PDAF --> Transform ensemble with deterministic Omega PDAF --> Use fixed forgetting factor: 1.00 PDAF --> ensemble size: 50 PDAF: Initialize Parallelization PDAF Parallelization - Filter on model PEs: PDAF Total number of PEs: 1 PDAF Number of parallel model tasks: 1 PDAF PEs for Filter: 1 PDAF # PEs per ensemble task and local ensemble sizes: PDAF Task 1 PDAF #PEs 1 PDAF N 50
The correctness of the ensemble initialization in U_init_ens
should be checked by the user.