Changes between Initial Version and Version 1 of OfflineInitPdaf_PDAF3


Ignore:
Timestamp:
May 25, 2025, 6:11:03 PM (6 days ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OfflineInitPdaf_PDAF3

    v1 v1  
     1= Offline Mode: Initializing PDAF =
     2
     3{{{
     4#!html
     5<div class="wiki-toc">
     6<h4>Offline Mode: Implementation Guide</h4>
     7<ol><li><a href="OfflineImplementationGuide_PDAF3">Main page</a></li>
     8<li><a href="OfflineAdaptParallelization_PDAF3">Initializing  the parallelization</a></li>
     9<li>Initializing PDAF</li>
     10<li><a href="ImplementationofAnalysisStep_PDAF3">Implementing the analysis step</a></li>
     11<li><a href="OfflineAddingMemoryandTimingInformation_PDAF3">Memory and timing information</a></li>
     12</ol>
     13</div>
     14}}}
     15
     16[[PageOutline(2-3,Contents of this page)]]
     17
     18== Overview ==
     19
     20The PDAF release provides example code for the offline mode in `tutorial/offline_2D_parallel`. We refer to this code to use it as a basis.
     21
     22The routine `PDAF_init` is called to initialize PDAF. This call sets parameters for the data assimilation, chooses the data assimilation method and initializes the ensemble. In the tutorial and template codes we collect the initialization of all variables required for the call to `PDAF_init` into the single subroutine `init_pdaf_offline`, which yields a clean code. The file `templates/offline/init_pdaf_offline.F90` provides a commented template for this routine, which can be used as the basis of the implementation.
     23
     24`PDAF_init` itself calls a user-supplied call-back routine to initialize the ensemble of model states. In the example, this is the routine in the file `init_ens_offline.F90`.
     25
     26== Using `init_pdaf_offline` ==
     27
     28In the offline mode, the routine `init_pdaf_offline` is executed after the initialization of the parallelization.
     29
     30In `init_pdaf_offline` a number of variables are defined that are used in the call to `PDAF_init` as described below.  There are also a few variables that are initialized in `init_pdaf_offline` but not used in the call to `PDAF_init`. These are variables that are specific for the data assimilation system, but only shared in between the user-supplied routines. For the tutorial example, these variables are described below in the section '[#Othervariablesfortheassimilation Other variables for the assimilation]'.
     31
     32The example implementation and the template code allow to specify all options at run time using a command line parser. These options are specified as the combination `-VARIABLE VALUE`. This method provides a convenient way to define an experiment and could also be used for other models. The parser module is provided by the file `tutorial/offline_2D_parallel/parser_mpi.F90`
     33
     34== Arguments of `PDAF_init` ==
     35
     36In the tutorial codes and the template, the call to `PDAF_init` is fully implemented. Here, we provide an overview of the arguments that are set in the call to `PDAF_init`.
     37
     38The call to `PDAF_init` has the following structure:
     39{{{
     40CALL PDAF_init(filtertype, subtype, step_null, &
     41               filter_param_i, length_filter_param_i, &
     42               filter_param_r, length_filter_param_r, &
     43               COMM_model, COMM_filter, COMM_couple, &
     44               task_id, n_modeltasks, filterpe, &
     45               init_ens_offline, screen, status_pdaf)
     46}}}
     47
     48The required arguments are described below. In the list, we mark those variables bold, which one might like to change, like the type of the DA method. The other variables are required, but usually not changed by the user.
     49
     50 * **filtertype**:[[BR]] An integer defining the type of the DA method.  (See the [#Noteonavailableoptions Note on Available Options])
     51 * **subtype**:[[BR]] An integer defining the sub-type of the filter algorithm. (See the [#Noteonavailableoptions Note on Available Options])
     52 * `step_null`:[[BR]] Always 0 for the offline mode.
     53 * **filter_param_i**:[[BR]] Integer array collecting options for PDAF. The first two variables are mandatory and equal for all filters. Further variables are optional (See the [#Noteonavailableoptions Note on Available Options]). The mandatory variables are in the following order:
     54  1. The size of the state vector for the current process (see [#Definingthestatevector information on defining the state vector])
     55  1. The ensemble size for all ensemble-based filters
     56 * **length_filter_param_i**:[[BR]] An integer defining the length of the array `filter_param_i`. The entries in the array are parsed up to this index.
     57 * **filter_param_r**:[[BR]] Array collecting real-valued options for PDAF. The first value is mandatory and equal for all filters.  Further variables are optional (See the [#Noteonavailableoptions Note on Available Options]). The mandatory variable is:
     58  1. The value of the forgetting factor controlling inflation (required to be larger than zero)
     59 * **length_filter_param_r**:[[BR]] An integer defining the length of the array `filter_param_r`. The entries in the array are parsed up to this index.
     60 * `COMM_model`:[[BR]] The communicator variable `COMM_model` as initialized by `init_parallel_pdaf`. If the model-communicator is named differently in the actual program, the name has to be adapted
     61 * `COMM_filter`:[[BR]] The communicator variable `COMM_filter` as initialized by `init_parallel_pdaf`.
     62 * `COMM_couple`:[[BR]] The communicator variable `COMM_couple` as initialized by `init_parallel_pdaf`.
     63 * `task_id`:[[BR]] The index of the model tasks  as initialized by `init_parallel_pdaf`. Always 1 for the offline mode
     64 * `n_modeltasks`:[[BR]] The number of model tasks as defined before the call to `init_parallel_pdaf`.
     65 * `filterpe`:[[BR]] The flag showing if a process belongs to `COMM_filter` as initialized by `init_parallel_pdaf`.
     66 * **init_ens_offline**:[[BR]] The name of the user-supplied routine that is called by `PDAF_init` to initialize the ensemble of model states. (See below: '[#User-suppliedroutineinit_ens_offline User-supplied routine init_ens_offline]'
     67 * `screen`:[[BR]] An integer defining whether information output is written to the screen (i.e. standard output). The following choices are available:
     68  * 0: quite mode - no information is displayed.
     69  * 1: Display standard information (recommended)
     70  * 2: as 1 plus display of timing information during the assimilation process
     71 * `status_pdaf`:[[BR]] An integer used as status flag of PDAF. If `status_pdaf` is zero upon exit from `PDAF_init`, the initialization was successful. An error occurred for non-zero values. (The error codes are documented in the routine PDAF_init.)
     72
     73PDAF uses two arrays **filter_param_i** and **filter_param_r** to respectively specify integer and real-valued options for PDAF. As described above, 2 integer values (state vector size, ensemble size) and 1 real value (forgetting factor) are mandatory. Additional options can be set by specifying a larger array and setting the corresponding size value (`length_filter_param_i`, `length_filter_param_r`). However, with PDAF V3.0 it can be more convenient to use the subroutines `PDAF_set_iparam` and `PDAF_set_rparam`, which are explained further below.
     74
     75An **overview of available integer and real-valued options** for each DA method can be found on the page [wiki:AvailableOptionsforInitPDAF Available options for the different DA methods]. The available options for a specific DA method can also be displayed by running the assimilation program for the selected DA method setting `subtype = -1`. (In the tutorial and template codes one can set `-subtype -1` on the command line). Generally, available options and valid settings are also listed in `mod_assimilation.F90` of the tutorials and template codes.
     76
     77We recommended to check the value of `status_pdaf` in the program after PDAF_init (and potentially `PDAF_set_iparam and `PDAF_set_rparam`) are executed. Only if its value is 0, the initialization was successful.
     78
     79=== Note on available options ===
     80
     81 A **list of available values of `filtertype`** as well as an **overview of available integer and real-valued options** for each DA method can be found on the page [wiki:AvailableOptionsforInitPDAF Available options for the different DA methods].
     82
     83 The **available options for a specific DA method** can also be displayed by running the assimilation program for the selected DA method setting `subtype = -1`. (In the tutorial and template codes one can set `-subtype -1` on the command line). Generally, available options and valid settings are also listed in `mod_assimilation.F90` of the tutorials and template codes, but this might not be up-to-date in all cases.
     84
     85== Other variables for the assimilation ==
     86
     87The routine `init_pdaf_offline` in the example also initializes several variables that are not used to call `PDAF_init`. These variables control some functionality of the user-supplied routines for the data assimilation system and are shared with these routines through `mod_assimilation`. These variables are for example:
     88 * `cradius`: Localization cut-off radius in grid points for the observation domain
     89 * `sradius`: support radius, if observation errors are weighted (i.e. `locweight>0`)
     90 * `locweight`: Type of localizing weight (see further below)
     91These localization parameters are used later in the subroutines called by `init_dim_obs_l_pdafomi`.
     92
     93We recommend to define such configuration variables at this central position, so that all configuration is done at one place. Of course, their definition should be adapted to the particular model used.
     94
     95The setting of `locweight` influences the weight function for the localization. With the PDAF3 interface (and generally with PDAF-OMI), the choices are standardized as follows
     96
     97||= '''locweight''' =||= '''0''' =||= '''1''' =||= '''2''' =||= '''3''' =||= '''4''' =||
     98||= function =|| unit weight ||  exponential  ||  5-th order[[BR]]polynomial  ||  5-th order[[BR]]polynomial  ||  5-th order[[BR]]polynomial  ||
     99||= regulation =||  -  ||  -  ||  -  ||  regulation using[[BR]]mean variance  ||  regulation using variance[[BR]]of single observation point  ||
     100||= '''cradius''' =||||||||||||  weight=0 if distance > cradius  ||
     101||= '''sradius''' =||  no impact  ||  weight = exp(-d / sradius)  ||||||||  weight = 0 if d >= sradius[[BR]] else[[BR]] weight = f(sradius, distance)  ||
     102
     103Here, 'regulation' refers to the regulated localization introduced in Nerger, L., Janjić, T., Schröter, J., Hiller, W. (2012). A regulated localization scheme for ensemble-based Kalman filters. Quarterly Journal of the Royal Meteorological Society, 138, 802-812. ​[https://doi.org/10.1002/qj.945 doi:10.1002/qj.945].
     104
     105
     106Apart from the generic variables for localization, we also specify variables that are specific for each observation type, for example in the tutorial code, we specify
     107 * `assim_A`: Flag whether to assimialtion observations of type A
     108 * `rms_obs_A`: Assumed observation error standard deviation of observation type A
     109
     110== User-supplied routine `init_ens_offline` ==
     111
     112The user-supplied routine that we named `init_ens_offline` here, is the call-back routine that is called by PDAF through the defined interface described below. For the offline mode the routine is called by all processes. `init_ens_pdaf` is only called by `PDAF_init` if no error occurred before; thus the status flag is zero.
     113
     114The interface details can be looked up in the template and tutorial codes. It is the following:
     115{{{
     116SUBROUTINE init_ens_offline(filtertype, dim_p, dim_ens, &
     117                           state_p, Uinv, ens_p, flag)
     118}}}
     119with
     120 * `filtertype`, `integer, intent(in)`:[[BR]]The type of filter algorithm as given in the call to `PDAF_init`
     121 * `dim_p`, `integer, intent(in)`:[[BR]] The size of the state dimension for the calling process as specified in the call to `PDAF_init`
     122 * `dim_ens`, `integer, intent(in)`:[[BR]]The size of the ensemble
     123 * `state_p`, `real, dimension(dim_p), intent(inout)`:[[BR]]Array for the local model state of the calling process (can be used freely for ensemble-based methods)
     124 * `Uinv`, `real, dimension(dim_ens-1, dim_ens-1), intent(inout)`:[[BR]]A possible weight matrix (Not relevant for ensemble-based methods)
     125 * `ens_p`, `real, dimension(dim_p, dim_ens), intent(inout)`:[[BR]] The ensemble array, which has to be filled with the ensemble of model states.
     126 * `flag`, `integer, intent(inout)`:[[BR]]Status flag for PDAF. It is 0 upon entry and can be set by in the user-supplied routine, depending on the success of the ensemble initialization.  Preferably, values above 102 should be used for failures to avoid conflicts with the error codes defined within PDAF_init.
     127
     128=== Defining the state vector ===
     129
     130The ensemble initialization routine `init_ens_pdaf` is the first location at which the user has to fill a state vector (or array of state vectors). A state vector is the collection of all model fields that are handled in the analysis step of the assimilation procedure into a single vector. Usually one concatenates the different model fields as complete fields. Thus, the vector could contain a full 3-dimensional temperature field, followed by the salinity field (in case of an ocean model), and then followed by the 3 fields of the velocity components.
     131
     132The logical definition of the state vector will also be utilized in several other user-supplied routines. E.g. in routines that fill model fields from a state vector or in the routine providing the observation operator.
     133
     134The actual setup of the state vector should be done in `init_pdaf`.  The tutorial example `tutorial/online_2D_serialmodel_2fields` demonstrates a possible setup of the state vector for 2 fields. Here, one defines the number of fields, the dimension of each included field as well as the offset of each field in the state vector.
     135
     136
     137=== Initialization for ensemble-based filters ===
     138
     139For the ensemble-based filters and the ensemble/hybrid 3D-Var methods, only the array `ens_p` needs to be initialized by the ensemble of model states. If a parallel model with domain decomposition is used, the full ensemble for the local sub-domain has to be initialized.
     140
     141The arrays `state_p` and `Uinv` are allocated to their correct sizes because they are used during the assimilation cycles. They are not yet initialized and it is allowed to use these arrays in the initialization. An exception from this is the EnKF for which `Uinv` is allocated only with size (`1`,`1`), because `Uinv` is not using for EnKF.
     142
     143For the offline mode, one will usually read the ensemble states from output files of the model used to perform the ensemble integrations separately (i.e. 'offline'). Thus, one has to implement a reading routine for the model files.
     144
     145== Setting additional options ==
     146
     147In PDAF V3.0 we added the possibility to set options for PDAF after the call to `PDAF_init`. For these there are the subroutines
     148{{{
     149   PDAF_set_iparam(id, value, status_pdaf)
     150}}}
     151to set interger parameters and
     152{{{
     153   PDAF_set_rparam(id, value, status_pdaf)
     154}}}
     155to set REAL (floating point) parameters. The arguments are
     156* `id`:[[BR]] The index value of a parameter
     157* `value`:[[BR]] The value of the parameter with index `id`
     158* `status_pdaf`:[[BR]] Status flag for PDAF. Both routines increment in the input value. The increment is 0 for no error (this allows to check `flag` once after all calls to `PDAF_init`, `PDAF_set_iparam`, and `PDAF_set_rparam`.)
     159
     160The tutorial code uses these routines for a few settings while the template code include an extended set of calls specific for different DA methods.
     161
     162An overview of available integer and real-valued options for each DA method can be found on the page [wiki:AvailableOptionsforInitPDAF Available options for the different DA methods]. The available options for a specific DA method can also be displayed by running the assimilation program for the selected DA method setting `subtype = -1`. (In the tutorial and template codes one can set `-subtype -1` on the command line).
     163
     164
     165== Testing the PDAF initialization ==
     166
     167The PDAF initialization can be tested by compiling the assimilation program (one can out-comment the call to `PDAF3_assim_offline` if one likes to focus on the initialization) and executing it.
     168
     169Standard output from PDAF_init looks like the following:
     170{{{
     171PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     172PDAF    +++                       PDAF                         +++
     173PDAF    +++       Parallel Data Assimilation Framework         +++
     174PDAF    +++                                                    +++
     175PDAF    +++                 Version 3.0beta                    +++
     176PDAF    +++                                                    +++
     177PDAF    +++                   Please cite                      +++
     178PDAF    +++ L. Nerger and W. Hiller, Computers and Geosciences +++
     179PDAF    +++ 2013, 55, 110-118, doi:10.1016/j.cageo.2012.03.026 +++
     180PDAF    +++   when publishing work resulting from using PDAF   +++
     181PDAF    +++                                                    +++
     182PDAF    +++          PDAF itself can also be cited as          +++
     183PDAF    +++  L. Nerger. Parallel Data Assimilation Framework   +++
     184PDAF    +++  (PDAF). Zenodo. 2024. doi:10.5281/zenodo.7861812  +++
     185PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
     186
     187
     188PDAF: Initialize filter
     189
     190PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++
     191PDAF    +++ Error Subspace Transform Kalman Filter (ESTKF) +++
     192PDAF    +++                                                +++
     193PDAF    +++  Nerger et al., Mon. Wea. Rev. 140 (2012) 2335 +++
     194PDAF    +++           doi:10.1175/MWR-D-11-00102.1         +++
     195PDAF    ++++++++++++++++++++++++++++++++++++++++++++++++++++++
     196
     197PDAF: Initialize Parallelization
     198PDAF     Parallelization - Filter on model PEs:
     199PDAF                 Total number of PEs:      1
     200PDAF      Number of parallel model tasks:      1
     201PDAF                      PEs for Filter:      1
     202PDAF     # PEs per ensemble task and local ensemble sizes:
     203PDAF     Task     1
     204PDAF     #PEs     1
     205PDAF        N     9
     206
     207PDAF: Call ensemble initialization
     208
     209         Initialize state ensemble
     210         --- read ensemble from files
     211         --- Ensemble size:      9
     212
     213PDAF: Initialization completed
     214}}}
     215
     216The correctness of the ensemble initialization in `init_ens_offline` should be checked by the user.