Changes between Version 4 and Version 5 of AddFilterAlgorithm_PDAF3


Ignore:
Timestamp:
Jun 5, 2025, 1:30:18 PM (6 weeks ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AddFilterAlgorithm_PDAF3

    v4 v5  
    3838|| `PDAF_set_rparam_filters` || `PDAF_X_set_rparam` || Set real (floating point) parameter for the DA method ||
    3939|| `PDAF_print_info_filters` || `PDAF_x_memtime` || Display information on the run time of the different parts of the DA method as well as information on the amount of allocated memory.  ||
    40 || `PDAF_configinfo_filters` || `PDAF_x_config` || Display the current configuration of the DA method ||
     40|| `PDAF_configinfo_filters` || `PDAF_X_config` || Display the current configuration of the DA method ||
    4141
    4242When `PDAF_init` is called, the DA method is chosen by its ID number or its name parameter (see [wiki:AvailableOptionsforInitPDAF page on specific options of DA method]). Internally to PDAF, each DA method is identified by a string that is defined in the module `PDAF_DA` in `PDAF_da.F90`. The interface routines have a very simple structure. In general, they select the method-specific routine based on the string identifying the filters.
     
    4646|| When adding a DA method, a line for the corresponding method-specific routine has to be inserted to each of the interface routines in `PDAF_utils_filters.F90`. Further, one has to add to `PDAF_da.F90` a line for the DA method declaring its name in the form `PDAF_DA_X` and a correspondig index. ||
    4747
    48 == Internal structure of a DA method's code ==
     48== Internal code structure of a DA method ==
    4949
    5050=== Modules of a DA method ===
     
    5858|| `PDAF_X_analysis` || This module contains the routine that computes and applies the actual ensemble analysis increment. ||
    5959
    60 Note on naming modules and subroutines: Fortran does not allow that the name of a module is idential to the name of a subroutine contained in the module. We we handle this issue with using or omitting understores, e.g. the module `PDAFassimilate_X` contains the subroutine `PDAF_assimilate_X`.
    61 
    62 
    63 === Calls structure for analysis step ===
    64 
    65 The routine `PDAFomi_assimilate_local`, for local ensemble filters and smoothers, or analogously the other available `PDAF_omi_assimilate` routines, like for global ensemble filters and smoothers (`PDAFomi_assimilate_global`) or the different variants of 3D-Var (`PDAFomi_assimilate_*3dvar*`) are called directly from the model code. These are used for the fully-parallel implementation variant. For the flexible-parallelization variant the analogous routines `PDAFomi_put_state_*` are provided. These are generic routines which internally call the method-specific routine (PDAF_assimilate_X or PDAF_put_state_X) according to the chosen filter which then calls the actual update routine (PDAF_X_update). This call structure is explained in Figure 2.
    66 
    67 [[Image(//pics/analysis_call_structure_PDAF3.png)]]
     60Note on naming modules and subroutines: Fortran does not allow that the name of a module is identical to the name of a subroutine contained in the module. We we handle this issue with using or omitting underscores, e.g. the module `PDAFassimilate_X` contains the subroutine `PDAF_assimilate_X`. This limitation is somewhat inconvenient and might lead to inconsistencies in the naming schemes (like that there is the module `PDAFassimilate_X` without underscore following PDAF, but the module `PDAF_X_update` with underscore).
     61
     62
     63=== Call structure for analysis step ===
     64
     65The call structure of an analysis step is shown in Figure 2.
     66The interface routines `PDAF3_assimilate` and `PDAF3_assim_offline` (or the different variants of 3D-Var or the more specific routines for ensemble filters) are called directly from the model code. These generic routines call internally the method-specific routine (`PDAF_assimilate_X` or `PDAF_assim_offline_X`) according to the chosen filter. These routines control the ensemble forecasting for the online coupled mode and the ensemble handling for the offline mode. In implementations done for PDAF2, the routine `PDAF_put_state_X` might be called. `PDAF_assimilate_X` calls `PDAF_put_state_X`, controls the ensemble for the case that multiple ensemble states are propagated by a single model task (i.e. the `flexible parallelization` variant, and collects the ensemble from the  ensemble tasks before the assimilate update and distributes the ensemble to the model tasks afterwards.
     67
     68At the time of an analysis step, the actual analysis routines are called. Here, `PDAFX_update` is the actual main routine for the DA method. In this routine, the observations are initialied and for domain-locallized filters, the local analysis loop is performed. The routine `PDAFX_analysis` then computes and applied the assimilation increment or computes the ensemble transformation of transform filters.
     69
     70[[Image(//pics/analysis_call_structure_PDAF3_.png)]]
    6871[[BR]]'''Figure 2:''' Internal call structure for the analysis step. The universal interface routines `PDAF3_assimilate` and `PDAF3_Assim_offline` call a corresponding specific routine of the DA method. These specific routines  are `PDAF_assimilate_X` and `PDAF_assim_offline_X` which are members of the module `PDAFassimilate_X`. `PDAF_assimilate_X` calls `PDAF_put_state_X`. These two routines together control the online coupled mode, while `PDAF_assim_offline_X` controls the offline coupled mode. The actual analysis update is performed by the routines, `PDAFX_update` and `PDAFX_analysis`, each in their own module.
    6972
    70 The universal routine `PDAF3_assimilatel` as example) calls the method-specific routine. Here `PDAF_assimilate_X` is the routine that controls the ensemble run in case of the fully-parallel implementation. The routine `PDAF_put_state_X` is used for both the fully-parallel and flexible parallelization variants. For the flexible parallelization one uses `PDAFomi_put_state_local` which directly calls `PDAF_put_state_X`, while for the fully parallel variant this routine is called by `PDAF_assimilate_X`. `PDAF_put_state_X` controls the ensemble for the case that multiple ensmeble states are propagated by a single model task and collects the ensemble from the  ensemble tasks before the assimilate update and distributes the ensemble to the model tasks afterwards. The actual main routine for the DA method is `PDAF_X_update` which is called by `PDAF_put_state_X`. The routines on the right without 'omi' in their name are those with the full interface so that they are usable without OMI.
    71 
    72 See the [https://pdaf.awi.de/trac/wiki/AddFilterAlgorithm#Filter-specificroutines section on filter-specific routines] for a detailed description for `PDAF_assimilate_X` and `PDAF_put_state_X`.
    73 
    74 
    75 == Internal dimensions ==
    76 
    77 PDAF internally stores the dimensions of the assimilation system. The dimensions are declared in the Fortran module `PDAF_mod_filter`. Important are the following dimensions:
     73Further below we provide a detailed description for the different routines. The routines `PDAF_assimilate_X`, `PDAF_assim_offline_X`, and `PDAF_put_state_X` are framework routines and the templates only need minimal changes when implementing a new DA method. Also `PDAFX_update` as control routine for the analysis needs likely only minor changes. The main work when implementing a new DA method is in `PDAFX_analysis` where the actual analysis algorithm is implemented.
     74
     75
     76== PDAF's framework infrastructure ==
     77
     78When adding a new DA method to PDAF, one utlizes the framework infrastructure provided by PDAF. PDAF provides dimensions and arrays for the data assimilation. It also handles the ensemble forecasting before providing the ensemble to the DA method for the analysis step.
     79
     80=== Internal dimensions ===
     81
     82PDAF internally stores the dimensions of the assimilation system. The dimensions are declared in the Fortran module `PDAF_mod_core`. Important are the following dimensions:
     83||= Variable =||= Description =||
    7884|| `dim_p` || The size of the state vector (with parallelization the size of the local state vector for the current process) ||
    7985|| `dim_ens` || The overall size of the ensemble ||
    80 || `dim_ens_l` || If the ensemble integration is distributed over several ensemble tasks, this variable stores the size of the sub-ensemble handled by the current process. (`dim_ens_l` equals `dim_ens` if no parallelization or if only a single model task is used.) For the fully parallel implementation it is din_ens_l=1. Note that this variable is only used in the ensemble handling of PDAF, but not in the DA update. ||
    81 || `rank` ||The maximum rank of the ensemble covariance matrix. In almost all cases, it is `dim_ens-1`. Used in error-subspace filters, (L)ESTKF and (L)SEIK ||
    82 
    83 == Internal arrays ==
    84 
    85 Several internal arrays are allocated when PDAF is initialized. These arrays are declared in `PDAF_mod_filter`. They are allocated in `PDAF_X_alloc` (see below for details) and remain allocated throughout the assimilation process.
     86|| `dim_lag` || The lag as number of previous analysis steps; only used if a smoother is implemented ||
     87|| `dim_bias_p` || Dimension of a bias vector; only used if a bias estimation is implemented ||
     88
     89=== Internal arrays ===
     90
     91When running `PDAF_init`, a DA method allocates in its routine `PDAF_X_alloc` several arrays of the PDAF infrastructure. These arrays are declared in `PDAF_mod_core`. These arrays remain allocated throughout the assimilation process.
     92
    8693For the processes that computes the analysis (those with `filterpe=.true.`) the following arrays are defined:
    87 || Array || Dimension || Comment ||
    88 ||`state`|| `dim_p`   || State vector. Used in all filters. Inside the filter code, it's usually called `state_p` ||
    89 ||`eofV` || `dim_p` x `dim_ens` || Ensemble array. Used in all filters. Inside some filters the name is `ens_p` ||
    90 ||`eofU` || `dim_ens-1` x `dim_ens-1` (SEEK, SEIK, ESTKF)[[BR]] `dim_ens` x `dim_ens` (ETKF) ||Eigenvalue matrix '''U''' from '''P'''='''VUV'''^T^ (SEEK, SEIK) or transform matrix '''A''' (ETKF, ESTKF). Not used in EnKF. In some routines the matrix is called `Uinv` to indicate the inverse ||
    91 ||`state_inc` || `dim_p` || state increment vector. Only allocated if incremental analysis updates are used ||
     94||= Array =||= Dimension =||= Comment =||
     95|| `state`|| `dim_p`   || State vector. Used in all DA methods. Inside the filter code, it's usually called `state_p` to indicate parallelization. ||
     96|| `ens` || `dim_p` x `dim_ens` || Ensemble array. Used in all DA methods. Inside some filters the name is `ens_p` to indicate parallelization. ||
     97|| `Ainv` || `dim_ens-1` x `dim_ens-1` (SEIK, ESTKF)[[BR]] `dim_ens` x `dim_ens` (ETKF) || transform matrix **U**^-1^ (in SEIK) or **A**^-1^ (in ETKF, ESTKF). Not used in EnKF. ||
     98|| `sens` || `dim_p` x `dim_ens` x `dim_lag` || Ensemble array for smoothing, storing the ensembles of previous analysis steps. Only used if DA method supports smoothing and `dim_lag>0`. ||
     99|| `bias` || `dim_bias_p` || Bias vector. Only used if a DA method implements bias estimation. ||
    92100
    93101For the processes that only compute model forecasts but are not involved in the analysis step (i.e. `filterpe=.false.`), only one array is defined:
    94102|| Array || Dimension || Comment ||
    95 ||`eofV` || `dim_p` x `dim_ens_l` || Ensemble array on non-filter processes. Used in all filters. ||
    96 
    97 == Filter-specific routines ==
     103|| `ens` || `dim_p` x `dim_ens_l` || Ensemble array on processes that are not filter processes. Used in all DA methods. ||
     104|| `state` || `dim_p` || State vector. Only used if a stae vector is integrated separately from the ensemble states. ||
     105
     106PDAF provides the routine `PDAF_alloc` to perform the allocation of these arrays, thus `PDAF_X_alloc` calls `PDAF_alloc` providing dimension information for the allocation. For more information see [wiki:PDAF_alloc].
     107
     108== Configuration routines of DA method ==
     109
     110The configuration routines are those included in the module `PDAF_X` in file `PDAF_X.F90`.
    98111
    99112When a filter algorithm is added, the following filter routines have to be implemented and inserted to each interface routines described above.
     
    101114 * `PDAF_X_init`
    102115 * `PDAF_X_alloc`
    103  * `PDAF_X_options` (optional)
    104  * `PDAF_X_memtime` (optional)
    105 
    106 The routines are very similar for all DA methods.
    107 
    108 In addition, one has to implement the routines
    109  * `PDAF_put_state_X` and `PDAF_assimilate_X`
    110 
    111 As an example we recommend to see e.g. the routines for the ETKF or LETKF.
    112 
    113 We recommend to base on the routines of an existing filter, as most of the routines can be easily adapted to a new filter method.
     116 * `PDAF_X_options`
     117 * `PDAF_X_set_iparam`
     118 * `PDAF_X_set_rparam`
     119 * `PDAF_X_config
     120 * `PDAF_X_memtime`
     121
     122These routines are contained in the module `PDAF_X`.
     123
     124The templates in the sub-directories of `templates/analysis_step` provide commented template source codes to assistent in implementing a new DA method.
     125
     126=== Module `PDAF_X` ===
     127
     128This module contains the different configuration routines.
     129
     130In the header part of the module, the variables controlling a DA method are declared. These are independent for each DA method. Here, one has the freedom to declare any required parameter. They have to be set using the routine `PDAF_X_set_iparam` and `PDAF_X_set_rparam`.
     131
    114132
    115133=== `PDAF_X_init` ===
    116134
    117 The routine `PDAF_X_init` performs the initialization of filter-specific parameters. In addition, it prints information about the configuration.
    118 
    119 The interface is as follows:
    120 {{{
    121   SUBROUTINE PDAF_X_init(subtype, param_int, dim_pint, param_real, dim_preal, &
    122                                  ensemblefilter, fixedbasis, verbose, outflag)
    123 
    124   INTEGER, INTENT(inout) :: subtype                ! Sub-type of filter
    125   INTEGER, INTENT(in)    :: dim_pint               ! Number of parameters in param_int
    126   INTEGER, INTENT(inout) :: param_int(dim_pint)    ! The array of integer parameters
    127   INTEGER, INTENT(in)    :: dim_preal              ! Number of parameters in param_real
    128   REAL, INTENT(inout)    :: param_real(dim_preal)  ! The array of real parameters
    129   LOGICAL, INTENT(out)   :: ensemblefilter         ! Flag, whether the filter is an ensemble filter or a mode-based filter
    130   LOGICAL, INTENT(out)   :: fixedbasis             ! Flag, whether the chosen `subtype` uses a fixed ensemble (only the ensemble mean is integrated by the model)
    131   INTEGER, INTENT(in)    :: verbose                ! Control screen output
    132   INTEGER, INTENT(inout) :: outflag                ! Status flag
    133 }}}
    134 
    135 The routine has to perform the following operations:
     135The routine `PDAF_X_init` performs the initialization of filter-specific parameters.
     136
     137The routine usually performs the following operations:
     138 * Print a message with information on the DA method
     139
    136140 * initialize the PDAF-internal parameter variables specific for the DA method from the provided values of `subtype`, `param_int`, and `param_real`.
    137141 * set the logical flags `ensemblefilter` and `fixedbasis`.
    138142The existing implementations also include some screen output about the configuration.
    139 
    140 Please note:
    141  * The routine should check, whether the provided value of `subtype` is a valid choice. If this is not the case, the error flag should be set to 2.
    142  * Only parameters from `param_int` and `param_real` up to the value `dim_pint` and `dim_preal`, respectively, should be considered in the initialization. The arrays may be bigger, but the user defines which parameters are to be used be setting the values of `dim_pint` and `dim_preal`.
    143  * The error flag `outflag` is initially set to 0.
    144  * The internal parameters are declared in the Fortran module `PDAF_mod_filter`. If a new filter algorithm requires additional parameters, their declaration should be added to the module. Alternatively, one can introduce a new module the holds the parameters specific to the new DA method.
     143 * Call `PDAF_X_set_iparam` in a loop from index 3 to `dim_pint` to initialize integer parameters for the DA method
     144 * Call `PDAF_X_set_rparam` in a loop from index 3 to `dim_preal` to initialize real-valued parameters for the DA method
     145 * Initialize different flags and values
     146   * `localfilter`: Set to (1) for domain-localized methods, (0) otherwise
     147   * `fixedbasis`: Set to .true. for ensemble OI schemes
     148   * `ensemblefiles`: Only .false. for mode-based filters; can be omitted if .true.
     149   * `dim_lag`: Usually =0 since smoothing is activated by an integer parameter set with `PDAF_X_set_iparam`
     150   * `observed_ens`: Usually .false. since this option is commonly set with integer parameter 9 (see [wiki:AvailableOptionsforInitPDAF page on available options])
     151 * Check if provided value of `subtype` is valid. If this is not the case, set error flag should to 3.
    145152
    146153
    147154=== `PDAF_X_alloc` ===
    148155
    149 The routine `PDAF_X_alloc` allocates arrays for the data assimilation. These are the arrays that are used in the forecasting and hence need to be persistently allocated, like the ensemble array and a state vector. The success of the allocation is checked.
    150 
    151 The interface is as follows:
    152 {{{
    153   SUBROUTINE PDAF_X_alloc(subtype, outflag)
    154 
    155   INTEGER, INTENT(in) :: subtype        ! Sub-type of filter
    156   INTEGER, INTENT(out):: outflag        ! Status flag
    157 }}}
    158 
    159 All arrays that need to be allocated are declared in the Fortran module `PDAF_mod_filter`. Here, also the shapes of the arrays are declared. For the allocation of arrays, one has to distinguish between processes that compute the analysis step (`filterpe=.true.`) and those that only participate in the ensemble forecast (`filterpe=.false.`).
    160 
    161 For the processes that compute the analysis (those with `filterpe=.true.`) it is mandatory to allocate the following two arrays:
    162  * `state`: The state vector of size `dim_p`.
    163  * `eofV`: This is the ensemble matrix in all ensemble-based filters. For SEEK it is the matrix holding eigenvectors. `eofV` has size (`dim_p`, `dim_ens`).
    164 Depending on the filter algorithm some of the following arrays also need to be allocated:
    165  * `eofU`: This is the eigenvalue matrix '''U''' used in the SEIK and SEEK filters and the transform matrix of ESTKF (here, its size is (`rank`,`rank`)). For ETKF, it is the transform matrix '''A''' of size (`dim_ens`,`dim_ens`). The array should always be allocated. For method that don't use such transform matrix, for example the EnKF, it can be allocated with size (1,1).
    166  * `state_inc`: The increment to the state vector computed in the analysis step. It only needs to be allocated in this routine, if incremental analysis updating is implemented. Otherwise, it is sufficient to allocate and deallocate `state_inc` in the routine performing the analysis step. The size of `state_inc` is `dim_p`.
    167  * `bias`: If the filter algorithm is implemented with bias correction, the vector `bias` with size `dim_bias_p` is allocated.
    168 
    169 Processes that only participate in the computation of the ensemble forecast, but are not involved in computing the analysis step (those with `filterpe=.false.`), operate only on a sub-ensemble. Accordingly, an ensemble array for this sub-ensemble has to be allocated. This is:
    170  * `eofV`: This is the ensemble matrix in all ensemble-based filters. For SEEK it is the matrix holding eigenvectors. For the processes with `filterpe=.false.`, `eofV` has size (`dim_p`, `dim_ens_l`).
    171 
     156The routine `PDAF_X_alloc` allocates arrays for the DA method. These are the arrays that are used in the forecasting and hence need to be persistently allocated, like the ensemble array and a state vector. The success of the allocation is checked.
     157
     158For the arrays provided by the PDAF framework, the allocation is usually done by calling `PDAF_alloc`, see [wiki:PDAF_alloc page on PDAF_alloc].
     159
     160
     161=== `PDAF_X_config` ===
     162
     163The routine prints the informstion on the chosen configuration. It is called at the end of the configuration phase when `PDAF_init_forecast` is claled for the online mode, or one of the 'assim_offline' routines (e.g. PDAF3_assim_offline) for the offline mode.
     164
     165The tempalte show the typical form of the output.
    172166
    173167