Changes between Version 7 and Version 8 of ImplementAnalysisENSRF_EAKF


Ignore:
Timestamp:
May 16, 2025, 6:39:35 PM (41 hours ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ImplementAnalysisENSRF_EAKF

    v7 v8  
    5555For completeness we discuss here all user-supplied routines that are specified in the interface to `PDAF_assimilate_ensrf`. Thus, some of the user-supplied routines that are explained on the page about the modification of the model code for the ensemble integration are repeated here.
    5656
    57 In our study, Nerger et al. (2015), we discussed that applying localization can lead to stability issues of the ENSRF. The filter performs a loop over all single observations. With localization the assimilation result depends on the order in which these observations are assimilated. This actually leads to the effect that the assimilation result at some grid point does not only depend on the observations within the localization radius **r**, but also on observations further away. This happens if these observation at larger distance are assimilated before the observations within the radius **r** and if they influence the state close to distance **r**. In this case the innovation for the state close to distance **r** is already changed when assimilating the observation within the radius and hence leads to a different result. This effect has implications on the parallelization since keeping the observation order constant over the full model domain leads to a partial serialization of the algorithm. In the implementation of the serial observation processing in PDAF we use the parallelization approach that does not guarantee the same order of the observations. Usually, the differences when changing the observation order are small, but the benefit is a better scaling because the serialization is avoided. Nonetheless, we generally recommend using LESTKF or LETKF, or their global variants ESTKF or ETKF, since they do not depend explicitly on the observation order, and they allow for non-diagonal observation error covariance matrices. However, the ENSRF/EAKF might have a good compute performance.
     57In our study, Nerger et al. (2015), we discussed that applying localization can lead to stability issues of the ENSRF. The filter performs a loop over all single observations. With localization the assimilation result depends on the order in which these observations are assimilated. This actually leads to the effect that the assimilation result at some grid point does not only depend on the observations within the localization radius **r**, but also on observations further away. This happens if these observation at larger distance are assimilated before the observations within the radius **r** and if they influence the state close to distance **r**. In this case the innovation for the state close to distance **r** is already changed when assimilating the observation within the radius and hence leads to a different result. This effect has implications on the parallelization since keeping the observation order constant over the full model domain leads to a partial serialization of the algorithm. In the implementation of the serial observation processing in PDAF we use the parallelization approach that does not guarantee the same order of the observations. Usually, the differences when changing the observation order are small, but the benefit is a better scaling because the serialization is avoided. Nonetheless, we generally recommend using LESTKF or LETKF, or their global variants ESTKF or ETKF, since they do not depend explicitly on the observation order, and they allow for non-diagonal observation error covariance matrices. Nonetheless, the ENSRF/EAKF methods might have a good compute performance.
    5858
    5959== `PDAF_assimilate_ensrf` ==
     
    8787The interface is the following:
    8888{{{
    89 SUBROUTINE PDAF_assim_offline_ensrf(&
     89SUBROUTINE PDAF_assim_offline_ensrf( &
    9090     U_init_dim_obs_f, U_obs_op_f,  U_init_obs_f, U_init_obsvars_f, &
    9191     U_localize_covar_serial, &
     
    9797== `PDAF_put_state_ensrf` ==
    9898
    99 For the offline mode of PDAF, the routine `PDAF_put_state_ensrf` has to be used instead of `PDAF_assimilate_ensrf`. This routine can also be used when the 'flexible' implementation variant is chosen for the assimilation system,  The general aspects of the filter specific routines `PDAF_put_state_*` have been described on the page [ModifyModelforEnsembleIntegration Modification of the model code for the ensemble integration]. The interface of the routine is identical with that of `PDAF_assimilate_ensrf` with the exception that the arguments of the user-supplied routines `U_distribute_state` and `U_next_observation` are missing.
     99In PDAF3 this routine exists for backward-compatibility to support implementations done using PDAF2 standards.
     100This routine was used in implementations before the release of PDAF V3.0 for the offline mode of PDAF and for the 'flexible' implementation variant of the assimilation system. The general aspects of the filter specific routines `PDAF_put_state_*` have been described on the page [ModifyModelforEnsembleIntegration Modification of the model code for the ensemble integration]. The interface of the routine is identical with that of `PDAF_assimilate_ensrf` with the exception that the arguments of the user-supplied routines `U_distribute_state` and `U_next_observation` are missing.
    100101
    101102The interface is the following:
     
    110111== User-supplied routines ==
    111112
    112 Here all user-supplied routines are described that are required in the call to `PDAF_assimilate_ensrf`. For some of the generic routines, we link to the page on [ModifyModelforEnsembleIntegration modifying the model code for the ensemble integration].
    113 
    114 To indicate user-supplied routines we use the prefix `U_`. In the tutorials in `tutorial/` and in the template directory `templates/` these routines exist without the prefix, but with the extension `_pdaf`. The files are named correspondingly. In the section titles below we provide the name of the template file in parentheses.
    115 
    116 In the subroutine interfaces some variables appear with the suffix `_p`. This suffix indicates that the variable is particular to a model sub-domain, if a domain decomposed model is used. Thus, the value(s) in the variable will be different for different model sub-domains.
     113Here, all user-supplied routines are described that are required in the call to `PDAF_assimilate_ensrf`. For some of the generic routines, we link to the page on [ModifyModelforEnsembleIntegration modifying the model code for the ensemble integration].
     114
     115To indicate user-supplied routines we use the prefix `U_`. In the tutorials in `tutorial/` and in the template directory `templates/` these routines exist without the prefix, but with the extension `_pdaf`. The files are named correspondingly. In the section titles below, we provide the name of the template file in parentheses.
     116
     117In the subroutine interfaces, some variables appear with the suffix `_p`. This suffix indicates that the variable is particular to a model sub-domain, if a domain decomposed model is used. Thus, the value(s) in the variable will be different for different model sub-domains.
    117118
    118119
     
    143144
    144145Some hints:
    145  * We recommend to not only determine the size of the observation vector at this point. The routine is a good place to also already gather information about the corresponding indices of the state vector needed later to implement the observation operator. In addition, one can already prepare an array that holds the full observation vector, an array storing the coordinates of the observations and possible an array storing observation error variances (if the observation error covariance matrix is diagonal). The required arrays can be defined in a module like `mod_assimilation`. The information can be used later in `U_localize_covar_serial`.
    146     || **Note**: PDAF-OMI provides a structured approach for implementing the observation functionality. Because of this we generally recommend to ue the PDAF3 interface that uses PDAF-OMI. ||
     146 * We recommend to not only determine the size of the observation vector at this point. The routine is a good place to also already gather information about the corresponding indices of the state vector needed later to implement the observation operator. In addition, one can already prepare an array that holds the full observation vector, an array storing the coordinates of the observations and possibly an array storing observation error variances (if the observation error covariance matrix is diagonal). The required arrays can be defined in a module like `mod_assimilation`. The information can be used later in `U_localize_covar_serial`.
     147    || **Note**: PDAF-OMI provides a structured approach for implementing the observation functionality. This approach simpified the implentation. Because of this, we generally recommend to use the [wiki:PDAF3_interface PDAF3 interface] that uses PDAF-OMI. ||
     148
    147149 * The routine is similar to `init_dim_obs` used in the global filters. However, if the global filter is used with a domain-decomposed model, it only initializes the size of the observation vector for the local model sub-domain. This is different for the local filters, as the local analysis also requires observational data from neighboring model sub-domains. Nonetheless, one can base on an implemented routine `init_dim_obs` to implement `init_dim_obs_f`.
    148150
     
    166168
    167169Hint:
    168  * The routine is similar to `init_dim_obs` used for the global filters. However, with a domain-decomposed model `m_state_f` will need to contain parts of the state vector from neighboring model sub-domains. Thus, one needs to collect this information which resides in the memory of other processes. PDAF provides the routine [wiki:PDAF_gather_obs_f PDAF_gather_obs_f] for this task. The example implementation in `tutorial/classical/online_2D_parallelmodel` shows the use of `PDAF_gather_obs_f`.
     170 * The routine is similar to `init_dim_obs` used for the global filters. However, with a domain-decomposed model `m_state_f` will need to contain parts of the state vector from neighboring model sub-domains. Thus, one needs to collect this information which resides in the memory of other processes. PDAF provides the routine [wiki:PDAF_gather_obs_f PDAF_gather_obs_f] for this task. The example implementation in `tutorial/PDAF1_full_interface/online_2D_parallelmodel` shows the use of `PDAF_gather_obs_f`.
    169171
    170172
     
    184186The routine is called during the analysis step before the loop over the single observations is entered. It has to provide the full vector of observations in `observation_f` for the current time step.
    185187
    186 Hints:
    187  * As for the other 'full' routines: While the global counterpart of this routine (`init_obs`) has to initialize the observation vector only for the local model sub-domain, the 'full' routine has to include observations that spatially belong to neighboring model sub-domains. As an easy choice one can simply initialize a vector of all globally available observations.
     188Hint:
     189 * As for the other 'full' routines: While the global counterpart of this routine (`init_obs`) has to initialize the observation vector only for the local model sub-domain, the 'full' routine has to include observations that spatially belong to neighboring model sub-domains. As an easy choice, one can simply initialize a vector of all globally available observations.
    188190
    189191
     
    215217  INTEGER, INTENT(in) :: dim_obs_f             !< Number of full observations
    216218  REAL, INTENT(inout) :: HP_p(dim_p)           !< Process-local part of matrix HP for observation iobs
    217   REAL, INTENT(inout) :: HXY_p(dim_obs_F)      !< Process-local part of matrix HX(HX_full) for full observations}}}
    218 }}}
    219 
    220 The routine is called during the loop over all single observations. The purpose of the routine is to apply covariance localization to the vectors '''Hi P''' and '''Hi PH^T^''' for the assimilation of a single observation (determined by the index `iobs` related to the observation operator '''Hi'''). Here '''Hi PH^T^''' is the vector relating to the observed covariance matrix for the full observation vector. This vector is required for parallelization.
     219  REAL, INTENT(inout) :: HXY_p(dim_obs_f)      !< Process-local part of matrix HX(HX_full) for full observations}}}
     220}}}
     221
     222The routine is called during the loop over all single observations. The purpose of the routine is to apply covariance localization to the vectors '''Hi P''' and '''Hi PH^T^''' for the assimilation of a single observation (determined by the index `iobs` related to the observation operator '''Hi'''). Here, '''Hi PH^T^''' (`HXY_p`) is the vector relating to the observed covariance matrix for the full observation vector. This vector is required for parallelization.
    221223
    222224Hints:
    223  * To compute the localization one can use the routine `PDAF_local_weight` after computing the distance between two elements in the vector '''Hi P''' or '''Hi PH^T'''.
     225 * To compute the localization, one can use the routine `PDAF_local_weight` after computing the distance between two elements in the vector '''Hi P''' or '''Hi PH^T'''.
    224226
    225227
    226228=== `U_prepoststep` (prepoststep_ens_pdaf.F90) ===
    227229
    228 The general aspects of this routines have already been described on the [ModifyModelforEnsembleIntegration#U_prepoststepprepoststep_ens_pdaf.F90 page on modifying the model code for the ensemble integration] for the SEIK filter. For completeness, the description is repeated specifically for the EnKF:
    229 
    230 The interface of the routine is identical for all filters, but sizes can vary. Also, the particular operations that are performed in the routine can be specific for each filter algorithm.
    231 
    232 The interface for this routine is for the LEnKF
    233 {{{
    234 SUBROUTINE prepoststep(step, dim_p, dim_ens, dim_ens_p, dim_obs_p, &
     230The general aspects of this routines have already been described on the [ModifyModelforEnsembleIntegration#U_prepoststepprepoststep_ens_pdaf.F90 page on modifying the model code for the ensemble integration] for the SEIK filter. For completeness, the description is repeated specifically for the ENSRF:
     231
     232The interface of the routine is identical for all filters, but sizes can vary. Additionally, the specific operations performed in the routine can vary for each filter algorithm.
     233
     234The interface for this routine is for the ENSRF
     235{{{
     236SUBROUTINE U_prepoststep(step, dim_p, dim_ens, dim_ens_p, dim_obs_p, &
    235237                       state_p, Uinv, ens_p, flag)
    236238
     
    249251}}}
    250252
    251 The routine `U_prepoststep` is called once at the beginning of the assimilation process. In addition, it is called during the assimilation cycles before the analysis step and after the ensemble transformation. The routine is called by all filter processes (that is `filterpe=1`).
    252 
    253 The routine provides for the user the full access to the ensemble of model states. Thus, user-controlled pre- and post-step operations can be performed.  For example the forecast and the analysis states and ensemble covariance matrix can be analyzed, e.g. by computing the estimated variances. In addition, the estimates can be written to disk.
     253The routine `U_prepoststep` is called once at the beginning of the assimilation process and subsequently during the assimilation cycles before the analysis step and after the ensemble transformation. The routine is called by all filter processes (that is `filterpe=1`).
     254
     255The routine provides for the user with full access to the ensemble of model states. Thus, user-controlled pre- and post-step operations can be performed. 
     256For example, the forecast and analysis states, as well as the ensemble covariance matrix, can be analyzed, e.g., by computing the estimated variances.
     257In addition, the estimates can be written to disk.
    254258
    255259Hint:
    256  * If a user considers to perform adjustments to the estimates (e.g. for balances), this routine is the right place for it.
    257  * Only for the SEEK filter the state vector (`state_p`) is initialized. For all other filters, the array is allocated, but it can be used freely during the execution of `U_prepoststep`.
     260 * If a user considers to perform adjustments to the estimates (e.g., for balances), this routine is the right place for it.
     261 * The state vector (`state_p`) is allocated for all other filters, but it can be used freely during the execution of `U_prepoststep`. One cannot assume that the vector holds the correct state vector.
    258262 * The array `Uinv` is not used in the EnKF. Internally to PDAF, it is allocated to be of size (1,1).
    259  * Apart from the size of the array `Uinv`, the interface is identical for all ensemble filters (SEIK/ETKF/EnKF/LSEIK/LETKF/LEnKF). In general it should be possible to use an identical pre/poststep routine for all these filters.
    260  * The interface through which `U_prepoststep` is called does not include the array of smoothed ensembles. In order to access the smoother ensemble array one has to set a pointer to it using a call to the routine `PDAF_get_smootherens` (see page on [AuxiliaryRoutines auxiliary routines])
     263 * Apart from the size of the array `Uinv`, the interface is identical for all ensemble filters. In general, it should be possible to use an identical pre/poststep routine for all these filters.
     264 * The interface through which `U_prepoststep` is called does not include the array of smoothed ensembles. In order to access the smoother ensemble array one has to set a pointer to it using a call to the routine `PDAF_get_smootherens` (see the page on [AuxiliaryRoutines auxiliary routines])
    261265
    262266
     
    272276For the ENSRF/EAKF methods, the user-supplied routines are essentially executed in the order they are listed in the interface to `PDAF_assimilate_ensrf`. The order can be important as some routines can perform preparatory work for later routines. For example, `U_init_dim_obs` can prepare an index array that provides the information for executing the observation operator in `U_obs_op`.
    273277
    274 Before the analysis step is called the following routine is executed:
     278Before the analysis step is called, the following routine is executed:
    275279 1. [#U_collect_statecollect_state_pdaf.F90 U_collect_state]
    276280
    277 The analysis step is executed when the ensemble integration of the forecast is completed. During the analysis step the following routines are executed in the given order:
    278  1. [#U_prepoststepprepoststep_ens_pdaf.F90 U_prepoststep] (Call to act on the forecast ensemble, called with negative value of the time step)
     281The analysis step is executed when the ensemble integration of the forecast is completed. During the analysis step, the following routines are executed in the given order:
     282 1. [#U_prepoststepprepoststep_ens_pdaf.F90 U_prepoststep] (Call to act on the forecast ensemble, called with a negative value of the time step)
    279283 1. [#U_init_dim_obsinit_dim_obs_pdaf.F90 U_init_dim_obs]
    280284 1. [#U_obs_opobs_op_pdaf.F90 U_obs_op] (`dim_ens` calls: one call for each ensemble member; one more call if also applied to ensemble mean)