Context Navigation

Changes between Version 7 and Version 8 of ImplementAnalysisENSRF_EAKF

Timestamp:: May 16, 2025, 6:39:35 PM (3 months ago)
Author:: lnerger
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ImplementAnalysisENSRF_EAKF

-              v7
+              v8
 For completeness we discuss here all user-supplied routines that are specified in the interface to `PDAF_assimilate_ensrf`. Thus, some of the user-supplied routines that are explained on the page about the modification of the model code for the ensemble integration are repeated here.
 In our study, Nerger et al. (2015), we discussed that applying localization can lead to stability issues of the ENSRF. The filter performs a loop over all single observations. With localization the assimilation result depends on the order in which these observations are assimilated. This actually leads to the effect that the assimilation result at some grid point does not only depend on the observations within the localization radius **r**, but also on observations further away. This happens if these observation at larger distance are assimilated before the observations within the radius **r** and if they influence the state close to distance **r**. In this case the innovation for the state close to distance **r** is already changed when assimilating the observation within the radius and hence leads to a different result. This effect has implications on the parallelization since keeping the observation order constant over the full model domain leads to a partial serialization of the algorithm. In the implementation of the serial observation processing in PDAF we use the parallelization approach that does not guarantee the same order of the observations. Usually, the differences when changing the observation order are small, but the benefit is a better scaling because the serialization is avoided. Nonetheless, we generally recommend using LESTKF or LETKF, or their global variants ESTKF or ETKF, since they do not depend explicitly on the observation order, and they allow for non-diagonal observation error covariance matrices. However, the ENSRF/EAKF might have a good compute performance.
+In our study, Nerger et al. (2015), we discussed that applying localization can lead to stability issues of the ENSRF. The filter performs a loop over all single observations. With localization the assimilation result depends on the order in which these observations are assimilated. This actually leads to the effect that the assimilation result at some grid point does not only depend on the observations within the localization radius **r**, but also on observations further away. This happens if these observation at larger distance are assimilated before the observations within the radius **r** and if they influence the state close to distance **r**. In this case the innovation for the state close to distance **r** is already changed when assimilating the observation within the radius and hence leads to a different result. This effect has implications on the parallelization since keeping the observation order constant over the full model domain leads to a partial serialization of the algorithm. In the implementation of the serial observation processing in PDAF we use the parallelization approach that does not guarantee the same order of the observations. Usually, the differences when changing the observation order are small, but the benefit is a better scaling because the serialization is avoided. Nonetheless, we generally recommend using LESTKF or LETKF, or their global variants ESTKF or ETKF, since they do not depend explicitly on the observation order, and they allow for non-diagonal observation error covariance matrices. Nonetheless, the ENSRF/EAKF methods might have a good compute performance.
 == `PDAF_assimilate_ensrf` ==
 …
 The interface is the following:
 {{{
 SUBROUTINE PDAF_assim_offline_ensrf(&
+SUBROUTINE PDAF_assim_offline_ensrf( &
      U_init_dim_obs_f, U_obs_op_f,  U_init_obs_f, U_init_obsvars_f, &
      U_localize_covar_serial, &
 …
 == `PDAF_put_state_ensrf` ==
+For the offline mode of PDAF, the routine `PDAF_put_state_ensrf` has to be used instead of `PDAF_assimilate_ensrf`. This routine can also be used when the 'flexible' implementation variant is chosen for the assimilation system,  The general aspects of the filter specific routines `PDAF_put_state_*` have been described on the page [ModifyModelforEnsembleIntegration Modification of the model code for the ensemble integration]. The interface of the routine is identical with that of `PDAF_assimilate_ensrf` with the exception that the arguments of the user-supplied routines `U_distribute_state` and `U_next_observation` are missing.
+In PDAF3 this routine exists for backward-compatibility to support implementations done using PDAF2 standards.
+This routine was used in implementations before the release of PDAF V3.0 for the offline mode of PDAF and for the 'flexible' implementation variant of the assimilation system. The general aspects of the filter specific routines `PDAF_put_state_*` have been described on the page [ModifyModelforEnsembleIntegration Modification of the model code for the ensemble integration]. The interface of the routine is identical with that of `PDAF_assimilate_ensrf` with the exception that the arguments of the user-supplied routines `U_distribute_state` and `U_next_observation` are missing.
 The interface is the following:
 …
 == User-supplied routines ==
 Here all user-supplied routines are described that are required in the call to `PDAF_assimilate_ensrf`. For some of the generic routines, we link to the page on [ModifyModelforEnsembleIntegration modifying the model code for the ensemble integration].
 To indicate user-supplied routines we use the prefix `U_`. In the tutorials in `tutorial/` and in the template directory `templates/` these routines exist without the prefix, but with the extension `_pdaf`. The files are named correspondingly. In the section titles below we provide the name of the template file in parentheses.
 In the subroutine interfaces some variables appear with the suffix `_p`. This suffix indicates that the variable is particular to a model sub-domain, if a domain decomposed model is used. Thus, the value(s) in the variable will be different for different model sub-domains.
+Here, all user-supplied routines are described that are required in the call to `PDAF_assimilate_ensrf`. For some of the generic routines, we link to the page on [ModifyModelforEnsembleIntegration modifying the model code for the ensemble integration].
+To indicate user-supplied routines we use the prefix `U_`. In the tutorials in `tutorial/` and in the template directory `templates/` these routines exist without the prefix, but with the extension `_pdaf`. The files are named correspondingly. In the section titles below, we provide the name of the template file in parentheses.
+In the subroutine interfaces, some variables appear with the suffix `_p`. This suffix indicates that the variable is particular to a model sub-domain, if a domain decomposed model is used. Thus, the value(s) in the variable will be different for different model sub-domains.
 …
 Some hints:
+ * We recommend to not only determine the size of the observation vector at this point. The routine is a good place to also already gather information about the corresponding indices of the state vector needed later to implement the observation operator. In addition, one can already prepare an array that holds the full observation vector, an array storing the coordinates of the observations and possible an array storing observation error variances (if the observation error covariance matrix is diagonal). The required arrays can be defined in a module like `mod_assimilation`. The information can be used later in `U_localize_covar_serial`.
+    || **Note**: PDAF-OMI provides a structured approach for implementing the observation functionality. Because of this we generally recommend to ue the PDAF3 interface that uses PDAF-OMI. ||
+ * We recommend to not only determine the size of the observation vector at this point. The routine is a good place to also already gather information about the corresponding indices of the state vector needed later to implement the observation operator. In addition, one can already prepare an array that holds the full observation vector, an array storing the coordinates of the observations and possibly an array storing observation error variances (if the observation error covariance matrix is diagonal). The required arrays can be defined in a module like `mod_assimilation`. The information can be used later in `U_localize_covar_serial`.
+    || **Note**: PDAF-OMI provides a structured approach for implementing the observation functionality. This approach simpified the implentation. Because of this, we generally recommend to use the [wiki:PDAF3_interface PDAF3 interface] that uses PDAF-OMI. ||
  * The routine is similar to `init_dim_obs` used in the global filters. However, if the global filter is used with a domain-decomposed model, it only initializes the size of the observation vector for the local model sub-domain. This is different for the local filters, as the local analysis also requires observational data from neighboring model sub-domains. Nonetheless, one can base on an implemented routine `init_dim_obs` to implement `init_dim_obs_f`.
 …
 Hint:
  * The routine is similar to `init_dim_obs` used for the global filters. However, with a domain-decomposed model `m_state_f` will need to contain parts of the state vector from neighboring model sub-domains. Thus, one needs to collect this information which resides in the memory of other processes. PDAF provides the routine [wiki:PDAF_gather_obs_f PDAF_gather_obs_f] for this task. The example implementation in `tutorial/classical/online_2D_parallelmodel` shows the use of `PDAF_gather_obs_f`.
+ * The routine is similar to `init_dim_obs` used for the global filters. However, with a domain-decomposed model `m_state_f` will need to contain parts of the state vector from neighboring model sub-domains. Thus, one needs to collect this information which resides in the memory of other processes. PDAF provides the routine [wiki:PDAF_gather_obs_f PDAF_gather_obs_f] for this task. The example implementation in `tutorial/PDAF1_full_interface/online_2D_parallelmodel` shows the use of `PDAF_gather_obs_f`.
 …
 The routine is called during the analysis step before the loop over the single observations is entered. It has to provide the full vector of observations in `observation_f` for the current time step.
 Hints:
  * As for the other 'full' routines: While the global counterpart of this routine (`init_obs`) has to initialize the observation vector only for the local model sub-domain, the 'full' routine has to include observations that spatially belong to neighboring model sub-domains. As an easy choice one can simply initialize a vector of all globally available observations.
+Hint:
+ * As for the other 'full' routines: While the global counterpart of this routine (`init_obs`) has to initialize the observation vector only for the local model sub-domain, the 'full' routine has to include observations that spatially belong to neighboring model sub-domains. As an easy choice, one can simply initialize a vector of all globally available observations.
 …
   INTEGER, INTENT(in) :: dim_obs_f             !< Number of full observations
   REAL, INTENT(inout) :: HP_p(dim_p)           !< Process-local part of matrix HP for observation iobs
   REAL, INTENT(inout) :: HXY_p(dim_obs_F)      !< Process-local part of matrix HX(HX_full) for full observations}}}
 }}}
 The routine is called during the loop over all single observations. The purpose of the routine is to apply covariance localization to the vectors '''Hi P''' and '''Hi PH^T^''' for the assimilation of a single observation (determined by the index `iobs` related to the observation operator '''Hi'''). Here '''Hi PH^T^''' is the vector relating to the observed covariance matrix for the full observation vector. This vector is required for parallelization.
+  REAL, INTENT(inout) :: HXY_p(dim_obs_f)      !< Process-local part of matrix HX(HX_full) for full observations}}}
+}}}
+The routine is called during the loop over all single observations. The purpose of the routine is to apply covariance localization to the vectors '''Hi P''' and '''Hi PH^T^''' for the assimilation of a single observation (determined by the index `iobs` related to the observation operator '''Hi'''). Here, '''Hi PH^T^''' (`HXY_p`) is the vector relating to the observed covariance matrix for the full observation vector. This vector is required for parallelization.
 Hints:
  * To compute the localization one can use the routine `PDAF_local_weight` after computing the distance between two elements in the vector '''Hi P''' or '''Hi PH^T'''.
+ * To compute the localization, one can use the routine `PDAF_local_weight` after computing the distance between two elements in the vector '''Hi P''' or '''Hi PH^T'''.
 === `U_prepoststep` (prepoststep_ens_pdaf.F90) ===
 The general aspects of this routines have already been described on the [ModifyModelforEnsembleIntegration#U_prepoststepprepoststep_ens_pdaf.F90 page on modifying the model code for the ensemble integration] for the SEIK filter. For completeness, the description is repeated specifically for the EnKF:
 The interface of the routine is identical for all filters, but sizes can vary. Also, the particular operations that are performed in the routine can be specific for each filter algorithm.
 The interface for this routine is for the LEnKF
 {{{
 SUBROUTINE prepoststep(step, dim_p, dim_ens, dim_ens_p, dim_obs_p, &
+The general aspects of this routines have already been described on the [ModifyModelforEnsembleIntegration#U_prepoststepprepoststep_ens_pdaf.F90 page on modifying the model code for the ensemble integration] for the SEIK filter. For completeness, the description is repeated specifically for the ENSRF:
+The interface of the routine is identical for all filters, but sizes can vary. Additionally, the specific operations performed in the routine can vary for each filter algorithm.
+The interface for this routine is for the ENSRF
+{{{
+SUBROUTINE U_prepoststep(step, dim_p, dim_ens, dim_ens_p, dim_obs_p, &
                        state_p, Uinv, ens_p, flag)
 …
 }}}
+The routine `U_prepoststep` is called once at the beginning of the assimilation process. In addition, it is called during the assimilation cycles before the analysis step and after the ensemble transformation. The routine is called by all filter processes (that is `filterpe=1`).
+The routine provides for the user the full access to the ensemble of model states. Thus, user-controlled pre- and post-step operations can be performed.  For example the forecast and the analysis states and ensemble covariance matrix can be analyzed, e.g. by computing the estimated variances. In addition, the estimates can be written to disk.
+The routine `U_prepoststep` is called once at the beginning of the assimilation process and subsequently during the assimilation cycles before the analysis step and after the ensemble transformation. The routine is called by all filter processes (that is `filterpe=1`).
+The routine provides for the user with full access to the ensemble of model states. Thus, user-controlled pre- and post-step operations can be performed.
+For example, the forecast and analysis states, as well as the ensemble covariance matrix, can be analyzed, e.g., by computing the estimated variances.
+In addition, the estimates can be written to disk.
 Hint:
  * If a user considers to perform adjustments to the estimates (e.g. for balances), this routine is the right place for it.
  * Only for the SEEK filter the state vector (`state_p`) is initialized. For all other filters, the array is allocated, but it can be used freely during the execution of `U_prepoststep`.
+ * If a user considers to perform adjustments to the estimates (e.g., for balances), this routine is the right place for it.
+ * The state vector (`state_p`) is allocated for all other filters, but it can be used freely during the execution of `U_prepoststep`. One cannot assume that the vector holds the correct state vector.
  * The array `Uinv` is not used in the EnKF. Internally to PDAF, it is allocated to be of size (1,1).
  * Apart from the size of the array `Uinv`, the interface is identical for all ensemble filters (SEIK/ETKF/EnKF/LSEIK/LETKF/LEnKF). In general it should be possible to use an identical pre/poststep routine for all these filters.
  * The interface through which `U_prepoststep` is called does not include the array of smoothed ensembles. In order to access the smoother ensemble array one has to set a pointer to it using a call to the routine `PDAF_get_smootherens` (see page on [AuxiliaryRoutines auxiliary routines])
+ * Apart from the size of the array `Uinv`, the interface is identical for all ensemble filters. In general, it should be possible to use an identical pre/poststep routine for all these filters.
+ * The interface through which `U_prepoststep` is called does not include the array of smoothed ensembles. In order to access the smoother ensemble array one has to set a pointer to it using a call to the routine `PDAF_get_smootherens` (see the page on [AuxiliaryRoutines auxiliary routines])
 …
 For the ENSRF/EAKF methods, the user-supplied routines are essentially executed in the order they are listed in the interface to `PDAF_assimilate_ensrf`. The order can be important as some routines can perform preparatory work for later routines. For example, `U_init_dim_obs` can prepare an index array that provides the information for executing the observation operator in `U_obs_op`.
 Before the analysis step is called the following routine is executed:
+Before the analysis step is called, the following routine is executed:
 . [#U_collect_statecollect_state_pdaf.F90 U_collect_state]
 The analysis step is executed when the ensemble integration of the forecast is completed. During the analysis step the following routines are executed in the given order:
 . [#U_prepoststepprepoststep_ens_pdaf.F90 U_prepoststep] (Call to act on the forecast ensemble, called with negative value of the time step)
+The analysis step is executed when the ensemble integration of the forecast is completed. During the analysis step, the following routines are executed in the given order:
+. [#U_prepoststepprepoststep_ens_pdaf.F90 U_prepoststep] (Call to act on the forecast ensemble, called with a negative value of the time step)
 . [#U_init_dim_obsinit_dim_obs_pdaf.F90 U_init_dim_obs]
 . [#U_obs_opobs_op_pdaf.F90 U_obs_op] (`dim_ens` calls: one call for each ensemble member; one more call if also applied to ensemble mean)