wiki:ImplementGenerateObs_PDAF3

Generating Synthetic Observations with PDAF

This page describes the implementation for PDAF3. The documentation for PDAF2 is still available.

Overview

Twin data assimilation experiments are a common approach to assess data assimilation methods. In twin experiments one uses the model to generate a true model state. Further, one generates synthetic observations by adding random perturbations to the true state. Then, in the actual twin experiment one starts the data assimilation with a state estimate that is different from the true state and assimilates the synthetic observations. One can analyze the assimilation result by comparing the state estimate from the twin experiment with the previously generated true state. One can also further modify the model or the observations to simulate deficiencies in real systems.

PDAF provides functionality to generate synthetic observations. The functionality bases on the usual implementation of the assimilation used with PDAF's oneline coupled mode. However, one can run the observation generation with an ensemble of just one member, which should be initialized with the initial true state. PDAF provides the routines PDAF3_generate_obs and PDAF3_put_state_generate_obs to generate the observations. These routines use the observation operator routines which the user also implements for assimilating real observations. Thus, one can use characterstics of real observations to generate the synthetic observations.

An example implementation can be found in models/loenz96/ where synthetic observation can be generated for the Lorenz-96 model. In addition, the implemenation is show in the templates in templates/online/.

Here we describes the steps needed to generate synthetic observations.

Initialization

The implementation of the initialization of PDAF is explained on the page on 'init_pdaf' and 'PDAF_init'.

For the observation generation one just has to set filtertype = PDAF_DA_GENOBS or filtertype = 100 as argument to PDAF_init.

To set options, using the common names in the tutorial and template codes, for filter_param_i one just has to specify the mandatory values of the state dimension and the ensemble size. For filter_param_r one has to specify the mandatory values of the forgetting factor (even though, this value is ignored for the observation generation).

There is one additional integer option: seedset (iparam(3)). This allows to select a seed set for the random number generation, see the options listed on the page on available options. An overview of the options can also be optained by runnign the program with subtype=-1.

Observation Generation Step

This step replaces the analysis step. The implementation is analogous to implementing the analysis step as described on the page on implementing the analysis step.

The observation generation is only available in PDAF's online mode.

PDAF3_generate_obs

This routine is used in the same way as the analysis routine PDAF3_assimilate. This routine can be used in both the fully-parallel and the flexible implementation variants of the data assimilation system. (See the page Modification of the model code for the ensemble integration for these variants).

Here, we list the full interface of the routine. Subsequently, the user-supplied routines specified in the call are explained.

The interface is

  SUBROUTINE PDAF3_generate_obs(collect_state_pdaf, distribute_state_pdaf, &
                                  init_dim_obs_pdafomi, obs_op_pdafomi, get_obs_f_pdaf, &
                                  prepoststep_pdaf, next_observation_pdaf, status_pdaf)

with the following arguments:

  • Routines to transfer between model fields and state vector:
    • collect_state_pdaf:
      The name of the user-supplied routine that initializes a state vector from the array holding the ensemble of model states from the model fields.
    • distribute_state_pdaf:
      The name of a user supplied routine that initializes the model fields from the array holding the ensemble of model state vectors. (The same routine is also used in PDAF_init_forecast.)
  • Observation routines using PDAF-OMI:
    • init_dim_obs_pdafomi:
      The name of the user-supplied routine that initializes the observation information and provides the size of observation vector
    • obs_op_pdafomi:
      The name of the user-supplied routine that acts as the observation operator on some state vector
  • Partifular routine to access generated observations
    • get_obs_f_pdaf: The name of the user-supplied routine that receives the full vector of generated synthetic observations from PDAF
  • Prepoststep and initialization for next forecast phase
    • prepoststep_pdaf:
      The name of the pre/poststep routine as in PDAF_init_forecast. (The same routine is also used in PDAF_init_forecast.)
    • next_observation_pdaf:
      The name of a user supplied routine that initializes the variables nsteps, timenow, and doexit. (The same routine is also used in PDAF_init_forecast.)
  • Status flag
    • status:
      The integer status flag. It is zero, if the routine is exited without errors.

PDAF3_put_state_generate_obs

This routine exists for backward-compatibility. In implementations that were done before the release of PDAF V3.0, a 'put_state' routine was used for the flexible parallelization variant and for the offline mode. When the 'flexible' implementation variant is chosen for the assimilation system, the routine. The routine PDAF3_put_state_generate_obs allows to port such implementations to the PDAF3 interface with minimal changes. The interface of the routine is identical with that of PDAF3_generate_obs, except that the user-supplied routines U_distribute_state and U_next_observation are missing.

The interface is the following:

  SUBROUTINE PDAF3_put_state_generate_obs(collect_state_pdaf, &
                                  init_dim_obs_pdafomi, obs_op_pdafomi, get_obs_f_pdaf, &
                                  prepoststep_pdaf, status_pdaf)

User-supplied routines

Here, all user-supplied routines are described that are required in the call to PDAF3_generate_obs or PDAF3_put_state_generate_obs. For some of the generic routines, we link to the page on modifying the model code for the ensemble integration.

In the subroutine interfaces some variables appear with the suffix _p (short for 'process'). This suffix indicates that the variable is particular to a model sub-domain, if a domain decomposed model is used.

In addition, there will be variables with the suffix _f (for 'full').

Call-back routines that end on _pdaf are regular call-back routines from the core part of PDAF, while call-back routines that end on _pdafomi handle observations within PDAF-OMI.

collect_state_pdaf (collect_state_pdaf.F90)

This routine is independent of the filter algorithm used.

See the page on modifying the model code for the ensemble integration for the description of this routine.

distribute_state_pdaf (distribute_state_pdaf.F90)

This routine is independent of the filter algorithm used.

See the page on modifying the model code for the ensemble integration for the description of this routine.

init_dim_obs_pdafomi (callback_obs_pdafomi.F90)

This is a call-back routine initializing the observation information. The routine just calls a routine from the observation module for each observation type.

See the documentation on callback_obs_pdafomi.F90 for more information.

obs_op_pdafomi (callback_obs_pdafomi.F90)

This is a call-back routine applying the observation operator to the state vector. The routine calls a routine from the observation module for each observation type.

See the documentation on callback_obs_pdafomi.F90 for more information.

get_obs_f_pdaf (get_obs_f_pdaf.F90)

This routine is specific for the observation generation. PDAF provides to this routine the vector of synthetic observations generated by PDAF. One can then, e.g., write the observation vector into a file so that one can use it later in a twin experiment (The template file readwrite_obs.F90 provides functionality for reading and writing as described on the page on readwrite_obs.)

The interface is the following:

SUBROUTINE get_obs_f_pdaf(step, dim_obs_f, observation_f)

with

  • step : integer, intent(in)
    Current time step
  • dim_obs_f : integer, intent(in)
    Size of the full observation vector
  • observation_f : real, intent(out), dimension(dim_obs_f)
    Full vector of synthetic observations (process-local)

Hints:

  • For generating synthetic observations, PDAF does not apply any data assimilation, but just calls the observation routines. The returned observation vector is for the global domain is no parallelization is used. If parallelization is used with a domain decomposition, the observation operator used for generating the observations will define whether different process-domain have the same or distinct observation vectors (i.e. covering the global domain or different process-specific domains). The observation operators provided with PDAF-OMI return the observations for a sub-domain.
  • The usual operation performed in this routine is to write the generated synthetic observation into a file. The PDAF package provides the template routine readwrite_obs for this. Depending on the parallelization, discussed above, one either writes a single file or a different file for each process (in this case, each process call the routine with a different file name; usually indicating the process-rank number).

prepoststep_pdaf (prepoststep_ens_pdaf.F90)

The routine has already been described for modifying the model for the ensemble integration and for inserting the analysis step.

See the page on modifying the model code for the ensemble integration for the description of this routine.

next_observation_pdaf (next_observation_pdaf.F90)

This routine is independent of the filter algorithm used.

See the page on modifying the model code for the ensemble integration for the description of this routine.

Recommendations for using PDAF3_generate_obs

The observation-generation with PDAF3_generate_obs or PDAF3_put_state_generate_obs works analogously to the observation handling in ensemble filters. However, the observation generation does not modify the ensemble states and prepoststep_pdaf is only called once before the each observation generation, but not afterwards. The usual observation functionality of init_dim_obs_pdafomi and obs_op_pdafomi is used to obtain the observed model state.

One can run the ensemble generation with a single ensemble member (dim_ens=1) or a larger ensemble. If dim_ens>1, the observation operator is applied to the ensemble mean state. The observation error information initialized in init_dim_obs_pdafomi is used in combination with Gaussian random noise to compute the perturbations that are added to the true state to generate the observations. Finally get_obs_f_pdaf gives the user access to the generated synthetic observation vector so that one can write it to a file for later use (See the page on the template file readwrite_obs.F90 for a description how the observations can be written to a file and used later on).

If one has access to real observations, one can use the implementation of init_dim_obs_pdafomi and obs_ob_pdafomi for these observations to generate synthetic observations simulating these real observations. Thus, one runs the observation generation using these routines without any modifications.

Note: The observation generation should always be performed for a single observation type at a time. Thus, one generates separate observation files for each observation type.

Using the synthetic observations in twin experiments

To perform a twin experiment using the synthetic observations generated by PDAF, one runs the data assimilation as one would with real observations. If one already initializes the vector of actual observations in the routines init_dim_obs_OBSTYPE in the observation modules, one only needs a small modification of this routine. Namely, the only required modification is that at the end of init_dim_obs_OBSTYPE: Here, one overwrites the vector of real observations with the values from the synthetic observations. If one uses the template file readwrite_obs.F90 for this, one can use read_syn_obs from this file at the end of init_dim_obs_OBSTYPE to overwrite the observation vector. To allow for a flexible switching between the case using real observations and the twin experiment, one can, for example, introduce a flag twin_experiment that controls whether the real observation values are overwritten. This reading is already included, but out-commented, in the templates.

Example implementations using PDAF3_put_state_generate_obs and readwrite_obs.F90 are provided by the implementation of PDAF with the Lorenz-96 model in models/lorenz96/. These also use the flag twin_experiment to activate the twin experiment (Note: The Lorenz-96 model case always use simulated observations. Nonetheless, it allows to see how the synthetic observations are generated with PDAF and how they are used in a twin experiment).

Last modified 20 hours ago Last modified on Jun 3, 2025, 10:57:21 AM
Note: See TracWiki for help on using the wiki.