Implementation of Observation Generation with PDAF
Contents of this page
 Overview
 Initialization
 Observation Generation Step

PDAF_generate_obs

PDAF_put_state_generate_obs

Usersupplied routines

U_collect_state
(collect_state_pdaf.F90) 
U_distribute_state
(distribute_state_pdaf.F90) 
U_init_dim_obs_f
(init_dim_obs_f_pdaf.F90) 
U_obs_op_f
(obs_obs_f_pdaf.F90) 
U_init_obserr_f
(init_obserr_f_pdaf.F90) 
U_get_obs_f
(get_obs_f_pdaf.F90) 
U_prepoststep
(prepoststep_ens_pdaf.F90) 
U_next_observation
(next_observation_pdaf.F90)


Recommendations for using
PDAF_generate_obs
 Using the synthetic observations in twin experiments
This page describes the implementation of the analysis step without using PDAFOMI. Please see the page on the analysis with PDAFOMI for the more modern and efficient implementation variant using PDAFOMI. 
The observation generation functionality was added with Version 1.14 of PDAF.
Overview
Twin data assimilation experiments are a common approach to assess data assimilation methods. In twin experiments one uses the model to generate a true model state. Further one generates synthetic observations by adding random perturbations to the true state. The, in the actual twin experiment one starts the data assimilation with a state estimate that is different from the true state and assimilates the synthetic observations. One can analyze the assimilation result by comparing the state estimate from the twin experiment with the previously generated true state.
Starting with version 1.14, PDAF provides functionality to generate synthetic observations. The functionality bases on the normal implementation of the assimilation used with PDAF. However, one can run the observation generation with an ensemble of just one member, which should be initialized with the initial true state. PDAF provides the routines PDAF_generate_obs
and PDAF_put_state_generate_obs
to perform the observation generation. These routines use the observation operator routines which the user implements e.g. for assimilating real observations.
Here we describes the steps need to generate synthetic obsrvations.
Initialization
The implementation of the initialization of PDAF is explained on the [wikiInitPdaf page on init_pdaf
and PDAF_init
].
For the observation generation one just has to set filtertype = 11
.
There are no particular options for the observation generation functionality. So for filter_param_i
one just has to specify the mandatory values of the state dimension and the ensemble size. For filter_param_r
one has to specify the mandatory values of the forgetting factor (even though, this value is ignored for the observation generation)
Observation Generation Step
This step replaces the analysis step. The implementation is analogous to implementing the analysis step as described on the page on implementing the analysis step.
PDAF_generate_obs
This routine is used in the same way as the filter specific routines PDAF_assimilate_*
. Thus the general aspect have been described on the page Modification of the model code for the ensemble integration and its subpage on inserting the analysis step. The routine PDAF_generate_obs
is used in the fullyparallel implementation variant of the data assimilation system. When the 'flexible' implementation variant is used, the routines PDAF_put_state_generate_obs' is used as described further below. Here, we list once more the full interface. Subsequently, the full set of usersupplied routines specified in the call to
PDAF_generate_obs` is explained. Apart from two callback routines, the routines are idnetical to e.g. those used for the LESTKF and LETKF filters.
SUBROUTINE PDAF_generate_obs(U_collect_state, U_distribute_state, & U_init_dim_obs_f, U_obs_op_f, U_init_obserr_f, U_get_obs_f, & U_prepoststep, U_next_observation, status_pdaf)
with the following arguments:
U_collect_state
: The name of the usersupplied routine that initializes a state vector from the array holding the ensemble of model states from the model fields. This is basically the inverse operation toU_distribute_state
used in PDAF_get_stateU_distribute_state
: The name of a user supplied routine that initializes the model fields from the array holding the ensemble of model state vectors.U_init_dim_obs_f
: The name of the usersupplied routine that provides the size of the full observation vectorU_obs_op_f
: The name of the usersupplied routine that acts as the full observation operator on some state vectorU_init_obserr_f
: The name of the usersupplied routine that initializes the vector of observations error standard deviations for full observation vectorU_get_obs_f
: The name of the usersupplied routine that receives the full vector of generated synthetic observations from PDAFU_prepoststep
: The name of the pre/poststep routine as inPDAF_get_state
U_next_observation
: The name of a user supplied routine that initializes the variablesnsteps
,timenow
, anddoexit
. The same routine is also used inPDAF_get_state
.status_pdaf
: The integer status flag. It is zero, ifPDAF_assimilate_lestkf
is exited without errors.
PDAF_put_state_generate_obs
When the 'flexible' implementation variant is chosen for the assimilation system, the routine PDAF_put_state_generate_obs
has to be used instead of PDAF_generate_obs
. The general aspects of the filter specific routines PDAF_put_state_*
have been described on the page Modification of the model code for the ensemble integration. The interface of the routine is identical with that of PDAF_generate_obs
with the exception the specification of the usersupplied routines U_distribute_state
and U_next_observation
are missing.
The interface is the following:
SUBROUTINE PDAF_put_state_generate_obs(U_collect_state, U_init_dim_obs_f, U_obs_op_f, U_init_obserr_f, & U_get_obs_f, U_prepoststep, status_pdaf)
Usersupplied routines
Here, all usersupplied routines are described that are required in the call to PDAF_generate_obs
. For some of the generic routines, we link to the page on modifying the model code for the ensemble integration.
To indicate usersupplied routines we use the prefix U_
. In the tutorials in tutorial/
and in the template directory templates/
these routines exist without the prefix, but with the extension _pdaf
. The files are named correspondingly. In the section titles below we provide the name of the template file in parentheses.
In the subroutine interfaces some variables appear with the suffix _p
(short for 'process'). This suffix indicates that the variable is particular to a model subdomain, if a domain decomposed model is used. Thus, the value(s) in the variable will be different for different model subdomains. In addition, there will be variables with the suffix _f
(for 'full').
U_collect_state
(collect_state_pdaf.F90)
This routine is independent from the filter algorithm used. See the page on inserting the analysis step for the description of this routine.
U_distribute_state
(distribute_state_pdaf.F90)
This routine is independent of the filter algorithm used. See the page on inserting the analysis step for the description of this routine.
U_init_dim_obs_f
(init_dim_obs_f_pdaf.F90)
This routine has to initialize the size dim_obs_f
of the full observation vector according to the current time step. For simplicity, dim_obs_f
can be the size for the global model domain. The routine is described in detail on the page on implementing the analysis step for LESKTF.
U_obs_op_f
(obs_obs_f_pdaf.F90)
This routine has to perform the operation of the observation operator acting on a state vector, which is provided as state_p
. The observed state has to be returned in m_state_f
. It is the observed state corresponding to the 'full' observation vector. The routine is described in detail on the page on implementing the analysis step for LESKTF.
U_init_obserr_f
(init_obserr_f_pdaf.F90)
This routine is specific for the observation generation. The routine is called by PDAF during the observation generation. Its purpose is to fill the provided vector of observation error standard deviations.
The interface is the following:
SUBROUTINE init_obserr_f_pdaf(step, dim_obs_f, obs_f, rms_obs)
with
step
:integer, intent(in)
Current time stepdim_obs_f
:integer, intent(in)
Size of full observation vectorobs_f
:real, intent(in), dimension(dim_obs_f)
Full vector of observationsrms_obs
:real, intent(out), dimension(dim_obs_f)
Full vector of observation error standard deviations
Notes:
 The routines handles the 'full' observation vector as in localizated filters. As described for the observation generation functionality one can also use it for global filters. In this case the 'full' vector would just contain the observations local to a process subdomain.
 The observation vector
obs_f
is provided to the routine for the case that the observation error is relative to the value of the observations.
U_get_obs_f
(get_obs_f_pdaf.F90)
This routine is specific for the observation generation. In this routine PDAF provides the user with the vector of synthetic observations generated by PDAF. One can then e.g. write the observation vector into a file so that one can use it later in a twin experiment (The template file readwrite_obs.F90
provides functionality for reading and writing as described on the page on readwrite_obs.
The interface is the following:
SUBROUTINE get_obs_f_pdaf(step, dim_obs_f, observation_f)
with
step
:integer, intent(in)
Current time stepdim_obs_f
:integer, intent(in)
Size of the full observation vectorobservation_f
:real, intent(out), dimension(dim_obs_f)
Full vector of synthetic observations (processlocal)
Hints:
 For the generation of synthetic observations, PDAF does not distinguish between local and global filters. Without parallelization, the full observation vector would be the same for both types of filters. With parallelization the implementation of the observation operator used for generating the observations will define whether different processdomain have the same or distinct observation vectors (i.e. covering the global domain or different processspecific domains).
 In case of the global filters, one uses the functionality of the observation operator for this filter type. With parallelization, the observation operator will initialize an observation vector specifically for each processdomain.
 The usual operation performed in this routine is to write the generated synthetic observation into a file. The PDAF package provides the template routine readwrite_obs for this. Depending on the parallelization, discussed above, one either writes a single file (of the full observation vector is the same for all processes. In this case one a single process calls the writing routine) or a different file for each process (in this case, each process call the routine with a different file name; usually indicating the processrank number).
U_prepoststep
(prepoststep_ens_pdaf.F90)
This routine can be identical to that used for the global ESTKF algorithm, which has already been described on the page on modifying the model code for the ensemble integration.
U_next_observation
(next_observation_pdaf.F90)
This routine is independent of the filter algorithm used. See the page on inserting the analysis step for the description of this routine.
Recommendations for using PDAF_generate_obs
The observationgeneration with PDAF_generate_obs
or PDAF_put_state_generate_obs
works analogously to the observation handling in the localized filters like LESTKF and LETKF. However, the observation generation does not modify the ensemble states and prepoststep_pdaf
is only called once before the each observation generation, but not afterwards. The routine init_dim_obs_f_pdaf
can be identical to the actuall assimilation case. It initializes the full observation dimension and usually also some more observation information (as described e.g. on the [wikio:init_dim_obs_f_pdaf detail page on init_dim_obs_f_pdaf]. Subsequently obs_op_f_pdaf
is applied. One can run the ensemble generation with a single ensemble member (dim_ens=1) or a larger ensemble. If dim_ens>1, the observation operator is applied to the ensemble mean state. The routine init_obserr_f_pdaf
provides PDAF with the vector of observation error standard deviations. This is used in combination with Gaussian random noise to compute the perturbations that are added to the true state to generate the observations. Finally get_obs_f_pdaf
gives the user access to the generated synthetic observation vector so that one can write it to a file for later use (See the page on the template file readwrite_obs.F90 for a description how the observations can be written to a file and used later on).
If one has access to real observations, one can use the implementation of init_dim_obs_f_pdaf
and obs_ob_f_pdaf
for these observations to generate synthetic observations simulating these real observations. Thus one runs the observation generation using these routines without any modifications.
Using the synthetic observations in twin experiments
To perform a twin experiment using the synthetic observations generated by PDAF, one runs the data assimilation as one would with real observations. If one already initializes the vector of actual observations in the routine init_dim_obs_f
one only needs to small modification of this routine. Namely, only required modification is that at the end of init_dim_obs_f
one overwrites the vector of real observations with the values from the synthetic observations. If one uses the template file readwrite_obs.F90
for this, one can use read_syn_obs
from this file at the end of init_dim_obs_f
to overwrite the observatio vector. To allow for a flexible switching between the case using real observations and the twin experiment, one can for example introduce a flag twin_experiment
that controls whether the real observation values are overwritten.
Example implementations using PDAF_put_state_generate_obs
and readwrite_obs.F90
are provided by the implementation of PDAF with the Lorenz96 model in models/classical/lorenz96/
. These also use the flag twin_experiment
to activate the twin experiment (Note: The Lorenz96 model case always use simulated observations. Nonetheless, it allows to see how the synthetic observations are generated with PDAF and how they are used in a twin experiment).