# PDAF-OMI Observation Modules

#### PDAF-OMI Guide

- Overview
- callback_obs_pdafomi.F90
- Observation Modules
- Observation operators
- Checking error status
- Debugging functionality
- Implementing the analysis step with OMI
- Porting an existing implemention to OMI
- Additional OMI Functionality

#### Contents of this page

## Overview

The implementation of the observations with OMI is done in observation modules (obs-modules). For each observation type a separate module should be created.

Each obs-module contains four routines (where 'TYPE' will be replaced by the name of the observation):

`init_dim_obs_OBSTYPE`

initializes all variables holding the information about one observation type. The information about the observation type is stored in a data structure (Fortran derived type).`obs_op_OBSTYPE`

applies the observation operator to a state vector. One can call an observation operator routine provided by PDAF, or one can to implement a new operator.`init_dim_obs_l_OBSTYPE`

calls a PDAF-OMI routine to initialize the observation information corresponding to a local analysis domain. One can set localization parameters, like the localization radius, for each observation type.`localize_covar_OBSTYPE`

calls a PDAF-OMI routine to apply covariance localization. One can set localization parameters, like the localization radius, for each observation type.

The template file `obs_OBSTYPE_pdafomi_TEMPLATE.F90`

shows the different steps needed when implementing these routines. The main work is to implement `init_dim_obs`

, while the other routines mainly call a subroutine provided by PDAF-OMI.

In the obs-module the subroutines are named according to the observation type. The template file uses generic names which can be replaced by the user. Having distinct names for each observation type is relevant to include the subroutine from the module in the call-back routine with ‘use’. In the header of each obs-module, the user can declare further variables, e.g. assim_OBSTYPE as a flag to control whether the observation type should be assimilated.

**Note:** In contrast to the 'classical' implementation of observation routines for PDAF, the global and local filters use the same routines `init_dim_obs`

and `obs_op`

. PDAF-OMI recognizes whether a global or local filter is used and does the necessary operations by itself.

## Data type obs_f

To ensure the functionality within each obs-module, we rely on a derived data type called `obs_f`

that contains all information about the observation. One instance of this data type is allocated in each obs-module with the generic variable name `thisobs`

. A few of the elements of `obs_f`

are initialized by the user when the observation information is initialized on `init_dim_obs_f`

. Further variables is set in a call to the routine `PDAFomi_gather_obs`

. This information is then used by all other routines in the obs-module. The template file `obs_OBSTYPE_pdafomi_TEMPLATE.F90`

shows the different steps needed to initialize `thisobs`

.

The **mandatory variables** in `obs_f`

that need to be set by the user are:

TYPE obs_f ! ---- Mandatory variables to be set in INIT_DIM_OBS ---- INTEGER :: doassim=0 !< Whether to assimilate this observation type INTEGER :: disttype !< Type of distance computation to use for localization ! (0) Cartesian, (1) Cartesian periodic ! (2) simplified geographic, (3) geographic haversine function ! (10) Cartesian 2+1D factorized, (11) Cartesian periodic 2+1D factorized ! (12) simplified geographic 2+1D factorized ! (13) geographic haversine function 2+1D factorized INTEGER :: ncoord !< Number of coordinates use for distnce computation INTEGER, ALLOCATABLE :: id_obs_p(:,:) !< Indices of process-local observed field in state vector ... END TYPE obs_f

In addition there are **optional variables** that the be used:

TYPE obs_f ... ! ---- Optional variables - they can be set in INIT_DIM_OBS ---- REAL, ALLOCATABLE :: icoeff_p(:,:) !< Interpolation coefficients for obs. operator (optional) REAL, ALLOCATABLE :: domainsize(:) !< Size of domain for periodicity (<=0 for no periodicity) (optional) ! ---- Variables with predefined values - they can be changed in INIT_DIM_OBS ---- INTEGER :: obs_err_type=0 !< Type of observation error: (0) Gauss, (1) Laplace INTEGER :: use_global_obs=1 !< Whether to use (1) global full obs. !< or (0) obs. restricted to those relevant for a process domain ... END TYPE obs_f

Apart from these variables, there is a number of variables that are set internally when the routine `PDAFomi_gather_obs`

is called. The full data type can be seen on the page on OMI debugging.

Next to the derived data type `obs_f`

, there is a derived type `obs_l`

for localization. This is only used internally. It will be filled in the routine `init_dim_obs_l`

when calling `PDAFomi_init_dim_obs_l`

.

`init_dim_obs_OBSTYPE`

This is the main routine to initialize observation information.

Please see the template file `templates/omi/obs_OBSTYPE_pdafomi_TEMPLATE.F90` for a step-by-step description of the implementation steps. |
---|

Each observation module uses the generic name **thisobs** for the variable with observation type `obs_f`

. Elements of `thisobs`

are accessed like
`thisobs%doassim`

.

The main variables that the filled in this routine are

- thisobs%doassim: Specify whether this observation type is assimilated
- thisobs%disttype: Specify the type of distance computation
- thisobs%ncoord: Specify the number of dimensions used to compute distances
- dim_obs_p: Count the number of available observations
- obs_p: Fill the vector of observations
- ocoord_p: store the coordinates of the observations
- ivar_obs_p: store the inverse error variance of each observation
- thisobs%id_obs_p: store the indices of state vector elements that correspond to an observation (A single value for observation at grid points, or multiple values for derived quantities or interpolation)

When the observation operator performs interpolation, one further needs to initialize an array of interpolation coefficients (thisobs%icoeff_p). For Cartesian distance computation with periodicity one also needs to set thisobs%domainsize.

Here one can also activate the omission of observations that are too different from the ensemble mean. This is activated by setting `thisobs%inno_omit>0.0`

When parallel model with domain decomposition is used, the variables with suffix `_p`

need to describe the observation information for a particular process domain. The following routine will perform the necessary operations to ensure that the parallelization is taken into account by PDAF.

After these variables are filled, one calls

CALL PDAFomi_gather_obs(thisobs, dim_obs_p, obs_p, ivar_obs_p, ocoord_p, & thisobs%ncoord, cradius, dim_obs)

This routine will complete all required initializations for OMI. As such it is mandatory to call the routine.

The routine `PDAFomi_gather_obs`

returns the number of observations `dim_obs`

which is the return variable for PDAF.

Notes:

- The value is
`cradius`

is only used if thisobs%use_global_obs=0. - Also in case of non-isotropic localization
`cradius`

is a single value. It should be set of the largest radius in the horizontal direction used in the parallel process domain.

`obs_op_OBSTYPE`

This routine applies the observation operator to a state vector. It returns the observed state vector to PDAF. The routine is used by all filters.

PDAF-OMI provides several observation operators. For example the observation operator for observations that are grid point values is called as:

CALL PDAFomi_obs_op_gridpoint(thisobs, state_p, ostate)

Here, `state_p`

is the state vector and `ostate`

is the observed state vector.

For more information on the available observation operators and on how to implement your own observation operator see the documentation of observation operators for OMI.

`init_dim_obs_l_OBSTYPE`

This routine initializes local observation information. The routine is only used by the domain-localized filters (LESTKF, LETKF, LSEIK, LNETF, LKNETF).

For the initialization the following routine is called:

CALL PDAFomi_init_dim_obs_l(thisobs_l, thisobs, coords_l, & locweight, cradius, sradius, dim_obs_l)

Here, `thisobs`

and `thisobs_l`

are the data-type variables `obs_f`

and `obs_l`

. `dim_obs_l`

, the local size of the observation vector, is the direct output of the routine.

*Implementation steps:*

- Ensure that
`coords_l`

is filled in`init_dim_l_pdaf`

and that the unit of`coords_l`

is the same as that used fo rthe observation coordinates. - Specify the localization variables (These variables are usually set in
`init_pdaf`

and included with`use mod_assimilation`

)`locweight`

: Type of localization (see table below)`cradius`

: The localization radius or directional radii (cut-off radius for the observations, weight is always =0 for distances > cradius)`sradius`

: The support radius (or directional redii) of the localization weight function

Note, that starting with PDAF V2.2.1 these three variables can be either scalar values - for isotropic localization-, or arrays - for non-isotropic localization and for additionally choose separate weights functions for the horizontal and vertical directions (see the notes below for more information)

The setting of `locweight`

influences the weight function for the localization. The choices are standardized as follows

locweight | 0 | 1 | 2 | 3 | 4
| |
---|---|---|---|---|---|---|

function | unit weight | exponential | 5-th order polynomial | 5-th order polynomial | 5-th order polynomial | |

regulation | - | - | - | regulation using mean variance | regulation using variance of single observation point | |

cradius | weight=0 if distance > cradius | |||||

sradius | no impact | weight = exp(-d / sradius) | weight = 0 if d >= sradius else weight = f(sradius, distance) |

Here, 'regulation' refers to the regulated localization introduced in Nerger, L., Janjić, T., Schröter, J., Hiller, W. (2012). A regulated localization scheme for ensemble-based Kalman filters. Quarterly Journal of the Royal Meteorological Society, 138, 802-812. doi:10.1002/qj.945.

**Notes:**

**isotropic localization**: If`cradius`

and`sradius`

are scalar values, the localization is isotropic. Thus, it uses the same`cradius`

in all directions. If different localization scales should be applied e.g. in the vertical compared to the horizonal one needs to scale the vertical coordinates.**non-isotropic localization**: Nonisotropic localization was introduced with PDAF V2.2:`cradius`

and`sradius`

can be declared as vectors of length`thisobs%ncoords`

and each element can get a different value. In this case, the values define a non-isotropic localization according to the values specified in`cradius`

and`sradius`

. PDAF-OMI will use these values to compute a directional localization radius.**2D+1D factorized non-isotropic localization**: With PDAF V2.2.1 a factorized 2D+1D localization can be specified (see explanation of disttype. If the non-isotropic localization is used one can specify different weight functions for the vertical and horizontal directions. This is achieved by declaring`loweight`

as a vector of size 2. Now the first element specifies the weight function (according to the table above) for the horizontal direction and the second element specified the wieght function for the vertical direction. When 'locweight' is used as a scalar variable, it specified the weight function in the horizontal direction while the weight function in the vertical dircetion is a constant value of one.- A common choice is to use
`locweight=2`

or`locweight=4`

in combination with`cradius=sradius`

. Choosing`sradius>cradius`

is possible, but`sradius<cradius`

should be avoided (one would set the weights of distant observation to zero, but would still assimilate them).

`localize_covar_OBSTYPE`

This routine initializes local observation information. The routine is only used by the local EnKF (LEnKF).

For the initialization the following routine is called:

CALL PDAFomi_localize_covar(thisobs, dim_p, locweight, cradius, sradius, & coords_p, HP_p, HPH)

Here, `thisobs`

is the data-type variable `obs_f`

. `HP_p`

and `HPH`

are the covariance matrices projected onto the observations. The localization will be applied to these variables.

*Implementation steps:*

- Ensure that
`coords_p`

is filled in`localize_covar_pdafomi`

- Specify the localization variables (These variables are usually set in
`init_pdaf`

and included with`use mod_assimilation`

)`locweight`

: Type of localization (see table above)`cradius`

: The localization radius (cut-off radius for the observations, weight is always =0 for distances > cradius)`sradius`

: The support radius of the localization weight function

Note, that starting with PDAF V2.2.1 these three variables can be either scalar values - for isotropic localization-, or arrays - for non-isotropic localization and for additionally choose separate weights functions for the horizontal and vertical directions (see the notes below for more information)

**Notes:**

**isotropic localization**: If`cradius`

and`sradius`

are scalar values, the localization is isotropic. Thus, it uses the same`cradius`

in all directions. If different localization scales should be applied e.g. in the vertical compared to the horizonal one needs to scale the vertical coordinates.**non-isotropic localization**: Nonisotropic localization was introduced with PDAF V2.2:`cradius`

and`sradius`

can declared as vectors of length`thisobs%ncoords`

and each element can get a different value. In this case, the values defined a non-isotropic localization according to the values specified in`cradius`

and`sradius`

.**non-isotropic localization**: Nonisotropic localization was introduced with PDAF V2.2:`cradius`

and`sradius`

can be declared as vectors of length`thisobs%ncoords`

and each element can get a different value. In this case, the values define a non-isotropic localization according to the values specified in`cradius`

and`sradius`

. PDAF-OMI will use these values to compute a directional localization radius.**2D+1D factorized non-isotropic localization**: With PDAF V2.2.1 a factorized 2D+1D localization can be specified (see explanation of disttype. If the non-isotropic localization is used one can specify different weight functions for the vertical and horizontal directions. This is achieved by declaring`loweight`

as a vector of size 2. Now the first element specifies the weight function (according to the table above) for the horizontal direction and the second element specified the wieght function for the vertical direction. When 'locweight' is used as a scalar variable, it specified the weight function in the horizontal direction while the weight function in the vertical dircetion is a constant value of one.- A common choice for the localization is to use
`locweight=2`

or`locweight=4`

in combination with`cradius=sradius`

. Choosing`sradius>cradius`

is possible, but`sradius<cradius`

should be avoided (one would set the weights of distant observation to zero, but would still assimilate them). - Particular for the LEnKF: When choosing
`locweight=1`

(exponential decrease) with a finite value of`cradius`

if might be that the localized covariance matrices are no longer positive semidefinite. Mathematically consistent for`locweight=1`

would be to set`cradius`

so that the full model domain is covered. The width of the localization weight function is then defined by`sradius`

. For`locweight>1`

one should set`cradius=sradius`

.

## Additional routines for 3D-Var

For the 3D-Var methods added with PDAF V2.0 two more routines are required in the observation module.

`obs_op_lin_OBSTYPE`

This routine applies the linearized observation operator to a state vector. It returns the observed state vector to PDAF. The routine is used only by the 3D-Var methods.

**Note:** A separate routine for `obs_op_lin_OBSTYPE`

is only required if the full observation operator in `obs_op_OBSTYPE`

is nonlinear. If `obs_op_OBSTYPE`

is linear, one can just insert calls to this operator in the routine `obs_op_lin_pdafomi`

in `callback_obs_pdafomi.F90`

.

PDAF-OMI provides several linear observation operators. For example the observation operator for observations that are grid point values is called as:

CALL PDAFomi_obs_op_gridpoint(thisobs, state_p, ostate)

Here, `state_p`

is the state vector and `ostate`

is the observed state vector.

For more information on the available observation operators and on how to implement your own observation operator see the documentation of observation operators for OMI.

`obs_op_adj_OBSTYPE`

This routine applies the adjoint observation operator to an observation vector. It returns the state vector to PDAF. The routine is used only by the 3D-Var methods.

PDAF-OMI provides consistent pairs of linear observation operators. For example the adjoint observation operator for observations that are grid point values is called as:

CALL PDAFomi_obs_op_adj_gridpoint(thisobs, ostate, state_p)

Here, `ostate`

is the observation vector and `state_p`

is the state vector.

For more information on the available observation operators and on how to implement your own observation operator see the documentation of observation operators for OMI.

## Implementing a new observation type

To implement a new observation type, the approach is generally as follows:

- Create a copy of
`obs_OBSTYPE_pdafomi_TEMPLATE.F90`

- Rename the module and its subroutines according to the observation (replacing ‘OBSTYPE’ by name of observation).
- Implement
`init_dim_obs`

for the observation type following the instructions in the template - Adapt
`obs_op`

for the observation type - Adapt
`init_dim_obs_l`

for the observation type (if using a domain_localized filter) - Adapt
`localize_covar`

for the observation type (if using a the local EnKF) - Add subroutine calls for the new observation type into the routines in
`callback_obs_pdafomi.F90`

## Implementation hints for init_dim_obs

`thisobs%doassim`

Set this variable to 1 to let the filter assimilate this observation. The setting is usually conditional on the value of `assim_OBSTYPE`

which is set in `init_pdaf`

:

IF (assim_OBSTYPE) thisobs%doassim = 1

`thisobs%ncoord`

This variable specifies the dimension of the distance computations. Thus thisobs%ncoord=2 will lead to distance computations in 2 dimensions.

`thisobs%disttype`

This variable specifies the type of distance computation. Possible choices are

- 0: Cartesian distance in ncoord dimension
- 1: Cartesian distance in ncoord dimensions with periodicity (Needs specification of thisobs%domainsize(ncoord))
- 2: Approximate geographic distance in meters with horizontal coordinates in radians (latitude: -pi/2 to +pi/2; longitude -pi/+pi or 0 to 2pi)
- 3: Geographic distance computation in meters using haversine formula with horizontal coordinates in radians (latitude: -pi/2 to +pi/2; longitude -pi/+pi or 0 to 2pi)

With PDAF V2.2.1, a **2D+1D factorized localization** was introduced for 3-dimensional applications. With the factorized localization, the horizontal distance (components 1 and 2) is treated separately from the vertical direction (3rd component). This is available for both isoptropic and non-isotropic localization and activated using the choices

- 10: Cartesian distance 2D+1D factorized in 3 dimensions
- 11: Cartesian distance 2D+1D factorized in 3 dimensions with periodicity (Needs specification of thisobs%domainsize(ncoord))
- 12: Approximate geographic distance 2D+1D factorized in meters with horizontal coordinates in radians (latitude: -pi/2 to +pi/2; longitude -pi/+pi or 0 to 2pi) and vertical in unit chosen by the user.
- 13: Geographic distance computation 2D+1D factorized in meters using haversine formula with horizontal coordinates in radians (latitude: -pi/2 to +pi/2; longitude -pi/+pi or 0 to 2pi) and vertical in unit as chosen by the user.

**Notes:**

- When disttype>=10 is specified with isotropic localization the weight function for the vertical direction is constant with a valu eof one. For non-isotropic localization, the weight functions can be separately specified for the vertical and horizontal directions. (see the description of init_dim_obs_l_OBSTYPE for information on how to specify the ono-isotropic localization.
- For 0 and 1 (likewise 10, 11) any distance unit can be used. The computed distance will be in the same unit. For 2 and 3 the horizontal input coordinates are in radians and the distance is computed in meters. Essential is that the grid point coordinates and observation coordinates use the same unit.
- For 3-dimensional localization, the unit of the vertical direction can be chosen by the user. However, for geographic ditances, the unit should be chosen to be 'compatible' with the unit in the horizontal (meter). When isotropic localization is used, the unit for the vertical direction can be scaled do that the length scales in the vertical and horizontal directions are the same (this, e.g., allows to use pressure as the distance measure in the vertical in atmospheric models). For non-isotropic localization, the units can differ without scaling. In ccase of the factorized 2D+1D localization (disttype>=10), the units in the horizontal and vertical directions are independent.

See `/models/lorenz96/`

for an example using case 1 with periodicity in one dimension.

`dim_obs_p`

This is a single integer value giving the number of observations. With a parallel model using domain-decomposition this will be the number of observations for the process sub-domain. For observation files holding all observations one can read these and then check which observation redice within the process sub-domain. `dim_obs_p`

will be used to allocate further arrays and as input argument to `PDAFomi_gather_obs`

.

`obs_p`

This should be a vector of real values. It will be used as an argument to `PDAFomi_gather_obs`

. The order of the entries has to be consistent in the arrays `thisobs%id_obs_p`

, `obs_p`

, `ivar_obs_p`

, and `ocoord_p`

.

`ocoord_p`

This should be a rank-2 array of real values with size (thisobs%ncoord, dim_obs_p). It will be used as an argument to `PDAFomi_gather_obs`

. The order of the entries has to be consistent in the arrays `thisobs%id_obs_p`

, `obs_p`

, `ivar_obs_p`

, and `ocoord_p`

.

The coordinates of the observation with index `k`

are given by `ocoord_p(:,k)`

.

**Note:** The observation coordinate values will only be used in case of the local filters or for computing interpolation coefficients. The array has always to be allocated because it is used in the call to PDAFomi_gather_obs.

**Note:** The unit of `ocoord_p`

and `coords_l`

(in `init_dim_obs_l`

) has to be the same. For geographic coordinate computations (thisobs%disttype=2 or =3) the unit used by PDAF-OMI is radian.

`ivar_obs_p`

This should be a vector of real values. It will be used as an argument to `PDAFomi_gather_obs`

. The order of the entries has to be consistent in the arrays `thisobs%id_obs_p`

, `obs_p`

, `ivar_obs_p`

, and `ocoord_p`

.

`thisobs%id_obs_p`

This array is allocated as

ALLOCATE(thisobs%id_obs_p(NROWS, dim_obs_p))

For a fixed value of the second index the NROWS are the indices of the elements of the state vector that are treated in the observation operator. The value of NROWS depends on the observation operator used for an observation. Examples:

- Using observations that are grid points values:
- NROWS=1
- The entry is the index of a single element of the state vector

- Using observations that are determined by bi-linear interpolation of 4 grid points:
- NROWS=4
- The entries are the indices of four elements of the state vector

**Note:** This array is only used in the observation operators provided by PDAF-OMI. If you don't use these observation operators, you might not need this array.

`thisobs%domainsize`

This array has to be allocated as

ALLOCATE(thisobs%domainsize(thisobs%ncoord))

Here one has to specify the size of the domain in each of its thisobs%ncoord dimensions. The information is used to compute the Cartesian distance with periodicity.

Setting one dimension to 0 or a negative value indicates that there is no periodicity in this direction.

`thisobs%icoeff_p`

This array is allocate the in same way as `thisobs%id_obs_p`

:

ALLOCATE(thisobs%icoeff_p(NROWS, dim_obs_p))

The value of NROWS has to be the same as for `thisobs%id_obs_p`

. For a fixed value of the second index the NROWS of the array hold the interpolation coefficients corresponding to the indices specified in `thisobs%id_obs_p`

.

Please see the documentation of OMI observation operators for information on how to initialize the array `thisobs%icoeff_p`

using functions provided by PDAF-OMI.

`thisobs%obs_err_type`

The particle filter methods NETF, LNETF and PF can handle observations with non-Gaussian errors. PDAF-OMI supports the following two choices:

- 0: Gaussian errors (
*default value*) - 1: double-exponential (Laplace) errors

`thisobs%use_global_obs`

In the domain-localized filters (LESTK, LETKF, LSEIK, LNETF) observations are assimilated that are located within the localization around some grid point. When a model uses parallelization with domain-decomposition some of these observations might belong to a different process-domain. In the default mode (`thisobs%use_global_obs`

=1) PDAF-OMI gathers all globally available observations so that each process has access to all observations. It can be more efficient to limit the observations on a process-domain to those observations that are located inside the domain or within the localization radius around it. Then, in the local analyses less observations have to be checked for their distance. Setting `thisobs%use_global_obs=0`

activates this feature. However, it needs additional preparations to make PDAF-OMI aware of the limiting coordinates of a process sub-domain.

The use of this feature is described in the documentation on using domain-limited observations.

`thisobs%inno_omit`

Setting this variable to a value > 0.0 activates the functionality that observations are omitted (made irrelevant) from the analysis update if the difference of their value and the ensemble mean to too large. For more information see the page on additional OMI functionality.