= PDAF-OMI Observation Diagnostics =

{{{
#!html
<div class="wiki-toc">
<h4>PDAF-OMI Guide</h4>
<ol><li><a href="PDAF_OMI_Overview">Overview</a></li>
<li><a href="OMI_Callback_obs_pdafomi">callback_obs_pdafomi.F90</a></li>
<li><a href="OMI_observation_modules">Observation Modules</a></li>
<li><a href="OMI_observation_operators">Observation operators</a></li>
<li><a href="OMI_error_checking">Checking error status</a></li>
<li><a href="OMI_debugging">Debugging functionality</a></li>
<li><a href="OMI_ImplementationofAnalysisStep">Implementing the analysis step with OMI</a></li>
<ol>
<li> <a href="ImplementFilterAnalysisOverview"> General overview for ensemble filters</a></li>
<ol>
<li><a href="ImplementAnalysisGlobal">Implementation for Global Filters</a></li>
<li><a href="ImplementAnalysisLocal">Implementation for Local Filters</a></li>
<li><a href="ImplementAnalysislenkfOmi">Implementation for LEnKF</a></li>
</ol>
<li> <a href="Implement3DVarAnalysisOverview"> General overview for 3D-Var methods</a></li>
<ol>
<li><a href="ImplementAnalysis_3DVar">Implementation for 3D-Var</a></li>
<li><a href="ImplementAnalysis_3DEnVar">Implementation for 3D Ensemble Var</a></li>
<li><a href="ImplementAnalysis_Hyb3DVar">Implementation for Hybrid 3D-Var</a></li>
</ol>
</ol>
<li><a href="OMI_nondiagonal_observation_error_covariance_matrices">Using nondiagonal R-matrices</a></li>
<li><a href="Porting_to_OMI">Porting an existing implemention to OMI</a></li>
<li><a href="PDAFomi_additional_functionality">Additional OMI Functionality</a></li>
<li>Observation diagnostics</li>
</ol>
</div>
}}}

[[PageOutline(2-3,Contents of this page)]]

|| PDAF-OMI observation diagnostics were introduced with PDAF V3.0 ||

The PDAF-OMI observation diagnostics module provides functionality to obtain statistics about the differences between observations and the observed model state. 
Additionally, there are routines that provide the user with access to observations and observed quantities, such as the observed ensemble mean state.

Here, we describe the functionalities of the observation diagnostics routines.

A common place to call the `PDAFomi_diag` diagnostics routines is in `prepoststep_pdaf`, which is the usual place to also analyze the ensemble. 

By default, PDAF initializes the observations after `prepoststep_pdaf` was executed after the forecast. To be able to compare the observations and the forecast ensemble, one has to switch the place at which observations are initialized. This is done with 
{{{
  CALL PDAF_set_iparam(9, 0)
}}}
which can be called in `init_pdaf` subsequently to the initialization of PDAF with `PDAF_init`.

The routines for observation diagnostics can be organized in four groups
 * Deactivating or re-activating observation diagnostics
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_set_obs_diag PDAFomi_set_obs_diag]
 * Statistics
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_rmsd PDAFomi_diag_rmsd] - root mean square difference
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_stats PDAFomi_diag_stats] - set of 6 statistics
 * Access to observation dimensions
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_nobstypes PDAFomi_diag_nobstypes] - number of observation types
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_dimobs PDAFomi_diag_dimobs] - vector of observation dimensions
 * Acces to observation arrays
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_get_obs PDAFomi_diag_get_obs] -  access to observation vector
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_get_HXmean PDAFomi_diag_get_HXmean] - access to observed ensemble mean
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_get_HX PDAFomi_diag_get_HX] - access to observed ensemble 
   * [wiki:PDAFomi_observation_diagnostics#PDAFomi_diag_get_ivar PDAFomi_diag_get_ivar]- access to inverse observation error variances


== Deactivating or re-activating observation diagnostics ==

=== PDAFomi_set_obs_diag ===

By default, the observation diagnostics are active. However, as this functionality increases the required memory, it might be desirable to deactivate this functionality. This routine is used deactivate the observation diagnostics. It is also possible to re-activate the observation diagnostics at a later time. 

The routine can be called by all processes, but it is sufficient to call it for those processes that handle observations, which are usually the filter processes. A common place is to call the routine in `init_pdaf` subsequently to the initialization of PDAF in `PDAF_init`.

The interface is: 
{{{
  SUBROUTINE PDAFomi_set_obs_diag(diag)

    INTEGER, INTENT(in) :: diag   ! Value for observation diagnostics mode
                                  ! =0 deactivates observation diagnostics
                                  ! >0 activates observation diagnostics
}}}


== Statistics ==

=== PDAFomi_diag_rmsd ===

The routine returns a pointer to a vector of the root-mean square difference (RMSD) between the observations and the observed ensemble mean for each observation type. 

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_obs_rmsd(nobs, rmsd_pointer, verbose)

    INTEGER, INTENT(inout) :: nobs                   ! Number of observation types
    REAL, POINTER, INTENT(inout) :: rmsd_pointer(:)  ! Pointer to vector of RMSD values
    INTEGER, INTENT(in) :: verbose                   ! Verbosity flag, >0 for output
}}}

**Note:**
 * The computed RMSD is for the global model domain. Thus, in case of a parallelized model, all process sub-domains are taken into account and calling `PDAFomi_diag_obs_rmsd` will return the same value for all processes.
 * In Fortran user code, the pointer should be declared in the form[[BR]] `REAL, POINTER :: rmsd_ptr(:)`[[BR]] It does not need to be allocated. The target vector has the length `nobs`.
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointer array will not be set and `nobs=0` is returned. One can check the value of `nobs` before assessing the pointer array.
 * A more extensive set of statistics can be obtained using the routine `PDAFomi_diag_stats`.


=== PDAFomi_diag_stats ===

The routine returns a pointer to an array of a selection of 6 statistics comparing the observations and the observed ensemble mean for each observation type. The statistics can, for example, be used to plot a Taylor diagram.

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_stats(nobs, obsstats_ptr, verbose)

    INTEGER, INTENT(inout) :: nobs                     ! Number of observation types
    REAL, POINTER, INTENT(inout) :: obsstats_ptr(:,:)  ! Array of observation statistics
          ! Included statistics are:
          !  (1,:) correlations between observation and observed ensemble mean
          !  (2,:) centered RMS difference between observation and observed ensemble mean
          !  (3,:) mean bias (observation minus observed ensemble mean)
          !  (4,:) mean absolute difference between observation and observed ensemble mean
          !  (5,:) variance of observations
          !  (6,:) variance of observed ensemble mean
    INTEGER, INTENT(in) :: verbose                     ! Verbosity flag, >0 to write output
}}}

**Note:**
 * The computed statistics are for the global model domain. Thus, in case of a parallelized model, all process sub-domains are taken into account and calling `PDAFomi_diag_stats` will return the same value for all processes.
 * In Fortran user code, the pointer should be declared in the form[[BR]] `REAL, POINTER :: obsstats_ptr(:)`.[[BR]] It does not need to be allocated. The target array has the size `(6, nobs)`.
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointer array will not be set and `nobs=0` is. One can check this value before assessing the pointer array
 * The routine returns the centered RMSD as displayed in Taylor diagrams. The non-centered RMSD can be computed using `PDAFomi_diag_obs_rmsd`.


== Access to observation dimensions ==

=== PDAFomi_diag_nobstypes === 

The routine returns the number of observation types that are active in an assimilation run. 

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_nobstypes(nobstypes)

    INTEGER, INTENT(inout) :: nobstypes   ! Number of observation types
}}}

**Note:**
 * `nobstypes` is commonly used as the upper limit of a loop running over all observation types. In this way, `nobstypes` can be used with the `PDAFomi_diag` routines that return different observation-related arrays for a single observation type.


=== PDAFomi_diag_dimobs ===

The routine returns a pointer to a vector of the number of observations (observation dimension) for each active observation type.

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_dimobs(dim_obs_ptr)

    INTEGER, POINTER, INTENT(inout) :: dim_obs_ptr(:)   ! Pointer to observation dimensions
}}}

**Note:**
 * In Fortran user code, the pointer should be declared in the form[[BR]] `INTEGER, POINTER :: dim_obs_ptr(:)`[[BR]] It does not need to be allocated.
 * If the observation diagnostics have been deactivated by using [wiki:PDAFomi_set_obs_diag], the pointer array will have length 1 and the observation dimension is returned as 0.



== Access to observation arrays ==

The routines that provide access to observation arrays all work for a single observation type, which is specified as the first argument. To process all observation types, one can implement a loop `DO iobs = 1, nobstypes` where `nobstype` can be obtained with `PDAFomi_diag_nobstypes`, which was described before.

=== PDAFomi_diag_get_obs ===

The routine returns a pointer to a vector of observations of the specified observation type (`id_obs`) and a pointer to the corresponding array of observation coordinates.

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_get_obs(id_obs, dim_obs_p_diag, ncoord, obs_p_ptr, ocoord_p_ptr)

    INTEGER, INTENT(in) :: id_obs                    ! Index of observation type to return
    INTEGER, INTENT(out) :: dim_obs_p_diag           ! Observation dimension
    INTEGER, INTENT(out) :: ncoord                   ! Number of observation dimensions
    REAL, POINTER, INTENT(out) :: obs_p_ptr(:)       ! Pointer to observation vector
    REAL, POINTER, INTENT(out) :: ocoord_p_ptr(:,:)  ! Pointer to coordinate array
                                                     ! (index order as in observation modules)
}}}

**Notes:**
 * In case of a parallelized model, the vector `obs_p_prt` and the array `ocoord_p_prt` contain the values for the process sub-domain of the calling process.
 * In Fortran user code, the pointer to the observation vector should be declared in the form[[BR]] `REAL, POINTER :: obs_p_ptr(:)`.[[BR]] It does not need to be allocated. The target vector has the length `dim_obs_p_diag`.
 * In Fortran user code, the pointer to the observation coordinates should be declared in the form[[BR]] `REAL, POINTER :: ocoord_p_ptr(:,:)`.[[BR]] It does not need to be allocated. The target array has the size `(ncoord, dim_obs_p_diag)`.
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointers will not be set and `dim_obs_p_diag=0` and `ncoord=0` will be returned. These values can be checked before assessing the pointer arrays
 * The array `ocoord_p_ptr(:,:)` is organized as in the observation modules:
   * First index: index of different coordinate directions for observation specified by the second index
   * Second index: index of the observation
 * One can access the values in `obs_p_prt` and `ocoord_p_prt` like usual arrays. There is no particularity with respect to being pointers.



=== PDAFomi_diag_get_HXmean ===

The routine returns a pointer to a vector of the observed ensemble mean state for the specified observation type (`id_obs`).

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_get_HXmean(id_obs, dim_obs_diag, HXmean_p_ptr)

    INTEGER, INTENT(in) :: id_obs                    ! Index of observation type to return
    INTEGER, INTENT(out) :: dim_obs_p_diag           ! Observation dimension
    REAL, POINTER, INTENT(out) :: HXmean_p_ptr(:)    ! Pointer to observed ensemble mean
}}}

**Notes:**
 * In case of a parallelized model, the vector `HXmean_p_prt` contains the observed ensemble mean for the process sub-domain
 * In Fortran user code, the pointer to the observed ensemble mean should be declared in the form: [[BR]] `REAL, POINTER :: HXmean_p_ptr(:)`[[BR]] It does not need to be allocated. The target vector has the length `dim_obs_p_diag`.
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointer will not be set and `dim_obs_diag=0` will be returned. This value can be checked before assessing the pointer array.



=== PDAFomi_diag_get_HX ===

The routine returns a pointer to the array of the observed ensemble for the specified observation type (`id_obs`).

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_get_HX(id_obs, dim_obs_p_diag, HX_p_ptr)

    INTEGER, INTENT(in)  :: id_obs                   ! Index of observation type to return
    INTEGER, INTENT(out) :: dim_obs_p_diag           ! Observation dimension
    REAL, POINTER, INTENT(out) :: HX_p_ptr(:,:)      ! Pointer to observed ensemble mean
}}}

**Notes:**
 * In case of a parallelized model, the array `HX_p_prt` contains the observed ensemble for the process sub-domain.
 * In Fortran user code, the pointer to the observed ensemble should be declared in the form:[[BR]] `REAL, POINTER :: HX_p_ptr(:,:)`[[BR]] It does not need to be allocated. The target array has the size `(dim_obs_p_diag, dim_ens)`
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointer will not be set and `dim_obs_diag=0` will be returned. This value can be checked before assessing the pointer array.



=== PDAFomi_diag_get_ivar ===

The routine returns a pointer to a vector of the inverse observation error variances for the specified observation type (`id_obs`).

The interface is: 
{{{
  SUBROUTINE PDAFomi_diag_get_ivar(id_obs, dim_obs_p_diag, ivar_p_ptr)

    INTEGER, INTENT(in)  :: id_obs                   ! Index of observation type to return
    INTEGER, INTENT(out) :: dim_obs_p_diag           ! Observation dimension
    REAL, POINTER, INTENT(out) :: ivar_p_ptr(:)      ! Pointer to inverse observation error variances
}}}

**Notes:**
 * In case of a parallelized model, the vector `ivar_p_prt` contains the observed ensemble mean for the process sub-domain.
 * In Fortran user code, the pointer to the vector of inverse observation variances should be declared in the form[[BR]] `REAL, POINTER :: ivar_p_ptr(:)`[[BR]] It does not need to be allocated. The target vector has the length `dim_obs_p_diag`.
 * If the observation diagnostics have been deactivated by using `PDAFomi_set_obs_diag`, the pointer will not be set and `dim_obs_diag=0` will be returned. This value can be checked before assessing the pointer array.
 * If the feature `thisobs%inno_omit` is used (see the [wiki:PDAFomi_additional_functionality page Additional functionality of PDAF-OMI]), the inverse variance of the omitted observations will show the small value set by `inno_omit`. One can use this information to exclude such observations when analyzing differences between observations and observed ensemble.