wiki:GeneralImplementationConcept

Version 27 (modified by lnerger, 20 hours ago) ( diff )

--

General Implementation Concept of PDAF

Components of the assimilation system

Implementation Concept

  1. General Concept
  2. Online Mode
  3. Offline Mode

Data assimilation requires 3 components. These are

  • Model:
    The numerical model provides the initialization and integration of all model fields. It defines the dynamics of the system that is simulated. This provides the first source of information on the system.
  • Observations:
    The observations of the system provide the second source of information.
  • Assimilation method:
    The assimilation method combines the information from the model and observations.

Figure 1 shows how these components are combined by PDAF.

/pics/DAstructure_PDAF3_web.png
Figure 1: Components required for data assimilation system. The data assimilation methods ('DA method') are provided by PDAF's core library. The handling of observations is supported by PDAF-OMI.

PDAF combines these three components, which are otherwise independent.

  • The model is provided by the user. It computes the time evolution of the model fields. For the data assimilation, it provides information on the model state fields, the time, and the model grid to the other components.
  • The assimilation methods (DA methods) are implemented in the core part of PDAF. They obtain information from the model and the observations and operate on abstract state vectors. For this, one has to fill and ensemble of state vectors with the values of the model fields. After the assimilation update, the analysis state vector is written back to the model fields. This functionality is implemented in separate routines that are supplied to the assimilation system by the user. The DA methods also interacts with the observations, providing the state vectors to the observation operator and obtaining observed model state vectors.
  • Information on observations (observation values, their coordinates and error estimates) have to be read from files. Further, one has to define the observation operator, which computes the model equivalent to the observations. PDAF-OMI provides a structured way for the handling of observations and to provide PDAF with the required information.

Both, the subroutines of PDAF and the user-provided subroutines are called through a well-defined standard interface. To ease the implementation complexity, the user-defined routines can be implemented like routines of the model code. Thus, if a user has experience with the model, it should be rather easy to extend it by the routines required for the assimilation system.

Online and offline assimilation systems

There are two possibilities to build a data assimilation system

  1. Online mode:
    In the online mode, calls to subroutines for the data assimilation are inserted into the model code. Thesse routines call the PDAF core routines. A single data-assimilative model program is compiled. While running this single program, the necessary ensemble integrations and the actual assimilation are performed. The information transfer between the model and the data assimilation functions are performed in memory.
  2. Offline mode:
    In the offline mode, the model is executed separately from the assimilation program. The model write restart files, which are used as inputs for the assimilation program. After computing the analysis step with the DA method, the assimilation program writes new restart files for the model, which is then started to compute the next forecast phase. In Figure 1, the model on the bottom left side is replaced by the reading and writing of files.

PDAF supports both the online and offline modes. The online mode is usually more efficient on parallel computers, since less files have to be written to disks and the model does not need to be restarted after the analysis step. However, the required programming is simpler for the offline more than the online mode, since no modification to the model source code is necessary.

The better efficiency of the online mode of the data assimilation system is caused by several factors:

  • The initialization phase of the model program is executed only once.
  • The output files are limited to the necessary outputs for the estimated forecast and analysis states and/or ensembles. In contrast, for the offline mode restart files have to be written and read for each forecast/assimilation cycle and also the model is stopped and restarted
  • The assimilation system for the online mode can make efficient use of a large number of processors by executing a single program containing the full assimilation system. In contrast, in the offline mode, separate programs for the forecasts and the assimilation have to be run, each of these use typically a smaller number of processors.

Generally the user-provided program code for the analysis step is very similar for the online and offline modes. The difference is mainly that in the online mode, one can access model fields and model grid information from the model source code (if the model is programmed in Fortran, the information is usually accessible via Fortran modules) while in the offline mode one needs to read the model fields and grid information from files. One can implement these reading operations in a way that for the offline mode the reading routines are used in the initialization phase, while all other subroutines remain independent on how the model fields and gird information are obtained.

The implementation concepts of the online and offline modes of PDAF are further described on separate pages:

Note: See TracWiki for help on using the wiki.