= General Implementation Concept of PDAF =

== Logical separation of the assimilation system ==

{{{
#!html
<div class="wiki-toc">
<h4>Implementation Concept</h4>
<ol>
<li>General Concept</li>
<li><a href="ImplementationConceptOnline">Online Mode</a></li>
<li><a href="ImplementationConceptOffline">Offline Mode</a></li>
</ol>
</div>
}}}

[[PageOutline(2-3,Contents of this page)]]

Data assimilation requires 3 components. These are
 * **Model**:[[BR]] The numerical model provides the initialization and integration of all model fields. It defines the dynamics of the system that is simulated.
 * **Observations**:[[BR]] The observations of the system provide additional information. 
 * **Assimilation method**:[[BR]] The assimilation method combines the model and observational information. 

Figure 1 shows how these components are combined by PDAF.

[[Image(//pics/da_structure.png)]]
[[BR]]'''Figure 1:''' Components required for data assimilation system. The data assimilation methods ('DA method') are provided by PDAF's core library. The handling of observations is supported by PDAF-OMI. 


Generally, all three components are independent. In particular, the assimilation methods are implemented in the core part of PDAF. To combine the model and observational information one has to define the relation of the observations to the model fields (For example, model fields might be directly observed or the observed quantities are more complex functions of the model fields. In addition, the observations might be at different locations than the model grid points so that interpolation is required.) In addition one has to fill the state vector that is using in the DA methods with the model fields and write the analysis state vector back to the model fields. These relations and operations are implemented in separate routines that are supplied to the assimilation system by the user. These routines are called through a well-defined standard interface. To ease the implementation complexity, these user-defined routines can be implemented like routines of the model code. Thus, if a user has experience with the model, it should be rather easy to extend it by the routines required for the assimilation system.

== Online and offline assimilation systems ==

There are two possibilities to build a data assimilation system 
 1. '''Online mode:''' [[BR]] In this case, the model code is extended by calls to PDAF core routines. A single program is compiled. While running this single program, the necessary ensemble integrations and the actual assimilation are performed and the information transfer between the model and the data assimilation functions are performed in memory.
 1. '''Offline mode:''' [[BR]] The model is executed separately from the assimilation program. Output files from the model are used as inputs for the assimilation program, which writes restart files for the model after computing the analysis step.

PDAF supports both the online and offline modes. The online mode is usually more efficient on parallel computers, since less files have to be written to disks and the model does not need to be restarted after the analysis step. However, the required coding is simpler for the offline than the online mode, since no modification to the model source code is necessary.

The better efficiency of the online mode of the data assimilation system is caused by several factors:
 * The initialization phase of the model program is executed only once
 * The output files are limited to the necessary outputs for the estimated forecast and analysis states and/or ensembles. In contrast for the offline mode restart files have to be written and read for each forecast/assimilation cycle.
 * The assimilation system for the online mode can make efficient use of a large number of processors by executing a single program containing the full assimilation system. In contrast, in the offline mode, separate programs for the forecasts and the assimilation have to be run, each of these use typically less processors.

Generally the user-provided code for the analysis step is very similar for the online and offline modes. The difference is mainly that in the online mode, one can access model fields and model grid information from the model source code (usually Fortran modules, if the model is programmed in Fortran) while in the offline mode one need to read the model fields and grid information from files. Oen can implement this in a way that for the offline mode mainly there reading routines are used in the initialization phase, while all other subroutines remain independent on how the model fields and gird information are obtained.

The implementation concepts of the online and offline modes of PDAF are described on separate pages:
 * [ImplementationConceptOnline Online mode: Attaching PDAF to a model]
 * [ImplementationConceptOffline Offline mode: Separating model integrations and the assimilation step]