wiki:OnlineorOfflineImplementation

Version 1 (modified by lnerger, 10 years ago) (diff)

--

Online or Offline Implementation

Here, we discuss whether an online or offline implementation of the data assimilation system should be considered.

Generally, we recommend to use the online implementation variant, i.e. the strong coupling to the numerical model (In fact, all implementations that the developers of PDAF did themselves are online implementations) This is motivated by the fact that the online implementation will be computationally more efficient. In the online implementation, the common memory of a single running executable is used to transfer state information between the model and PDAF. In contrast, the offline implementation uses disk files to transfer the state information between the model executable and the separate assimilation executable containing PDAF. In addition, the offline implementation implies that the full model initialization (start up) has to be performed each time when an ensemble member is integrated by the model. The cost of the model start up will be avoided when the online implementation is used. We cannot precisely state how large the overhead of the offline implementation over the offline implementation is. However, in general, for a single forecast/analysis cycle it will be the cost to write twice as many output files as there are ensemble files plus reading as many files as there are ensemble files. In addition, there will be the cost of the repeated model start up phase (which includes again reading a file holding state information) for each ensemble member.

However, there is also a strong advantage in using the offline implementation variant: One does not need to touch the model code. (This is apart from a possible addition of perturbed forcing to simulate model error, which in deed might require an addition to the initialization routines for forcing) Instead, the model is repeatedly called providing an initialization or restart file together with initial model time and the length of the integration. Then one has to implement reading routines for the assimilation executable. These routines have to initialize the ensemble information in the assimilation program. In addition one has to implement a routine, which writes the analysis states into restart/initialization files for the model. There are also some user-supplied routines that are not required in the case of the offline implementation. In particular these are U_distribute_state, U_collect_state, and U_next_observation. However, the implementation of these routines should not pose a challenge.

Avoiding the need to touch the model code can generally lead faster to a working assimilation system. When the model code is modified in the online implementation, one has to take care that all transfers of information between the model and the data are consistent. In addition, it might be that some arrays, apart from the model fields in the state vector, need to be re-initialized before a new ensemble state is integration. Over all, this should not be a problem, if the person who performs the implementation does know the model well or has very good contacts to a person with this experience. If you don't really know the model code, it can be difficult to implement the online variant.