Context Navigation

Changes between Version 19 and Version 20 of ImplementationConceptOnline

Timestamp:: May 18, 2025, 2:41:20 PM (3 months ago)
Author:: lnerger
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ImplementationConceptOnline

-              v19
+              v20
 == Online mode: Attaching PDAF to a model ==
 Here we describe the extensions of the model code for the online mode of PDAF. The online mode offers two implementation variants. The first one, called ''fully-parallel'', assumes that you have a sufficient number of processes when running the data assimilation program so that all ensemble states can be propagated concurrently. The parallelism allows for a simplified implementation. The second implementation variant, called ''flexible'', allows to run the assimilation program in a way so that a model task (set of processors running one model integration) can propagate several ensemble states successively. This implementation variant is a bit more complicated, because one has to ensure that the model can jump back in time.
+Here we describe the extensions of the model code for the online mode of PDAF. The online mode offers two implementation variants. The first one, called ''fully-parallel'', assumes that you have a sufficient number of processes when running the data assimilation program so that all ensemble states can be propagated concurrently. This parallelism allows for a simplified implementation. The second implementation variant, called ''flexible'', allows to run the assimilation program in a way so that a model task (set of processors running one model integration) can propagate several ensemble states successively. This implementation variant is a bit more complicated, because one has to ensure that the model can jump back in time.
 If the data assimilation can be run with a sufficient number of processors to use the ''fully-parallel'' variant, we recommend to use it. Here, we will first focus on the ''fully-parallel'' variant. More informatio on the 'flexible'' variant is provided further below.
+If the data assimilation can be run with a sufficient number of processors to use the ''fully-parallel'' variant, we recommend to use it. Here, we will first focus on the ''fully-parallel'' variant. More information on the 'flexible'' variant is provided further below.
 The assimilation system is built by adding subroutine calls to the general part of the model code. In these routines one can define variables for PDAF, use-include the PDAF module and call the PDAF core subroutines. Usually only single lines of subroutine calls are inserted into the model code. As only minimal changes to the model code are required, we refer to this as "attaching" PDAF to the model.
 …
  * **init_pdaf**: This subroutine is added after the initialization part of the model, just before the time stepping loop. This this subroutine one defines parameters for PDAF  and then one calls the core initialization routine `PDAF_init`. This core routine also initializes the array of ensemble states using a user-provided call-back routine. Subsequently, the PDAF-core routine `PDAF_init_forecast` is called (in implementations of PDAF before version 3.0, this routine was called `PDAF_get_state`). This routine initializes model fields from the array of ensemble states using a call-back routine. In addition it returns the the number of time steps that have to be computed in the following forecast phase.
  * **assimilate_pdaf**: This routine is added to the model code at the end of the time stepping loop (usually just before the ''END DO'' in a Fortran program). The routine declares the names of user-supplied subroutines and calls a PDAF-core routine `PDAF3_assimilate`. (In implementations of PDAF before version 3.0, different routine named `PDAFomi_assimilate_X` with, e.g., X=`local`, for local filters are used). This routine has to be called at the end of each time step. It counts whether all time steps of the current forecast phase have been computed. If this is not the case, the program continues integrating the model. If the forecast phase is completed, the analysis step (i.e. the actual assimilation of the observations) is computed. Subsequently, the next forecast phase is initialized by writing the analysis state vector into the model fields and setting the number of time steps in the next forecast phase.
+ * **finalize_pdaf**: This routine is optional. It is used to let PDAF display timing and memory information and to call PDAF to deallocate its internal arrays.
 With the implementation strategy of PDAF, calls to four routines are added to the model code. These are usually only single lines of code and the changes only affect the general part of the model code.
+A code example for the `fully parallel` mode is provided in the tutorial code in tutorial/online_2D_serialmodel. See also the [wiki:PdafTutorial PDAF implementation tutorial].
 == Important aspects of the implementation concept ==
 …
 == Parallelization of the data assimilation program ==
 …
 || Note: This description is for the updated structure that introduced with PDAF V3.0. Implementations for PDAF V2 followed the former structure which is decribed on the [wiki:FlexibleParallelization_in_PDAF2 Page on the flexible parallelization mode in PDAF2]. ||
+The ''flexible'' parallelization mode allows to run the assimilation program in a way so that a model task (set of processors running one model integration) can propagate several ensemble states on after the other. This approach allows one ot use a smaller number of processes compared to the ''fully parallel'.
+The ''flexible'' parallelization mode allows to run the assimilation program in a way so that a model task (set of processors running one model integration) can propagate several ensemble states on after the other. This approach allows one to use a smaller number of processes compared to the ''fully parallel'.
+The ''flexible'' parallelization requires that the model can jump back in time. If the number of model tasks used to evolve the ensemble states is smaller than the number of ensemble members, each model task propagates more than one model state. In this case a model task has integrate more than one model state and will have to jump back in time after the integration of each ensemble member, in order to integrate the next state over the same time period.
 Implementing the ''flexible'' mode requires additional changes to the model code. These are shown in Figure 4. In particular an external loop has to be added. In addition, a call to `PDAF_get_fcst_info` determines the number of time steps for the net forecast phase.
-The ''flexible'' parallelization requires that the model can jump back in time. Jumping back in time will be required if the number of model tasks used to evolve the ensemble states is smaller than the number of ensemble members. In this case a model task has integrate more than one model state and will have to jump back in time after the integration of each ensemble member.
 [[Image(//pics/DAextension_flexible_PDAF3.png)]]
 …
 '''Extensions for the flexible assimilation system'''[[BR]]
 Figure 4 shows the extensions required for the ''flexible'' assimilation system (marked green):
  * **init_parallel_pdaf**: This routine is inserted close to the start of the model code as for the `fully parallel` mode.
+Figure 4 shows the extensions required for the ''flexible'' assimilation system (marked green with additional changes compared to the `fully parallel` mode marked orange):
+ * **init_parallel_pdaf**: This routine is inserted close to the start of the model code in the same way as for the `fully parallel` mode.
  * **init_pdaf**: This routine is added after the initialization part of the model in the same way as for the `fully parallel` mode. The routine also determines the number of time steps for the initial forecast phase.
+ * **Ensemble loop**: In order to allow for the integration of the state ensemble, an unconditional loop is added around the time stepping loop of the model. This loop will allow to compute the time stepping loop multiple times to integrate all ensemble states. PDAF provides an exit-flag for this loop. The number of time steps in the forecast (`nsteps`) is initially provided by `init_pdaf` and subsequently by `PDAF_get_fcst_info`.
+ * **assimilate_pdaf**: Inside the ensemble loop, a call to this interface routine is added to the code. In this routine the names of user-supplied routines are declared and the PDAF-core routine `PDAF_assim_offline` is called. This routine initializes model fields from the array of ensemble states and initializes the number of time steps that have to be computed and ensures that the ensemble integration is performed correctly.
+ * **put_state_pdaf**: At the end of the external loop, the call to the interface routine `put_state_pdaf` is added to the model code. The routine declares the names of user-supplied routines and calls a PDAF_core routine that is specific for the DA methods. E.g., the routine `PDAFomi_put_state_local` is called for the local ensemble Kalman filters. This routine writes the propagated model fields back into a state vector of the ensemble array. Also it checks whether the ensemble integration is complete. If not, the next ensemble member will be integrated. If the ensemble integration is complete, the analysis step (i.e. the actual assimilation of the observations) is computed.
+ * **Ensemble loop**: In order to allow for the integration of the state ensemble, an unconditional loop is added around the time stepping loop of the model. This loop will allow to compute the time stepping loop multiple times to integrate all ensemble states. PDAF provides an exit flag for this loop, which is checked to control when to exit this loop. The number of time steps in the forecast (`nsteps`) is initially provided by `init_pdaf` and subsequently by `PDAF_get_fcst_info`.
+ * **assimilate_pdaf**: Inside the ensemble loop, a call to this interface routine is added to the code in the same way as for the `fully parallel` mode.
+ * ** PDAF_get_fcst_info**: The analysis step is computed when the request number of time steps `steps` are completed. At this point the program steps out of the model time stepping loop. Here, a call to `PDAF_get_fcst_info` is used to obtain the number of time steps for the next forecast phase and the value of the exit flag. Subsequently, one checks the exit flag and when this indcates to end the assimilation process one steps out of the outer unconditional loop. Otherwise, the program continues with the next forecast phase.
+ * **finalize_pdaf**: This routine is optional. It is used to let PDAF display timing and memory information and to call PDAF to deallocate its internal arrays.
+A code example for the `flexible` mode is provided in the template codes in `templates/online_flexible/`.