Changes between Version 34 and Version 35 of AdaptParallelization


Ignore:
Timestamp:
May 20, 2025, 1:50:55 PM (12 days ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AdaptParallelization

    v34 v35  
    2121The PDAF release provides example code for the online mode in `tutorial/online_2D_serialmodel` and `tutorial/online_2D_parallelmodel`. We refer to this code to use it as a basis.
    2222
    23 Like many numerical models, PDAF uses the MPI standard for the parallelization. For the case of a parallelized model, we assume in the description below that the model is parallelized using MPI.
     23In the tutorial code and the templates in `templates/online`, the parallelization is initialized in the routine `init_parallel_pdaf` (file `init_parallel_pdaf.F90`). The required variables are defined in `mod_parallel.F90`. These files can be used as templates.
    2424
    25 As explained on the page on the [wiki:ImplementationConceptOnline Implementation concept of the online mode], PDAF supports a 2-level parallelization. First, the numerical model can be parallelized and can be executed using several processors. Second, several model tasks can be computed in parallel, i.e. a parallel ensemble integration can be performed. This 2-level parallelization is initialized by the routine `init_parallel_pdaf`. The templates-directory  `templates/` contains the file `init_parallel_pdaf.F90` that can be used as a template for the initialization. The required variables are defined in `mod_parallel.F90`, which is stored in the same directory and can also be used as a template.
     25Like many numerical models, PDAF uses the MPI standard for the parallelization. For the case of a parallelized model, we assume in the description below that the model is parallelized using MPI. If the model is parallelized using OpenMP, one can follow the explanations for a non-parallel model below.
     26As explained on the page on the [wiki:ImplementationConceptOnline Implementation concept of the online mode], PDAF supports a 2-level parallelization. First, the numerical model can be parallelized and can be executed using several processors. Second, several model tasks can be computed in parallel, i.e. a parallel ensemble integration can be performed. We need to configure the parallelization so that more than one model task can be computed.
    2627
    27 If the numerical model itself is parallelized, this parallelization has to be adapted and modified for the 2-level parallelization of the data assimilation system generated by adding PDAF to the model. The necessary steps are described below.
     28There are two possible cases regarding the parallelization for enabling the 2-level parallelization:
     291. The model itself is parallelized using MPI
     30 * In this case we need to adapt the parallelization of the model
     312. The model is not parallelized or uses only shared-memory parallelization using OpenMP
     32 * In this case we need to add parallelization
    2833
    2934
    30 == Using COMM_model ==
     35== The adaptions in short ==
    3136
    32 Frequently the parallelization of a model is initialized in the model by the lines:
     37If you are experienced with MPI, the steps are the following:
     381. Find the call to `MPI_init`
     392. Check whether the model uses `MPI_COMM_WORLD`.
     40 * If yes, then replace `MPI_COMM_WORLD` in all places except MPI_abort and MPI_finalize by a user-defined communicator, e.g `COMM_mymodel`, which can be initialized as `COMM_mymodel=MPI_COMM_WORLD`.
     41 * If no, then take note of the name of the communicator variable (we assume it's `COMM_mymodel`)
     423. Insert the call `init_parallel_pdaf` directly after `MPI_init`.
     434. Adapt `init_parallel_pdaf` so that at the end of this routine you set `COMM_mymodel=COMM_model`. Potentially, also replace the rank and size variables, respectively, by `mype_model` and `npes_model`.
     445. The number of model tasks in variable `n_modeltasks` is required by `init_parallel_pdaf` to perform commucator splitting. In the tutorial code we added a command-line parsing to set the variable (it is parsing for `dim_ens`). One could also read the value from a configuration file.
     45
     46
     47== Adapting a parallelized model ==
     48
     49If the online mode is implemented with a parallelized model, one has to ensure that the parallelization can be split to perform the parallel ensemble forecast. For this, one has to check the model source code and potentially adapt it.
     50
     51Any program parallelized with MPI will need to call `MPI_Init` for the initialziation of MPI. Frequently, the parallelization of a model is initialized in the model by the lines:
    3352{{{
    3453      CALL MPI_Init(ierr)
     
    3655      CALL MPI_Comm_Size(MPI_COMM_WORLD, size, ierr)
    3756}}}
    38 Here, the call to `MPI_init` is mandatory, while the two other lines are optional, but common. If the model itself is not parallelized, the MPI-initialization will not be present. Please see the section '[#Non-parallelmodels Non-parallel models]' below for this case.
     57Here, the call to `MPI_init` is mandatory, while the two other lines are optional, but common.
     58
     59In the model code one has to find the place where `MPI_init` is called to check how the parallelization is set up. In particular we have to check if parallelization is ready to be split into model tasks
     60
     61The call to `MPI_init` initialized the parallel region of an MPI-parallel program. This call initializes the communicator `MPI_COMM_WORLD`, which is pre-defined by MPI to contain all processes of the MPI-parallel program. Often models, use only this communicator to control all MPI communication. However, as `MPI_COMM_WORLD` contains all processes of the program, this approach will not allow for parallel model tasks.
     62
     63Next, one has to check whether the model
     64
     65In order to allow parallel model tasks, it is required to replace `MPI_COMM_WORLD` by an alternative communicator that is split for the model tasks. We will denote this communicator `COMM_model`. If a model code already uses a communicator distinct from `MPI_COMM_WORLD`, it should be possible to use that communicator.
     66
     67
    3968
    4069Subsequently, one can define `COMM_model` by
     
    5685 * `COMM_couple` - defines the groups of processes that are involved when data are transferred between the model and the filter
    5786
    58 The parallel region of an MPI-parallel program is initialized by calling `MPI_init`.  This initializes the communicator `MPI_COMM_WORLD`, which is pre-defined by MPI to contain all processes of the MPI-parallel program. Often it is sufficient to conduct all parallel communication using only `MPI_COMM_WORLD`. Thus, numerical models often use only this communicator to control all communication. However, as `MPI_COMM_WORLD` contains all processes of the program, this approach will not allow for parallel model tasks.
    5987
    60 In order to allow parallel model tasks, it is required to replace `MPI_COMM_WORLD` by an alternative communicator that is split for the model tasks. We will denote this communicator `COMM_model`. If a model code already uses a communicator distinct from `MPI_COMM_WORLD`, it should be possible to use that communicator.
     88
    6189
    6290[[Image(//pics/communicators_PDAFonline.png)]]