Changes between Version 32 and Version 33 of AdaptParallelization
- Timestamp:
- May 20, 2025, 10:52:48 AM (12 days ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
AdaptParallelization
v32 v33 19 19 == Overview == 20 20 21 Like many numerical models, PDAF uses the MPI standard for the parallelization. In the description below, we assume that the model is parallelized using MPI.21 The PDAF release provides example code for the online mode in `tutorial/online_2D_serialmodel` and `tutorial/online_2D_parallelmodel`. We refer to this code to use it as a basis. 22 22 23 PDAF supports a 2-level parallelization: First, the numerical model can be parallelized and can be executed using several processors. Second, several model tasks can be computed in parallel, i.e. a parallel ensemble integration can be performed. This 2-level parallelization has to be initialized before it can be used. The templates-directory `templates/` contains the file `init_parallel_pdaf.F90` that can be used as a template for the initialization. The required variables are defined in `mod_parallel.F90`, which is stored in the same directory and can also be used as a template. If the numerical model itself is parallelized, this parallelization has to be adapted and modified for the 2-level parallelization of the data assimilation system generated by adding PDAF to the model. The necessary steps are described below. 23 Like many numerical models, PDAF uses the MPI standard for the parallelization. For the case of a parallelized model, we assume in the description below that the model is parallelized using MPI. 24 25 As explained on the page on the [wiki:ImplementationConceptOnline Implementation concept of the online mode], PDAF supports a 2-level parallelization. First, the numerical model can be parallelized and can be executed using several processors. Second, several model tasks can be computed in parallel, i.e. a parallel ensemble integration can be performed. This 2-level parallelization is initialized by the routine `init_parallel_pdaf`. The templates-directory `templates/` contains the file `init_parallel_pdaf.F90` that can be used as a template for the initialization. The required variables are defined in `mod_parallel.F90`, which is stored in the same directory and can also be used as a template. 26 27 If the numerical model itself is parallelized, this parallelization has to be adapted and modified for the 2-level parallelization of the data assimilation system generated by adding PDAF to the model. The necessary steps are described below. 24 28 25 29 26 30 == Three communicators == 27 31 28 MPI uses so-called 'communicators' to define groups of parallel processes. These groups can then conveniently exchange information. In order to provide the 2-level parallelism for PDAF, three communicators need to be initialized that define the processes that are involved in different tasks of the data assimilation system.32 MPI uses so-called 'communicators' to define groups of parallel processes. These groups can then conveniently exchange information. In order to provide the 2-level parallelism for PDAF, three communicators are initialized that define the processes that are involved in different tasks of the data assimilation system. 29 33 The required communicators are initialized in the routine `init_parallel_pdaf`. There are called 30 34 * `COMM_model` - defines the groups of processes that are involved in the model integrations (one group for each model task) … … 32 36 * `COMM_couple` - defines the groups of processes that are involved when data are transferred between the model and the filter 33 37 34 The parallel region of an MPI parallel program is initialized by calling `MPI_init`. By calling `MPI_init`, the communicator `MPI_COMM_WORLD` is initialized. This communicator is pre-defined by MPI to contain all processes of the MPI-parallel program. Often it is sufficient to conduct all parallel communication using only `MPI_COMM_WORLD`. Thus, numerical models often use only this communicator to control all communication. However, as `MPI_COMM_WORLD` contains all processes of the program, this approach will not allow for parallel model tasks. In order to allow parallel model tasks, it is required to replace `MPI_COMM_WORLD` by an alternative communicator that is split for the model tasks. We will denote this communicator `COMM_model`. If a model code already uses a communicator distinct from `MPI_COMM_WORLD`, it should be possible to use that communicator. 38 The parallel region of an MPI-parallel program is initialized by calling `MPI_init`. This initializes the communicator `MPI_COMM_WORLD`, which is pre-defined by MPI to contain all processes of the MPI-parallel program. Often it is sufficient to conduct all parallel communication using only `MPI_COMM_WORLD`. Thus, numerical models often use only this communicator to control all communication. However, as `MPI_COMM_WORLD` contains all processes of the program, this approach will not allow for parallel model tasks. 39 40 In order to allow parallel model tasks, it is required to replace `MPI_COMM_WORLD` by an alternative communicator that is split for the model tasks. We will denote this communicator `COMM_model`. If a model code already uses a communicator distinct from `MPI_COMM_WORLD`, it should be possible to use that communicator. 35 41 36 42 [[Image(//pics/communicators_PDAFonline.png)]] … … 39 45 == Using COMM_model == 40 46 41 Frequently the parallelization is initialized in the model by the lines:47 Frequently the parallelization of a model is initialized in the model by the lines: 42 48 {{{ 43 49 CALL MPI_Init(ierr) … … 45 51 CALL MPI_Comm_Size(MPI_COMM_WORLD, size, ierr) 46 52 }}} 47 (The call to `MPI_init` is mandatory, while the second an thirdlines are optional) If the model itself is not parallelized, the MPI-initialization will not be present. Please see the section '[#Non-parallelmodels Non-parallel models]' below for this case.53 (The call to `MPI_init` is mandatory, while the two other lines are optional) If the model itself is not parallelized, the MPI-initialization will not be present. Please see the section '[#Non-parallelmodels Non-parallel models]' below for this case. 48 54 49 Subsequently, one can define `COMM_model` by adding55 Subsequently, one can define `COMM_model` by 50 56 {{{ 51 57 COMM_model = MPI_COMM_WORLD 52 58 }}} 53 In addition, the variable `COMM_model` has to be declared in a way such that all routines using the communicator can access it. The parallelization variables of the model are frequently held in a Fortran module. In this case, it is easiest to add `COMM_model` as an integer variable here. (The exampledeclares `COMM_model` and other parallelization-related variables in `mod_parallel.F90`)59 In addition, the variable `COMM_model` has to be declared in a way such that all routines using the communicator can access it. The parallelization variables of the model are frequently held in a Fortran module. In this case, it is easiest to add `COMM_model` as an integer variable here. (The tutorial declares `COMM_model` and other parallelization-related variables in `mod_parallel.F90`) 54 60 55 61 Having defined the communicator `COMM_model`, the communicator `MPI_COMM_WORLD` has to be replaced by `COMM_model` in all routines that perform MPI communication, except in calls to `MPI_init`, `MPI_finalize`, and `MPI_abort`. 56 The changes described by now must not influence the execution of the model itself. Thus, after these changes, one shouldensure that the model compiles and runs correctly.62 The changes described by now must not influence the execution of the model itself. Thus, after these changes, one can run the model to ensure that the model compiles and runs correctly. 57 63 58 64 == Initializing the communicators ==