Changes between Version 1 and Version 2 of AddingMemoryandTimingInformation_PDAF23


Ignore:
Timestamp:
Jun 3, 2025, 1:57:45 PM (3 days ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AddingMemoryandTimingInformation_PDAF23

    v1 v2  
    2020PDAF provides functions to display the memory required by the array allocated inside PDAF. In addition, information about the execution duration of different parts of PDAF can be displayed. These information can be obtained by calling the routine `PDAF_print_info`.
    2121
     22The calls described here are implemented in `finalize_pdaf.F90` in the template and tutorial codes. One can directly use these routines without changes.
     23
    2224== Displaying memory information ==
    2325
    2426Information about the memory required by PDAF through allocated arrays can be obtained by inserting into the program the line
    2527{{{
    26   CALL PDAF_print_info(2)
     28  IF (mype_world==0) CALL PDAF_print_info(10)
    2729}}}
    2830The function displays the following information
    29  * Memory required for the ensemble array, state vector, and matrix '''Uinv'''
     31 * Memory required for the ensemble array, state vector, and matrix '''Ainv'''
    3032 * Memory required by the analysis step
    3133 * Memory required to perform the ensemble transformation
     
    4143}}}
    4244
    43 Currently only the memory required by the first process of the filter processes is displayed. Thus the total required memory should be the displayed memory multiplied by the number of processes in `COMM_filter`.
     45This memory information shows only the memory required by a single filter processes. In the example codes, this is the process with `mype_world=0`. One can also display the overall allocated memory by adding
     46{{{
     47  CALL PDAF_print_info(11)
     48}}}
     49to the routine `finalize_pdaf`.
    4450
    4551== Displaying timing information ==
     
    6167More detailed output is obtained with
    6268{{{
    63   CALL PDAF_print_info(3)
     69  IF (mype_world==0) CALL PDAF_print_info(3)
    6470}}}
    6571which will display timing information of each of the call-back routines. E.g. for the LESTKF this might look like:
    6672{{{
    67   PDAF            PDAF Timing information - call-back routines
    68   PDAF        ----------------------------------------------------
    69   PDAF          Initialize PDAF:                     2.007 s
    70   PDAF            init_ens_pdaf:                       2.004 s
    71   PDAF          Ensemble forecast:                 571.850 s
    72   PDAF            MPI communication in PDAF:           0.004 s
    73   PDAF            distribute_state_pdaf:               0.140 s
    74   PDAF            collect_state_pdaf:                  0.001 s
    75   PDAF          LESTKF analysis:                    12.654 s
    76   PDAF            PDAF-internal operations:           10.360 s
    77   PDAF            init_n_domains_pdaf:                 0.000 s
    78   PDAF            init_dim_obs_f_pdaf:                 1.091 s
    79   PDAF            obs_op_f_pdaf:                       0.022 s
    80   PDAF            init_dim_l_pdaf:                     0.001 s
    81   PDAF            init_dim_obs_l_pdaf:                 1.136 s
    82   PDAF            g2l_state_pdaf:                      0.003 s
    83   PDAF            g2l_obs_pdaf:                        0.007 s
    84   PDAF            init_obs_l_pdaf:                     0.004 s
    85   PDAF            prodRinvA_l_pdaf:                    0.023 s
    86   PDAF            l2g_state_pdaf:                      0.002 s
    87   PDAF          prepoststep_pdaf:                   91.396 s
     73PDAF            PDAF Timing information - call-back routines
     74PDAF        ----------------------------------------------------
     75PDAF          Initialize PDAF:                     1.552 s
     76PDAF            init_ens_pdaf:                       1.526 s
     77PDAF          Ensemble forecast:               23847.693 s
     78PDAF            MPI communication in PDAF:         666.890 s
     79PDAF            distribute_state_pdaf:               2.153 s
     80PDAF            collect_state_pdaf:                  0.427 s
     81PDAF          LESTKF analysis:                   191.429 s
     82PDAF            PDAF-internal operations:          157.618 s
     83PDAF            OMI-internal routines:               1.524 s
     84PDAF            init_n_domains_pdaf:                 0.000 s
     85PDAF            init_dim_l_pdaf:                     0.127 s
     86PDAF            g2l_state_pdaf:                      5.190 s
     87PDAF            l2g_state_pdaf:                      3.087 s
     88PDAF            Time in OMI observation module routines
     89PDAF              init_dim_obs_pdafomi:              8.880 s
     90PDAF              obs_op_pdafomi:                    3.913 s
     91PDAF              init_dim_obs_l_pdafomi:           10.750 s
     92PDAF          prepoststep_pdaf:                 9422.757 s
    8893}}}
    89 This example is from one of our real data assimilation application. Most of the time is spent in for ensmeble forecast. The second most time is spent in `prepoststep_pdaf`, which is mainly due to the writing of large output files.
    90 The analysis step  (line `LESTKF analysis`) took only 12.65s. Most of this time was spent for computations inside PDAF (line `PDAF-interal operations`, 10.36s), while also `init_dim_obs_f_pdaf` (the initialization of observation information) and `init_dim_obs_l_pdaf` (the search for observations within the localization cut-off radius) took some time.
     94This example is from one of our real data assimilation applications where we performed 13 analysis steps in this run. Most of the time is spent in for ensemble forecast. The second most time is spent in `prepoststep_pdaf`, which is mainly due to the writing of large output files using a parallel writing using the binary netCDF file format.
     95The analysis steps  (line `LESTKF analysis`) took only 191.429s. Most of this time was spent for computations inside PDAF (line `PDAF-interal operations`, 157.618s), while also `init_dim_obs_l_pdafomi` (the search for observations within the localization cut-off radius, 10.75s) and `init_dim_obs_f_pdafomi` (the initialization of observation information, 8.88s) took some time.
    9196
    9297If significant time is spend in one or several of the call-back routines, this gives an indication which routines might have potential for optimization.
    9398
    94 More detailed information in time spend in different parts of the filter algorithm itself can be obtained using a value of 4 or 5 in the call to `PDAF_print_info`. Only the time from the first process of the filter processes is displayed. However, the time for each process should be similar.
     99More detailed information in time spend in different parts of the filter algorithm itself can be obtained using a value of 4 or 5 in the call to `PDAF_print_info`. Only the time from the first process of the filter processes is displayed. However, the time for each process should be similar. If one performs the call without `IF (mype_world==0) ` each process would write its timing information.
    95100