Changes between Version 87 and Version 88 of FeaturesofPdaf


Ignore:
Timestamp:
May 18, 2025, 11:38:25 AM (6 hours ago)
Author:
lnerger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FeaturesofPdaf

    v87 v88  
    1212
    1313 1. PDAF provides fully implemented, parallelized, and optimized ensemble-based algorithms for data assimilation. Currently, these are ensemble-based Kalman filters like the LETKF, LESTKF, and EnKF methods and nonlinear filters are provided. Starting from PDAF V2.0 also 3D-variational methods are provided.
    14  1. PDAF is attached to the model source code by minimal changes to the code, which we call 'online mode'. These changes only concern the general part of the code, but not the numerics of the model. In addition, a small set of routines is required that are specific to the model or the observations to be assimilated. These routines can be implemented like routines of the model.
     14 1. PDAF provides two variants to build a data assimilation system:
     15  1. PDAF can be attached to the model source code by minimal changes to the code, which we call ``online mode``. These changes only concern the general part of the code, but not the numerics of the model. In addition, a small set of routines is required that are specific to the model or the observations to be assimilated. These routines can be implemented like routines of the model.
     16  1. PDAF also offers an ``offline mode``. This is for the case that you don't like to (or even cannot) modify your model source code at all. In the offline mode, PDAF is compiled separately from the model together with the supporting routines to handle the observations. Then, the model and the assimilation step are executed separately. This approach is simpler to implement than the ``online mode``, but it is computationally less efficient.
    1517 1. PDAF is called through a well-defined standard interface. This allows, for example, to switch between the LETKF, LESTKF, and LSEIK methods without additional coding.
    1618 1. PDAF provides parallelization support for the data assimilation system. If your numerical model is already parallelized, PDAF enables the data assimilation system to run several model tasks in parallel within a single executable. However, PDAF can also be used without parallelization, for example to test small systems.
    17  1. PDAF does not require that your model can be called as a subroutine. Rather PDAF is added to the model and the formed data assimilation system can be executed pretty much like the model-program would without data assimilation.
    18  1. PDAF also offers an offline mode. This is for the case that you don't want to (or even cannot) modify your model source code at all. In the offline mode, PDAF is compiled separately from the model together with the supporting routines to handle the observations. Then the model and the assimilation step are executed separately. While this strategy is possible, we don't recommend it, because it's computationally less efficient.
    19  1. Starting with PDAF 1.13, the PDAF release also provides bindings to couple PDAF with selected real models. As of PDAF 1.15, modelbindings for the MITgcm ocean circulation model and for the AWI Climate Model (AWI-CM, a coupled model consisting of ECHAM (atmophsere) and FESOM (ocean)) are provided.
     19 1. PDAF does not require that your model can be called as a subroutine. Rather, PDAF is added to the model and the formed data assimilation system can be executed pretty much like the model-program would without data assimilation.
     20 1. The PDAF release also provides bindings to couple PDAF with selected real models. Such modelbindings are, e.g., available for the MITgcm and the NEMO ocean circulation models, for the AWI Climate Model (AWI-CM, a coupled model consisting of ECHAM (atmosphere) and FESOM (ocean)) and the Weather and Forecast Model (WRF). See the [wiki:ModelsConnectedToPDAF list of models that were already coupled to PDAF] for an overview.
    2021
     22== Data Assimilation Methods ==
    2123
    22 == Filter algorithms ==
     24PDAF provides the following method for data assimilation. All assimilation methods are fully implemented, optimized and parallelized. In addition, all ensemble-based methods offer an Ensemble-OI mode in which only a single ensemble state needs to be integrated.
    2325
    24 PDAF provides the following algorithms for data assimilation. All filters are fully implemented, optimized and parallelized. In addition, all filters offer an Ensemble-OI mode in which only a single emseble state needs to be integrated
     26=== Ensemble filters and smoothers ===
    2527
    26 Local filters:
     28Local ensemble filters:
    2729 * LETKF (Hunt et al., 2007)
    2830 * LESTKF (Local Error Subspace Transform Kalman Filter, Nerger et al., 2012, [PublicationsandPresentations see publications])
     
    3234 * LKNETF (Local Kalman-nonlinear Ensemble Transform Filter, Nerger, 2022, [PublicationsandPresentations see publications], added in PDAF V2.1)
    3335
    34 Global filters:
     36Global ensemble filters:
    3537 * ESTKF (Error Subspace Transform Kalman Filter, Nerger et al., 2012, [PublicationsandPresentations see publications])
    3638 * ETKF (The implementation follows Hunt et al. (2007) but without localization, which is available in the LETKF implementation)
     
    4749 * NETF & LNETF
    4850
     51=== 3D variational methods ===
     52
    4953Starting from Version 2.0 of PDAF, 3D variational methods are also provided. The 3D-Var methods are implemented in incremental form using a control vector transformation (following the review by R. Bannister, Q. J. Roy. Meteorol. Soc., 2017) in three different variants:
    5054 * 3D-Var - 3D-Var with parameterized covariance matrix
     
    5458== Requirements ==
    5559
    56  * '''Compiler'''[[BR]]To compile PDAF a Fortran compiler is required which supports Fortran 2003. PDAF has been tested with a variety of compilers like gfortran, ifort, xlf, pgf90, cce.
    57  * '''BLAS''' and '''LAPACK'''[[BR]]The BLAS and LAPACK libraries are used by PDAF. For Linux there are usually packages with these libraries. With commercial compilers the functions are usually provided by optimized libraries (like MKL, ESSL).
    58  * '''MPI''' [[BR]] An MPI library is required (e.g. OpenMPI). [For the PDAF versions before V2.0, the assimilation program can also be compiled and run without parallelization. For this, PDAF <2.0 provides functions that mimic MPI operations for a single process.]
     60 * '''Compiler'''[[BR]]To compile PDAF, a Fortran compiler is required which supports Fortran 2003. PDAF has been tested with a variety of compilers like gfortran, ifort, nfort.
     61 * '''BLAS''' and '''LAPACK'''[[BR]]The BLAS and LAPACK libraries are used by PDAF. For Linux there are usually packages that provide these libraries. With commercial compilers the functions are usually provided by optimized libraries (like MKL, ESSL).
     62 * '''MPI''' [[BR]] An MPI library is required (e.g. OpenMPI).
    5963 * '''make'''[[BR]]PDAF provides Makefile definitions for different compilers and operating systems.
    6064
     
    6266
    6367PDAF has been tested on various machines with different compilers and MPI libraries. Current test machines include:
    64  * Linux Desktop machine, Ubuntu, ifort compiler
    65  * Linux Desktop machine, Ubuntu, gfortran, OpenMPI
    66  * Notebook Apple !MacBook, Mac OS X, gfortran, OpenMPI
    67  * Atos cluster 'Lise' at HLRN (Intel Cascade Lake processors), ifort, IMPI
    68  * Windows 10 with Cygwin, gfortran, OpenMPI
     68 * Linux Desktop computer, Ubuntu, gfortran, OpenMPI
     69 * Notebook Apple !MacBook, MacOS, gfortran, OpenMPI
     70 * Atos cluster 'Lise' at HLRN (Intel Cascade Lake processors), ifort, IMPI and OpenMPI
     71 * Windows with Cygwin, gfortran, OpenMPI
     72 * NEC SX-Aurora vector computer, nfort, NEC MPI
    6973Past test machines also included
    7074 * NEC SX-ACE, sxf90 compiler (rev 530), sxmpi
     
    7983== Test cases ==
    8084
    81 The regular tests use a rather small configuration with a simulated model. This model is also included in the test suite of the downloadable PDAF package.
    82 In addition, the scalability of PDAF was examined with a real implementation with the finite element ocean model (FEOM, Danilov et al., A finite-element ocean model: Principles and evaluation. Ocean Modeling 6 (2004) 125-150). In these tests up to 4800 processor cores of a supercomputer have been used (see [PublicationsandPresentations Nerger and Hiller (2013)]). In [PublicationsandPresentations Nerger et al., GMD (2020)], the scalability was assessed up to 12144 processor cores for the coupled atmosphere-ocean model AWI-CM (Sidorenko et al., 2015).
     85The regular tests use a rather small configuration with a simulated model. This model is provided in the PDAF tutorial code of the release.
     86In addition, the scalability of PDAF was examined with a real implementation with the finite element sea-ice ocean model (FESOM). In these tests up to 4800 processor cores of a supercomputer have been used (see [PublicationsandPresentations Nerger and Hiller (2013)]). In [PublicationsandPresentations Nerger et al., GMD (2020)], the scalability was assessed up to 12144 processor cores for the coupled atmosphere-ocean model AWI-CM (Sidorenko et al., 2015). Also, [PublicationsandPresentations Kurtz et al., GMD, (2016)] assessed the parallel performance up to 32768 processor cores for the TerrSysMP terrestial model system.
    8387
    84 To examine PDAF's behavior with large-scale cases, experiments with the simulated model have been performed. By now the biggest case had a state dimension of 8.64^.^10^11^. An observation vector of size 1.73^.^10^10^ was assimilated. For these experiments, the computations used 57600 processor cores. In this case, the dimensions were limited by the available memory of the compute nodes. Using an ensemble of 25 states, the distributed ensemble array occupied about 2.9 GBytes of memory for each core (about 165 TBytes in total).
    85 
     88To examine PDAF's behavior with large-scale cases, experiments with the simulated model have been performed. By now the biggest case had a state dimension of 8.64^.^10^11^. An observation vector of size 1.73^.^10^10^ was assimilated. For these experiments, the computations used 57600 processor cores. In this case, the dimensions were limited by the available memory of the compute nodes. Using an ensemble of 25 states, the distributed ensemble array occupied about 2.9 GBytes of memory for each core (about 165 TBytes in total).