wiki:ReleaseNotes

Version 48 (modified by lnerger, 21 months ago) (diff)

--

Release notes for PDAF

Version 2.1 - February 21, 2023

Note on change influencing the compatibility with previous versions of PDAF

  • We moved the memory output option in PDAF_print_info from value 2 to 10 to avoid mixing timing and memory outputs

Major additions:

  • Added the hybrid Kalman-Nonlinear Ensemble Transform Filter (LKNETF), see L. Nerger (2022) Data assimilation for nonlinear systems with a hybrid nonlinear-Kalman ensemble transform filter. Q. J. Meteorol. Soc., 148, 620-640 doi:10.1002/qj.4221
  • Added debug outputs for PDAF. Using the routine PDAF_set_debug_flag one can activate debug output printed by the PDAF code routines. This should help to find issues in the user-supplied call-back routines when implementing PDAF with some model or extending an implementation. The debug option is described at https://pdaf.awi.de/trac/wiki/PDAF_debugging

Changes:

  • Added output option for globally allocated memory. This can be printed by calling PDAF_print_info(11)
  • Modes output option for process-local allocated memory from PDAF_print_info(2) to PDAF_print_info(10)
  • Added option for NETF and LNETF to add random noise perturbations to the analysis ensemble analogous to the PF
  • Extend available configuration parameters for 3D-Var methods. Now parameters controlling the solver methods can be chosen
  • Change of behavior for NETF/LNETF/PF in the case that all weights are zero. Now, the weights are reset to 1/ensemble size and a warning is shown (before PDAF exited with an error message)
  • Add tutorial models without PDAF to allow users perform the implementation themselves following the tutorial description.

Code revisions:

  • Renamed the localization variables in example codes from local_range to cradius (c='cut-off’) and srange to sradius (s=‘support’)
  • Changed time stepping code of Lorenz model cases to avoid creation of temporary array by the compiler

Bug corrections (the bugs concern rather unusual use cases, so most users wil not encounter these)

  • Correction of NETF and PF for the case when running with parallelization: For each process domain, only the domain-local observations were assimilated
  • PDAF-OMI related:
    • Correction in Lorenz96 with OMI: In obs_gp_pdafomi.F90 an incorrect observation dimension was used in call to PDAFomi_gather_obs if incomplete observations are used.
    • Correction using global filters with PDAF-OMI and running with parallelization: The program could get stuck if there were observations available but a process domain exists without observations (dim_obs_p=0).
    • Correction of likelihood computation in the global NETF and the PF for multiple observations types when using OMI. Here, the distinction between the total number of observations and the number of observations per observation type was missing
    • Correction of EnKF/LEnKF with OMI running with parallelization: For the case that for a process domain dim_obs_p=0, but globally dim_obs_g>0 the array thisobs%ivar_obs_f was allocated with incorrect size


Previous versions

Version 2.0 - December 17, 2021

Note on signifcant changes influencing the compatibility with previous versions of PDAF

  • We modernized the MPI parallelization. With these changes the MPI stub library that we provided before, and which allowed to compile and run PDAF without an MPI library, is no longer usable. Thus, PDAF does not require an MPI library. This should not have an impact on most users given that today MPI is standard on all cluster computers and an MPI library is available and can by easily installed on virtually any Linux, MacOS or Windows system.
  • We renamed the PDAF library files with prefix PDAF-D_ to the prefix PDAF_ to ensure a consistent naming of the files. If you don't use the Makefile provided in the PDAF package, you likely need to adapt to this change.
  • The observation generation option (GENOBS) was moved from filtertype=11 to filtertype=100
  • Some model implementations (e.g. the Lorenz models) use netcdf for file writing/reading. Here we moved to the NF90 interface. This reuired that the netcdf.mod module file of the netcdf library is installed

Changes:

  • Added 3D-Var methods
    • variants: 3D-Var with parameterized covariances, 3D ensemble var (ensemble covariances) and hybrid-var (combined parameterized and ensemble covariances); ensemble perturbations can be transformed using the global ESTKF or the local LESTKF
    • Added tutorials for 3D-Var methods (codes and slide set with explanations)
    • Added template files for 3D-Var methods
  • Added models from Lorenz (2005): model II (state with averaging), model III (two-scale); both with assimilation fully implemented
  • Added tutoral code showing a multivariate implementation with two model fields (/tutorial/online_2D_serialmodel_2fields). The tutorial demonstrates an efficient way to handle multiple model fields
  • Added possibility to reset the MPI world communicator in which PDAF operates (routine PDAF_set_comm_pdaf). This ensures compatibility with e.g. OI-servers that use processes separate from those used by the model integration.
  • Added possibility to let the user force that the analysis step is computed at the next call to PDAF_assimilate/PDAF_put_state (routine PDAF_force_analysis)
  • Added possibility to overwrite PDAF's ensemble member counting in the flexible parallelization variant and force execution of the analysis (PDAF_set_memberid)
  • Added routine to reset the value of the forgetting factor. Can be applied during the analysis step, e.g. to give each local analysis domain a different inflation value (PDAF_reset_forget)
  • Added diagnostic routine to compute the continuous ranked probability score, CRPS (PDAF_diag_crps)
  • Code revisions:
    • Renaming of routines: prefix PDAF-D_ replaced by PDAF_
    • modernized use of MPI: now we use 'USE MPI' instead of 'include mpif.h'. This solves the issue of gfortran 10, which claimed argument mismatches
    • modernized use of netCDF (with 'use netcdf' instead of 'include netcdf.inc'). This also resolves the issue of gfortran 10
    • observation generation (GENOBS) moved to filtertype=100
    • PDAF_sampleens now works without prior call to PDAF_init (removed memory counting)
    • removed stub PDAF library since it is not compatible with 'USE mpi'. Thus, compiling PDAF now requires a MPI library to be installed
  • PDAF-OMI related
    • Calling deallocate_obs_pdafomi is no longer required
    • Model bindings for MITgcm and AWI-CM revised for using PDAF-OMI
    • all model implementations (Lorenz models) now implemented using PDAF-OMI
    • Added OMI adjoint observation operators for use with 3D-Var methods
  • Bug corrections:
    • Correction of initialization of gcoords in tutorial for obs_C_pdafomi (observations with linear interpolation)
    • For compatibility with the Cray compiler, all .mod files of the PDAF library are copied to /include
    • resolved issue of gfortran-10 complaining about argument mismatch by 'USE mpi'

Version 1.16 - November 30, 2020

Changes:

  • Added PDAF-OMI (Observation Module Infrastructure). This is a new and improved way to implement observation handling with PDAF. For more information see Documentation on PDAF-OMI
  • Added tutorials and templates for PDAF-OMI: the default tutorial not describes the implementation with PDAF-OMI
  • Revised tutorials and templates to make several routines and files more generic
  • Added weights inflation to NETF/LNETF. This inflation the observation error to ensure that the effective smaple size does not shrink below a defined value
  • Added 'schedule(runtime)' to OpenMP parallelization: With this it's nowpossible to define a 'schedule' as environment variable to tune the parallel performance at run time.
  • added some routines in nullmpi.F90 to simulate MPI-behavior for a single process
  • Bug corrections:
    • PDAF_gather_ens: add MPIstatus to allow correct compilation for blocking MPI
    • corrected skewness/kurtosis computation (not all array indices were used)

Version 1.15.1 - March 12, 2020

Changes:

  • Revised screen output of number of processes per process domain. Now this is only shown as debug information when setting screen>2
  • Local filters: Revised screen output of number of local analysis domains. The number for each process is now only shown as debug information when setting screen>2. Otherwise a singe line with minimum/maximum/average number of local analysis domains is shown.
  • Particle filter: In case of zero weights, the filter does no longer stop with an error message. Instead the weights are reset uniformly to 1/(ensemble size).
  • Lorenz-63 model implementation: Added a plot function to plot two states at once, e.g. to compare the true with the assimilated state estimate.
  • Tutorial: Added plot scripts in tutorial/plotting
  • Bug corrections:
    • Particle filter: The filter resued the same random number for resampling. Now, a different set of random number is used at different analysis times
    • Lorenz96 and 63 model implementations: corrected tools/generate_covar.F90: The variable stddev needd to be REAL not INTEGER
    • Lorenz96 and 63 model implementations: updated in the plot scripts some directory settings which were not correctly changed when moving the models into the directory models/ in Verson 1.15
    • Lorenz96 and 63 model implementations: For global filters the time index for reading observations in the call-back routines was corrected. There was a time-offset by one.

Version 1.15 - December 9, 2019

Changes:

  • New timer mode 3: This timer mode show the times spent in the different call-back routines. This helps the user to determine which routines take most time and have hence the largest potential for optimization.
  • Given the new timer mode 3, the previous timer mode 3 is now shifted to 4 and the old timer mode 4 to 5.
  • New directory models/: We moved the implementations of the Lorenz-63 and Lorenz96 models with PDAF to the new directory models/. The former directory testsuite/ is now intended for validation tests, while the models that are fully implemented with PDAF reside in a separate directory
  • Revised validation scripts: In both the testsuite and the tutorial directories, one can run automated test runs whose outputs are compared with reference outputs using Python
  • PDAF_get_state and PDAF_put_state_X can now called in flexible order each having its own counting for the ensemble member index: Before it was required that a call to PDAF_get_state was followed by a matching call to PDAF_put_state_X. Now On can also call PDAF_get_state for all ensemble members at the beginning of the forecast phase. Later the matching calls to PDAF_put_state_X are done. It's only important to have the same number of calls. This change allows, e.g. to distribute and compute the forecasts on GPUs or let other load-balancing software distribute the forecasts.
  • Following the change in PDAF_get_state/PDAF_put_state_X, the routine PDAF_get_memberid was revised. The routine will return the index of the ensemble member according to the current (i.e. last) call to either PDAF_get_state or PDAF_put_state_X (In practice you should see no difference to before if you call the routines alternately)
  • The SEEK filter is declared deprecated, i.e. we plan to no longer support it in some future release. This mode-based filters usually shows a worse performance compared to the current ensemble filters. (Please let us know if you actually use the SEEK filter)
  • The tutorial case online_2d_parallelmodel_fullpar_1fpe now allows to compute the ensemble forecast with a different number of processes for each model task.
  • For PDAF_eofcovar we clarified that the array 'states' is actually destroyed by the singular value decomposition compute in the routine. Accordingly, the meanstate is also no longer added to this array after the SVD.
  • Model-binding for AWI-CM: The directory modelbindings/ now also contains code to use PDAF with the model AWI-CM (the AWI climate model, a coupled atmosphere-ocean model consisting of the atmosphere model ECHAM6 and the ocean model FESOM).
  • Bug corrections:
    • The analysis step of the particle filter is corrected. The analysis ensemble was incorrectly computed for state dimensions >200.
    • To avoid an error massage with rather old compilers, we added a 'save' statement to the variable screenout (This is required by OpenMP older than version 4.0, e.g. in gfortran older than version 4.9.1)

Version 1.14 - July 4, 2019

Changes:

  • Added functionality to generate synthetic observations to simplify the application of twin data assimilation experiments (filtertype=11).
  • Added a particle filter with importance resampling (filtertype=12)
  • Added an option for ETKF/SEIK/ESTKF to compute the innovation either from mean of HX or from H(meanX). Only the latter was supported by now, but using mean(HX) is the correct approach for nonlinear observation operators. This option is specified as param_int(8) (see overview of available filter options)
  • Added timing information on the time required to collect and distribute the ensemble in the online implemented data assimilation.
  • Added more routines to simplify the implementation of parallelized local analysis steps when collecting full observations: 'PDAF_gather_obs_f_flex' and 'PDAF_gather_obs_f2_flex'. These are flexible variants fo the routines introduced in version 1.13 as they don't rely on a call to 'PDAF_gather_dim_obs_f'.
  • All template routines now give an output starting with 'TEMPLATE'. Given that one should delete this output line when one implements a routine, this will help to keep an track on how far a particular implementation is complete.
  • Revised Lorenz-63 model example and added data assimilation and plotting scripts to it. The use is analogous to the Lorenz-96 model, but without localized filters.
  • Bug corrections:
    • For the 'fullpar' case in which the filter runs on separate processes form the model tasks, we corrected the PDAF-internal parallelization setup, as the process-local ensmeble size was not set correctly for task_id=0
    • For the 'fullpar' case also PDAF_get_state for SEEK. Now distribute_state is only executed for model tasks, but not the separate filter task
    • Corrected an array allocation in PDAF_enkf_omega. This lead to an error for the case that dim_ens-1 was not equal to the specified rank of the set of EOFs. This only happens when using this routine to generate a large ensemble for a small state dimension (as now used in the Lorenz-63 model test case).

Version 1.13.2 - September 3, 2018

Changes:

  • Revised Lorenz-96 model example and added a detailed documentation on the web site. Now it is easier to use this fully featured implementation of PDAF with the Lorenz-96 model. Further, plotting scripts for both Matlab and Python are provided.
  • Bug corrections:
    • Corrected the smoother mode of LETKF, subtype 1 (Here the transform matrix for smoothing was not correctly computed)
    • Corrected PDAF_diag_ensstats (For the case that the statistics over all elements of the ensemble array were to be computed, still only element 1 was used)
    • Corrected testsuite/src/main/main.F90 (Here, we added a call PDAF_deallocate in version 1.13 but did not enclose it in a preprocessor check whether PDAF is active. This lead to a compile failure when compiling with deactivated PDAF)

Version 1.13.1 - March 12, 2018

Changes:

  • Added routine PDAF_get_assim_flag. This routines returns the information whether onthe last call to a PDAF_assimilate routine the analysis step of a filter was actually computed, i.e. observation were assimilated. (This can be used e.g. in model with leap frog time stepping to compute an Euler time step directly after the analysis step)
  • Bug corrections: We got notified by a user (thank you) that two bugs we announced to be fixed in Version 1.13 were actually not fixed. Unfortunately, the corrected files didn’t make it into the previous release. Now the following two bugs are really fixed:
    • Corrected order of arguments in 'PDAF_assimilate_lenkf'
    • Corrected 'PDAF_local_weight' (the weight was not always intialized to 0 for distances beyond the localization radius. This only happens if you set the localization radius to be larger than the support-radius of the weight function)

Version 1.13 - February 6, 2018

Changes:

  • Added a model binding for the MITgcm ocean circulation model
  • Added routines to simplify the implementation of parallelized local analysis steps when collecting full observations: 'PDAF_gather_dim_obs_f', 'PDAF_gather_obs_f', 'PDAF_gather_obs_f2' (see updated tutorials on how to use these routines)
  • Updated the tutorials to reflect our updated implementation recommendations. For example, we now recommend to compute the distances of observations for the local analysis step only once
  • Add routine 'PDAF_deallocate' to deallocate the big internal arrays of PDAF at the end of a program. In the tutorials it is called in the new routine 'finalize_pdaf'.
  • In the tutorials, the variables in 'init_pdaf' are reordered to clearly separate the variables that are used in 'PDAF_init' from those that are only used in the call-back routines.
  • Added a verbosity flag for 'PDAF_sampleens' and 'PDAF_eofcovar' to make them better usable in parallel programs (note: the interface has changed)
  • Bug correction: Corrected order of arguments in 'PDAF_assimilate_lenkf'
  • Bug correction: Corrected 'PDAF_local_weight' (the weight was not always intialized to 0 for distances beyond the localization radius)
  • Bug correction: Corrected use do doexit flag in 'PDAF_get_state' (it was very unlikely to cause a problem)

Version 1.12 - December 21, 2016

Changes:

  • New filter method: LEnKF - The classical Ensemble Kalman Filter with perturbed observations (Evensen 1994) now with covariance localization (filtertype 8)
  • New filter: NETF (Nonlinear Ensemble Transform Filters by Toedter and Ahrens (Monthly Weather Review 143 (2015) 1347-1367) including smoother extension (filtertype 9)
  • New filter: LNETF - NETF with local analysis and observation localization, including smoother extension (filtertype 10)
  • revised memory counting to work with more than 2.1 GB per process
  • New routines for ensemble generation: PDAF_eofcovar and PDAF_sampleens. These routines simplify to generate an ensemble with second-order exact sampling (see documentation on ensemble generation and documentation for each routine linked on that page)
  • New routines for ensemble diagnostics (histograms, skewness and kurtosis, effective sample size). (see documentation on data assimilation diagnostics and documentation for each routine linked on that page)
  • Bug correction: forgetting factor in EnKF smoother was treated incorrectly
  • Bug correction: SEEK filter in single-precision case showed an issue
  • Additional functionality for Lorenz-96 test case: model error can be added to the integration and incomplete observations are supported.
  • revised ensemble generation in testsuite for EnKF with dummy model: for ensemble larger than state dimension a random sampling is now used, while for ensemble up to the size of the state the mean-preserving intialization as before is used.

Version 1.11.1 - February 28, 2015

This release fixes a few bugs in the compilation with Cray compilers and cleans up the make process.
Changes:

  • Bug fix: The Makefiles for the PDAF library and the main Makefile for the testsuite are corrected to allow for the correct compilation with Cray compilers (CCE).
  • The Makefiles for the testsuite cases are cleaned up to avoid the warning about the missing directory dummympi/ (The directory doesn't exist any more)
  • Typo corrections in screen output of the PDAF core routines

Version 1.11 - December 22, 2014

Changes:

  • Revised the screen output. All output lines from the PDAF code routines now start with 'PDAF'. This will make it easier to grep for these lines when PDAF is used with models that generate a lot of output. Further, the output now also works correctly if the model state or the number of observations are very large (up to O(109)) or if the number of processes is large (up to O(105)).
  • OpenMP parallelization for the local filters (LESTKF, LSEIK, LETKF) was added. This allows to speed up the analysis step without the more compilated changes required for MPI-parallelization. The tutorial have been updated to explain how to use the OpenMP parallelization.
  • The communication for collecting ensemble states for the assimilation update is improved. We switched to non-blocking MPI communication, which allows to gather the ensemble members in arbitrary order and can speed up the collection of ensemble members. Analogous, the distribution of ensemble members can be improved using non-blocking communication. The old blocking communication is still available by setting the preprocessor flag BLOCKING_MPI_EXCHANGE.
  • The overall configuration possibilities for the MPI parallelization have been revised. Now, it is more easily possible to run the filter using a different set of processors as the models run. This variant can be useful, e.g. if the memory of a computer is so limited that one cannot store the arrays from the model and the ensemble array at the same time.
  • New routines:
    • PDAF_get_obsmemberid: This routine returns the index of the ensemble member on which an observation operator has to be applied
    • PDAF_prepost: This routine can be used, e.g., like PDAF_assimilate_lestkf. However, the routine only calls the pre/poststep routine once, but does not compute an analysis step. The routine can be used to analyze an ensemble forecast.
    • PDAF_put_state_prepost: This routine can be used, e.g. like PDAF_put_state_lestkf. However, the routine only calls the pre/poststep routine once, but does not compute an analysis step. The routine can be used to analyze an ensemble forecast.

Version 1.10 - October 4, 2013

Changes:

  • Added a simplified implementation variant for the online implementation of PDAF that relies on the parallelization. For the case that there are sufficient processors to integrate all ensemble members in parallel, the new routines PDAF_assimilate_X (with X being replaced by the name of the filter algorithm) can be used instead of PDAF_get_state and PDAF_put_state_X. The use of PDAF_assimilate_X is explained in the tutorial.
  • Added tutorial implementations for an example of the online implementation (coupling model and PDAF into a single assimilation program) of PDAF. The examples demonstrate the implementation with a model that itself is not parallelized as well as a parallelized model. Corresponding tutorials are now available on the tutorial web page.
  • Revised templates for the implementation of the online mode.

Version 1.9 - May 6, 2013

Changes:

  • Added smoothers for ESKTF, ETKF, EnKF and the local filters LESTKF, LETKF.
  • Added fixed basis (subtype=2) and fixed covariance matrix (subtype=3) variants for ESTKF, ETKF and the local filters LESTKF, LETKF.
  • Added an example implementation of the offline mode with a simple 2D model domain and observations with data gaps. This implementation serves for a tutorial, that provides a step-by-step description on how to implement the analysis step in the offline mode.
  • Added a function PDAF_get_memberid to query the index of an ensemble member during the forecast phase.
  • revised the templates and simplified the implementation for the offline mode
  • fixed a bug in SEIK/LSEIK for subtype=3

Version 1.8 - February 12, 2012

Changes:

  • Added Error Subspace Transform Kalman filter (ESTKF) and localized variant LESTKF. In addition a variant of the SEIK filter with symmetric square-root and explicit ensemble transformation is now available. (These filters have been introduced in the paper: "A unification of ensemble square-root filters" by L. Nerger, T. Janjic, J. Schroeter, and W. Hiller, Monthly Weather Review, 140, 2335-2345, doi:10.1175/MWR-D-11-00102.1)
  • Added support to specify the type of the matrix square root in the SEIK filter. (Cholesky decomposition or symmetric square root based on singular value decomposition. The effects of these square root are also discussed in the paper mentioned above.)
  • Revised the internal structure of PDAF to simplify the implementation of additional filters. (See the page about adding a filter algorithm for details.)
  • Added support to compile for either double or single precision.
  • Clean-up of PDAF's internal timers and memory allocation counting.

Version 1.7 - September 16, 2011

Changes:

  • Revised internal structure of PDAF to simplify implementation of additional assimilation methods.
  • Added full data assimilation implementation of Lorenz-96 model with PDAF.
  • Revision of observation localization. It also includes the regulated localization that was introduced in the paper "A regulated localization scheme for ensemble-based Kalman filters" by L. Nerger et al. to appear in Q. J. Roy. Meteor. Soc. (accessible online: DOI:10.1002/qj.945)
  • Added an option to display parameter options for a selected filter using the compiled program.
  • Added routines with a simplified interface. The simplified interface does not require that you provide the names of user-supplied subroutines int he call to PDAF. However, one is restricted to use pre-defined routine names.
  • License change: Now PDAF is licensed with the more flexible Lesser GNU Public License (older versions of PDAF used the GNU Public License).

Version 1.6.2 - 10/05/2010

Changes:

  • Change in Makefiles to correct compilation on Linux with gfortran

Version 1.6.1 - 08/27/2010

Changes:

  • Added pre-processor statement PDAF_NO_UPDATE to simplify tests during implementation.
  • Unified interface to pre/poststep routines. For the EnKF Uinv was added. This array is never used in EnKF.
  • Added shortened timer output to PDAF_print_info

Version 1.6.0 - 03/18/2010

Version distributed after presentation at Ocean Sciences Meeting, Portland, OR.

Changes:

  • Added ETKF and LETKF to public release

Version 1.5.0 - 01/19/2010

Changes:

  • Revised directory structure to separate PDAF core routines from test suite.

Versions 1.4.2 to 1.1.0

Version 1.0 - 10/08/2004

Original public release after participating at the GODAE International Summer School of Oceanography, „An Integrated View of Oceanography: Ocean Weather Forecasting in the 21st Century”, Lalonde les Maures, France