Generating artificial observations

EMPIRE can generate artificial observations easily and quickly.

Model specific operations required:

In empire.nml set the following variables:

gen_data = .true.

The system then should be run with a single ensemble member and a single EMPIRE process, i.e.

1 mpirun -np 1 model : -np 1 empire

Observations

To use real observations (i.e. those not generated automatically in twin experiment mode) the user must change the subroutine get_observation_data in model_specific.f90.

When called, get_observation_data must return the vector of observations \(y\) that corresponds to the observation on, subsequently to, the current timestep.

Running a deterministic ensemble

EMPIRE can simply integrate forward in time an ensemble of models.

In empire.nml set the following variables:

filter = 'DE'

Running a stochastic ensemble

EMPIRE can integrate forward in time an ensemble of models whilst adding stochastic forcing.

Model specific operations required:

qhalf

In empire.nml set the following variables:

filter = 'SE'

Redirecting the output from EMPIRE

This feature can be used to suppress output from EMPIRE STDOUT.

See open_emp_o for more information.

Outputting ensemble member weights

This is controlled by output_weights in empire.nml . By default the weights will not be output. If set to true, this will create a number of files named "ensemble_weights_??" where ?? will refer to the rank of the empire process on pf_mpi_comm. Within that file, the timestep, particle number and the negative log of the weight will be output. Note that these weights may not be normalised.

Outputting rank histograms

This is controlled by use_talagrand in empire.nml and for more information see load_histogram_data .

Outputting trajectories of model variables

This is controlled by use_traj in empire.nml and for more information see setup_traj .

Outputting mean of the ensemble

EMPIRE has the ability to output the mean of the ensemble in each dimension. For each dimension of the state vector \(j\), the ensemble mean \(\bar{x}_j\) is defined as

\( \bar{x}_j := \frac{1}{N_e}\sum_{i=1}^{N_e} x_{i,j} \) where \(x_{i,j}\) is the jth component of ensemble member i and \(N_e\) is the number of ensemble members. To use this feature, set use_mean = true in empire.nml.

Outputting covariances of the ensemble

EMPIRE has the ability to output the ensemble covariance matrices throughout the run. This is controlled by the optional namelist &mat_pf in empire.nml. For more information see matrix_pf::matrix_pf_data. Note however, that this will output a large matrix – if the state dimension of the model is large, this is likely not a good thing to compute! This feature is not available with empire version 3 communications.

Outputting variances of the ensemble

EMPIRE has the ability to output the variance in the ensemble in each dimension. For each dimension of the state vector \(j\), the ensemble variance \(\sigma_j^2\) is defined as

\( \sigma_j^2 := \frac{1}{N_e-1}\sum_{i=1}^{N_e} (x_{i,j} - \bar{x}_j)^2 \) Note that this is the sample variance. To use this feature, set use_variance = true in empire.nml.

Outputting Root Mean Squared Errors

In a twin experiment, where EMPIRE has generated a "truth", EMPIRE can output the following: \( \sqrt{( \frac{1}{N_x}\sum_{i=1}^{N_x}(\bar{x}_i-x^t_i)^2 )} \) where \(N_x\) is the state dimension (state_dim ), \(\bar{x}\) is the ensemble mean, \(x^t\) is the truth, and \(i\) is an index running over each component of the state vector.

Note that in the case where the model has different variables, that are on different scales, this is probably not a good measure. For example, if one component of the state vector is measured in units of "apples per pie" and another is measured in "oranges per country per decade", this measure of RMSE will combine the two. Hence the latter should have much larger scale than the former, so this RMSE measure will be dominated by the errors in the components with greater variability. To use this feature, set use_spatial_rmse = true in empire.nml.

There is now the option to compute RMSE fields using the formula: \( f_j(t) = \sqrt{ \frac{1}{N_e}\sum_{i=1}^{N_e} (x_{i,j}(t) - x^t_j(t))^2 } \) where \( f_j(t)\) is the jth component of the state at time t, \( x^t_j(t)\) is the jth component of the truth at time t, \(x_{i,j}(t)\) is the jth compnonent of ensemble member i at time t and \(N_e\) is the number of ensemble members.

This is controlled by the option use_ens_emse in empire.nml .

Outputting the entire forecast ensemble

The entire forecase ensemble can be output by setting the logical variable output_forecast to be .true. in empire.nml.

Where the forecast ensemble is output is controlled by the string output_path in empire.nml. The default is forecast/.