Author: PA Browne. Time-stamp: <2016-01-16 00:54:55 pbrowne>

Description of the model

\(\frac{\mathrm{d}x_k}{\mathrm{d}t} = -x_{k-2}x_{k-1} + x_{k-1}x_{x+1} - x_k + F\)

Todo:: Write some actual description of this model

Connecting the model to EMPIRE using MPI.

Todo:: Write some stuff about this. maybe a separate page.

Fortunately, there is a model already within EMPIRE that can do all this. Build it with the command

1 make lorenz96

Now you can check that a model executable was created. It will be found in the binaries folder bin/

1 ls -l bin/lorenz96

Specification of subroutines in model_specific.f90

Before any data assimilation can be done with the model, we must specify operators such as the observation operator H, its error covariance R, and a few others. Below, we list how we shall set these up in the file model_specific.f90, which is located in the top directory of EMPIRE.

The observations

Defining observation operators:

We shall have an observation network that does not change with time. We are going to observe every other grid point directly.

That is, \( y = H(x) = \begin{bmatrix} x_1 \\ x_3 \\ x_5 \\ \vdots \\ x_{N-1} \end{bmatrix} \).

Note that here we are using Fortran indexing (starting from 1). For simplicity, we will assume that \(N\) is odd. Hence the observation operator should look as follows.

\(H\) is therefore a matrix in \(\mathbb{R}^{\frac{N}{2}\times N}\) such that

\(H = \begin{bmatrix} 1 & 0 &0 &0 & 0 & 0 &0 & \cdots & 0 &0 \\ 0 & 0 &1 &0 & 0 & 0 &0 & \cdots & 0 &0 \\ 0 & 0 &0 &0 & 1 & 0 &0 & \cdots & 0 &0 \\ \vdots & \vdots &\vdots &\vdots & \vdots & \vdots &\ddots & \cdots & \vdots &\vdots \\ 0 & 0 &0 &0 & 0 & 0 &0 & \cdots & 1 &0 \end{bmatrix}\)

We never form this matrix, instead we simply implement H (and its transpose, HT) by selecting (inserting) the appropriate values from (to) the state vector. Note that we have to do this for an arbitrary number of state vectors.

subroutine h(obsDim,nrhs,x,hx,t)
  use sizes  
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(state_dim,nrhs), intent(in) :: x
  real(kind=rk), dimension(obsDim,nrhs), intent(out) :: hx
  integer, intent(in) :: t
  hx(:,:) = x(1:state_dim:2,:)
end subroutine H

Similarly, for the transpose of this observation operator, we can write this as follows:

subroutine ht(obsDim,nrhs,y,x,t)
  use sizes
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(obsDim,nrhs), intent(in) :: y
  real(kind=rk), dimension(state_dim,nrhs), intent(out) :: x
  integer, intent(in) :: t
  x = 0.0_rk
  x(1:state_dim:2,:) = y(:,:)
end subroutine HT

Defining observation error covariances:

Let us assume that we have homogeneous, uncorrelated observation errors such that \(R = \sigma^2I\). For this example, \(\sigma^2=0.1\). Then we can code multiplication by R in the following way:

subroutine r(obsDim,nrhs,y,Ry,t)
  use rdata
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(obsDim,nrhs), intent(in) :: y
  real(kind=rk), dimension(obsDim,nrhs), intent(out) :: ry
  integer, intent(in) :: t
  ry = y*0.1d0
end subroutine R

Similarly, application of \(R^\frac{1}{2}\) can be done as:

subroutine rhalf(obsDim,nrhs,y,Ry,t)
  use rdata
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(obsDim,nrhs), intent(in) :: y
  real(kind=rk), dimension(obsDim,nrhs), intent(out) :: ry
  integer, intent(in) :: t
  ry = y*sqrt(0.1d0)
end subroutine RHALF

We also need to have the application of \(R^{-1}\) and \(R^{-\frac{1}{2}}\). These can be done with the following subroutines:

subroutine solve_r(obsDim,nrhs,y,v,t)
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(obsdim,nrhs), intent(in) :: y
  real(kind=rk), dimension(obsdim,nrhs), intent(out) :: v
  integer, intent(in) :: t
  v = y/0.1d0
end subroutine solve_r

subroutine solve_rhalf(obsdim,nrhs,y,v,t)
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: obsdim
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(obsdim,nrhs), intent(in) :: y
  real(kind=rk), dimension(obsdim,nrhs), intent(out) :: v
  integer, intent(in) :: t
  v = y/sqrt(0.1d0)
end subroutine solve_rhalf

Defining background error covariance matrix:

To make an initial ensemble, we can use a background error covariance matrix, \(B\). In this example, \(B = 0.2I\). There are two functions of this matrix that we could need: \(B^{\frac{1}{2}}\) and \(B^{-1}\). These can be coded in the following ways:

subroutine bhalf(nrhs,x,Qx)
  use sizes
  use qdata
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(state_dim,nrhs), intent(in) :: x
  real(kind=rk), dimension(state_dim,nrhs), intent(out) :: qx
  qx = sqrt(0.2d0)*x
end subroutine Bhalf

subroutine solve_b(nrhs,x,Qx)
  use sizes
  use qdata
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(state_dim,nrhs), intent(in) :: x
  real(kind=rk), dimension(state_dim,nrhs), intent(out) :: qx
  qx = x/0.2d0
end subroutine solve_b

Defining model error covariance matrix:

If \(i\) and \(j\) are two different grid points, then we define the correlation, \(C_{ij}\) between the model error at grid points \(x_i\) and \(x_j\) to be

\( C_{ij} =\begin{cases} 1 & \text{if } i=j\\ \frac{2}{3} & \text{if } |i-j|=1 \\ \frac{1}{6} & \text{if } |i-j|=2 \\ 0 & \text{otherwise} \end{cases}\)

Then, we define the model error covariance matrix \(Q = \alpha^2 \frac{3}{2} C\). Hence

\(Q^{\frac{1}{2}} = \alpha\begin{bmatrix} 1 & 0.5 & 0 & 0 & \cdots & 0 & 0.5 \\ 0.5 & 1 & 0.5 & 0 & \cdots & 0 & 0 \\ 0 & 0.5 & 1 & 0.5 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \ddots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & 1 & 0.5 \\ 0.5 & 0 & 0 & 0 & \cdots & 0.5 & 1 \\ \end{bmatrix}\)

Thus this can be coded as:

subroutine qhalf(nrhs,x,Qx)
  use sizes
  use qdata
  implicit none
  integer, parameter :: rk=kind(1.0d+0)
  integer, intent(in) :: nrhs
  real(kind=rk), dimension(state_dim,nrhs), intent(in) :: x
  real(kind=rk), dimension(state_dim,nrhs), intent(out) :: qx
  real(kind=rk), parameter :: alpha=0.2
  integer :: i
  qx(1,:) = x(1,:) + 0.5d0*x(2,:) + 0.5d0*x(state_dim,:)
  qx(2:state_dim-1,:) = 0.5d0*x(1:state_dim-2,:)+x(2:state_dim-1,:)+0.5d0*x(3:state_dim,:)
  qx(state_dim,:) = 0.5*x(1,:) + 0.5*x(state_dim-1,:) + x(state_dim,:)
  qx = alpha*qx
end subroutine Qhalf

For simplicity, we shall leave the operator Q as the default one. This will simply apply Qhalf twice in order to apply the Q operator.

Specifying distance for localisation:

For the LETKF, we have to be able to do localisation. To do so, we define a distance measure between the observations and the grid points.

The model is cyclic, so we can say that all the variables lie in the interval [0,1]. To find the position of variable \(x_{xp}\) we can therefore divide \(xp\) by the total number of gridpoints. To find the position of observation \(y_{yp}\), we note that \(y\) corresponds to every other gridpoint of \(x\). Hence its position in the interval [0,1] can be calculated as \(2yp-1\) divided by the total number of gridpoints. The distance between these two positions is then either the distance directly within the interval [0,1], or the distance wrapping around this interval. The code implementing this is below:

subroutine dist_st_ob(xp,yp,dis,t)
  use sizes
  implicit none
  integer, intent(in) :: xp
  integer, intent(in) :: yp
  real(kind=kind(1.0d0)), intent(out) :: dis
  integer, intent(in) :: t 
  integer, parameter :: rk = kind(1.0d0)
  real(kind=rk) :: st,ob
  st = real(xp,rk)/real(state_dim,rk)
  ob = real((2*yp)-1,rk)/real(state_dim,rk)
  dis = min(abs(st-ob),1.0d0-abs(st-ob))
end subroutine dist_st_ob

Setting up configure model and reconfigure model for an experiment:

Here we tell EMPIRE how large the model is, how many observations we have per observation time. In this experiment we are going to have observations at a fixed frequency. The total number of observations in time, and the frequency of observations will be read in at run time to the variables pf%time_obs and pf%time_bwn_obs, respectively. Here we also call timestep_data_set_total to tell EMPIRE how long the model run will be. The "sanity check" below has helped me countless times when debugging - it serves no other purpose than to help identify errors.

subroutine configure_model
  use pf_control
  use timestep_data
  use sizes
  implicit none
  !this is for lorenz 96
  state_dim = 40
  obs_dim = 20
  call timestep_data_set_total(pf%time_bwn_obs*pf%time_obs)
  print*,'#################################'
  print*,'######### SANITY CHECK ##########'
  print*,'#################################'
  print*,'## STATE DIMENSION = ',state_dim
  print*,'##  OBS  DIMENSION = ',obs_dim
  print*,'#################################'
end subroutine configure_model

In this example, the observational network is not going to change through time. Reconfigure model is called after each analysis is performed, so that the observational network can be reconfigured for the next set of observations. As the observation is going to be at the same time interval as for the last one, and the operators H and R remain constant, we this subroutine can be left blank as below. Note it cannot be removed as this will lead to a compilation error.

subroutine reconfigure_model

end subroutine reconfigure_model

Compiling the model specific changes

We are now at a point where we can compile the code. Go to the same directory as model_specific.f90 and simply type the command

1 make

Now you can check that a new empire executable was created. It will be found in the binaries folder bin/, run the following command and ensure that it has an up-to-date timestamp:

1 ls -l bin/empire

Running the codes

Now that we have both the model and EMPIRE compiled, we are in a position to execute the codes and therefore do some experiments. We shall do a twin experiment, where we first run the model to act as a "truth" and from which we generate observations. Then afterwards we will run an ensemble from different starting conditions and attempt to use the observations we have taken to stay close to the truth.

The first step is to run the truth and generate the observations.

Running the truth

We have to define the runtime parameters that EMPIRE requires. These are found in the file empire.nml.

we want 1 observation in time: we set this in the variable time_obs
we want the observations to occur every 4th model timestep: we set this in the variable time_bwn_obs
we need to tell EMPIRE that it should be doing a truth run and generating the data: we set the logical variable gen_data to be true
we need to tell EMPIRE how to initially perturb the model. We will do this as \(x_t(0) = x_\text{ref}(0) + \eta\), where \(\eta \sim \mathcal{N}(0,B)\): we set this in the variable init

In empire.nml, the namelist should appear as follows:

&pf_params
time_obs=1,
time_bwn_obs=4,
gen_data=.true.,
init='B'
/

Now let us move to an appropriate folder to run:

1 cd /path/to/where/you/want/to/run/the/code

In this folder, you must have the empire.nml file located. Check this with an ls command.

The model also needs a driving file. It needs to be called l96.nml, which is a standard Fortran namelist file. To set the parameters for the run we shall do, the l96.nml should look as follows:

&l96
N=40,
total_timesteps=4,
F=8.0d0,
dt=0.01
/

Now we want to run a single ensemble member. For this model, each ensemble member uses a single MPI process. So we must launch a total of 1 MPI processes to run the model. To output the truth, we only need a single EMPIRE process. The mpirun syntax for doing this is as follows:

1 mpirun -np 1 /path/to/model/executable : -np 1 /path/to/empire/executable

Now after the code has finished (a matter of seconds), let us look for some output. Check to see the observation files have been created:

1 ls obs*

There should only be one: obs_000001. This is going to be the observation file that we use in later.

Running a stochastic ensemble

Before we do the assimilation, let's get something to compare with. The comparison that we can do is against a model run where we have not done any assimilation.

We want EMPIRE to run for the same number of timesteps as before, so in empire.nml we set time_obs and time_bwn_obs as we had previously. We also want to create the initial ensemble in the same way, so init remains 'B'. We are no longer generating the data, so remove gen_data to use its default value false. Lastly, we have to tell EMPIRE to propagate the ensemble members forward stochastically without assimilating the observations. We do this by setting the filter type filter to 'SE' (Stochastic Ensemble).

In empire.nml, the namelist should appear as follows:

&pf_params
time_obs=1,
time_bwn_obs=4,
filter='SE',
init='B'
/

Now we want to execute this but using more than one ensemble member. 32 ensemble members seems like a good number. As we have more than 1 ensemble member, we can use more than one EMPIRE process. Here we will use 4, and therefore each EMPIRE process will be dealing with 8 ensemble members.

We run this with the mpirun command:

1 mpirun -np 32 /path/to/model/executable : -np 4 /path/to/empire/executable

Running an assimilation

All that is necessary to do in order to run an assimilation is now to change filter to correspond to the method we want to use. Let us use the LETKF (as we spent so long ensuring the distance calculation for it earlier).

Hence modify filter in empire.nml to 'LE' so that the namelist appears as:

&pf_params
time_obs=1,
time_bwn_obs=4,
filter='LE',
init='B'
/

Run this the same way as you ran the stochastic ensemble with the mpirun command:

1 mpirun -np 32 /path/to/model/executable : -np 4 /path/to/empire/executable

Plotting the results

For this we shall use python, with numpy and matplotlib

The python script examples/lorenz96/tutorial_lorenz96_plot.py is able to produce some plots of trajectories from the output. Run this with the command

1 ../examples/lorenz96/tutorial_lorenz96_plot.py

If all has gone to plan, you should see a plot looking like the one below:

Notice how the ensemble members from the LETKF (in red, labelled "assimilation") narrows at timestep 4. This is where the observation occurred, and you should be able to see how the LETKF has brought the ensemble closer to the true state than the stochastic ensemble.

Tutorial codes

These can be found in the file examples/lorenz96/tutorial_lorenz96_script.sh

#!/bin/bash
set -o verbose
#make the lorenz96 code
make lorenz96
#backup the model_specific.f90 file
cp model_specific.f90 model_specific.f90_original
#make all the changes to model_specific.f90 that we have listed above
#happily they are in the file examples/lorenz96/model_specific_tutorial_l96.f90
cp examples/lorenz96/model_specific_tutorial_l96.f90 model_specific.f90
#build the codes
make
#check to see if the empire codes built properly
ls -l bin/empire
#make a run directory
mkdir -p rundirectory
#move to the run directory
cd rundirectory
#get the empire.nml file
cp ../examples/lorenz96/tutorial1.nml empire.nml
#look at the empire.nml file
cat empire.nml
#pause to look at this file
sleep 10
#generate the l96.nml file
echo -e "&l96\nN=40,\ntotal_timesteps=4,\nF=8.0d0,\ndt=1.0d-2\n/\n" > l96.nml
#look at the l96.nml file
cat l96.nml
#pause to look at this file
sleep 5
#generate the observations
mpirun --output-filename truth -np 1 ../bin/lorenz96 : -np 1 ../bin/empire
#look for the observation files
ls obs*
#modify the empire.nml file to run a stochastic ensemble
sed -i "s/filter.*/filter='SE',/g" empire.nml
sed -i "/gen_data/d" empire.nml
#look at the empire.nml file
cat empire.nml
#pause to look at this file
sleep 10
#now run the stochastic ensemble 
mpirun --output-filename stoch -np 32 ../bin/lorenz96 : -np 4 ../bin/empire
#modify the empire.nml file to run the LETKF
sed -i "s/filter.*/filter='LE',/g" empire.nml
#look at the empire.nml file
cat empire.nml
#pause to look at this file
sleep 5
#now run the LETKF 
mpirun --output-filename assim -np 32 ../bin/lorenz96 : -np 4 ../bin/empire
#plot the output
../examples/lorenz96/tutorial_lorenz96_plot.py