As we’ve seen from previous talks and tutorials a key goal in Biogeochemical research is to estimate ecosystem carbon exchange over a large area. Often we are faced with a dilemma when choosing how to use our resources to achieve a robust estimate of ecosystem exchange.
In this tutorial we will ask you to estimate the net
ecosystem exchange of a large area of forested land. The domain we are
interested in is a 100 km2 area of high elevation forest. We have defined a
10x10 grid. Each grid cell is 1 km across and contains forest. A ridge line
runs from the

The goal is to estimate the NEE in a way which might allow us to predict how this entire area will respond to future changes in temperature and rainfall.
We want to:
Part 0: Set up
Navigate
to the H (network) drive. Create a new folder with a unique name
(e.g. your name). Choose a name with no
spaces. Copy the SIPNET_tutorial
folder into your personal folder.
Take a minute to explore the contents of the SIPNET_tutorial folder. Here is a list
of this folder’s contents, for your reference:
You will also notice a program, xemacs, in the H drive. You should use
this program to edit text files in
this exercise.
Finally, you will use the cygwin program for much of this
exercise. This program is in the START
menu, under Programs : ASP. Open
cygwin. This program gives you a unix-like environment, which we will use
to run the various executables. If you are unfamiliar with Unix, some basic
commands are listed here: <http://mally.stanford.edu/~sr/computing/basic-unix.html>.
For now, type the following to get into your
personal copy of the tutorial folder (replace “your-directory” with the name
you gave your personal directory):
cd your-directory
cd SIPNET_tutorial
Part
1. Parameter estimation
We’ll start by getting a SIPNET parameter estimation run going, so that it can
chug away in the background while we do some other stuff.
We have at our disposal the actual net CO2 flux data (NEE) from Niwot Ridge, from Nov. 1, 1998 - Dec. 31, 2005. These data have been aggregated up to SIPNET’s half-daily time step.
But in the real world, you often don’t have the luxury of such a long data set. For the purposes of this exercise:
You should decide the start date and resolution/record length based on your knowledge of ecosystem processes and the figure below (note that negative NEE indicates uptake by the biosphere):

Now you will run a utility to subset the data, and then will perform an optimization on that subset. Open subset.in in xemacs. Change the values for START_YEAR and START_DAY (the starting year and Julian day of your desired subset), recalling that data are available from Nov. 1, 1998 - Dec. 31, 2005. Then:
Save and close subset.in.
From your cygwin window, type the following to create your personalized Niwot input subsets:
./subsetData.exe
Make sure it worked by typing ls -lart (this will display all the files in the folder with the most recently created at the bottom of the window) - you should see a bunch of files named niwot_subset.*
Finally, start the parameter estimation by typing:
./estimate.exe &
This will read the files beginning with “niwot_subset”, and output its results to files beginning with “niwot-estimated”. It should take 15-30 minutes to run. The “&” will run it in the background, so you can continue using this window.
Part 2: Optimization by hand
While the estimation is running, we’ll try to get a sense of what’s going on “under the hood”.
First, a little more about the model: SIPNET was designed to estimate fluxes that could also be measured using an Eddy Flux tower. The major pools of Carbon and Water in the SIPNET model are shown below in boxes; the arrows represent fluxes into, out of or between pools.

Parameters in SIPNET are not time dependent and a single value for each is used throughout each model run. Parameters control the initial size of the pools and also the magnitude of the fluxes. For example Amax to some extent controls the Photosynthesis flux and Leaf Turnover Rate controls the rate at which leaf litter falls.
In this section we will attempt to find optimal values of two SIPNET parameters:
We’ll change the values of these two parameters and observe how the fit of model to the observed fluxes (measured at Niwot Ridge) changes as we change each parameter.
To set up a sensitivity test…
Save and close the
file.
From your cygwin window, run the model by typing:
./sipnet.exe
Next type ls -lart - check that 4 new files have been created:
niwot.NEE niwot.NEE_cum niwot.GPP niwot.GPP_cum
These 4 files store the time series of model-predicted NEE (the net ecosystem exchange of CO2, g C m-2), GPP (gross primary productivity, g C m-2), and the running totals of these two variables. For Excel to be able to read these files, we need to switch the rows and columns. This can be done with the enclosed ‘transpose’ utility: Type:
./transpose.exe niwot.NEE
./transpose.exe niwot.NEE_cum
./transpose.exe niwot.GPP
./transpose.exe niwot.GPP_cum
Open a window, locate these four files and open them with Excel. You may have to put the data into columns using “Text to Columns” in the Data menu. The values are space delimited. Each column corresponds to one run of the sensitivity test. The first row gives the value of the Amax parameter in the run, and the remaining rows give the NEE (for example) value at each half-daily point (day or night), from Nov. 1, 1998 - Dec. 31, 2005.
Make line plots of the time series of some of these variables. Does the model seem very sensitive to the value of Amax?
Now
try to find the “best” value of Amax,
by finding the value that minimizes the error between modeled and observed NEE.
Open NEE-obs.xls. This file provides
the observed NEE time series from the Niwot Ridge flux tower. For your error function, use the
root-mean-squared (RMS) error between model and data:
(where n = number of data points, and mi and oi are the modeled and observed values in time step i). Make a plot of parameter value vs. RMS error. Does one parameter value stand out as being the best?
Now repeat the above steps for the
LeafTurnoverRate parameter. Use LOW_VAL = 0 (infinite leaf longevity), HIGH_VAL = 0.27 (leaf longevity of
about 4 years), NUM_RUNS = 10. Is the
model more or less sensitive to this parameter as compared to Amax? Now does one
parameter value stand out as being the best?
You've just seen how the model error changes if we vary some key parameters. The automated estimation routine uses this fact to estimate a suite of parameters. We change each parameter by a small random value within predefined bounds, and compare the model results to the observed fluxes. We retain parameter values that improve the model fit and discard those that increase the error (though not always... why?).
However, the automated procedure modifies all parameters simultaneously. In contrast, your manual parameter estimation just identified the best parameter value given fixed values of the rest of the parameters. The automated procedure runs the model about 500,000 times to identify the best parameter set as well as a range of plausible values for each parameter (“posterior distribution”):


Part 3: Upscale optimized
model to the region
By now the estimate program should be done. To make sure, type ls -lart - among other files, you should see the file niwot-estimated.param.
We will now run the optimized model forward across the entire 100 km2 region. We will use the optimized parameter set from Part 1, but use some (artificial) spatially-varying data for a few variables that differ at each point in space. The variables that can vary spatially are:
Your “budget” can support measuring any 3 of these spatial variables. The rest must remain fixed at their values from Niwot Ridge. To help you choose which 3 to vary, here are plots of the six variables:

Delete the file niwot.param and copy niwot-estimated.param to niwot.param. This will make SIPNET use the optimized parameter values in the forward run of the model. You may be interested in comparing the optimized parameters with the original, unoptimized parameters (in niwot.param.startfile_bgc). If you do so, however, note that the file formats differ. For both files, the parameter values are in the 2nd column; only parameters with a 1 in the 3rd column were allowed to vary in the optimization.
Decide which of these 3 variables to
“measure”. Then open the file “spatial.control” in xemacs, and change No in the second column to Yes for just these 3 variables. Save and close the file.
From your cygwin window, type:
./makeSpatialInputs.exe
This utility will take a minute or so to run. It is reading the *.anomalies files for the variables you specified, and creating new parameter and climate input files with these spatially-varying variables. It creates the files niwot_spatial.clim, niwot_spatial.param, and niwot_spatial.param-spatial (you can confirm this with ls -lart).
To set up the spatial run:
Save and close the file.
From your cygwin window, run the model by typing:
./sipnet.exe
It will take a minute to run. It will create the files niwot_spatial.NEE, niwot_spatial.NEE_cum, niwot_spatial.GPP, and niwot_spatial.GPP_cum. As before, you need to transpose these files: Type:
./transpose.exe niwot_spatial.NEE
./transpose.exe niwot_spatial.NEE_cum
./transpose.exe niwot_spatial.GPP
./transpose.exe niwot_spatial.GPP_cum
Open these 4 files with Excel. You may have to put the data into columns using “Text to Columns” in the Data menu. The values are space delimited. Each column corresponds to one spatial location. The first row gives the location, and the remaining rows give the NEE (for example) value at each half-daily point (day or night), from Nov. 1, 1998 - Dec. 31, 2005.
Also open the spatial validation data sets: validation-NEE.xls and
validation-GPP.xls. Do not be alarmed if you do not see any data - these are sparse data sets, and so the
data start further down in the files.
These files contain “observations” (artificially created) of NEE (in g C m-2, summed over the day) and GPP (also in g C m-2, summed over the day) across the spatial domain. The NEE “data” are similar to that which you might get from occasional aircraft flights across the domain: a single, area-averaged estimate of NEE about once per month. The GPP “data” are similar to that which you might get from MODIS: an estimate of GPP at each location once every 8 days. Note that there are no satellite observations in winter (when the signal is corrupted by snow), or before 2000 (before MODIS became operational).
Now the fun part! Explore the spatial
predictions of NEE and GPP produced by the model. Some things you may want to look at are:
Part 4 (Optional, if time
allows): Comparisons of different choices
If time allows, either: