In what follows, we provide a brief summary of the CLM3.0 data structures. Understanding of these data structures is essential before the user attempts to modify code and/or add new history output fields to the model.
The subgrid hierarchy in CLM3.0 is composed of gridcells, landunits, columns and plant functional types (pfts). Each gridcell can have a different number of landunits, each landunit can have a different number of columns and each column can have multiple pfts. This results in efficient memory allocation, and allows for the implementation of many different types of subgrid representations.
The first subgrid level, the landunit, is intended to capture the broadest spatial patterns of subgrid heterogeneity. These broad patterns include physically distinct surface types (glaciers, lakes, wetlands, and vegetated areas). In terms of CLM3.0 variables, the central distinguishing characteristic of the landunit is that this is where physical soil properties are defined: texture, color, depth, pressure-volume relationships, and thermal conductivity.
The second subgrid level, the column, is intended to capture potential variability in the soil and snow state variables within a single landunit. The central characteristic of the column is that this is where the state and flux variables for water and energy in the soil and snow are defined. Regardless of the number and type of pfts occupying space on the column, the column physics operates with a single set of upper boundary fluxes, as well as a single set of transpiration fluxes from multiple soil levels. These boundary fluxes are weighted averages over all pfts.
The third and final subgrid level is referred to as the plant functional type (pft), but it also includes the treatment for bare ground. It is intended to capture the biophysical and biogeochemical differences between broad categories of plants, in terms of their functional characteristics. All fluxes to and from the surface are defined at the pft level, as are the vegetation state variables (e.g. vegetation temperature, canopy water storage, and carbon for the leaf, stem, and roots).
In addition to state and flux variable data structures for conserved quantities (energy, water, carbon, etc.), each subgrid level also has a physical state data structure for handling quantities that are not involved in conservation checks (diagnostic variables). For example, soil texture is defined through physical state variables at the landunit level, the number of snow layers and the roughness lengths are defined as physical state variables at the column level, and the leaf area index and the fraction of canopy that is wet are defined as physical state variables at the pft level.
The hierarchical subgrid data structures are implemented in the code through the modules clmtype.F90, clmtypeInitMod.F90, decompMod.F90 and initGridCellsMod.F90. These routines are all in the /src/main/ subdirectory. The new code makes extensive use of the Fortran 90 implementation of the derived data type. This permits the user to define new data types that can consist of multiple standard data types (integers, doubles, strings) as well as other derived data types.
This subgrid hierarchy is implemented in CLM3.0 as a set of nested derived types. The entire definition is contained in module clmtype.F90. Extensive use is made of pointers, both for dynamic memory allocation and for simplification of the derived type referencing within subroutines. The use of pointers for dynamic memory allocation ensures that the number of subgrid elements at each level in the hierarchy is flexible and resolved at run time, thereby eliminating the need to statically declare arrays of fixed dimensions that might end up being sparsely populated. The use of pointers for referencing members of the derived data type within the subroutines provides a coherent treatment of the logical relationships between variables (e.g., the user cannot inadvertently change a pft-level variable within a subroutine that is supposed to operate on the column states and fluxes), and a more transparent representation of the core algorithms (it is easy to tell when the code is in a column or pft loop).
The module, clmtype.F90, is organized such that derived types which are members of other derived types are defined first (a Fortran 90 compiler requirement). In particular, the energy and mass conservation data types are defined first, followed by data types constituting the pft level, column level, landunit level, gridcell level and the model domain level. Finally, the hierarchical organization of these types is defined, starting with the model domain level, which consists in part of a pointer to an array of gridcells, each of which consists in part of a pointer to an array of landunits, each of which has a pointer to an array of columns, which each have a pointer to an array of pfts.
Model initialization occurs in module initializeMod.F90. A brief summary of the CLM3.0 initialization is provided. For a more detailed discussion, the user is referred to the CLM3.0 Developer's Guide. The first step in CLM3.0 initialization is to determine processor and thread decomposition (i.e. ``clump'' layout). This is done via a call to subroutine initDecomp in module decompMod.F90. Subsequently, memory is allocated for the clm data structures in subroutine initClmtype in module clmtypeInitMod.F90. Once memory allocation has occurred, the hierarchy of the data structures (e.g. assignment of pfts to columns, etc.) is determined in subroutine initGridCells in module initGridCellsMod.F90. Use is made of input gridded datasets defining the spatial distribution of pfts and other surface types (glacier, lake, etc.). Finally, the necessary model filters (e.g. isolating soil points, lake points, etc) are determined in routine initFilters in module filterMod.F90.