Standard for CCM4
This document defines a set of specifications, rules, and
recommendations for the coding of CCM4. The purpose is to provide a framework
that enables users to easily understand or modify the code, or to port
it to new computational environments. In addition, it is hoped that adherence
to these guidelines will facilitate the exchange and incorporation of new
packages and parameterizations into the model. Other works which
influenced the development of this document are "Report on Column Physics
and "European Standards For Writing and Documenting Exchangeable Fortran
90 Code" (http://nsipp.gsfc.nasa.gov/infra/eurorules.html).
Where the use of a language preprocessor is required, it
will be the C preprocessor (cpp). cpp is available on any UNIX platform,
and many Fortran compilers have the ability to run cpp automatically as
part of the compilation process. All tokens will be uppercase to distinguish
them from Fortran code, which will be in lower case.
CCM4 will adhere to the Fortran 90 language standard. The
purpose is to enhance portability, and to allow use of the many desirable
new features of the language. If a situation arises in which there is good
reason to violate this rule and include Fortran code which is not compliant
with the f90 standard, an alternate set of f90-compliant code must be provided.
This is normally done through use of a C-preprocessor ifdef construct.
Free-form source will be used. The f90 standard allows up
to 132 characters, but a self-imposed limit of 90 should enhance readability
and make life easier for those with bad eyesight, who wish to make overheads
of source code, or print source files with two columns per page. The world
will not come to an end if someone extends a line of code to column 91,
but multi-line comments that extend to column 100 for example would be
Loops should be structured with the do-end do construct as
opposed to numbered loops.
Input arguments and local variables will be declared 1 per
line, with a comment field expressed with a "!" character followed by the
comment text all on the same line as the declaration. Multiple comment
lines describing a single variable are acceptable when necessary. Variables
of a like function may be grouped together on a single line. For example:
integer :: i,j,k ! Spatial indices
Continuation lines are acceptable on multi-dimensional
array declarations which take up many columns. For example:
real(r8), dimension(plond,plev), intent(in) :: &
array1, &! array1 is blah blah blah
array2 ! array2 is blah blah blah
Note that the f90 standard defines a limit of
39 continuation lines.
Code lines which are continuation lines of assignment statements
must begin to the right of the column of the assignment operator. Similarly,
continuation lines of subroutine calls and dummy argument lists of subroutine
declarations must have the arguments aligned to the right of the "(" character.
Examples of each of these constructs are:
a = b + c*d + blahblahblah/xxxxxxxxxx + &
h*g + e*f
call sub76 (x, y, z, w, a, &
b, c, d, e)
subroutine sub76 (x, y, z, w, a, &
b, c, d, e)
Code within loops and if-blocks will be indented 3 characters
Routines with large argument lists will contain 5 variables
per line. This applies both to the calling routine and the dummy argument
list in the routine being called. The purpose is to simplify matching up
the arguments between caller and callee. In rare instances in which 5 variables
will not fit on a single line, a number smaller than 5 may be used. But
the per-line number must remain consistent between caller and callee. An
call linemsbc (u3(i1,1,1,j,n3m1), v3(i1,1,1,j,n3m1), t3(i1,1,1,j,n3m1),
q3(i1,1,1,j,n3m1), qfcst(i1,1,m,j), xxx)
subroutine linemsbc (u, v, t, &
q, qfcst, xxx)
Commenting style. Short comments may be included on the same
line as executable code using the "!" character followed by the description.
More in-depth comments should be written in the form:
! Describe what is going on
Key features of this style are 1) it starts with
a "!" in column 1; 2) The text starts in column 3; and 3) the text is offset
above and below by a blank comment line. The blank comments could just
as well be completely blank lines (i.e. no "!") if the developer prefers.
Use of the operators <, >, <=, >=, ==, /= is recommended
instead of their deprecated counterparts .lt., .gt., .le., .ge., .eq.,
and .ne. The motivation is readability.
Code will be written in lower case. This convention cleanly
segregates code from C preprocessor tokens, since the convention has been
established that such tokens are all uppercase.
Embedding multiple routines within a single file and/or module
is allowed, encouraged in fact, if any of three conditions hold. First,
if routine B is called by routine A and only by routine A, then the two
routines may be included in the same file. This construct has the advantage
that inlining B into A is often much easier for compilers if both A and
B are in the same file. Practical experience with many compilers has shown
that inlining when A and B are in different files often is too complicated
for most people to consider worthwhile investigating.
The second condition in which it is desirable to put multiple
routines in a single file is when they are "CONTAIN"ed in a module for
the purpose of providing an implicit interface block. This type of construct
is strongly encouraged, as it allows the compiler to perform argument consistency
checking across routine boundaries. An example is:
real :: x, y
real :: var, var2
public sub1, sub2
The number, type, and dimensionality of the arguments
passed to sub1 and sub2 are automatically checked by the compiler.
The final reason to store multiple routines and
their data in a single module is that the scope of the data defined in
the module can be limited to only the routines which are also in the module.
This is accomplished with the "private" clause.
If none of the above conditions hold, it is not
acceptable to simply glue together a bunch of functions or subroutines
in a single file.
Modules MUST be named the same as the file in which they
reside. The reason to enforce this as a hard rule has to do with the fact
that dependency rules used by "make" programs are based on file names.
For example, if routine A "USE"s module B, then "make" must be told of
the dependency relation which requires B to be compiled before A. If one
can assume that module B resides in file B.o, building a tool to generate
this dependency rule (e.g. A.o: B.o) is quite simple. Put another way,
it is difficult (to say nothing of CPU-intensive) to search an entire source
tree to find the file in which module B resides for each routine or module
which "USE"s B.
Note that by implication multiple modules are not allowed
in a single file.
The use of common blocks is deprecated in Fortran 90 and
their continued use in the CCM is strongly discouraged. Modules are a better
way to declare static data. Among the advantages of modules is the ability
to freely mix data of various types, and to limit access to contained variables
through use of the ONLY and PRIVATE clauses.
The use of array syntax is not encouraged. Though compact
and concise, many compilers have trouble generating efficient code from
source written in this notation. A good general rule is that little or
no performance penalty will result from writing (or rewriting) loops containing
only a few statements in array syntax. Long, complicated loops containing
much work are best coded as explicitly indexed loops.
All subroutines and functions will include an "implicit none"
statement. Thus all variables must be explicitly typed.
Each function, subroutine, or module will contain a header
immediately following the routine declaration. The purpose is to describe
what the code does, possibly referring to external documentation. The format
of the header will be:
! <Say what the routine does>
! <Describe the algorithm(s) used in the routine.>
! <Also include any applicable external references.>
! Author: <Who is primarily responsible for the code>
Inclusion of the "Method" portion is not required
when not applicable, such as a module which contains data but no subroutines.
Note also that the "Author" portion is expected to be filled in by hand,
*not* automatically by the cvs variable "$Author".
I/O statements which need to check an error condition will
use the "iostat=<integer variable>" construct instead of the outmoded
end= and err=. Note that a 0 value means success, a positive value means
an error has occurred, and a negative value means the end of record or
end of file was encountered.
All dummy arguments must include the INTENT clause in their
declaration. This is extremely valuable to someone reading the code, and
can be checked by compilers. An example is:
subroutine sub1 (x, y, z)
real(r8), intent(in) :: x
real(r8), intent(out) :: y
real(r8), intent(inout) :: z
y = x
z = z + x
The term "package" in the following rules refers to a
routine or group of routines which takes a well-defined set of input and
produces a well-defined set of output. A package can be large, such as
a dynamics package, which computes large scale advection for a single timestep.
Or it could be relatively small, such as a parameterization to compute
the effects of gravity wave drag.
A package should refer only to its own modules and subprograms
and to those intrinsic functions included in the Fortran 90 standard. This
is crucial to attaining plug-compatibility. An exception to the rule might
occur when a given computation needs to be done in a consistent manner
throughout the model. Thus for example a package which requires saturation
vapor pressure would be allowed to call a generic routine used elsewhere
in the main model code to compute this quantity.
When exceptions to the above rule apply, (i.e. routines are
required by a package which are not f90 intrinsics or part of the package
itself) the required routines which violate the rule must be specified
within the package.
A package shall provide separate setup and running procedures,
each with a single entry point. All initialization of time-invariant data
must be done in the setup procedure and these data must not be altered
by the running procedure. This distinction is important when the code is
being run in a multitasked environment. For example, constructs of the
following form will not work when they are multitasked:
if (first) then
first = .false.
<set time-invariant values>
All communication with the package will be through the argument
list or namelist input. The point behind this rule is that packages should
not have to know details of the surrounding model data structures, or the names
of variables outside of the package. A notable exception to this rule is
model resolution parameters. The reason for the exception is to allow compile-time
array sizing inside the package. This is often important for efficiency.
Precision. Parameterizations should not rely on vendor-supplied
flags to supply a default floating point precision or integer size. The
f90 "kind" feature should be used instead. For example, in CCM4, all routines
and modules USE a module named "precision" which defines:
integer, parameter :: r8 = selected_real_kind(12)
integer, parameter :: i8 = selected_int_kind(13)
Thus, any variable declared real(r8) will be
of sufficient size to maintain 12 decimal digits in their mantissa. Likewise,
integer variables declared integer(i8) will be able to represent an integer
of at least 13 decimal digits. Note that the names r8 and i8 defined above
are meant to reflect the size in bytes of variables which are subsequently
defined with that "kind" value.
Bounds checking. All parameterizations must be able to run
when a compile-time and/or run-time array bounds checking option is enabled.
Thus, constructs of the following form are disallowed:
real(r8) :: arr(1)
where "arr" is an input argument into which the
user wishes to index beyond 1. Use of the (*) construct in array dimensioning
to circumvent this problem is forbidden because it effectively disables
array bounds checking.
Error conditions. When an error condition occurs inside a
package, a message describing what went wrong will be printed. The name
of the routine in which the error occurred must be included. It is acceptable
to terminate execution within a package, but the developer may instead
wish to return an error flag through the argument list. If the user wishes
to terminate execution within the package, generic CCM termination routine
"endrun" should be called instead of issuing a Fortran "stop". Otherwise
a message-passing version of the model could hang. Note that this is an
exception to the package coding rule that "A package should refer only
to its own modules and subprograms and to those intrinsic functions included
in the Fortran 90 standard".
Inter-procedural code analysis. Use of a tool to diagnose
problems such as array size mismatches, type mismatches, variables which
are defined but not used, etc. is strongly encouraged. Flint is one such
tool which has proved valuable in this regard. It is not a strict rule
that all CCM4 code and packages must be "flint-free", but the developer
must be able to provide adequate explanation for why a given coding construct
should be retained even though it elicits a complaint from flint. If too
many complaints are issued, the diagnostic value of the tool diminishes
The use of dynamic memory allocation is not discouraged because
we realize that there are many situations in which run-time array sizing
is desirable. However, this type of memory allocation can cause performance
problems on some machines, and some debuggers get confused when trying
to diagnose the contents of such variables. Therefore, dynamic memory allocation
is allowed only "when necessary". The ability to run a code at a different
spatial resolution without recompiling is not considered to be an adequate
reason to use dynamically allocated arrays.
The preferable mechanism for dynamic memory allocation
is automatic arrays, as opposed to ALLOCATABLE or POINTER arrays for which
memory must be explicitly allocated and deallocated. An example of an automatic
real :: a(n)
The same routine using an allocatable array would
real, allocatable :: a(:)
Magic numbers should be avoided. Physical constants (e.g.
pi, gas constants) must NEVER be hardwired into the executable portion
of a code. Instead, a mnemonically named variable or parameter should be
set to the appropriate value, probably in the setup routine for the package.
We realize than many parameterizations rely on empirically derived constants
or fudge factors, which are not easy to name. In these cases it is not
forbidden to leave such factors coded as magic numbers buried in executable
code, but comments should be included referring to the source of the empirical
Hard-coded numbers should never be passed through argument
lists. One good reason for this rule is that a compiler flag, which defines
a default precision for constants, cannot be guaranteed. Fortran 90 allows
specification of the precision of constants through the "_" compile-time
operator (e.g. 3.14_r8 or 365_i8). So if you insist on passing a constant
through an argument list, you must also include a precision specification.
If this is not done, a called routine which declares the resulting dummy
argument as, say, real(r8) or 8 bytes, will produce erroneous results if
the default floating point precision is 4 byte reals.