[Next] [Prev] [Contents] [Top]

An Introduction to Atmospheric and Oceanographic Datasets


On a typical day 5,000 weather stations, 800-1100 upper-air stations (raob and pibal), 2,000 ships, 600 aircraft, several polar-orbiting satellites, five geostationary satellites and various other sources provide observations that are used by operational forecast centers to produce gridded analyses (see below for details). These gridded analyses represent a best estimate of the state of the atmosphere at a particular time. They are used as initial conditions for daily weather forecasts models and are the datasets most commonly used to analyze atmospheric quantities and processes on large spatial scales.

Brief History of Operational Forecasts

The first numerical weather forecast by computer was made in 1950. This forecast was based upon a simple one-level model over a limited domain. Regular or operational computer forecasts by the (then) U.S. Weather Bureau began in the mid-late fifties. Initially, very idealized models which made 24 to 48 hour forecasts of 500 hPa geopotential heights were used. These models, called equivalent barotropic models, used finite differences over a limited domain. They allowed a quantity called geostrophic vorticity (i.e., the Laplacian of geopotential height) to be advected by the winds. These forecasts were useful but they could not predict the initiation or demise of this quantity. The next generation of operational models were called baroclinic models. These models were able to forecast vertical motion, in addition to vorticity and, thus, were capable of forecasting cyclogenesis (e.g., the formation of cyclonic disturbances). As both computers and knowledge progressed, the operational forecast models evolved. Rather than use highly simplified models over limited areas, operational centers began using a simplified version of the Navier-Stokes equations [The main simplification was the use of the hydrostatic approximation] (sometimes called the `primitive equations') in the 1960s. Initially, due to computer and operational constraints, these equations were used only over one hemisphere but soon they were used for global forecasts.

For a variety of reasons, many forecast centers changed from approximating the primitive equations with finite differences of data at grid points to exact differentials of spherical harmonics. They also changed from using pressure as a vertical coordinate to the `sigma' coordinate system. This quantity represents a transformed pressure coordinate and has several advantages. (It also has a few disadvantages.) The biggest advantage is that it is easier to deal with the lower boundary of the earth's surface because sigma levels approximately parallel the model's smoothed topography. Some recent model formulations use a hybrid pressure-sigma coordinate system where sigma is used in the troposphere and pressure in the stratosphere with a gradual transition in between.

Early hemispheric and global operational forecast models used horizontal resolutions of about 5 degrees. Today global operational models have horizontal grids as fine as about 0.6 degree (``T213'' Gaussian resolution in spherical harmonic jargon). The number of vertical levels used by the operational forecast models has increased significantly over time. In the early days, they included only five tropospheric levels. Currently, NCEP and ECMWF use 28 and 31 levels, respectively. These levels encompass both the troposphere and the lower stratosphere.

Differences between Operational and Climate Models

The purpose of a operational forecast models is to predict the details of the weather out to, say, 10 days. Forecast models assume specific initial conditions (see next section) based upon the current state of the atmosphere-ocean system. Users of these models are interested in specific days and times (realizations). Climate models, which are more like boundary value problems, also simulate the details of weather but do so for much longer time periods (10-100+ years). Generally, climate models use lower horizontal and vertical resolutions than forecast models because climate simulations at high resolutions would be prohibitively expensive. Climate researchers (usually) are interested in the statistics of the climate simulations rather than specific realizations.

Historically, there were considerable differences between operational models and climate models. For instance, early primitive equations operational models did not include radiation because, over short time periods, radiation effects were thought to be small. However, radiation is of fundamental importance for climate models. Many other physical approximations also differed between the models. Currently, there is little difference in either the numerics or physics used in operational and climate models. In fact, it is now generally accepted that poor representations of physical processes are a source of systematic forecast error (so it is important to have a good model climate if you want to accurately forecast the detailed evolution of the atmosphere over several days).

The DSS has several datasets from non-NCAR climate model experiments located in ds318.0 (Table 6.3). A separate dataset (ds318.6; Table 6.4) contains three 100-year runs from the Max Plank Institute. All were part of EPA carbon dioxide studies. In addition, NCAR's Climate Modeling Section makes several simulations from the CCM2 available (see Chapter 11).

Data Assimilation and Analyzed Grids

[Most of this section has been taken from Trenberth and Solomon (1994) with permission.]

The process of establishing a dataset suitable for the initial conditions to an operational forecast model has been an integral part of the operational cycle since the first routine numerical forecasts. These datasets are the analyzed grids which have formed the basis for many atmospheric research studies. The procedure to develop these datasets has changed significantly over time to keep pace with model and computer improvements. Early analyzed grids were developed using a simple objective analysis method. Currently, they are produced using a four-dimensional (4-D) data assimilation system in which multivariate observed data are combined with a ``first guess'' using a statistically optimum scheme. The first guess is the best estimate of the current state of the atmosphere from previous analyses produced using the forecast model.

It must be emphasized that the operational analyses are performed under time constraints for weather forecasting purposes and not with the objective of providing a continuous picture of the atmosphere over time. Changes in the operational models, data handling techniques, the data available, initialization, and so on, which are implemented to improve the weather forecasts, may disrupt the continuity of the analyses. Some aspects, such as detailed analyses of the conditions at the surface of the earth, may be of less importance for weather forecasting while of great importance for diagnostic studies. Typically, the representation of orography in the operational models is greatly simplified (as it is in climate models).

The global atmospheric analyses produced as a result of four-dimensional data assimilation operationally consist of global fields of eastward and northward wind components (u, v), geopotential height (Z), virtual temperature (T), and relative humidity (RH) or, equivalently, specific humidity (q) as a function of pressure (p), latitude and longitude. In recent times, these quantities have been analyzed on the levels of the numerical weather prediction model used in the 4-D data assimilation to provide the first guess for the analyses. Generally, these are sigma levels where sigma = p/ps, and ps is the surface pressure defined on the model surface topography. Alternatively, the model levels are a hybrid between sigma and pressure coordinates, typically reverting to constant pressure above about 100 mb. Analyzed fields on standard constant pressure levels are produced by interpolation (e.g., at ECMWF by using tension splines or linearly). Actually, the changes in the analysis from one synoptic observation time to the next are interpolated to update the standard pressure level fields although the details as to how this has been done have changed with time. Horizontal divergence and vertical motion (w= vertical p-velocity) fields are produced diagnostically from the analyses. Once the fields have been analyzed, they are typically initialized using a procedure called nonlinear normal mode initialization (NNMI) to bring the mass and temperature fields into a dynamical balance with the velocity fields consistent with the predominant low frequency motions in the atmosphere. Thus, analyzed gridded datasets may be "initialized" or "uninitialized" (In some studies it is important to know which type of analyses are being used.).

In addition to the standard analyzed variables, new global fields of various quantities are becoming available from satellite data and/or from the model itself. In some cases, the satellite products may be produced as a part of the four-dimensional data assimilation process such that some elements of the model and/or analyzed fields are used. Examples of possible new products include short-wave and long-wave radiation at the top of the atmosphere and at the surface, cloudiness, precipitable water, and cloud liquid water. Fields of soil moisture, snow cover, sea surface temperature, surface wind and wind stress, and fluxes of sensible and latent heat may be produced. Some of these will have much larger model components than others. All need to be validated. More comprehensive use of the model can result in estimates of precipitation, latent heating, and other diabatic heating fields throughout the atmosphere. Because of the importance of precipitation in the hydrological cycle and in agriculture, and of diabatic heating in driving the whole atmospheric circulation, there is considerable interest in these fields. It is therefore desirable to obtain as much information as possible about these fields and use physical constraints whenever possible to try to determine them more accurately.

Reanalysis Datasets

As previously discussed, there are several reasons why these analyzed grids have deficiencies that can limit their usefulness. Model numerics, physics, horizontal and vertical resolutions and other changes over time (Fig. 6.1) have introduced inhomogeneities into these analyzed grids (Fig. 6.2). To provide researchers with a relatively clean series of analyses which can be used to address a broad range of research topics, several operational and research organizations are establishing their own programs to reanalyze data for various time periods. The following organizations have established reanalysis projects: NCEP-NCAR, ECMWF, NASA GSFC, and the NRL (Monterey). Each uses its own unique model and data assimilation scheme to produce analyzed grids every 6 hours over assorted time spans. Table 6.1 provides a brief listing of the most frequently accessed Reanalysis datasets.

Unlike the model/assimilation schemes which do not change, the observational data bases used as the basis for the analysis effort will change with time. These input data bases are similar in many respects but will differ in some way: e.g., NCEP-NCAR use a comprehensive set of quality controlled observations including the recovery of `lost' datasets; ECMWF uses direct assimilation of satellite radiances to improve the moisture analysis; NASA GSFC will includes special satellite data; and NRL will use operationally available observations only. This heterogeneous mixture of models, assimilation methods and data will allow researchers to assess the degree of agreement among the final products which might be interpreted as a measure of reliability.

Initially, the reanalysis efforts focused upon a particular time period: 1979-93 for ECMWF (ERA-15); 1985-89 for NRL; 1985-90 for GSFC; and, 1985-94 for NCEP. Subsequently, the time periods were expanded. For example, ECMWF's < HREF="http://www.ecmwf.int/research/era/">ERA-40 spans mid-1957 to mid-2001 and NCEP established an operational Climate Data Assimilation System (CDAS) that continues to produce routine reanalysis output. The CDAS uses the same model/assimilation scheme as the NCEP reanalysis effort but with a seven day delay to capture all possible operational data. All NCEP-NCAR reanalyzed data are in the DSS archive (ds090.0 , ds090.1 and ds090.2 ). It is expected that NCEP reanalysis efforts will take place at regular intervals (perhaps, 5-10 years) with improved models and assimilation

Table 6.1
Frequently Accessed Reanalysis Datasets
GroupRES Hor(Vert) Initial
ds090.0NMC-NCAR T62/2.5(28)Basic Global 1948-pres
ds090.1NMC-NCAR T62/2.5(28)8-day fcst1948-pres
ds090.2NMC-NCAR T62/2.5(28)Monthly Subset1948-pres
ds115.0ECMWF 2.5Global Sfc1979-1993
ds115.0ECMWF 2.5Global Upper Air1979-1993
ds115.7ECMWF 2.5Global Monthly1979-1993
ds117.0ECMWF N80Sfc, Integrals9/1957-8/2002
ds117.1ECMWF T159/N80(23 pres)Upper Air9/1957-8/2002
ds117.2ECMWF T159/N80(60 model)Upper Air9/1957-8/2002
ds118.0ECMWF 2.5(23 pres)Sfc,Integrals9/1957-8/2002
ds118.1ECMWF 2.5(23 pres)Upper Air9/1957-8/2002
ds119.0ECMWF N80Monthly Sfc,Integrals9/1957-8/2002
ds119.1ECMWF T159(23 pres)Monthly Upper Air9/1957-8/2002
ds119.2ECMWF T159/N80(60 model)Monthly Upper Air9/1957-8/2002

Table 6.2
Major Gridded Analyses Available at NCAR
SourceGridRegion PeriodUpdate Variables
[* means many]
ds060.0NMC 47x51 NH 1959-72 12hrly z,t,thick sfc,tropo,strato
ds060.1 NMC 47x51 NH 1960-77 12hrly z tropo (500mb)
ds061.0 NMC 47x51 NH 1964-80 12hrly z,t strato
ds061.5 NMC 47x51 NH 1962-72 12hrly * sfc,tropo,strato
ds061.6 NMC47x51NH1962-6312hrlyz,t,u,v sfc,tropo,strato
ds062.0NMC47x51NH1967-7112hrly * sfc,tropo,strato
ds063.0NMC47x51 NH1963-72 12hrly p,t,u,v,rh sfc,tropo,strato
ds065.0 NMC47x51 NH1958-7212hrly w,z,thick tropo
ds066.0NMC65x65 NH,SH 1973-pres12hrly *,snow sfc,tropo,strato
ds067.0 NMC 65x65 NH,SH 1981-pres daily z,t strato
ds069.0 NMC LFM NH 1971-9112hrly * sfc,tropo,strato
ds069.5 NMCNEST NH1984-90 12hrly * sfc,tropo
ds075.0 NMC73x23 Trop1968-90 12hrly t,u,v,ff tropo
ds080.0 NMC144x37 NH,SH 1972-74 12hrly z,t,u,v,rh sfc,tropo,strato
ds082.0 NMC 145x37 NH,SH 1976-pres 12hrly * sfc,tropo,strato
ds082.1 NMC145x37 NH,SH1976-pres 12hrly * sfc,bound,1000
ds082.5 NMC145x37 NH,SH 1991-pres 12hrly * sfc,tropo,strato
ds084.0 NMC MRF R30 Global 1990-pres 12hrly z,v,t,u,v,rh tropo,strato
ds084.2 NMC T80,T126 Global 1990-pres 6hrly div,vort,etc. 18 layers
ds084.5 NMC MRF 384x190 Global 1990-pres 6hrly flux  
ds090.0 NMC T62 Global 1985+ 6hrly * sfc,tropo,strato
ds110.0 ECMWF WMO 144x73 Global 1980-89 12hrly z,t,u,v,w,rh tropo
ds110.1 ECMWF 144x72 Oceans 1980-86 monthly wind stress sfc
ds110.3 ECMWF WMO 144x73 Global 1978-89 lt means u,v,t,z,w,q tropo
ds111.0 ECMWF TOGA T106 Global 1985-pres 6hrly u,v,w,t,z,rh tropo,strato
ds111.1 ECMWF TOGA N80 Global 1985-pres 6hrly pcp,tsoil,etc. sfc
ds111.2 ECMWF TOGA 144x73 Global 1985-pres 12hrly u,v,w,t,z,rh sfc,tropo,strato
ds111.4 ECMWF TOGA T106 Global 1990-91 6hrly t,w,vort,div,q 19 (model) levels
ds111.5 ECMWF TOGA 144x73 Global 1985-92 m means u,v,w,t,z,rh sfc,tropo,strato
ds195.0 DSS/SCD 47x51 NH 1946-94 daily slp,tsfc,u,v,z sfc,tropo,strato
ds195.2 DSS/SCD 144x73 Global 1946-99 daily z tropo
ds195.5 DSS/SCD 72x19 NH 1946-93 daily slp,tsfc,sst,u,v,z sfc,tropo
ds219.0 ECMWF 72x37 Global 1979-89 yrly u,v,w,z,t,q,+ tropo,strato
ds277.0 NMC various Global 1982-93 wkly sst sea level
ds277.1 NMC ODAS 112x81 Oceans 1991-94 wkly u,v,t ocn /atmo 27 levels
ds302.5 ECMWF 193x97 Global 1978-79 12hrly w tropo,strato
ds306.0 NMC 73x37 Global 1979 12hrly u,v,t,q sfc, sigma
ds307.0 ECMWF FGGE 192x49 NH,SH 1978-79 12hrly u,v,w,t,z,rh sfc,tropo,strato
ds307.3 ECMWF FGGE 192x49 NH,SH 1979 6hrly u,v,w,t,z,rh sfc,tropo,strato
ds307.5 ECMWF FGGE 96x25< target="figures" td> NH,SH 1978-79 12hrly u,v,w,t,z,rh sfc,tropo,strato
ds618.0 ECMWF AMEX T106 Global 1987jan 12hrly rh,vort,div sfc,tropo,strato
ds673.0 NMC Nimbus-5 145x37 Global 1975   ice,pcp sfc
ds757.0 NMC 144x72 Global     sfc elev sfc
ds840.1 NOAA TDL LFM NH 1973-93 hourly mdr  

Table 6.3

Non-NCAR Climate Model Outputs for
EPA co2 Studies in a Common Format

Group Resolution
UKMO48x36 9 2.99
OSU72x46 9 5.60
GFDL48x40 22 7.07
GFDL Q-flux 48x40 22 7.07
GISS36x24 25 5.24
GISS control 36x24 7 0.47
GISS Sc A36x24 7 4.26
GISS Sc B36x24 7 2.84
GFDL 1x co2 48x40 22 31.65
GFDL 2x co248x40 22 31.65
GFDL R30 96x80 22 27.63
CCC 96x48 6 5.73
1Consult NCAR's Data Support Section for details.

Table 6.4

Non-NCAR Climate Model Output for
EPA co2 Studies
Three Max Plank Inst. 100 year runs

GROUP Resolution
MPI 64x32 7+114.7
1Consult NCAR's Data Support Section for details.

Brief History of Operational Forecasts
Differences between Operational and Climate Models
Data Assimilation and Analyzed Grids
Reanalysis Datasets

An Introduction to Atmospheric and Oceanographic Datasets
[Next] [Prev] [Contents] [Top]