Hyperslab OPerator Suite: An object-oriented approach to geophysical data analysis Definition of the "hyperslab data structure", and suite of operators ==================================================================== 23 Sep 1997 One of the nice features of a large dataset stored in netCDF format is that one can choose to access only a portion of the dataset by reading a "hyperslab", i.e., a specified multi-dimensional subset of the whole dataset. Say a 1000-year 3-dimensional time series of daily atmospheric temperature values from a CSM integration is archived in a netCDF history file. One could choose to extract the hyperslab of temperature time-series only over the "warm pool" region in the Pacific, for example. One could then perform some statistical analyses on this hyperslab, and then display the results graphically. It would be nice if this subdomain selection operation were applied not just to the data, which is fairly easy to do, but also to the meta-data describing the data, which is more difficult. If one could accomplish that, then one could avoid a lot of the "bookkeeping" required to keep track of the domain reduction operations performed on a dataset. One way to accomplish this painlessly would be to take advantage of the "structured data types" available in high-level programming languages such as Yorick, IDL, MATLAB, C, or Fortran-90. We define a "hyperslab data structure", or simply, a "hyperslab", that has several representations. It may be a netCDF file, a Yorick structure, an IDL structure, or a C structure. However, exactly the same information would be contained in all these representations. Using the standardized data structure, one can create a suite of "hyperslab operators" (in any of the high-level languages) that take one or more hyperslabs as arguments and return a hyperslab result, or perform some other operations such as graphically displaying the hyperslab data, or saving the data to a hyperslab netCDF file. In particular, the last operation would allow hyperslab data structures to be exchanged between hyperslab operators written in different languages. The hyperslab operators would modify not only the data contained in the hyperslab data structure, but also the meta-data. Since the hyperslab can always be represented as a netCDF file, it could also easily be accessed from any graphics or data processing package, such as NCL (NCAR Command Language), that accepts netCDF input. If one also chooses to define the netCDF representation of a hyperslab in such a way as to conform to the "Cooperative Ocean-Atmosphere Research Data Service" (COARDS) conventions, then one can use packages such as GRADS for graphical display of hyperslab data that has been processed by IDL or Fortran-90 routines (say). Thus the hyperslab data format would allow easy interchange of data between programs written in different high-level languages, interpreted or otherwise, and with graphics packages that can read netCDF files. In principle, the netCDF format itself should be capable of allowing such flexibility. However, in practice, writing Yorick/IDL/MATLAB/C/Fortran-90 procedures to process any general netCDF file can be quite cumbersome, since one would need to deal with a variety of different ways (i.e., different dimension names, coordinate ordering, units etc.) of representing essentially the same information. To get around this difficulty, one can define a fairly restrictive data format, within the netCDF framework, and use filters to transform between this format and other formats. A hyperslab data structure is proposed below that assumes a staggered "rectangular" latitude-longitude-"height" grid. It is general enough to deal with atmospheric data on gaussian grids with pressure/sigma/hybrid level structure. It can also handle ocean data on a staggered grid with missing values and spatially variable depth. However, it is fairly restrictive in specifying names for the coordinate dimensions and restricting their ordering. It also incorporates area/volume elements to allow weighted spatial averaging operations, and spatial masking operations to select geographical regions. An important difference between this data format and archival data formats, such the Processor history tape format for CCM data, is that it does not assume global domain for the data. The data domain may be restricted to a hyperslab portion of the original global data. Another difference is that the hyperslab data structure is primarily designed for representing data associated with a single variable. Different variables would usually need to be stored in different hyperslab data structures (except for the netCDF representation, where variables with common horizontal dimensions may be saved in one file). The hyperslab data structure is not designed for long term archival of global domain data, but only for storing intermediate results between data processing steps. Filters should be used translate archived history file data into the hyperslab data structure format as and when needed. These filters could transpose/reverse data dimensions to conform to the stricter coordinate orderings required by the hyperslab data structure. The hyperslab data structure is extensible. The basic data structure only defines the "guaranteed minimum" of fields (structure members) available. There could be a number of other fields available in the hyperslab. A hyperslab operator that returns a hyperslab result should copy all the fields of its highest-dimensional (if none, then the leftmost) hyperslab operand to its result. This would ensure that any extra fields in the hyperslab data structure (i.e., besides those defined by the basic format) will be inherited by the result of the operator. The hyperslab data structure allows up to five data dimensions; three spatial dimensions, "xyz", the time dimension "t", and an index dimension "i". The ordering of the dimensions is assumed to be "xyzti" (Fortran-style). In the netCDF representation, the "t" dimension may be unlimited, provided the the "i" dimension is not present in the data. The index dimension allows for representing different cases, ensemble members etc. Any (or all) of the five dimensions may be missing from the data. If all dimensions are missing, then the data corresponds to a scalar. In addition to the regular spatial grid, denoted as "xyz", the hyperslab data structure also allows for a staggered spatial grid, denoted as "XYZ", corresponding to grid-box edges or interfaces. Typically, there would be one extra point on the interfacial grid in each dimension, when compared to the regular grid. (There are, however, exceptions to this.) X coordinate values are assumed to be positive in the eastward direction, although the coordinate values themselves may be in ascending or descending order. The recommended units are "degrees_east" and the recommended name is "longitude". Y coordinate values are assumed to be positive in the northward direction, although the coordinate values themselves may be in ascending or descending order. The recommended units are "degrees_north" and the recommended name is "latitude". (Other units may be used, but they may not always be recognized by the hyperslab plotting operators, and the hyperslab may not conform to the COARDS conventions if the units represent transformed coordinates.) Z coordinate values (depth/pressure) may be ordered either upward or downward, and a flag is used to determine which is the case. ============================================================================== Definition of the basic hyperslab data structure ------------------------------------------------ A hyperslab may be represented as 1. a netCDF file, 2. a Yorick structure, 3. an IDL structure, or 4. a C structure, or 5. a Fortran-90 structure. All scalar components (real/integer values) of a hyperslab are assumed to be defined in all representations of the hyperslab. For non-scalar variables, such as arrays (or string, which are just character arrays), it is useful to introduce the notion of a "null" value, which may be 1. a non-existent variable in a netCDF file 2. a void ([]) value in Yorick 3. a null string ("") in IDL, or 4. a NULL pointer value in C, or 5. a NULL pointer value in Fortran-90. If a non-scalar variable has a "null" value, it is assumed to be undefined. Hyperslab components -------------------- Like a netCDF file, a hyperslab has dimensions, variables, and attributes: 1. DIMENSIONS: a) DYNAMIC DIMENSIONS: These are the actual coordinate dimensions for the current subdomain of the hyperslab and may change in length during the life of the hyperslab. The standard dynamic dimensions are: X, Y, Z, TIME, and ILABEL, with X/Y/Z corresponding to the regular or interfacial spatial grid, depending upon the context, and ILABEL corresponding to the "index" dimension. If any of these dimensions are not defined, they will be referred to as "undefined" dynamic dimensions. (The netCDF representation of the hyperslab may include the the additional dynamic dimensions XINT, YINT, and ZINT to represent the interfacial spacial grid, as distinct from the regular grid dimensions, which are always named X, Y, Z in the netCDF representation. This allows variables on both types of grids to be stored together in the same netCDF file.) b) STATIC DIMENSIONS: These dimensions will not change in length during the life of the hyperslab (although they may be deleted or re-introduced.) i) STATIC COORDINATE DIMENSIONS: These correspond to the full (original) grid of the data. The standard static coordinate dimensions are: X0, Y0, Z0, TIME0, ILABEL0, XINT0, YINT0, ZINT0. (Since the original time dimension could be arbitrarily long, there is no static dimension named TIME0, by convention.) ii) STATIC EXTENDED DIMENSIONS: These are other static dimensions associated with extensions to the hyperslab data format, e.g., the number sigma coefficients with a hybrid vertical coordinate. 2. VARIABLES: a) DYNAMIC VARIABLES: These variables have dynamic dimensions. i) PRIMARY DATA VARIABLE: This corresponds to the actual data array, and is referred to by the actual data variable name in the netCDF representation and by the name "data" in other hyperslab representations. The dimensions of the data variable would be a subset of the defined dynamic dimensions. Those defined dynamic dimensions that are present in the data will be referred to as "reduced" dynamic dimensions, which means that they they were originally present in the data, but were eliminated by a rank-reduction/slicing operation. (The reason why these reduced dimensions are not immediately deleted is that it is useful to retain the information contained in the coordinate variables associated with these dimensions.) The data variable may be of type "float", "double", or "complex". (In the netCDF representation, the data variable contains only the real part. The imaginary part is saved in another variable whose name has the suffix "_im" attached to it.) ii) SECONDARY DATA VARIABLES: These variables contain information like the area/volume weights associated with the data, or coordinate information like bottom Z values (e.g., surface pressure). There are two secondary variables, "area_wt" and "z_bot", corresponding to these. Their dimensions would always be a subset of the dimensions of the primary data variable. iii) PRIMARY COORDINATE VARIABLES: These are one-dimensional and have the same names as the dynamic dimensions and contain the coordinate values corresponding to the dimensions. They are all of type "double", except ILABEL, which is contains character strings. They should always be defined if the corresponding dynamic dimension is defined. iv) SECONDARY COORDINATE VARIABLES: These are optional one-dimensional variables with dynamic dimensions, e.g., DATE contains the date values corresponding to the time values, and IPARAM contains parameters for the i-dimension. b) STATIC VARIABLES: These variables have static dimensions. i) STATIC COORDINATE VARIABLES: These are one-dimensional and have the same names as the static dimensions and contain the coordinate values corresponding to the dimensions. They are all of type "double", except ILABEL0, which is contains character strings. They should always be defined if the corresponding staic dimension is defined. ii) STATIC EXTENDED VARIABLES: These are other static variables associated with extensions to the hyperslab data format, and have static dimensions. 3. ATTRIBUTES: Although netCDF allows attributes to be arrays, the hyperslab format restricts attributes to have scalar values of type "long", "double", or "string". a) GLOBAL VARIABLES: These are not associated with any variable. b) VARIABLE ATTRIBUTES: These are associated with a variable. Hyperslab representations ------------------------- The following defines different representations of the basic hyperslab data structure. 1. netCDF Common Data-form Language representation -------------------------------------------------- (NOTE: The netCDF representation allows for multiple variables VAR1, VAR2, ... to be stored in the same netCDF file; all other representations deal only with one variable at a time.) netcdf hyperslab { // CDL notation for a hyperslab dimensions: nchar = NC, // usually 128 x = NX, x0 = NX0, xint0 = NXINT0, y = NY, y0 = NY0, yint0 = NYINT0, z = NZ, z0 = NZ0, zint0 = NZINT0, time = NT, // UNLIMITED, if the "i" dimension is absent ilabel = NI, ilabel0 = NI0; variables: // Variables describing the hyperslab // (A dimension variable may be absent, if the corresponding dimension // was not present even in the original history file, i.e., prior to slicing, // rank-reduction operations.) // Variable attributes are optional, except for the following: // x:units, x:subdomain, x:lower_bound, x:upper_bound, x:grid, // y:units, y:subdomain, y:lower_bound, y:upper_bound, y:grid, // z:units, z:subdomain, z:lower_bound, z:upper_bound, z:grid, // time:units, time:subdomain // ilabel:subdomain, // data:original_dims, data:reduction_ops, data:name, // data:case_name // Dimension variables double x(x); x:long_name = "longitude" / ... ; x:units = "degrees_east" / ... ; x:subdomain = -1/0/1/...; x:lower_bound = x_min; x:upper_bound = x_max; x:grid = "regular" / "interfacial"; x:period = 360. / ... ; x:rotated = 0/1/...; double y(y); y:long_name = "latitude" / ... ; y:units = "degrees_north" / ... ; y:subdomain = -1/0/1/...; y:lower_bound = y_min; y:upper_bound = y_max; y:grid = "regular" / "interfacial"; double z(z); z:long_name = "depth" / "height" / "pressure" / "sigma" / "hybrid_sigma_pressure" / "potential_temperature" ; z:units = "m" / "Pa" / "sigma_level" / "K"; z:subdomain = -1/0/1/...; z:lower_bound = z_min; z:upper_bound = z_max; z:grid = "regular" / "interfacial"; z:positive = "down" / "up"; double time(time); time:long_name = "time" ; time:units = "day" / "hour" / "minute" / "second"; time:subdomain = -1/0; time:days_per_year = DAYS_PER_YEAR; char ilabel(ilabel, nchar); ilabel:long_name = "" ; ilabel:subdomain = -1/0/1/...; // Auxiliary dimension variables (optional) double date(time); // date values (yyyymmdd.) date:long_name = "Date (yyyymmdd.)" ; double iparam(ilabel); // parameter values // Full spatial domain dimension variables ("invariant dimensions", optional) double x0(x0); x0:long_name = "Full domain regular X grid" ; double y0(y0); y0:long_name = "Full domain regular Y grid" ; double z0(z0); z0:long_name = "Full domain regular Z grid" ; double xint0(xint0); xint0:long_name = "Full domain interfacial X grid" ; double yint0(yint0); yint0:long_name = "Full domain interfacial Y grid" ; double zint0(zint0); zint0:long_name = "Full domain interfacial Z grid" ; char ilabel0(i0, nchar); ilabel0:long_name = "Full domain I-labels" ; double iparam0(i0); iparam0:long_name = "Full domain I-parameters" ; // Primary data variable(s) float VAR1(VAR1_DIMENSIONS); // variable (may also be of double type) VAR1:original_dims = "x,y,z,time,ilabel" ; VAR1:reduction_ops = "avg,,,," ; VAR1:area_wt_var = "AREA_WT_VARIABLE_NAME" ; VAR1:z_bot_var = "Z_BOT_VARIABLE_NAME" ; VAR1:missing_value = VAR1_MISSING_VALUE ; VAR1:long_name = "VAR1_LONG_NAME" ; VAR1:units = "VAR1_UNITS" ; VAR1:time_rep = "instantaneous"/ averaged"/ "minimum_val"/"maximum_val" ; VAR1:legend = "PLOT_LEGEND" ; VAR1:history = "PROCESSING_HISTORY" ; VAR1:C_format = "C_format" ; VAR1:FORTRAN_format = "FORTRAN_format" ; float VAR1_im(VAR1_DIMENSIONS); // imaginary part of complex data (optional) VAR1_im:missing_value = VAR1_MISSING_VALUE; float VAR2(VAR2_DIMENSIONS); // Another variable ... // Auxiliary data variables (optional) float VAR1_area_wt(SUBSET_OF_DATA_DIMENSIONS); // must have same precision as VAR1 VAR1_area_wt:long_name = "Area/volume/mass element in horizontal plane" ; VAR1_area_wt:units = "m^2" / "m^2 m" / "m^2 Pa" ; VAR1_area_wt:dims = "x,y,,," ; VAR1_area_wt:elements = "dxdy" / "dxdydz" ; float VAR1_z_bot(SUBSET_OF_DATA_DIMENSIONS); // must have same precision as VAR1 VAR1_z_bot:long_name = "bottom_depth" / "surface_pressure" ; VAR1_z_bot:units = "m" / "Pa" ; VAR1_z_bot:dims = "x,y,,t," ; VAR1_z_bot:ref = REFERENCE_VALUE_OF_Z_BOT ; // global attributes: :hor_subdomain = "HORIZONTAL_SUBDOMAIN" ; :ver_subdomain = "VERTICAL_SUBDOMAIN" ; :case_name = "CASE_NAME" ; :case_title = "CASE_TITLE" ; :original_file = "ORIGINAL_HISTORY_FILE" ; :resolution = "" / "T41L18" / "D02L45" / ... ; :data_source = "" / "CCM3" / "NCOM" / ... ; :data_URL = "http://..." ; :structure = "HYPERSLAB_..." ; :hyperslab_vars = "VAR1,VAR2,..." ; :format_URL = "http://www.cgd.ucar.edu/gds/svn/hyperslab.html" ; :Conventions = "COARDS" ; // COARDS-conformant provided x:units, // y:units are convertible to degrees // longitude/latitude }; [NOTE: If any of the attributes ":hor_subdomain", ":ver_subdomain", ":case_name", ":case_title", ":original_file", ":resolution", ":data_source", ":data_URL" are are not identically applicable to variables, then they may be stored as attributes of individual variables. E.g., "VAR1:case_name", "VAR2:case_name", ...] 2. Yorick structure representation ---------------------------------- (The structure member DATA corresponds to one of the variables (VAR1/VAR2/...) in the netCDF representation of the hyperslab.) struct hyperslab { // Yorick notation for a hyperslab // Mandatory variables describing the hyperslab // (A dimension variable may be null, if the corresponding dimension // was not present even in the original history file, i.e., prior to slicing, // rank-reduction operations.) // SDIM=5: no. of standard dimensions // NATT=56: total number of attributes // NIATT=7, NFATT=8, NSATT=41: numberof integer, float, string attributes string structure; // Structure name ("HYPERSLAB...") pointer x, y, z, time; // pointer to a 1-D double arrays pointer ilabel; // pointer to a 1-D string array pointer data; // pointer to a 5-D float/double array pointer missing_value; // pointer to a scalar of same type as // the data string name, long_name, units; string *attlist(3,NATT); // Attribute list("var","nam","var:nam") long *attcode(2,NATT); // Attribute codes (type(1-3),index) long *iatt(NIATT); // Integer attributes double *fatt(NFATT); // Float attributes string *satt(NSATT); // String attributes string type(3); // DATA/AREA_WT/Z_BOT type string long dimension(SDIM,3); // DATA/AREA_WT/Z_BOT dimensions long reduced(SDIM); // DATA rank-reduction codes // Optional variables describing the hyperslab (may be null) pointer area_wt, z_bot; // pointers to a 5-D float/double arrays pointer date; // pointer to a 1-D double array pointer iparam; // pointer to a 1-D double array // Optional variables describing the full spatial domain grid pointer x0, y0, z0; // pointers to 1-D double arrays pointer xint0, yint0, zint0; // pointers to 1-D double arrays pointer ilabel0; // pointer to a 1-D string array } 3. IDL structure representation ---------------------------------- (The structure member DATA corresponds to one of the variables (VAR1/VAR2/...) in the netCDF representation of the hyperslab.) ;; IDL notation for a hyperslab (anonymous structure type): ;; SDIM=5: no. of standard dimensions ;; NATT=56: total number of attributes ;; NIATT=7, NFATT=8, NSATT=42: numberof integer, float, string attributes ;; NX', ... denote either the corresponding dimension length, or unit-length { $ structure: "HYPERSLAB...", $ x: dblarr(NX), $ y: dblarr(NY), $ z: dblarr(NZ), $ time: dblarr(NT), $ date: dblarr(NT), $ ilabel: strarr(NI), $ iparam: dblarr(NI), $ x0: dblarr(NX0), $ y0: dblarr(NY0), $ z0: dblarr(NZ0), $ xint0: dblarr(NXINT0), $ yint0: dblarr(NYINT0), $ zint0: dblarr(NZINT0) $ ilabel0: strarr(NI0), $ iparam0: dblarr(NI0), $ attlist: strarr(3,NATT), $ attcode: lonarr(2,NATT), $ iatt: lonarr(NIATT), $ fatt: dblarr(NFATT), $ satt: strarr(NSATT), $ type: strarr(3), $ dimension: lonarr(SDIM,3), $ reduced: lonarr(SDIM), $ area_wt: fltarr(NX',NY',NZ',NT',NI'), $ z_bot: fltarr(NX',NY',NZ',NT',NI'), $ data: fltarr(NX',NY',NZ',NT',NI'), $ missing_value: float(MISSING_VALUE), $ name: "DATA_NAME", $ long_name: "DATA_LONG_NAME", $ units: "DATA_UNITS" $ } 4. C structure representation ----------------------------- (The structure member DATA corresponds to one of the variables (VAR1/VAR2/...) in the netCDF representation of the hyperslab.) struct hyperslab { /* C notation for a hyperslab */ char *structure; long nx, ny, nz, ni, nt; double *x, *y, *z, *time, *date, *iparam; char **ilabel; long nx0, ny0, nz0, nxint0, nyint0, nzint0, ni0; double *x0, *y0, *z0; double *xint0, *yint0, *zint0; char **ilabel0; double *iparam0; float *area_wt, *z_bot; /* must have same precision as data */ float *data, *missing_value; /* may also be of double type */ char *name, *long_name, *units; char *attlist[NATT][3]; long attcode[NATT][2]; long iatt[NIATT]; double fatt[NFATT]; char *satt[NSATT]; char *type[3]; long dimension[3][SDIM]; long reduced[SDIM] }; 5. Fortran-90 structure representation -------------------------------------- (to be defined) EXTENDED HYPERSLAB REPRESENTATIONS ================================== The standard hyperslab format allows for upto five "variable" dimensions, "x, y, z, t, ilabel", and several "invariant dimensions, i.e., "x0, y0, z0, xint0, yint0, zint0, ilabel0" to represent the full spatial domain grid. Also, a number of standard variable names and attributes are recognized. Two kinds of extensions are allowed to the standard hyperslab format: (i) "soft" extensions, where new attributes are introduced, but no new variables are added. This extension may be accomplished very easily, simply by invoking the hyperslab copy operator with a list of the added attributes (and their types). No code changes are necessary. Further copy/write/read operations with the extended hyperslab will simply pass on the added attributes. (ii) "hard" extensions, where new full spatial domain dimension variables ("invariant dimensions") are introduced. This extension would require minor code changes to the copy operator, and the initialization/creation operators. However, almost all other operators would be unaffected. (Introducing new dimension variables would require extensive revision of the code) The following is a list of "hard" extensions to the basic hyperslab data structure. The variable STRUCTURE contains extension name. (netCDF representation) HARD-EXTENSIONS --------------- a) Spherical geometry extension (structure = "HYPERSLAB1.0_SPH_...") //variables: double a0; a0:long_name = "planetary radius" ; a0:units = "m" ; double eqdx0(x0); eqdx0:long_name = "Regular X grid interval at equator" ; eqdx0:units = "m"; double cosdy0(y0); cosdy0:long_name = "Regular Y grid interval (including cosine-latitude factor)" ; cosdy0:units = "m"; double eqdxint0(x0); eqdxint0:long_name = "Interfacial X grid interval at equator" ; eqdxint0:units = "m"; double cosdyint0(yint0); cosdyint0:long_name = "Interfacial Y grid interval (including cosine-latitude factor)" ; cosdyint0:units = "m" // variable attributes: x:long_name = "longitude" ; x:units = "degrees_east" ; y:long_name = "latitude" ; y:units = "degrees_north" ; // global attributes: :structure = "HYPERSLAB1.0_SPH_..." ; b) Sigma coordinate extension (structure = "HYPERSLAB1.0_..._SIG_...") -------------------------------------------------------------- //dimensions: sigma_coefs = 2, //variables: double sigma0(z0, sigma_coefs); sigma0:long_name = "z (A), sigma (B) regular grid coefficients" ; double sigmaint0(zint0, sigma_coefs); sigmaint0:long_name = "z (A), sigma (B) interfacial grid coefficients" ; // global attributes: :structure = "HYPERSLAB1.0_..._SIG_..." ; c) Atmospheric data extension (structure = "HYPERSLAB1.0_SPH_SIG_ATM") -------------------------------------------------------------- //variables: byte hgrid0(y0,x0); hgrid0:long_name = "Grid-point type code: 0 (ocean), 1 (land), 2 (sea ice)" ; // variable attributes: z:bot_var = "PS" ; // global attributes: :structure = "HYPERSLAB1.0_SPH_SIG_ATM" ; d) Oceanic data extension (structure = "HYPERSLAB1.0_SPH_SIG_OCN...") ---------------------------------------------------------- //variables: byte kmax0(y0,x0); kmax0:long_name = "Ocean depth index (>=0)" ; // variable attributes: z:long_name = "depth" ; z:units = "m" ; z:positive = "down" ; z_bot:long_name = "bottom_depth" ; z_bot:units = "m" ; // global attributes: :structure = "HYPERSLAB1.0_SPH_SIG_OCN" ; e) Spectral Spherical Harmonic extension (structure = "HYPERSLAB1.0_SSH_...") //variables: double a0; a0:long_name = "planetary radius" ; a0:units = "m" ; // variable attributes: x:long_name = "m" ; x:units = "" ; y:long_name = "n-m" ; y:units = "" ; // global attributes: :structure = "HYPERSLAB1.0_SSH_..." ; :trunc_k= 42; SOFT-EXTENSIONS --------------- The preferred forms for the time units string are: "second", "minute", "hour", "day", "week", "month", "year" a) Truncated triangular spectral gaussian grid extension to HYPERSLAB1.0_SPH_... format // global attributes: :trunc_n= 42; :trunc_m= 42; :trunc_k= 42; b) Monthly data ilabel= [ "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December" ] (or any cyclic permutation of the above) // variable attributes: ilabel:long_name = "calendar month" ; c) Lagged regression (always with respect to a dimensionless normalized index, i.e., with zero mean and unit variance) // variable attributes: time:long_name = "Lag time" ; time:lag_index = "INDEX_VARIABLE_NAME" ; time:sample_count = TOTAL_NO_OF_TIME_SAMPLES_AT_ZERO_LAG ; d) Time-filtered data // variable attributes: time:filter_name = "name of time filter" ; time:filter_points = NO_OF_POINTS_IN_FILTER ; time:min_frequency = MINIMUM_FREQUENCY ; time:max_frequency = MAXIMUM_FREQUENCY ; e) Frequency spectra (real or complex) // variable attributes: time:long_name = "frequency" ; time:units = "second-1"/.../"year-1" ; time:sample_count = TOTAL_NO_OF_TIME_SAMPLES ; (NOTE: "frequency" = 1/time_interval, without factors of 2*Pi.) (optional) time:taper_window = "NAME_OF_TAPERING_WINDOW" ; time:bin_size = BIN_SIZE ; time:degrees_of_freedom = EFFECTIVE_DEGREES_OF_FREEDOM_PER_BIN ; time:lag_one_autocorrelation = LAG_ONE_AUTOCORRELATION ; f) Wavenumber spectra (real or complex) x/y/z:long_name = "wavenumber" ; x/y/z:units = "m-1"/..." ; (NOTE: "wavenumber" = 1/length_interval, without factors of 2*Pi.) g) Empirical orthogonal functions (EOFs) for atmospheric/oceanic data: Total variance is the area/volume weighted sum of variance over all spatial grid-points, i.e., the trace of the covariance matrix. (Variance at each grid-point is computed by dividing by the number of time samples, so that it is "independent" of the number of samples.) Normalization for the EOF spatial patterns: area/volume weighted squared sum over over the spatial dimensions equals the EOF variance (= fractional EOF variance times total variance). (The sum of all EOF variances would equal the total variance) // variable attributes: ilabel:long_name = "EOF number" ; iparam:long_name = "EOF variance fraction" ; iparam:total_variance = TOTAL_VARIANCE ; iparam:sample_count = TOTAL_NO_OF_T_AND_I_DIMENSION_SAMPLES ; Usual normalization for the principal component time series: unit variance (squared average over time equals unity). // variable attributes: ilabel:long_name = "EOF number" ; iparam:long_name = "PC variance fraction" ; iparam:total_variance = TOTAL_VARIANCE ; Unusual normalization for the principal component time series: variance is the ratio w.r.t. original EOF variance. // variable attributes: ilabel:long_name = "EOF number" ; iparam:long_name = "EOF variance fraction" ; iparam:total_variance = TOTAL_VARIANCE ; NOTES ===== (Yorick-style array notation will be used below, which is similar to the Fortran style, with indices starting at 1, and the leftmost dimension corresponding to the fastest varying index. Additionally, index 0 will denote the last element of the array, index -1 the last but one element and so on.) A. COORDINATE ATTRIBUTES ======================== The standard dimension coordinate variables are X, Y, Z, T, and ILABEL, corresponding to the "xyzti" dimensions. A1. X, Y, Z refer to grid-point coordinate values in the current subdomain. X, Y, Z are required to be defined if the data contains (or used to contain before slicing/rank-reduction) the corresponding dimension. A2. dim:GRID, where dim = X/Y/Z contains one of the strings "regular" or "interfacial", corresponding to regular or interfacial grids. A3. dim:SUBDOMAIN, where dim = X/Y/Z/TIME/ILABEL, attribute for each standard dimension variable contains one of the following integer values: 0 -> dimension-m, if present, extends over the full spatial domain -1 -> dimension-m is/was restricted to an non-contiguous ("irregular") subdomain of the full spatial domain >= 1 -> dimension-m is/was restricted to a contiguous subdomain of the full spatial domain. If the dimension is present, this value is contains the Fortran-style starting index of the location of the subdomain within full spatial domain regular or interfacial grid (X0 or XINT0, ...), depending the dim:GRID attribute. Note: a) By convention, TIME:SUBDOMAIN is always set to either 0 or -1. (This is done because time-subdomains are often non-contiguous, and the full domain time-coordinate values are not stored in the hyperslab) b) Coordinate transformation operators (e.g., model level to pressure level interpolation operator) should either (i) set dim:SUBDOMAIN to 1 for the transformed dimension, and re-define the full domain coordinate arrays (e.g., Z0, ZINT0), or (ii) set dim:SUBDOMAIN to -1, and set the full domain coordinate arrays to null. c) Domain extension operators may optionally set dim:SUBDOMAIN to -1 for the extended dimension, and set the full domain coordinate arrays to null. A4. dim:LOWER_BOUND, dim:UPPER_BOUND contain the bounding coordinate values for the spatial dimensions, which would initially correspond to the full domain (these may extend beyond the actual range of the coordinate values themselves.) If dim:SUBDOMAIN != 0, the dimension bounds contain the user-specified coordinate range used by the subdomain selection operator to select the subdomain. Again, this may be differ from the actual range of grid-point values of the coordinate within the subdomain (because grid-point locations may not always coincide with the end-points of the user specified range). (dim:LOWER_BOUND may be equal to dim:UPPER_BOUND, in which case it represents a single coordinate value and not a range.) Bounds are not defined for the T/I dimensions. A5. For some grids, the X-grid may be periodic, which is indicated by a non-zero value for the attribute X:PERIOD, which contains the period length in the same units as the X coordinate itself. The interfacial X grid usually has one more point than the X grid, except when the X grid is periodic, when both grids have the same number of points. A periodic X grid may be rotated, in which case the attribute X:ROTATED contains a non-negative integer value (which should be less than the number of X grid-points) indicating the rotation angle measured in grid-intervals. We assume, additionally, that the first interfacial grid point always lies "outward" of the first regular grid point. A6. For global domain grids with origin at the Greenwich Meridian, it is is sometimes useful to swap the Eastern/Western hemispheres, so that the origin is moved to the Dateline. This would allow subdomains that wrap around the origin to be contiguous. A7. Usually the regular Y grid would exclude the North Pole (although the interfacial grid may include it). However, if there is a regular grid point at the North Pole, then the regular grid and the interfacial grid are assumed to have the same number of points. We assume, additionally, that the first interfacial grid point always lies "outward" of the first regular grid point, unless the first regular grid point is on a pole. A8. For ocean data, if (X:PERIOD != 0) X0 = XT(2:IMT-1) XINT0 = XU(1:IMT-1) else X0 = XT(2:IMT-1) XINT0 = XU(2:IMT-1) if (YT(JMT) > 89.999) Y0 = YT(2:JMT) YINT0 = YU(1:JMT-1) else Y0 = YT(2:JMT-1) YINT0 = YU(1:JMT-1) A9. For Z:UNITS == "sigma_level", the Z values correspond to either sigma or hybrid pressure-sigma levels, with SIGMA0, SIGMAINT0 containing the corresponding pressure(A)/sigma(B) sigma/pressure coefficients on the regular/interfacial grid. Z:POSITIVE is true if the Z coordinate is positive in the upward direction, and false otherwise. Z:BOT_VAR, if not null, is the name of the history file variable containing the time-dependent values of the bottom Z values (e.g., surface pressure for a sigma/hybrid coordinate atmospheric model). Z:BOT_VAR is used to read in values of Z_BOT, described below. A10. TIME contains the time coordinate values in any of the standard units. TIME:DAYS_PER_YEAR contains the integral number of days per year for model years (which are assumed to be of fixed length). If it is zero, a real world calendar, with variable length years, is assumed. A11. ILABEL contains a list of strings describing the index coordinate "i", in its current subdomain. It could be omitted from the netCDF representation of the hyperslab, but for the Yorick/IDL representation it is necessary to define an array containing null strings to represent the "i" dimension, if it is present in the data, so that the length of the "i" dimension can be determined without reference to the data array, like all other dimensions. If a numeric value needs to be associated with each index value, the auxiliary variable IPARAM may be used. A12. DATE is an optional array of date values corresponding to time values TIME. The date is stored in the year-month-day format "yyyymmdd.fff", where ".fff" denotes a fraction of a day. A13. X0, Y0, Z0 are optional arrays containing the coordinate values of the regular grid in the full spatial domain. XINT0, YINT0, ZINT0 are optional arrays containing coordinate values of the interfacial (staggered) grid in the full spatial domain. X0, Y0, XINT0, YINT0 are required for horizontal staggered gridding and finite-differencing operations; Z0, ZINT0 are required for vertical averaging and vertical staggered gridding/ finite-differencing operations. ("dim:SUBDOMAIN" > 0, then "dim:SUBDOMAIN" defines the starting location of the subdomain within the full spatial domain of the dimension, on the regular or the interfacial grid, depending upon the value of abs(IS_PRESENT(m)).) A14. ILABEL0, if defined, contains the original list of index label strings, and IPARAM0, if defined, contains the original list of index parameter values. B. DATA DIMENSIONALITY ATTRIBUTES: ================================= (The variable DATA corresponds to one of the variables (VAR1/VAR2/...) in the netCDF representation of the hyperslab.) B1. The variable DATA contains the data values, with dimensions ordered "xyzti". It may be of single ("float") or double precision, but the MISSING_VALUE attribute, and the auxiliary data variables AREA_WT, and Z_BOT should be of the same precision as DATA. In the Yorick/IDL representations, DATA will always be a 5-dimensional array, with missing dimensions represented by unit-length dimensions. In the netCDF representation, any missing dimensions, as determined by the values of IS_PRESENT(m) (see below), would not be present. (In C representation, DATA would just be a pointer.) B2. The netCDF representation of hyperslabs allows for more than one variable (with the same horizontal dimensions) to be saved together. However, any common dimension variables are required to have the same values and attributes. All the other hyperslab representations only allow for a single variable to be represented, corresponding to the DATA field of the structure. It is possible to represent the multiple-variable netCDF file as an array of hyperslab data structures in Yorick/IDL/C. B3. DATA:ORIGINAL_DIMS is a string containing a 5-element, comma-separated list of the original dimensions of the data, i.e., before any domain-reduction were performed. E.g., "x,y,,time," for data with "xyt" dimensions. The names "xint", "yint", "zint" are used for interfacial dimensions. (In the netCDF representation of a hyperslab, the ORIGINAL_DIMS and the REDUCTION_OPS attribute may both be omitted, in which it is assumed that the original dimensions are the same as the current dimensions.) B4. DATA:REDUCTION_OPS is a string containing a 5-element, comma-separated list of the rank-reduction operations performed on the data. E.g, "avg,,,," for the above X-averaged data. The recognized rank-reduction operations are "avg"/"sum"/"rms"/"min"/"max" representing area/volume weighted averaging, area/volume weighted sum, area/volume weighted root-mean-square, minimum value along the dimension, or the maximum value along the dimension, respectively. For slicing operations, the index value (>=1) to the un-sliced dimension is used as the reduction operation code. E.g, "avg,,2,," denotes X-averaging, and slicing to the second Z-coordinate value (in the subdomain, *not* the full domain.) B5. DATA:AREA_WT_VAR, if present, contains the name of area weights variable associated with this hyperslab. DATA:Z_BOT_VAR, if present, contains the name Z_BOT variable associated with this hyperslab. B6. It is useful to define the functions IS_PRESENT, and IS_REDUCED that return information about the dimensionality of a hyperslab. IS_PRESENT(m) returns an integer value corresponding to each of the five dimensions "xyzti" (m=1...5): 0 => dimension-m was never present in the variable 1 => dimension-m is present, on the regular grid ("x/y/z/time/ilabel") 2 => dimension-m is present, but on the interfacial grid ("xint/yint/zint") -1 => dimension-m was eliminated, on the regular grid -2 => dimension-m was eliminated, but on the interfacial grid If IS_PRESENT(m) < 0, it means that the dimension was "eliminated", it means that that dimension-m was either sliced out or subject to rank-reduction, and is not present in the data any more. The coordinate variables for "eliminated" dimensions are frozen at the values prior to the elimination of the dimension. For example, even if IS_PRESENT(1) < 0, X should be defined. In this case, X would represent the X coordinate value(s) before the slicing/rank-reduction operation eliminated the dimension. If abs(IS_PRESENT(1)) == 1, it means that X corresponds to the regular X grid values, i.e., a subset of X0. If abs(IS_PRESENT(1)) == 2, then X corresponds to the interfacial X grid values, i.e., a subset of XINT0. IS_REDUCED(m) returns an integer value corresponding to each of the five dimensions "xyzti" (m=1...5): == 0 => dimension-m, if present, has not been subject to rank-reduction operations; <= -1 => dimension-m is no longer present, but was present and subject to a rank-reduction operation, as follows: == -1 ("avg") => area/volume/mass weighted average, as appropriate == -2 ("sum") => area/volume/mass weighted sum, as appropriate == -3 ("rms") => weighted root-mean-square of values along dimension == -4 ("min") => minimum value along the dimension == -5 ("max") => maximum value along the dimension == -6 ("eof") => empirical orthogonal reduction along the dimension (whether the rank-reduction was done over the full domain, or over a subdomain can be determined from the "dim:SUBDOMAIN" attribute of dimension-m) >= 1 => dimension-m is no longer present, but was present and subject to a slicing operation, with the IS_REDUCED value denoting the coordinate index within the dimension (in the subdomain). Rank-reduction/slicing operators should not delete the coordinates of the dimension that was reduced, although the dimension itself would be reduced. E.g., the X-averaging operator should leave the values of X as they are so there is information available on the coordinate values that were averaged over. IS_PRESENT, and IS_REDUCED allow us to access in a compact manner the information stored in the data attributes DATA:ORIGINAL_DIMS, and DATA:REDUCTION_OPS. B7. DATA:ORIGINAL_DIMS, DATA:REDUCTION_OPS., DATA:AREA_WT_VAR, and DATA:Z_BOT_VAR are not actually saved in non-netCDF representations of the hyperslab. The associated IS_PRESENT, IS_REDUCED information is saved in the components .DIMENSION(SDIM,1), and .REDUCED(SDIM) of the hyperslab structure. The .DIMENSION(*,*) and .REDUCED(*) structure components are not explicitly saved in the netCDF representation, because the dimension information is saved as part of the netCDF variable definition, and through the data attributes DATA:ORIGINAL_DIMS and DATA:REDUCTION_OPS. C. AUXILIARY DATA VARIABLES: =========================== C1. AREA_WT is the horizontal area/volume/mass element array. Z_BOT is the "depth" mask which contains the lowermost allowed value of the Z coordinate at each horizontal grid-point in the current subdomain, i.e., the depth for oceanic grids, and the time-dependent surface pressure values for atmospheric grids. They both have the same precision as the data. C2. AREA_WT and Z_BOT, if present, always have the same horizontal (XY) dimensions as the data, and the other dimensions should be a subset of the data dimensions, i.e., some of the dimensions may be missing (they would be of unit-length in non-netCDF representations). Of course, Z_BOT can never have the Z-dimension. The structure components .DIMENSIONS(*,2) and .DIMENSIONS(*,3) describe the dimensions of the two arrays, similar to .DIMENSIONS(*,1) which contains the IS_PRESENT information for the data. The non-zero values of .DIMENSIONS(*,2) and .DIMENSIONS(*,3) should be identical to the corresponding values in .DIMENSIONS(*,1), since the AREA_WT and Z_BOT dimensions are a subset of the data dimensions. C3. If the AREA_WT array is present, AREA_WT:ELEMENTS may have the value "dxdy" or "dxdydz", denoting area or volume elements respectively. C4. AREA_WT may be initialized when the hyperslab is created, or it may be set to null initally, and created just when needed (i.e., prior to a horizontal masking operation, a non-contiguous subdomain selection operation, or rank-reduction operation), using the full domain spatial grid information. If AREA_WT is null, then averaging/summing/RMS operations are not allowed. Horizontal masking operators should set AREA_WT to zero over excluded points. Horizontal averaging operators should accumulate AREA_WT over the averaged dimension. Horizontal slicing operators should slice AREA_WT. Vertical averaging operators should multiply AREA_WT by the total depth (or thickness) at each horizontal grid point, converting area elements to "volume" elements, and changing AREA_WT:ELEMENTS to reflect this. C5. The Z-dimension attribute Z:BOT_VAR points to the Z_BOT variable in the netCDF history file, if any. C6. Z_BOT may be associated with any hyperslab data structure that had all three spatial dimensions initally. If Z_BOT is null, full "depth" is assumed everywhere. If Z_BOT is present, Z_BOT:REF should contain the reference value of Z_BOT, in the same units as Z_BOT, to allow hybrid pressure-sigma coordinates to be converted to pressure, using the SIGMA0/SIGMAINT0 variables. Horizontal slicing operators should slice Z_BOT as well. Horizontal averaging operators should pick the lowermost (maximum "depth") value of Z_BOT over the averaged dimension. Z_BOT should be set to null when the Z-dimension eliminated through averaging/slicing. NOTE: In the Yorick/IDL representation, AREA_WT and Z_BOT are always stored as 5-D arrays (if not null), with missing dimensions represented by unit-length dimensions. D. OTHER DATA ATTRIBUTES: ======================== D1. DATA:NAME attribute contains the original name associated with the variable, e.g., "T", "PS", ... Hyperslab operators may change NAME. For example, the product of variables "V" and "T" may have the name "V_TIMES_T". (This attribute is not saved in the netCDF representation.) D2. DATA:LONG_NAME and DATA:UNITS describe the data. D3. DATA:MISSING_VALUE denotes missing values in the data, and should have the same precision as DATA. For complex data values, the real and imaginary parts should be associated with the same numerical missing value. (For plotting purposes, all data values >= MISSING_VALUE may be assumed to be missing.) D3. DATA:TIME_REP records whether variable contains instantaneous values, averaged values between the previous time coordinate and the current one, or extreme values over the same time interval. (DATA:TIME_REP = ""/"instantaneous" / "averaged" / "minimum_val" / "maximum_val") D4. DATA:HISTORY contains a string describing the history of operations performed on the hyperslab. Each hyperslab operator should append a descriptive string (terminated by a semi-colon/new-line pair) to HISTORY. D5. DATA:LEGEND is a short-hand form of history used for plotting. Each hyperslab operator may append a short string to LEGEND to be used for plotting. For example, the differencing Atlantic and Pacific zonal-means may produce the legend "ATLANTIC-PACIFIC". D6. DATA:C_FORMAT and DATA:FORTRAN_FORMAT contain optional format strings for displaying the data variable. Note: The data attributes HISTORY, and LEGEND are meant primarily for documenting purposes and should not be used for checking conformance. E. GLOBAL ATTRIBUTES: ==================== E1. :HOR_SUBDOMAIN, :VER_SUBDOMAIN contain strings describing named spatial subdomains of the data, horizontal and vertical. E.g., "U.S.", "ATLANTIC", "WARM_POOL" for horizontal subdomains, and "UPPER_OCEAN", "BOUNDARY_LAYER", "STRATOSPHERE" for vertical subdomains. (The meaning of these names is not defined by the format.) The subdomain names may be used for checking conformance of hyperslabs. E2. :CASE_NAME contains a short name for case (or model run) that uniquely identifies it. It may be used for conformance checking. E3. :CASE_TITLE contains a one-line description of the case. It should not be used for conformance checking. E4. :ORIGINAL_FILE contains the name of the original history file that the hyperslab was extracted from. E5. :RESOLUTION is a string describing the full spatial domain of the data, e.g., "T42L18" for CCM3, "D02L45" for the CSM ocean model etc. E6. :DATA_SOURCE may be a null string, in which case data is assumed to be from generic observations. Otherwise, DATA_SOURCE should contain the name of a specific observational dataset or the name of the numerical model that was used to generate the data. E7. :DATA_URL is a string containing the Web address (URL) of a document describing the data source, i.e., the observational data set, numerical model, or the particular model integration. E8. :STRUCTURE is string containing one or more "_" separated substrings. The first substring is always "HYPERSLAB". The remaining substrings denote named extensions to the basic hyperslab data structure. E.g., "_ATM", "_OCN", "_EOF" and so on. The extensions are apply sequentially, from left to right. They determine what extra fields are defined in the hyperslab data structure. STRUCTURE may be used for conformance checks. E9. HYPERSLAB_VARS is string containing a comma-separated list of all hyperslab variables in a netCDF file. E10. FORMAT_URL is a string containing the Web address (URL) of a document describing the data format, i.e., the hyperslab data format. Note: The data-related global attributes HOR_SUBDOMAIN, VER_SUBDOMAIN, and CASE_NAME may be used for conformance checking. The other global attributes are present primarily for documenting purposes and should not be used for checking conformance. F. CONFORMABILITY ================= Consider a set of hyperslabs A, B, C, ... For each standard dimension-m, the following levels of conformance are defined: (i) "Strong full conformance": Dimension-m is either present in all of the hyperslabs, or is absent in all of the slabs. If it is present, the dimension length is identical in all the hyperslabs, and the dimension attributes LONG_NAME, UNITS, and GRID are also identical. Furthermore, all the coordinate values of the dimension are also identical. (ii) "Strong broadcast conformance": As in (i), except that dimension-m may be present in some of the hyperslabs and absent in others. Where it is present, all the constraints of (i) apply. In binary operation, this dimension would be "broadcast" in those slabs where it is absent, as is done in Yorick, for example. (iii) "Weak full conformance": As in (i), but relaxing the requirement that the coordinate values and attributes of dimension-m be identical. (Only the dimension lengths are required to be identical.) (iv) "Weak broadcast conformance": As in (ii), but relaxing the requirement that the coordinate values of dimension-m be identical. Note: The above conformance criteria differ slightly from those of Yorick in that a unit-length dimension is not conformable with a non-unit-length dimension. This because the hyperslab format allows a dimension to be "present" or "absent" (although the actual data array would still contain a unit-length dimension corresponding to the missing dimension). One can easily avoid non-conformance arising from unit-length dimensions by explicitly eliminating that dimension through a slicing operation. It is also useful define the following other types of conformance for two slabs A and B: a) binary-conformability: all of the 5 standard dimensions are conformable such that at least one of the slabs (the "high-dimensional" slab) has no dimensions that require broadcasting. b) unit-conformance: the variable UNITS attribute is the same in both slabs c) variable-conformance: in addition to unit conformance, the variable. attributes NAME, LONG_NAME, and TIME_REP are the same in both slabs. d) grid-conformance: the STRUCTURE attribute is the same in both slabs, and X:PERIOD, X:ROTATED, X0, Y0, Z0, ILABEL0, IPARAM0, and XINT0, YINT0, ZINT0 are also the same, if present. e) domain-conformance: in addition to grid-conformance, the HOR_SUBDOMAIN and VER_SUBDOMAIN attributes are also the same. f) case-conformance: CASE_NAME is the same in both slabs g) save-conformability: requires strong broadcast conformance for regular and interfacial dimensions separately, including eliminated dimensions, and the dim:SUBDOMAIN, X:PERIOD, X:ROTATED attributes are required to be identical. Binary arithmetic operations may be performed between any two binary-conformable hyperslabs, except for addition/subtraction, which also require unit-conformance. All dimensions and attributs for the results of a binary operations will be inherited from the higher-dimensional operand, if any, or from the leftmost operand. G. OPERATORS ============ The following types of hyperslab operators may be defined: a) creation operators (for different history file formats) b) domain reduction operators (for all five dimensions) (i) "slicing" operators (ii) rank-reduction operators (iii) subdomain/masking operators c) domain extension/sprouting operators d) conformance operator e) copy operator f) arithmetic operators (unary/binary/...) g) staggered gridding operator h) vertical interpolation operators i) spatial/temporal derivative operators j) netCDF saving/appending operators for hyperslabs k) eddy/anomaly operators l) time-filtering operators m) spectral transform operators a) A creation operator should: -for atmospheric data, create dummy staggered X/Y grid -"slice out" any extra dimensions besides the spatial/temporal dimensions -transpose "zy" ordering, or reverse coordinate ordering when reading hyperslab -optionally create AREA_WT, Z_BOT for data with two horizontal dimensions -set dim:LOWER_BOUND, dim:UPPER_BOUND to reflect the plot domain in each dimension (this may extend beyond the actual range of coordinate values) -for a hyperslab netCDF file, read the selected variable b1) A "slicing" operator should: -insert the slicing offset string in DATA:REDUCTION_OPS -slice the coordinate array, AREA_WT and Z_BOT, if defined b2) An rank-reduction operator should: -insert the rank-reduction operator string in DATA:REDUCTION_OPS -for horizontal-averaging, accumulate AREA_WT, and pick lowermost value of Z_BOT over dimension-m -for vertical-averaging, multiply AREA_WT by Z_BOT, and set Z_BOT to null b3) A subdomain/masking operator should: -set dim:SUBDOMAIN to the m-index -set dim:LOWER_BOUND, dim:UPPER_BOUND, to the user-specified bounds of dimension-m that were used select the range (this may differ from the extreme grid-point values of the coordinate within the subdomain.) -prior to horizontal masking or non-contiguous subdomain selection operations, create AREA_WT, if the hyperslab had two horizontal dimensions initially -extract subdomain of AREA_WT, Z_BOT -extract subdomain of coordinate array and AREA_WT, if defined -if horizontal region mask is specified, set masked data values to MISSING_VALUE, masked AREA_WT values to zero -set name of subdomain, if specified c) Domain extension operators are concatenation operators in a particular dimension. They would splice together different sets coordinate values, in monotonic order, for a set of hyperslabs that have identical coordinates in all dimensions except the one being extended. (This would be useful in processing data one "snapshot" at a time, and then splicing together the resulting hyperslabs.) A "sprouting" operator could be defined to re-introduce an rank-reduced/ sliced-out dimension, by using either the first, the last, or the averaged coordinate value for the re-introduced m-dimension. The sprouting operator should -remove any reudction operation entry for that dimension in DATA:REDUCTION_OPS -set dim:SUBDOMAIN to -1. d) Conformance operator would check for the degree of conformance of hyperslabs e) Copy operator copies hyperslabs. (This would be trivial in interpreted languages such as Yorick or IDL, but in C one would need to handle pointers carefully.) f) Arithmetic operators would act on binary-conformable hyperslabs, generating a DATA:LEGEND string if subdomain names are different, or if the subdomain limits are different. If variable names are different, a new variable name would also be generated. The weight and depth masks for results of binary operations will be inherited from the higher-dimensional operand, if any, or from the left operand. (Sum/difference operators would also require unit-conformance.) g) A staggered gridding operator would allow variables to be interpolated from the regular grid to the interfacial grid, and vice versa. (This would involve a certain degree of smoothing.) This would be useful in conjunction with the spatial derivative operators. h) A vertical interpolation operator would interpolate data between different vertical coordinates, such as between model levels and pressure levels. i) Spatial/temporal derivative operators would carry out finite-differencing operations. j) NetCDF saving/appending operators would allow one or more save-conformable hyperslabs to be saved or appened to a hyperslab netCDF file. Appending would only be allowed in the t-dimension, with the i-dimension being absent. k) Eddy/anomaly operators would compute anomalies with respect to zonal/time averages. l) Time-filtering operators would carry out lowpass/highpass/bandpass filtering, or compute running means. m) Spectral transform operators would compute horizontal derivatives, or switch between velocity and vorticity/divergence representations. --