Space Radiation Laboratory
220-47 Downs
California Institute of Technology
Pasadena, California 91125
11 March 1992
Outline
1. Introduction
1.1 Purpose of the Tennis Standard
1.2 Description
1.3 History and Context of the Standard
2. How to Use the Standard: the User Interface
2.1 Reading Data
2.2 Writing Data
2.3 Describing New Data Formats
2.4 Keeping Track of Pedigree
2.5 Time Organization
2.6 Examples
2.7 Function Lists
2.7.1 User Functions
2.7.2 Internal and System Functions
2.7.3 Multiple Unit Functions
3. Utilities
4. How the Standard Works: Implementation & Layout
4.1 User Sets
4.2 Metasets
4.2.1 Sets 0[ and 0] : Beginning and End of Tourney
4.2.2 Set 0! : Set Descriptor
4.2.3 Sets [[ and ]] : Match Markers
4.3 Pedigree Sets
0$ , @@ , and 1?
5. Why to Use the Standard: Justification and Tradeoffs
5.1 Short Tourneys
5.2 Long User Sets
5.3 Platform and Medium Independence
5.4 Alternative Possibilities
Appendix: Various Lists
1. Introduction
1.1. Purpose of the Tennis Standard
This proposed data formatting standard is intended to facilitate documentation of data formats and allow the casual programmer to do input and output of spacecraft data and other similar data by use of an existing library of functions, as opposed to having each user create his own i/o functions. Use of the library avoids duplication of effort and makes programs more transferable from user to user. The tennis standard includes this library of C functions (or FORTRAN equivalents) for getting data from and putting data on the appropriate medium (the "court") and the blocking protocol for the data. An additional benefit of the standard is its lack of dependence on hardware, operating system, and distribution medium. Thus the user not only avoids re-invention of the i/o system every time a new person starts analyzing data, the user also avoids re-writing applications whenever hardware or operating system is "improved" or changed. The only program changes needed are in the tennis library. We should be able to support both Unix and VMS platforms on SAMPEX.
Still another part of the standard will be a library of utility programs, described in section 3. These utilities will perform functions such as ascii dumps without any need for user programming.
The tennis standard originated as a blocking protocol, which allowed collection of many short logical blocks (then called chapters) into physical record blocks for mag tape. Our experience demonstrated the virtues of standards extolled above, but, in addition to these virtues, which apply to almost any standard, we found the standard to be extremely useful for that sort of task -- time sequential blocking of short logical records from spacecraft telemetry streams.
1.2. Description
The tennis library will get or put data from the user program to the court (the selected storage medium) in units of various sizes. These units are nested, thus: The smallest unit, clearly, is the bit. Next larger is a "point". A point may consist of 1, 2, 4, or 8 bytes, i.e., it is a character or integer or float or .... A collection of points is a "game" and a collection of games is a "set". A set might, for example, be the data associated with a cosmic ray pulse height event -- something like eight 16-bit matrix detector addresses, six 16-bit pulse heights, and 16 single bit discriminator indicators. In this example, each address is a point, the eight addresses collectively are one game, the six pulse heights are another game, and the 16 indicators could be a third game. How each of these units is assembled to make the next larger is detailed in the Implementation and Layout section, section 4. The novice does not really need these details -- use of the library is independent of them.
The library is described in the User Interface section, section 2. Previewing briefly: The typically most heavily used function will be get_set, which brings the next available set from the court to the user program. If the set is of interest, the user will ask for a pointer to the data with get_game. The user may then dig into the set directly using pointers, or may ask to library to get_float or get_bit or etc. Use of these "additional" library functions is encouraged as good style, but is not required if the user is sufficiently familiar with the data to know where it is within the set. A similar set of functions allow the user to put points, games, etc. from his/her program to the court.
A collection of sets is called a "match" and typically corresponds to a physical record on a tape or a fixed size group of records on a disk, but also has meaning in non-record-oriented media like a Unix pipe. A collection of matchs is a "tourney" and would typically correspond to a tape or a disk file. Tourneys also have meaning in non-record-oriented media. The Justification and Tradeoffs section (5) includes a discussion of workarounds used when sets have lengths longer than convenient match sizes. The Implementation and Layout Section explains how the library deals with non-record-oriented media, in a fashion intended to be invisible to the user.
Tourneys are the data units which will normally be shipped from one user or data center to another. Each tourney begins with a description of the meaning and format of the various data units in the tourney, i.e., the description specifies what types of games and sets are in the tourney, what points are in the games, and, if interesting, what bits are in the points. These descriptions, in addition to documenting the data for the user, tell the library how to find the bits, points, etc. requested by the user. If necessary the library can use this information to transform, for example, floating-point points from VAX to Sun format, freeing the user from such hassles.
A special feature of the tennis standard is its emphasis on maintaining a creation history or pedigree of the data. The library, with some assistance from the user, records such details as what court served as input to the current court, what program created the current court, and what were the similar details for all ancestral courts. The user's responsibilities in assisting this function are detailed in sections 2.3 and 2.4. Generally pedigree is of interest only when something goes wrong.
Thus there are three types of sets: user data sets contain the data which the user normally wants to get or put; metasets include the sets containing descriptions of the other data sets, and various marker sets (such as end-of-tourney) which allow use of non-record-oriented media; and pedigree sets describe the history of the tourney and its data. Details of these sets are given in sections 2.3 and 2.4 and in section 4.
1.3. History and Context of the Standard
This section is only interesting to those who are familiar with previous versions of this document and wonder how we got here. New users should skip it.
The tennis standard is a successor to the chapter-verse (set-game) standard used first on the HEAO C3 experiment by Caltech, Washington University, and the University of Minnesota. Chapter-verse was also used at Caltech for Voyager, IMP, ISEE, and Galileo data. Experience showed that the chapter-verse standard provided some immunity to computer upgrades (PDP-11 to VAX to Sun) in addition to its benefits regarding documentation and blocking. The new standard has system and medium independence as an explicit goal.
The unusual nomenclature is intended to avoid frequently confusing associations with previous usage of words. Credit goes to the SRL ad hoc lunchtime committee on free advice, led by Dr. A. C. Cummings.
Alternatives to writing our own standard are discussed in section 5.4.
2. How to Use the Standard: the User Interface
Each set is labeled with a key, which consists of two ascii bytes. For example, the Galileo set which specifies the time of a data group is a tG set. An eG set contains a Galileo event. The convention is that the second character specifies the project; keys for special or pedigree sets are non-alphabetic -- for example, end-of-tourney is marked by a key of 0] . The user knows the format of a particular set (or game, point, etc.) from paper documentation or can get the format information from the tourney itself using a standard utility program.
These sets (games, points, etc.) are moved into and out of the user's program by use of the library routines: get_set, put_set, get_game, .... Call conventions for these routines forms the interface between the user and the library. See the figure in section 2.7 for an illustration of how this interface isolates the user from the operating system.
A typical program is a filter -- for example, it might read a tourney containing, amongst other things (like time indicators and rate readouts), sets which correspond to pulse height events. It might apply a PHA-channel to energy-signal calibration to these events and write out new sets which now have floating point signals instead of integer channel numbers. Typically all the "other things" on the input tourney would be copied to the output tourney along with the new signal sets. The other typical pattern for a program is "read data; output plot".
Given a set or a game by the get_set or get_game routines, the user will normally access individual variables (points or bits) by use of pointers or indices, operate on them according to his/her needs, and plot them or output them using the putset, putgame, ... routines.
In a little more detail, one typical program might look like:
define the format of the new "sG" set;
do other initialization;
begin a loop -- get a set;
if it is not an "eG" set {copy it to the output tourney;}
if it is an "eG" set,
{get the game containing the channel numbers;
calculate the signals from the channel numbers;
output a "sG" set with a signal game to replace
the channel number game (other games unchanged;}
repeat the loop until reaching end-of-tourney;
polish things off and quit;
The calls for getting sets, etc. are documented in Reading Data (2.1) below. The calls for
putting sets, etc. are documented in Writing Data (2.2) below. After you understand those
two sections, then some cumbersome details of the "initialize" section of the program example
above are described in the following two sections. These details are needed mainly for
creating new sets, games, etc. which have not previously been described to the library.
2.1. Reading Data
No initialization is needed for reading data. Assuming a Unix C environment, to read a tourney or collection of tourneys in tennis format:
1) Include the tennis library in your program, #includeWhen running programs using tennis format, when input or output is required, the library prompts on the terminal for input or output tourney. If a disk file is being used, enter the name of the file (followed by a carriage return). Otherwise, enter the device name, such as /dev/nrst0 for the 8-mm tape drive number 0. To input or output data from or to standard i/o (the keyboard), enter a backslash and carriage return in place of the device name. Terminate standard input with a ^D. Procedures for other courts are TBD.and use -ltennis.clib in the compile command. 2) Call get_set(). It returns the key string of the next set. 3) Call get_game(N). It returns a pointer to game N, the sequence number of the game within the set. Note that the pointer for the first game can be used as a pointer for the set. 4) A key of 0] means end of tourney, i.e., no more data. Before telling the user program that end of tourney has been reached, the library will ask the operator for additional tourneys.
When the end of an input tourney is reached, the library asks for the next input. Thus it is easy to add files or tapes together. When the last tourney is reached, enter "-1" for input and the program will quit. When the end of an output tape is reached (disk files should never have this problem) it prompts for the next output tape, so both input and output can be continued over many different tapes and files.
A FORTRAN interface will be much cleaner if we also have routines for getting points and bits. For consistency, we will therefore include C versions -- get_point and get_bit.
2.2. Writing Data
Assuming, again, a Unix C environment and assuming initialization is done, the following steps are used to write data:
As above, we will also need put_game, put_point, and put_bit to clean up the FORTRAN interface.
2.3. Describing New Data Formats
To create a new tennis format data set, assuming a Unix C environment, one follows the procedure below:
char *newpnt = "/home/thor/tlg/tG.set.dscr"; set_init(newpnt); where the file tG.set.dscr contains ascii along the following lines: BEGIN_GROUP = setdscr; setkey = "tG "; /* two trailing blanks*/ . . . END_GROUP = gamedescr; END_GROUP = setdscr;Section 4.1 contains a more complete description of the format of this info.
2.4. Keeping Track of Pedigree
In addition to the information needed for the metaset, as described above, the user must provide information needed to keep track of pedigree when writing a new tourney. In particular, the user must provide the self-documentation info needed in set 0[ , i.e., provide the name of the main routine, the name of an "other" routine, and, if appropriate (see section 2.5), the variable skiptm (tpnm is requested from the operator). The user will also likely provide names of files to be preserved in set 0$ -- optional but highly recommended. These data are provided to the library in the following fashion:
1) For the two most interesting files of source language in your program (main and
one other), save the file names in a set 0[ by calling setnmsav as in the
following example:
setnmsav("/home/thor/galileo/gnsrc/FLUX/fluxth.c",
"/home/thor/galileo/gnsrc/wrcspni.c");
This statement should be in either "main" or "other". These two files will be
copied into pedigree sets 0$ by the library.
2) For all other interesting source language files, the user should save the
source in a set 0$ by passing a pointer to a string with the complete filename.
The pointer is passed with a call to function setsrcsav(pointer) . For example,
pointer = setmnpt;
setsrcsav(pointer);
pointer = "/home/thor/galileo/gensrc/flux.h";
setsrcsav(pointer);
These files should be short compared to buffer length, or broken into short pieces
with ^L characters (formfeeds); else they will be broken at arbitrary locations
by the library functions. Only the last 256 characters of the filename are saved
on the tourney.
2.5. Time Organization
Typically the sets within a tourney will be time-ordered with sets specifying time bracketing sets which contain events or rate readouts or field measurements. Individual events, etc. will not usually be individually time labeled because of the overhead cost of storing so many time specifications. They are frequently labeled with a one or two byte integer time offset from the most recent time set.
One of the services to be provided by the library is the insertion of warning messages into the output tourney when data is processed from time that is listed in the warning database. For example, if the user is processing an interval from Jan 28 thru June 11, the library might insert, near the March 11 data the message (in tennis set form) that the front detector was noisy around noon. This message would also be copied to the user's terminal.
Some users have expressed an interest in direct access to data. It is certainly conceivable that one could create a collection of pointers that allowed direct access to individual sets or time sets within a tourney. I judge that to have excessive overhead. This specification does, however, allow for access to individual time periods which are assumed to be relatively lengthy, perhaps a day or so.
If the user wishes to use this feature, the time period must be specified with a variable named skiptm in the 0[ set, and the tourney must contain recognizable time sets. In order to be recognizeable, a time set must have a key starting with upper-case T, the two bytes following the key must contain two more T's, and the time is specified in ISO standard format in the first game of the set. An example is given in section 4.1, User Sets.
The ISO standard time format is an ascii string: YYYY-MM-DDThh:mm:ss[.fff] or YYYY-DDDThh:mm:ss[.fff], where T is a delimiter and [ ] implies optional. See examples in section 4.2.1.
Alternative systems, with much better direct access capabilities, are described in section 5.4.
2.6. Examples
2.7. Function Lists
The tennis library is included by "-ltennis.clib" when compiling, and
#include
#define EOT "0]" /* end-of-tourney metaset */
As the figure illustrates, normally the user calls only user functions, but the user can call
internal functions. System functions should be called only by the internal functions. All
modules have access to the definitions in tennis.h.
2.7.1. User Functions
The list of functions includes:
2.7.3. Multiple Unit Functions
There are also routines for handling multiple (two) input and output units. Generally
speaking, they are the same as the above, with the letter "m" prepended and an extra
argument for the unit number. Unit numbers can be 0 or 1. Dialog with the user/operator
will assign physical devices to the units as above (section 2.1). If there are two input
tourneys, any sets that are on both tourneys and have the same key must have the same
format if they are going to be used.
same as above, but U is the input or output unit number .
integer mcopy_set(int Ufrom, int Uto, short key)
assumes the last set read on unit Ufrom was of type key, and copies it onto unit Uto.
used instead of mput_set(U,key), it writes a set of type key on unit U using the data
starting at pointer.
Utilities to be furnished along with the subroutine library would include:
Verify: Reads a tourney and prints a summary of its contents -- start time, end time, and
gaps; total length in time units, bytes, matchs, etc.; numbers of sets of various sorts; any
warning sets on the tourney, etc.
Browse: Similar to verify, but interactive and capable of looking at the documentation in the
metasets and of dumping sets.
Enrecord & Derecord: Read a tourney which does not observe the convention requiring
alignment between physical records and matchs and write one which does or vice-versa.
Emailsend & Emailrecv: Possibly redundant. Send a tourney out over the network or
receive from the network and write a tourney which observes record/match alignment
conventions.
Getsrc: Reads all the source language sets off a tourney into files in the current directory.
Filenames are prepended to the actual text, inside the file. The new files are names according
to which sets they came from -- 0$ , 1$ , ....
Merge: Reads two tourney and outputs one, in time order. Assumes that each of the two
input tourneys are in time order.
Split: Reads one tourney and outputs one or two, with specified sets going to the specified
output unit.
File Database: Finds new tourneys on a disk and prepares an index of the information in
pedigree sets 0[ and 0] .
Index Database: For a particular tourney, prepares a separate file of pointers to records with
time sets separated by interval skiptm as specified in the set 0[ .
4. How the Standard Works: Implementation & Layout
As previously stated, each tourney has a metaset containing information describing the
format of the other sets. The library uses this information to perform the functions specified
in the Interface description.
We will describe the sets in three groups, user sets, for which hypothetical Galileo data
sets will serve as examples; metasets of which the 0! set which describes the format of
other sets is the most important; and pedigree sets which specify the source of the data, such
as set 0[ .
Sets with characters chosen from the NASA SFDU PVL non-alphanumeric, non-reserved
list (see appendix) in the second byte of the key (the project) are "internal" sets (metasets or
pedigree sets) and are not normally of interest to the user. They are generated automatically
by the library. Note that these "internal" sets are pure ascii and will (presumably) never need
translation due to computer type change. A sequence of 0! sets near the beginning of a
tape defines all the user sets and most of the metasets and pedigree sets. This sequence will
contain [[ and ]] sets to mark off matchs and is preceded by 0[ sets (also marked off
by [[ and ]] sets). Thus, these sets -- 0[ , [[ , ]] -- must be defined a priori. The
1[ , 2[ , etc. which have the same format as the 0[ are also defined a priori. In addition
to the self-documentation provided by the sets 0! , documentation for all sets should be in a
file maintained by TLG.
The following is intended to be a complete list of internal sets with comments on where
additional description is found
Sets whose "keys" are ascii alphabetic characters are used for user data. By convention,
the second character identifies the project and the first specifies a type of data within that
project. Thus each project is limited to 52 (upper and lower case) types of data. Likewise,
we are limited to 52 projects, if we avoid duplication of labels. Numerical characters and
some non-alphanumeric characters can be used (but only after consultation with TLG, please)
in cases where that limit presents a problem. We also have maintained the option of
expansion to 4-character keys.
Sets are blocked together in the buffer and on the court, forming a sequential list. When
putting sets, if a set threatens to overflow the buffer, the buffer is written out to the court and
the set goes near the beginning of a new buffer. Note that some computers may require 8-
byte variables (double precision floating point) to be aligned on 8-byte boundaries. In order
to avoid alignment problems, it is conventional to make set lengths a multiple of 8 bytes.
While one can frequently disregard this convention safely, it will almost certainly cause
trouble if set lengths are not multiples of 4.
Our first example is a short set with four games, the tG set contains a time label,
which specifies the time at the beginning of the instrument subcom cycle and the rate, status,
and spin angle data associated with that cycle. Note that we used a lower case t in the key;
this set does not have ISO format time in it and is not compatible with the skiptm "direct
access" indexing scheme. The set 0! which describes this set tG is shown in section
4.2.2.
The set 0! for the EG includes:
4.2. Metasets
The 0! metaset specifies format info for most other sets, including all user sets. The
0[ serves both metaset (specifies beginning of tourney and skiptm) and pedigree set
functions and is described here. Other marker metasets include 0] (end of tourney), and
[[ and ]] (beginning and end of match).
4.2.1. Sets 0[ and 0] : Beginning and End of Tourney
Set 0[ is the first data on the tourney. It is in SFDU Parameter Value Language after the
synch characters. Note that white space -- blanks, tabs, carriage return, line feed, vertical tab,
and form feed are ignored outside variable or value strings, and can be used to improve
readability. Within variable names, white space is not allowed. Within values, white space is
allowed only if quoted. In general, values must be specified with a restricted ASCII subset.
Comments are specified inside /* */ pairs as in C and are also ignored. The general format is
variable = value with semicolon separators. The variables are as listed in the example below.
The set has a fixed length of 4000 bytes. It is padded with white space, preferably blanks,
after the last semicolon. After the 4000 bytes comes a set ]]. No more data is put in this
record, since the next set (after the [[ ) will be a 0! , which is usually too long to fit in the
same match.
The set 0] contains match and set counts for the tourney which it terminates and has the
following format. Provision is made for "only" 64 different sets on the tourney. Additional
sets will be ignored.
The set 0! follows the 0[ set on the tape (ignoring match markers); it is in
Parameter Value Language also. Its format is very like that of the 0$ , i.e., it is a text set
with variable length. Maximum set length is 30,000 bytes. An example of a set 0! which
describes a set tG follows. The set was introduced in section 4.1
Note that the pointyp, cmptyp, and setyp variables are chosen from a list to ensure
consistency. Those lists are in the appendix. The point types include, for example, A for
ascii character S for signed short integer, s for unsigned short, ....
4.2.3. Sets [[ and ]] : Match Markers
These sets are used to mark off beginning and end of matchs, which will normally be
aligned with beginning and end of record.
4.3. Pedigree Sets
The pedigree sets include the 0[ described above, and the 0$ set, which contains
source language from the program that created the tourney.
When a pedigree set is encountered on an input tourney, it is copied by the library onto
the output tourney. For example, a 0[ is copied; its key is changed to 1[ to indicate
generation level. In general, A n[ set is written to the output tourney whenever a (n-1)[
is encountered in the input tourney. The number in the set key is incremented by one each
time to indicate generation level. When 9 is reached, quit incrementing. Thus, pedigree sets
also include the sets 1[ , 2[ , ... 9[ . Note that the synch string in the 0[ must be
replaced by blanks when writing a 1[ .
Similarly, sets 1] , ... 9] record input of 0] sets and 1$ , ... 9$ sets record input
of 0$ sets. A set 1? is output if there is an unrecoverable read error on the input
tourney; later generations show 2? , ... 9? . Except for the 1? set, these sets are all
identical in format to their obvious progenitors. Formats for 1? and @@ are TBD.
5. Why to Use the Standard: Justification and Tradeoffs
5.1. Short Tourneys
5.2. Long User Sets
The tennis standard was invented for short, fixed-length sets, and clearly excels for that
type of data. However, some sets will have to contain long alphanumeric strings of text for
documentation purposes. Set 0$ contains source listing for the programs which create the
tape, and might be 50,000 to 100,000 bytes long or more. Since these sets don't fit neatly
into a single physical record or buffer, it is necessary to break them into pieces. The size of
the pieces should reflect the data properties rather than be fixed, i.e., breaks in sets 0$
might be put where page breaks (formfeeds or ^L) occur in listings.
Sets with long variable length are broken into pieces shorter than one match. Set length
(setlen) is specified as -1 in the set description in the 0! set. The game containing the actual
data has its length specified in the first game and is also terminated by an end of match
marker, a special set.
Consider, on the other hand an 800 \(mu 800 \(mu 8 bits image. The logical block length is
\(ap640 Kbytes, much longer than any reasonable physical record length, but fixed. This logical
block would have to be spanned across \(ap20 to 80 records or matchs. These longer data sets
could, of course, always be broken down into smaller logical blocks and this procedure is
likely the best available. Thus the image mentioned above could be represented by 800 sets,
each containing one line.
Long fixed-length sets (setyp = lfl) are broken into pieces which will normally occupy a
full physical record or match, i.e., the pieces are the size of the buffer or slightly smaller.
Information is provided so that the library function get_set can reassemble the logical record
(file). See the example sets IG in section 4.1.
Also, consider Galileo or Voyager cosmic-ray events -- they are only 48 bits long, so
adding a 16- or 32-bit key to each one seems inefficient. One might prefer to gather them up
in groups, with each group corresponding to some particular time interval, such as an
instrument cycle. In that group we may well find that most events are null and should be
omitted for efficient storage. In that case, we end up with groups of variable length, but still
short compared to expected match lengths. (Galileo, for instance, has up to 48 events
telemetered per instrument cycle.)
Short variable-length sets (setyp = svl) will be marked off by synch strings and will
contain a specification of their length following the opening synch string. They may contain
either one data game of variable length or a variable number of identical data games. See the
example EG.
Note the alternatives mentioned in section 5.4 for dealing with large sets.
5.3. Platform and Medium Independence
The tennis metasets are pure ascii and should be readable on any system. The data they
describe, when binary, is generally in the native format of the system which wrote it, and that
native format is known, since the computer system is identified in the set 0[ . Thus the
library can translate without user assistance.
The library can be programmed to read data from any reasonable medium of storage, in
particular, it should have the ability to read tape, disk, Unix pipes, and probably network
connections. Direct access, of course, is only possible on disk and is not generally considered
a feature relevant to tennis. The get_time function can do a fast-forward sort of operation.
Direct access of data on disk would appear to be straightforward, requiring "only" creation of
an index and addition of the appropriate library functions. Since the index could be separate
from the tennis data itself, it does not require modification of the tennis format standard.
5.4. Alternative Possibilities
Alternatives to writing our own standard including the three "popular" standards of
national standing: NASA's SFDU and CDF standards, and NCSA's HDF.
The Standard Formatted Data Unit is a very "loose" standard, basically specifying a
means for documentation of packet like data units. The typical unit is megabytes long, with
kilobytes of documentation, and dozens of bytes of overhead. It is too much overhead for a
typical tennis set, which might be as short as, for example, 16 bytes. However, it is possible
to make a tennis tourney look very much like a SFDU, so that translation to SFDU format is
trivial.
The NASA Common Data Format is intended for storage of large arrays, and is not
appropriate for lists of structured sets.
The National Center for Supercomputing Applications' Hierarchical Data Format looks
similar to tennis in that it can support structures and short data sets. It has significantly more
overhead (20 bytes per equivalent of a set) and is normally used for large sets like images.
All of their free application support is image/array oriented. It is clear that HDF could
replace tennis, but at significant cost. The two are sufficiently similar that translation is easy.
6. Appendix: Various Lists
Format specification of points in each user game is necessary to allow automatic
translation. This is done with the variable pointyp using the abbreviations listed below,
| tennis.h | user _______ |
| | user functions|_____|
| | internal functions |
| | system functions |
| Operating System |
The tennis library consists of the four pieces: the user functions, the internal functions, the
system functions, and the tennis.h include file. The user functions, listed below, are those
functions which are normally used by the user. Internal functions are available to the user,
but are not expected to be needed. The user is expected to avoid using system routines,
which may be dependent on the operating system. The names of internal functions start with
the two characters t_ and the names of system functions start with the three characters ts_ .
All O.S. dependent functions are to be isolated into the system routines of the tennis library.
The tennis.h file consists of various definitions which might make the user's program more
readable. For example, one might define:
2.7.2. Internal and System Functions
3. Utilities
0[ This metaset marks beginning of tourney. It also contains pedigree of this
tourney and specifies skiptm.
1[ Information from 0[ sets on input tourney. Pedigree set.
n[ Information from (n-1)[ sets on input. Same format as 0[ .
0! A metaset, specifying format (set, game, point, bit lengths and offsets) for a
particular set. Several will be required for a tourney.
0] A metaset which specifies end of tourney.
1] Output when a set 0] is found in the input tourney. Pedigree set.
n] Output when a set (n-1)] is found in the input.
[[ Beginning of match metaset. Marks beginning of physical record or synch flag
for electronic transmission/storage media (or other non-record oriented media).
]] End of match metaset. Specifies end-of-physical-record or end of data within a
physical record of fixed size. Also used for synch detection with electronic
transmission/storage media. On standard tape allows variable length records.
1? Output if there is a unrecoverable read error on the input tourney. Pedigree
set; format TBD.
2? Output if there is a set 1? on the input tourney. Pedigree set.
n? Output if there is a set (n-1)? on the input. Same format as 1? .
0$ Pedigree set containing source listing of programs that created this output
tourney and description of all variables in sets. User must furnish file names
to library functions. An example of a long set.
1$ Copy input 0$ onto output tourney with new key. Pedigree.
n$ Copy (n-1)$ onto output tourney with new key. Same format.
@@ This pedigree set specifies change of computer type. Computer changes are
needed if mixed, untranslated computer data formats occur on same tape. Not
recommended. Format TBD.
4.1. User Sets
Set tG | Galileo time (example only)
-------------------------------------------------------------------------
Game# Name Length Index | Comments
key 2 0 | key = tG
1 TIME 12 2 | Time at beginning of instrument cycle.
2 STAT 8 14 | Status read out during cycle.
3 RATE 256 22 | Rates read out during cycle.
4 ANGL 10 278 | Angle parameters during cycle.
288
tG: Game 1 TIME
Point Length Index Comments
timtyp 1*A 0 S for SCET OR E for ERT
errflg 1*A 1 G for good or B
msec 1*S 2 0 to 999, millisecond of second
sec 1*i 4 seconds since start of 1989
sc_clk 1*i 8 see jpl doc'n of spacecraft clock
12
tG: Game 2 STAT
Point Length Index Comments
swa 1*b 0 status word a bit pattern
swb 1*b 1 status word b bit pattern
swc 1*b 2 ditto
swd 1*b 3
swe 1*b 4
swf 1*b 5
dqfl 2*b 6 16-bits of data quality flags, see jpl doc
8
tG: Game 3 RATE
Point Length Index Comments
ratea1 1*S 0 first readout of a scaler, negative flags prob
ratea2 1*S 2 second readout of a scaler
.
.
rateh16 1*S 254 16th readout of h scaler
256
tG: Game 4 ANGL
Point Length Index Comments
aqflg 1*A 0 quality flag, G or B
spare 1*A 1
offset 1*F 2 spin angle(time) = offset + arate*time
arate 1*F 6 "
10
The next example is another time set, now a TG set which does have the ISO format
time that can be used to create an index. It has two games: ISO time is in the first and the
second is a duplicate of the game 1 of set tG .
Set TG Galileo ISO Time (example only)
Game # Name Length Index Comments
key 2 0 key = TG
spare 2 2 two blanks
1 ISOTM 24 4 Same time in ISO format.
2 TIME 12 28 Time at beginning of instrument cycle.
40
TG: Game 1 ISOTM
Point Length Index Comments
timiso 24*A 0 Same time as game 2, ISO format, blank pad.
24
TG: Game 2 TIME
Point Length Index Comments
timtyp A 0 S for SCET OR E for ERT
errflg A 1 G for good or B
msec S 2 0 to 999, millisecond of second
sec i 4 seconds since start of 1989
sc_clk i 8 see jpl doc'n of spacecraft clock
12
The next example is an eG set, a short set containing a single cosmic ray event
composed of 12 bits of "tag" (discriminator) information and three 12-bit pulse height channel
numbers. These data are labeled with spacecraft clock offset relative to the time in the tG
set. The set is made up of one game, containing the five points just mentioned. Each of
these points is stored in a 16-bit integer, an "unsigned short integer" in VAX C usage. Note
that the storage overhead of adding a key to this short set is not trivial and observing the
eight-byte boundary convention costs even more, but the alternatives (see example set EG
below) are noticeably more complex. On most computers it would be ok to omit the 4 spare
bytes. This omission is also safe if there are no 8-byte floating point numbers in any of the
sets (frequently true).
Set eG Galileo Cosmic Ray Event
Game # Name Length Index Comments
key 2 0 key = eG
spare 4 2 four ascii blanks
1 EVENT 10 6 Pulse height event.
16
eG: Game 1 EVENT
Point Length Index Comments
cntoff S 0 Number of clock counts since time in tG (0-90)
tag s 2 Tag bits -- which discriminators fired
pha3 s 4 Pulse height from pha3
pha2 s 6 Pulse height
pha1 s 8 Pulse height
10
The EG set contains a variable number (n) of events, all from a single instrument
cycle. Each event is in a game, containing the same five points as above. There are up to 48
events in an instrument cycle, hence, up to 48 games in the set, plus a "control" game at the
beginning and a "control" game at the end of the set. Conglomeration of very short sets in
this fashion to avoid storage overhead adds one more level of looping to any program which
reads the data -- inside of the loop which reads all sets is another loop which reads all events
within that set.
Set EG Galileo Events (example only)
Game # Name Length Index Comments
key 2 0 key = EG
spare 2 2
1 BCNTL 12 4 Specifies beginning, length of this set
2 EVENT 10 16 First non-null event in instrument cycle
3 EVENT 10 26 Second non-null event in instrument cycle
.
.
n+1 EVENT 10 16+10*(n-1) Last (nth) non-null event in the instrument cycle
which began at the time specified in tG
n+2 ECNTL 16 16+10*n Specifies that the set is ending.
32+10*n+padlen
EG: Game 1 BCNTL
Point Length Index Comments
bsynstr 4*A 0 synch string is ]V[B
n s 4 number of events in this particular set
setlen 4*A 6 number of bytes (32+10*n+padlen) in this particular
set, ascii integer, blank pad
12
EG: Game 2 EVENT
Point Length Index Comments
cntoff S 0 Number of clock counts since time in tG (0-90)
tag s 2 Tag bits
pha3 s 2 Pulse height from pha3
pha2 s 2 Pulse height
pha1 s 2 Pulse height
10
EG: Game n+2 ECNTL
Point Length Index Comments
flag S 0 -1 where cntoff might be expected
pad padlen*A 2 padlen blanks, for the 8 byte set length convention.
See below.
setlen 10*A 2+padlen as above
esynstr 4*A 12+padlen synch string is ]V[E
16+padlen
The user should arrange a unique flag value for the ECNTL game, but this arrangement
is redundant since n is available from the BCNTL game. Note that n must be known and
output at the beginning of the set. Buffering can be done in the put_set area since sets are
not split across buffers/matchs.
The length of blank padding, padlen is given by:
padlen = 8 - ((10*n) mod 8)
where 10 should be understood as the length of the game and is found by taking the
difference between the gamepnt's of the EVENT and ECNTL games. Zero is not allowed.
Once padlen is calculated, the set length, setlen, can be gotten from:
setlen = 10*n+32+padlen
Note that bsynstr is 4 bytes after beginning of set, as is the case for all special sets that have
synch strings. Note the placement and use of ascii for setlen. These factors should be
considered part of the protocol for short, variable-length sets. Note that the user need not use
the setlen variable, hence, need not translate it to binary. Also note that the user might well
not need n; the -1 in game ECNTL serves a similar purpose.
setkey = "EG ";
setname = galileo_events;
setlen = -1; /* flag for variable */
setyp = svl;
gamecnt = -1; /* flag for variable */
.
.
.
gamename = BCNTL;
gamepnt = 4;
bsynstr = ]V[B;
.
.
.
gamename = EVENT;
gamepnt = 16;
.
.
.
gamename = ECNTL;
gamepnt = 26; /* +(n-1)*10 */
esynstr = ]V[E
.
.
.
.
Finally, an example of a long, fixed-length set (setyp=lfl). Postulate a galileo image file
with a length of 639,100 bytes and details of the file are unspecified (therefore no automatic
translation is possible). Keep in mind that this image might be better handled as a file in
SFDU format or an HDF or a CDF, in addition to the possibility of logical subdivision
already mentioned.
Set IG Galileo Images
Game # Name Length Index Comments
key 0 key is ascii string IG
spare 6 2 six blanks
1 CNTRL 288 8 locates the data
2 DATA nbyte 296 contains the data (32000)
nbyte+296 (32296)
IG: Game 1 CNTRL
Point Length Index Comments
seqno 4*A 0 Sequence no. of this set in the ncpf sets. 1, ... ncpf
ncpf 4*A 4 Number of sets required to contain this file
nbyte 8*A 8 Number of bytes in DATA.
fillen 8*A 16 Number of bytes in the image file.
filpad 8*A 24 Number of bytes pad needed to produce integer multiple
of nbyte.
flnm 256*A 32 Full name of the file
288
IG: Game 2 DATA
Point Length Index Comments
txt nbyte*b 0 A section of length nbyte from a data file of length
640000 (setlen). In this example, the data are all
one-byte unsigned integers. nbyte
The set 0! for this set will include statements like:
setkey = "IG ";
setname = galileo_image;
setlen = -32296; /* note the minus sign */
setyp = lfl;
nbyte = 32000;
fillen = 639100;
filpad = 900;
ncpf = 20;
gamecnt = 2;
.
.
.
gamename = CNTRL;
gamepnt = 8;
.
.
.
gamename = DATA;
gamepnt = 288;
.
.
.
pointyp = 8000*b;
.
.
.
It is assumed that the pattern specified (8000*b here) is repeated throughout the data file.
If possible, make nbyte a multiple of the length of that pattern. Automatic translation of
representations may well not be possible for complicated patterns. Note again: if the pattern
is of reasonable length, it is probably better for the user to make multiple short fixed-length
sets of that length, rather than these huge sets.
0[ ]S[syBOT;
BEGIN_GROUP = trnydscr; /* PVL from here down */
bfsz = 32768;
cmptyp = VAXII; /* names should be chosen from a list for consistency */
cmpos = LTRX4.1; /* ditto */
cmpnm = odin.srl.caltech.edu;
trnm = galileo/data/lib/1989-294.297; /* tourney name */
trdt = 1991-05-02T05:14:23;
lbnm = /home/odin/usr/lib/tennis.clib;
lbdt = 1991-03-29T14:32:00;
mnnm = /home/odin/galileo/gensrc/flux/main.c;
mndt = 1991-05-01T13:17:02; /* fixed oxygen limits */
othnm = /home/odin/galileo/gensrc/flux/zcalc.c;
othdt = 1991-05-01T13:33:56;
skiptm = 0000-00-00T12:00:00; /* 12 hours */
END_GROUP = trnydscr;
Note that the time is in ISO standard time format, i.e., YYYY-MM-DDThh:mm:ss[.fff] or
YYYY-DDDThh:mm:ss[.fff], where T is a delimiter and [ ] implies optional.
Set 0] End of Tourney
Game # Name Length Index Comments
key 2 0 key is ascii string 0]
spare 2 2 2 blanks
1 MSTCNT 916 4 Match and set counts.
920
0] Game 1 RECCNT
Item Length Index Comments
synstr 8*A 0 synch string is ]S[syEOT
mtcnt 12*A 8 Count of matchs on tourney, ascii integer, padded with white
space.
stky 2*A 20 key of set whose count follows.
kycnt 12*A 22 Number of sets with key stky on this tourney.
.
.
.
stky 2*A 902 key of set whose count follows.
kycnt 12*A 904 Number of sets with key stky on this tourney.
4.2.2. Set 0! : Set Descriptor
BEGIN_GROUP = setdscr;
setkey = "tG "; /* two trailing blanks */
setname = galileo_time;
setlen = 288;
setyp = sfl; /* choose from sfl, lfl, lvl, ... */
setext = "time, status, rates, angle for one rate/status subcom cycle";
gamecnt = 4;
BEGIN_GROUP = gamedscr;
gamename = TIME;
gamepnt = 2;
gametext = "time for beginning of rate subcom cycle";
BEGIN_GROUP = pointdscr;
pointnm = timtyp;
pointpnt = 0;
pointyp = A; /* ascii */
pointext = "ASCII encoded logical variable; S for SCET,
E for ERT.";
END_GROUP = pointdscr;
BEGIN_GROUP = pointdscr;
pointnm = errflg;
pointpnt = 1;
pointyp = A;
pointext = "ASCII encoded logical variable; G for Good,
B for Bad.";
END_GROUP = pointdscr;
BEGIN_GROUP = pointdscr;
pointnm = msec;
pointpnt = 2;
pointyp = S; /* signed short integer */
pointext = "0 to 999, millisecond of second.";
END_GROUP = pointdscr;
BEGIN_GROUP = pointdscr;
pointnm = sec;
pointpnt = 4;
.
.
.
.
END_GROUP = pointdscr;
END_GROUP = gamedscr;
BEGIN_GROUP = gamedscr;
gamename = STAT;
gamepnt = 14;
BEGIN_GROUP = pointdscr;
pointnm = swa;
.
.
.
.
END_GROUP = pointdscr;
END_GROUP = gamedscr;
BEGIN_GROUP = gamedscr;
gamename = RATE;
gamepnt = 22;
BEGIN_GROUP = pointdscr;
pointnm = rate_sclr;
pointpnt = 0
pointyp = 128*S;
pointext = "An array of 128 short, signed integers specify rate
scaler readouts. The sequence is a1, a2, a3,
..., h16. The letter refers to which scaler,
the number tells which subcom state.";
END_GROUP = pointdscr;
END_GROUP = gamedscr;
END_GROUP = setdscr;
Sets 0! may have a length exceeding the 30,000 bytes specified above. The library
will split them across match boundaries just as for the 0$ . Split locations can be specified
by the user by inserting form-feed characters (^L).
Set [[ Beginning of match
Game # Name Length Index Comments
key 2 0 key is ascii string [[
spare 2 2 two blanks
1 RECSQ 20 4 match sequence number
24
[[ Game 1 RECSQ
Point Length Index Comments
synstr 8*A 0 synch string is ]S[syBOM
rcsq 12*A 8 Sequence number of this match on this tourney. an ascii
integer padded with blanks.
20
Set ]] End of match
Game # Name Length Index Comments
key 2 0 key is ascii string ]]
spare 2 2 two blanks
1 RECLN 20 4 Match length
24
]] Game 1 RECLN
Point Length Index Comments
synstr 8*A 0 synch string is ]S[syEOM
rcln 12*A 8 Number of bytes of data in this match, including the match
markers themselves. An ascii in- teger padded with blanks.
20
There is a set [[ preceding the set 0[ at the beginning of the tourney and there is a set
]] after the 0] at the end of the tourney.
Set 0$ Source Language
Game # Name Length Index Comments
key 2 0 key is ascii string 0$
spare 2 2 two blanks
1 BCNTL 20 4 specifies length of text game
2 TEXT texlng 24 contains actual text
24+texlng
0$ Game 1 BCNTL
Point Length Index Comments
bsynstr 4*A 0 synch string is ]$[B
seqno 4*A 4 sequence no. of this set in string of sets containing the
text file
ncpf 4*A 8 total no. of sets used to hold the text file
texlng 8*A 12 No. of characters in the TEXT game. Need not be a multiple
of 8.
0$ Game 2 TEXT
Point Length Index Comments
txt texlng*A 0 actual text
texlng
The 0$ set occupies a match of its own due to its uncertain, presumable large length. Thus
the 8-byte convention and padding can be ignored.
A ascii character
B one byte integer
b one byte unsigned integer or bit pattern
S short integer -- nominally 16 bits
s unsigned short integer.
I integer -- nominally 32 bits
i unsigned integer
E extended integer -- nominally 64 bits
F native floating point -- 32 bits
D native double precision floating point -- 64 bits
The cmptyp variable is chosen from a list that currently includes: PDP11 , VAXII ,
SUN3 , SSPARC . Obviously more format types can be added as needed. In particular, I
think there are non-proprietary "standard" data representations including IEEE and Sun XDR,
which are used on most RISC machines.
The setyp list includes:
sfl short, fixed-length
lfl long, fixed-length
lvl long, variable-length
svl short, variable-length
The SFDU PVL non-alphanumeric, non-reserved list includes all the characters below:
& * ^ : @ $ ! / % + ? [ ]
There has been talk of reserving the [ and ] characters; this does not appear to affect
our usage, since we always quote these characters.