HDS currently uses the following tuning parameters to control its
behaviour.
- INALQ - Initial File Allocation Quantity:
-
This value determines how many blocks
are to be allocated when a new container file is created. The
default value of 2 is the minimum value allowed; the first block
contains header information and the second contains the top-level
object. Note that the host operating system may impose further
restrictions on allowable file sizes, so the actual size of a file may
not match the value specified exactly.
The value of this parameter reverts to its default value (or the value
specified by the HDS_INALQ environment variable) after each file is
created, so if it is being set from within a program, it must be set
every time that it is required.
If a file is to be extended frequently (through the creation of new
objects within it), then this parameter may provide a worthwhile
efficiency gain by allowing a file of a suitable size to be created
initially. On most UNIX systems, however, the benefits are minimal.
- MAP - Use file mapping if available?
-
This value controls the method by which HDS performs I/O operations on
the values of primitive objects and may take the following values:
- MAP=1:
-
Use "file mapping" (if supported) as the preferred method of
accessing primitive data.
- MAP=0:
-
Use read/write operations (if supported) as the preferred data access
method.
- MAP=-1:
-
Use whichever method is normally faster for sequential access to all
elements of a large array of data.
- MAP=-2:
-
Use whichever method is normally faster for sparse random access to a
large array of data.
- MAP=-3:
-
Use whichever method normally makes the smaller demand on system
memory resources (normally this means a request to minimise use of
address space or swap file space, but the precise interpretation is
operating system dependent). This is normally the appropriate option
if you intend to use HDS arrays as temporary workspace.
HDS converts all other values to one. The value may be changed at any
time.
A subsequent call to HDS_GTUNE, specifying the
`MAP' tuning parameter, will return 0 or 1 to indicate which option
was actually chosen. This may depend on the capabilities of the host
operating system and the particular implementation of HDS in use. The
default value for this tuning parameter is also system dependent (see
§
).
Typically, file mapping has the following plus and minus points:
- +
- It allows large arrays accessed via the HDS mapping
routines to be sparsely accessed in an efficient way. In this case,
only those regions of the array actually accessed will need to be
read/written, as opposed to reading the entire array just to access a
small fraction of it. This might be useful, for instance, if a
1-dimensional profile through a large image were being generated.
- +
- It allows HDS container files to act as "backing store"
for the virtual memory associated with objects accessed via the
mapping routines. The operating system can then use HDS files, rather
than its own backing (swap) file, to implement virtual memory
management. This means that you do not need to have a large system
backing file available in order to access large datasets.
- +
- For the same reason, temporary objects created with
DAT_TEMP and mapped to provide temporary
workspace make no additional demand on the system backing file.
- ?
- On some operating systems file mapping may be less
efficient in terms of elapsed time than direct read/write
operations. Conversely, on some operating systems it may be more
efficient.
- -
- Despite the memory efficiency of file mapping, there may be
a significant efficiency penalty when large arrays are mapped to
provide workspace. This is because the scratch data will often be
written back to the container file when the array is unmapped (despite
the fact that the file is about to be deleted). This can take a
considerable time and cannot be prevented as the operating system has
control over this process.
Unfortunately, on some operating systems, this process appears to
occur even when normal system calls are used to allocate memory
because file mapping is used implicitly. In this case, HDS's file
mapping is at no particular disadvantage.
- -
- Not all operating systems support file mapping and it
generally requires system-specific programming techniques, making it
more trouble to implement on a new operating system.
Using read/write access has the following advantages and
disadvantages:
- +?
- On some operating systems it may be more efficient than
file mapping in terms of elapsed time in cases where an array of data
will be accessed in its entirety (the normal situation). This is
generally not true of UNIX systems, however,
- -
- It is an inefficient method of accessing a small subset of
a large array because it requires the entire array to be
read/written. The solution to this problem is to explicitly access the
required subset using (e.g.) DAT_SLICE,
although this complicates the software somewhat.
- -
- It makes demands on the operating system's backing
file which the file mapping technique avoids (see above). As a result,
there is little point in creating scratch arrays with DAT_TEMP for
use as workspace unless file mapping is available (because the system
backing file will be used anyway).
- ?
- If an object is accessed several times simultaneously using
HDS mapping routines, then modifications made via one mapping may not
be consistently reflected in the other mapping (modifications will
only be updated in the container file when the object is unmapped, so
the two mappings may get out of step in the mean time). Conversely, if
file mapping is in use and a primitive object is mapped in its
entirety without type conversion, then
this behaviour does not occur (all mappings remain consistent). It may
occur, however, if a slice is being accessed or if type conversion is
needed.
It is debatable which behaviour is preferable. The best policy is to
avoid the problem entirely by not utilising multiple access to the
same object while modifications are being made.
- MAXWPL - Maximum Size of the "Working Page List":
-
This value specifies how many blocks
are to be allocated to the memory cache which HDS
uses to hold information about the structure of HDS files and objects
and to buffer its I/O operations when obtaining this information. The
default value is 32 blocks; this value cannot be
decreased. Modifications to this value will only have an effect if
made before HDS becomes active (i.e. before any call is made to
another HDS routine).
There will not normally be any need to increase this value unless
excessively complex data structures are being accessed with very large
numbers of locators simultaneously active.
- NBLOCKS - Size of the internal "Transfer Buffer":
-
When HDS has to move large quantities of data from one location to
another, it often has to store an intermediate result. In such cases,
rather than allocate a large buffer to hold all the intermediate data,
it uses a smaller buffer and performs the transfer in pieces. This
parameter specifies the maximum size in blocks
which this transfer buffer may have and is
constrained to be no less than the default, which is 32 blocks.
The value should not be too small, or excessive time will be spent in
loops which repeatedly refill the buffer. Conversely, too large a
value will make excessive demands on memory. In practice there is a
wide range of acceptable values, so this tuning parameter will almost
never need to be altered.
- NCOMP - Optimum number of structure components:
-
This value may be used to specify the expected number of components
which will be stored in an HDS structure. HDS does not limit the
number of structure components, but when a structure is first created,
space is set aside for creation of components in future. If more than
the expected number of components are subsequently created, then HDS
must eventually re-organise part of the container file to obtain the
space needed. Conversely, if fewer components are created, then some
space in the file will remain unused. The value is constrained to be
at least one, the default being 6 components.
The value of this parameter is used during the creation of the first
component in every new structure. It reverts to its default value (or
the value specified by the HDS_NCOMP environment variable)
afterwards, so if it is being set from within a program, it must be
set every time it is needed.
- SHELL - Preferred shell:
-
This parameter determines which UNIX shell should be used to interpret
container file names which contain "special" characters
representing pattern-matching, environment variable substitution,
etc. Each shell typically has its own particular way of
interpreting these characters, so users of HDS may wish to select the
same shell as they normally use for entering commands. The following
values are allowed:
- SHELL=2:
-
Use the "tcsh" shell (if available). If this is not available, then
use the same shell as when SHELL=1.
- SHELL=1:
-
Use the "csh" shell (C shell on traditional UNIX systems). If this
is not available, then use the same shell as when SHELL=0.
- SHELL=0 (the default):
-
Use the "sh" shell. This normally means the Bourne Shell on
traditional UNIX systems, but on systems which support it, the similar
POSIX "sh" shell may be used instead.
- SHELL=-1:
-
Don't use any shell for interpreting single file names (all special
characters are to be interpreted literally). When performing
"wild-card" searches for multiple files (with
HDS_WILD), use the same shell as when SHELL=0.
HDS converts all other values to zero.
- SYSLCK - System wide lock flag:
-
This parameter is present for historical reasons and has no effect on
UNIX systems.
- WAIT - Wait for locked files?
-
This parameter is present for historical reasons and currently has no
effect on UNIX systems, where HDS file locking is not implemented.