The
hash
routines manipulate collections of dynamic, scoped hash tables.
A hash table provides an association between a
key
and its
value.
A
key
is a sequence of
char
elements and a
value
is a user supplied pointer to the value.
Each hash table has a dynamic number of slots,
each pointing to the head of a forward linked
collision chain.
Hashing occurs as follows:
a
hash function
takes a
key
as an argument and returns a non-negative index.
The index modulo the table size produces a
slot that points to a
collision chain.
The collision chain is sequentially searched until a match is found for
key.
The hash tables are automatically resized as new entries are added.
Each hash table has one key type.
The default key type is a pointer to a null-terminated string.
The alternate key type is a pointer to a fixed length byte buffer,
declared for a given table by the
hashalloc()
function described below.
Hash table information is partitioned into two parts for efficient scoping.
The
root
part contains fixed information that is shared among a set of related
hash tables.
The remaining information is maintained on a per-table basis.
These basic types are defined in the header file
hash.h
(alternate names are listed in parenthesis):
- Hash_table_t (HASHTABLE)
The per-table information.
The readonly public elements are:
- int buckets
The number of table entries.
- char* name
The hash table name.
- root
-
The root information.
The public elements are:
- int root->accesses
The number of lookups.
- int root->collisions
The number of lookup collisions.
- Hash_table_t* scope
The table that this scope covers,
NULL
if the table is not a scope.
- int size
The current hash table size.
- Hash_bucket_t (HASHBUCKET)
A collision chain hash bucket.
The public structure elements are:
- char* hashname(Hash_bucket_t*)
Returns a pointer to the hash bucket key given the bucket pointer.
- char* value
The value associated with the key.
- Hash_header_t (HASHHEADER)
The hash bucket header that must be the first element in all user defined
buckets.
HASH_VALUE
may not be used with user defined buckets.
- Hash_position_t (HASHPOSITION)
Stores the hash table position for
hashscan.
The public elements are:
- Hash_bucket_t* bucket
The current hash bucket pointer.
The routines are:
- Hash_table_t* hashalloc(Hash_table_t* ref, int op, ...)
Creates a new hash table and returns a pointer to the table.
malloc(3)
is used to allocate space for the table.
NULL
is returned if the table cannot be created.
ref
is a pointer to a reference hash table that provides
default values for unspecified information.
The new hash table and
ref
share the same root information.
If
ref
is
NULL
then new root information is created for the new table.
The remaining arguments appear in
op-arg
pairs, followed by a final
0
argument.
The
op-arg
pairs are:
- HASH_alloc, (char(*)()) alloc
alloc
is a function that is called to process
Hash_bucket_t
value
assignments.
The single argument is
char* value
and the processed
char*
value is returned.
- HASH_clear, int flags
flags
are the
ref
flags to be cleared in the new hash table.
See
HASH_set
below.
- HASH_compare, (int(*)()) compare
Specifies an alternate
key
comparison function.
The arguments and return value for
compare
are the same as for
strncmp(3)
if
HASH_namesize
is specified and
strcmp(3)
otherwise.
The first argument is the
key
from the current hash bucket on the
collision chain
and the second argument is the user supplied
key.
- HASH_free, (int(*)()) free
free
is a function that is called when a hash bucket is freed.
If
HASH_BUCKET
was set in
hashalloc
then the hash bucket pointer is passed, otherwise the bucket
value
pointer is passed.
- HASH_hash, (int(*)()) hash
Specifies an alternate
key
hash function.
A
char*
key argument (and, if
HASH_namesize
is specified, an
int
key size argument) is passed to
hash.
The return value must be a non-negative
int.
- HASH_meanchain, int meanchain
Specifies the mean collision chain length.
The hash table is automatically resized when this value is exceeded.
The default mean chain length is 2.
- HASH_name, char* name
Associates
name
with the hash table.
Used by
hashdump).
- HASH_namesize, int namesize
The fixed size in bytes for
keys
in the table.
If
namesize
is 0 (the default) then the
keys
are interpreted as null-terminated strings.
- HASH_set, int flags
Changes the hash table flags by
oring
in
flags.
The flags, which may be
ored
together, are:
- HASH_ALLOCATE
Keys for new hash table entries are to be copied to data areas obtained from
malloc(3).
- HASH_FIXED
Fixes the hash table size, disabling any automatic table resizing.
- HASH_SCOPE
The new hash table is a scope that is to be pushed on top of
ref.
ref
must be
non-NULL.
- HASH_va_list, va_list ap
ap
is a
va_list
variable argument list pointer
(see
<stdarg.h>).
- Hash_table_t* hashfree(Hash_table_t* tab)
The hash table
tab
is freed.
The scope covered table pointer is returned,
NULL
if
tab
is not a scope.
- char* hashlook(Hash_table_t* tab, char* name, int flags, char* value)
Operates on the key
name
in the hash table
tab
according to
flags
and
value.
A
Hash_bucket_t
pointer is returned unless otherwise noted.
There are three basic lookup operations:
- HASH_CREATE
name
is entered into the top level scope if it does not already exist.
If
name
also appears in a lower scope and
HASH_ALLOC
is set for the table then the new bucket will share the
name
field value with the lower scope.
- HASH_DELETE
name
is deleted from the top level scope if it exists.
NULL
is returned.
- HASH_LOOKUP
The scopes are searched in order from top to bottom for
name .
The bucket pointer for the first occurrence is returned.
NULL
is returned if
name
is not found.
The basic operations may be qualified by the following
(the qualifiers are restricted to the basic operations in
the parenthesized list):
- HASH_BUCKET (HASH_CREATE,HASH_DELETE,HASH_LOOKUP)
name
is a pointer to a bucket that has already been entered into the table.
- HASH_FIXED (HASH_CREATE)
value
is taken to be the size of the hash bucket to be created for
name
in the top level scope.
The minimum bucket size is silently restricted to
sizeof(Hash_header_t).
- HASH_INSTALL (HASH_CREATE)
name
is a pointer to a bucket that has not been entered into the table.
- HASH_NOSCOPE (HASH_LOOKUP)
The lookup is restricted to the top level scope.
- HASH_OPAQUE (HASH_CREATE,HASH_DELETE)
Sets
(HASH_CREATE)
or clears
(HASH_DELETE)
the
opaque
property for the bucket.
An opaque bucket is not visible in lower scopes.
- HASH_SCOPE (HASH_CREATE,HASH_DELETE)
All scopes are searched for the bucket.
If the bucket is not found for
HASH_CREATE
then a new bucket is created in the lowest scope.
- HASH_VALUE (HASH_CREATE,HASH_LOOKUP)
For
HASH_CREATE
the bucket
value
field is set to
value
and the bucket
name
value is returned.
For
HASH_LOOKUP
the bucket
value
field is returned,
NULL
if the bucket is not found.
If
name
NULL
then the name from the most recent
hashlook()
is used, avoiding recomputation of some internal parameters.
- char* hashget(Hash_table_t* tab, char* name)
Returns the value
associated with the key
name
in the hash table
tab.
If
name
is
NULL
then the name from the most recent
hashget()
is used, avoiding recomputation of some internal parameters.
NULL
is returned if
name
is not in the table.
All scope covered tables are searched.
- Hash_bucket_t* hashlast(Hash_table_t* tab)
Returns a pointer to the most recent hash bucket for
tab.
The value is set by
hashlook(),
hashscan()
and
hashwalk().
- char* hashput(Hash_table_t* tab, char* name, char* value)
Set the value of the key
name
to
value
in the top level scope of the hash table
tab.
name
is entered into the top level scope if necessary.
The (possibly re-allocated) key name pointer is returned
(see
HASH_ALLOCATE).
If
name
is 0 then the most recent lookup
name
to
hashlook()
or
hashget()
is used.
This eliminates a re-hash and re-lookup of
name.
- int hashwalk(Hash_table_t* tab, int flags, (int(*)()) walker, char* handle)
The function
walker
is applied to each entry (not covered by a scope starting at
tab)
in the hash table
tab.
If
flags
is
HASH_NOSCOPE
then only the top level hash table is used, otherwise the walk includes
all scope covered tables.
walker
is called with
char*
key
as the first argument,
char*
value
as the second argument, and
char*
handle
as the third argument.
handle
may be
0.
The walk terminates after the last entry or when
walker
returns a negative value.
The return value of the last call to
walker
is returned.
Only one walk may be active within a collection of scoped tables.
- Hash_position_t* hashscan(Hash_table_t* tab, int flags)
Returns a
Hash_position_t
pointer for a sequential scan on the hash table
tab.
If
flags
is
HASH_NOSCOPE
then only the top level hash table is used, otherwise the scan includes
all scope covered tables.
Only one scan may be active within a collection of scoped tables.
hashdone()
must be called to terminate the scan.
0
is returned on error.
- Hash_bucket_t* hashnext(Hash_position_t* pos)
Returnes a pointer to the next bucket in the sequential scan set up by
hashscan()
on
pos.
If no elements remain then
0
is returned.
- void hashdone(Hash_position_t* pos)
Completes a scan initiated by
hashscan()
on
pos.
- int hashset(Hash_table_t* tab, int flags)
Sets the flags for the hash table
tab
by
oring
in
flags.
Only
HASH_ALLOCATE
and
HASH_FIXED
may be set.
- int hashclear(Hash_table_t* tab, int flags)
Clears the flags for the hash table
tab
by masking out
flags.
Only
HASH_ALLOCATE
and
HASH_FIXED
may be cleared.
- void hashdump(Hash_table_t* tab, int flags)
Dumps hash table accounting info to standard error.
If
tab
is
NULL
then all allocated hash tables are dumped, otherwise only information on
tab
is dumped.
If
flags
is
HASH_BUCKET
then the hash bucket
key-value
pairs for each collision chain are also dumped.
- void hashsize(Hash_table_t* tab, int size)
Changes the size of the hash table
tab
to
size
where
size
must be a power of 2.
Explicit calls to this routine are not necessary as hash tables
are automatically resized.
- int strhash(char* name)
Hashes the null terminated character string
name
using a linear congruent pseudo-random number generator algorithm
and returns a non-negative
int
hash value.
- int memhash(char* buf, int siz)
Hashes the buffer
buf
of
siz
bytes using a linear congruent pseudo-random number generator algorithm
and returns a non-negative
int
hash value.
- long strsum(char* name, long sum)
Returns a running 31-bit checksum of the string
name
where
sum
is
0
on the first call and
the return value from a previous
memsum
or
strsum
call otherwise.
The checksum value is consistent across all implementations.
- long memsum(char* buf, int siz, long sum)
Returns a running 31-bit checksum of buffer
buf
of
siz
bytes where
sum
is
0
on the first call and
the return value from a previous
memsum
or
strsum
call otherwise.
The checksum value is consistent across all implementations.