@SubAppendix
    @Title { Arenas, arena sets, and arrays }
    @Tag { ha }
@Begin
@LP
This section describes the Ha module, which implements arenas, arena
sets, and extensible arrays.  It is used throughout KHE in place of
@C { malloc } to obtain heap memory.  Its header file is @C { howard_a.h }.
This module was originally flown in from another project of the
author's, called Howard.  For a general overview of arenas and
arena sets, see Section {@NumberOf intro.arena}.
ha. @Index { Ha }
@BeginSubSubAppendices

@SubSubAppendix
    @Title { Arenas }
    @Tag { ha.arenas }
@Begin
@LP
An arena is an object (a pointer to a private struct) of type
@C { HA_ARENA }.  It represents an unlimited amount of @I { arena memory }:
heap memory held in an arena so that it can be freed all at once later.
All heap memory allocated by KHE is arena memory.
@PP
Every arena belongs to exactly one @I { arena set }, an object of type
@C { HA_ARENA_SET } representing a set of arenas, from which the arena
obtains its memory and to which it returns its memory when it is no
longer needed.  Function
@ID @C {
HA_ARENA HaArenaMake(HA_ARENA_SET as)
}
creates an arena belonging to arena set @C { as }.  And
@ID @C {
void HaArenaDelete(HA_ARENA a);
}
deletes @C { a }, returning all its arena memory to @C { a }'s
arena set, where it becomes available for use by the other
arenas of @C { as }.
# Also,
# @ID @C {
# void HaArenaClear(HA_ARENA a);
# }
# frees @C { a }'s memory without deleting @C { a }, returning
# @C { a } to its state immediately after @C { HaArenaMake }.
@PP
In practice, functions @C { KheSolnArenaBegin } and @C { KheSolnArenaEnd }
(Section {@NumberOf solutions.top.arenas}) are the best way to create
and delete arenas.  They call @C { HaArenaMake } and @C { HaArenaDelete }.
@PP
Operations
@ID @C {
void *HaAlloc(HA_ARENA a, size_t size);
void HaMake(X res, HA_ARENA a);
}
allocate memory.  @C { HaAlloc } returns a pointer to at least
@C { size_t } bytes of arena memory from @C { a }, aligned suitably
for any data.  Macro @C { HaMake } sets @C { res } (which may have any
pointer type @C { X }) to point to at least @C { sizeof(*res) } bytes
of memory obtained from @C { HaAlloc }.  These objects may not be
resized.  For resizable objects, see Section {@NumberOf ha.allocator}.
@PP
An arena obtains its memory from its arena set, which obtains it from
@C { malloc }.  As long as @C { malloc } can supply memory, an arena
will supply memory to the user.  If a request for memory from
@C { malloc } fails, then the arena set will make a long jump using
a jump environment passed to it by @C { KheArenaJmpEnvBegin }
(Section {@NumberOf ha.arena_sets}) if that is available, otherwise
it will abort.
@PP
The memory pointed to by a variable @C { a } of type @C { HA_ARENA }
is arena memory from a private arena held by @C { a }'s arena set.  A
deleted arena has its memory reclaimed, but the arena object itself
is saved in a free list in its arena set, where it is available to
later calls to @C { HaArenaMake }.  This makes the cost of calling
@C { HaArenaMake } and @C { HaArenaDelete } small enough to allow
many small arenas to come and go.  Section {@NumberOf ha.allocator}
has more detail on this.
# This memory is freed along with the rest of the arena memory when
# the arena is deleted.
@PP
For maximum efficiency, arena memory is not initialized to zero.
This can cause two problems.  First, an uninitialized object field
can cause a program to behave differently each time it runs, which
is a nightmare to debug.  The user must be very disciplined about
initializing objects, which basically means enclosing every call
to @C { HaMake } in a function something like this:
@ID @C {
THING ThingMake(FIELD1 f1, FIELD2 f2, HA_ARENA a)
{
  THING res;
  HaMake(res, a);
  res->field1 = f1;
  res->field2 = f2;
  return res;
}
}
where every field of @C { THING } is initialized within @C { ThingMake }.
The second problem is that @C { HaArrayContains }
(Section {@NumberOf ha.arrays}) compares array elements using
@C { memcmp }.  In the unlikely case where the elements are structs with
gaps in them, the user needs to call @C { memset } to zero out the
struct before its fields are initialized and it is copied into the array.
# Arguably this is not Ha's problem, because structs are usually
# created in stack memory and then copied.
@PP
When the user misuses memory, obscure errors occur which often
manifest themselves as failures within the memory allocator.
When this happens, it is an old mistake to jump to the conclusion
that the memory allocator has a bug.  In fact, memory allocators
get such a thorough workout that they are probably the least likely
places to find a bug.
@PP
Although they are not the cause of your bug, they can help you
find it.  Function
@ID @C {
void HaArenaCheck(HA_ARENA a);
}
carries out a check of arena @C { a }'s data structure, and aborts
the program if it finds something wrong.  If you call this
function twice, and the first call succeeds but the second one
fails, then your program has misused memory at some point between
the two calls.
# @PP
# There is a compiler flag at the top of source file @C { ha_all.c }:
# @ID @C {
# #define HA_ARENA_FULL_CHECK 0
# }
# By changing the value of this flag from @C { 0 } to @C { 1 } and
# recompiling, you can instruct Ha to call @C { HaArenaFullCheck }
# at the start and end of every call on arena @C { a }, including
# calls on resizable memory (from which the arena is reachable).
# This should give you lots of points that you can interpolate between.
@PP
Finally,
@ID @C {
void HaArenaDebug(HA_ARENA a, int verbosity, int indent, FILE *fp);
}
produces a debug print of @C { a } onto @C { fp } with the given
verbosity and indent.
# A call to
# @C { HaArenaMake } generates one call to @C { malloc } requesting
# 14 words of memory, a few internal function calls which will
# certainly be inlined, and about 15 initializing assignments.  A
# call to @C { HaArenaDelete } generates one call to @C { free } and
# three assignments when the arena is empty, growing logarithmically
# (i.e. negligibly slowly) as the amount of memory allocated in the
# arena increases.
@End @SubSubAppendix

@SubSubAppendix
    @Title { Arena sets }
    @Tag { ha.arena_sets }
@Begin
@LP
An @I { arena set } is a set of arenas.  It is also where the memory
allocated by deleted arenas is stored and made available to other
arenas, and where a stack of @C { longjmp } environments is kept,
allowing the arena set to perform a long jump when memory runs out.
@PP
To create a new, empty arena set, call
@ID @C {
HA_ARENA_SET HaArenaSetMake(void);
}
Memory for the arena set object is taken from a private arena kept by
the arena set for its own purposes.  If this call cannot obtain memory
from @C { malloc } for this private arena, it aborts; but that will
never happen if all arena sets are created near the start of the run,
as happens in practice.
@PP
To create and delete an arena belonging to arena set @C { as }, the
calls are @C { HaArenaMake(as) } and @C { HaArenaDelete(a) }, as
we saw previously.  Each arena belongs to exactly one arena set
and it knows which arena set it belongs to.
@PP
There should be one arena set per thread.  When a thread terminates,
its memory needs to be passed on to the arena set of the parent
thread.  To do this, the parent thread should call
@ID @C {
void HaArenaSetMerge(HA_ARENA_SET dest_as, HA_ARENA_SET src_as);
}
It moves the memory used by @C { src_as } to @C { dest_as },
destroying @C { src_as }.  Also,
@ID @C {
void HaArenaSetDelete(HA_ARENA_SET as);
}
deletes @C { as } and all its arenas, returning all their memory to
the operating system via calls to @C { free }.  In practice there is
no need to do this.
@PP
When an arena is asked for memory but has none to give, it asks its
arena set for more.  When the arena set cannot give any more (when
memory runs out), by default the arena set will cause an abort.  But
there is a more graceful alternative, which is to pass a @C { longjmp }
environment to the arena set; it will then execute a long jump instead
of aborting.  The calls for this are
@ID @C {
void HaArenaSetJmpEnvBegin(HA_ARENA_SET as, jmp_buf *env);
void HaArenaSetJmpEnvEnd(HA_ARENA_SET as);
}
These must occur in matching pairs; the second is made when
the jump environment becomes unavailable.  These pairs of calls
may be nested; the arena set holds a stack of jump environments
and makes its long jump using the environment on top of the stack.
@PP
In the unlikely case where there is no memory to store @C { env }
within @C { as }, the action taken depends on what is on the jump
environment stack before the call to @C { HaArenaSetJmpEnvBegin }.
That is, the new jump environment takes effect only after
@C { HaArenaSetJmpEnvBegin } returns.
@PP
An important problem with exception handling is giving up
resources---closing files and so on.  Ha does not pretend to
offer a complete solution to this problem, but it does handle
one major part of it:  just before the long jump is taken, it
deletes each arena created in @C { as } since the most recent
call to @C { HaArenaSetJmpEnvBegin } in @C { as } but not yet
deleted.  This does not mean that all memory consumed since the
most recent call to @C { HaArenaSetJmpEnvBegin } is freed, only
memory in arenas created in @C { as } since that call.  At times
the user may have a choice of creating an arena before or after a
call to @C { HaArenaSetJmpEnvBegin }; the choice will be determined
by whether that arena should continue in use after an exception.
@PP
Here is an example using these functions.  Suppose we have a solver
@C { MySolve } which might use a lot of memory, and we want to abandon
it gracefully when memory runs out.  We do this:
@ID @C {
#include <setjmp.h>
...
bool MySolve(KHE_SOLN soln, ...)
{
  jmp_buf env;  bool success;
  ...
  if( setjmp(env) == 0 )
  {
    /* get here on direct call; do the solve */
    KheSolnJmpEnvBegin(soln, &env);
    success = DoMySolve(soln);
    KheSolnJmpEnvEnd(soln);
  }
  else
  {
    /* get here by calling longjmp; abandon the solve */
    fprintf(stderr, "MySolve abandoned (out of memory)\n");
    success = false;
    KheSolnJmpEnvEnd(soln);
  }
  return success;
}
}
The call to @C { KheSolnJmpEnvBegin } just calls @C { HaArenaSetJmpEnvBegin }
on @C { soln }'s arena set, and @C { KheSolnJmpEnvEnd } just calls
@C { HaArenaSetJmpEnvEnd }.
@PP
A thread which takes a long jump will have previously consumed all
available memory, leaving none for the other threads.  To avoid this
problem, one can call
@ID @C {
void KheArenaSetLimitMemory(HA_ARENA_SET as, size_t limit);
}
This limits the total amount of memory consumed by @C { as } to
@C { limit } bytes.  If satisfying some request for memory would
exceed this limit, the long jump or abort is taken, as though memory
had run out completely.  If @C { limit } is set to the amount of
available memory divided by the number of threads, calling this
function ensures that each thread gets its fair share of available memory.
@C { KheArchiveParallelSolve } (Section {@NumberOf general_solvers.parallel})
has a @C { ps_avail_mem } option which does this.
# Finding a suitable value for this
# option can be awkward.  On Linux, file @C { /proc/meminfo } and system
# call @C { sysinfo } report the available memory, but there seems to be
# no portable way to find it out, which is why KHE offers it to the user
# in the form of an option.
# (But see the description of
# @C { ps_avail_mem } in Section {@NumberOf general_solvers.parallel}
# for a way to set the value automatically on Linux.)
# @PP
# Instead of an integer the value of @C { ps_avail_mem } may be
# @C { sysinfo }, in which case @C { sysinfo } is called to obtain the
# value.  This value is only available when the @C { USE_SYSI NFO }
# preprocessor flag (defined at the top of @C { khe_platform.h })
# has value 1.  It is the default value when it is available,
# otherwise the
@PP
Finally, function
@ID @C {
void HaArenaSetDebug(HA_ARENA_SET as, int verbosity, int indent,
  FILE *fp);
}
produces a debug print of @C { as } onto @C { fp } with the given
verbosity and indent.
@End @SubSubAppendix

@SubSubAppendix
    @Title { Arrays }
    @Tag { ha.arrays }
@Begin
@LP
Like C's native arrays, Ha's arrays are @I { generic }:  they may
have elements of any one type, of any width, and the C compiler
will report an error if there is a type mismatch.  But, unlike
C's arrays, Ha's arrays are @I { extensible }:  they may grow to
any length during use.
@PP
The type of an extensible generic array must be declared using a
@C { typedef } invoking macro @C { HA_ARRAY }.  For example, the
following declarations already appear within @C { howard_a.h }:
@ID @C {
typedef HA_ARRAY(bool)        HA_ARRAY_BOOL;
typedef HA_ARRAY(char)        HA_ARRAY_NCHAR;
typedef HA_ARRAY(wchar_t)     HA_ARRAY_CHAR;
typedef HA_ARRAY(short)       HA_ARRAY_SHORT;
typedef HA_ARRAY(int)         HA_ARRAY_INT;
typedef HA_ARRAY(int64_t)     HA_ARRAY_INT64;
typedef HA_ARRAY(void *)      HA_ARRAY_VOIDP;
typedef HA_ARRAY(char *)      HA_ARRAY_NSTRING;
typedef HA_ARRAY(wchar_t *)   HA_ARRAY_STRING;
typedef HA_ARRAY(float)       HA_ARRAY_FLOAT;
typedef HA_ARRAY(double)      HA_ARRAY_DOUBLE;
}
ha_array @Index @C { HA_ARRAY }
Create your own array type by placing any type at all between the
parentheses.
@PP
To gain access to @C { wchar_t } and @C { int64_t }, @C { howard_a.h }
includes header files @C { <wchar.h> } and @C { <stdint.h> }.  Use of
@C { long } just leads to trouble, in the author's experience, since
its width varies across platforms, so @C { int64_t }, a standard
64-bit signed integral type, is used instead.
@PP
A variable of any of these types is a struct (not a pointer to a struct)
with three fields:  a typed pointer to arena memory holding the elements,
the number of elements that that memory @I can hold, and the number of
elements that it currently @I does hold.  Structs are used rather than
pointers to structs because extensible arrays are mainly used as aids to
the implementation of other abstractions, and are thus usually private
to one class or function, not shared.  So there is no problem in having
their structs lie directly in class objects or on the call stack, rather
than in arena memory at the end of a pointer; and it is more efficient
this way.
@PP
An array may be a field of an object that lies in one arena, while the
array's arena memory lies in a different arena.  But that would be
unusual, since the array would normally have the same lifetime as
the object, and thus would naturally belong in the same arena.
@PP
When an array is initialized, it contains no elements and no arena
memory is allocated for it.  Its pointer to arena memory points to
a shared empty array in its arena.  As the array grows, arena memory
for it is allocated and reallocated, but always from the same arena.
Each reallocation approximately doubles the number of elements that
the array can hold, ensuring that another reallocation will not be
needed soon, while wasting at most as much space as is used.  Memory
freed by a reallocation becomes available to hold other resizable
objects in the same arena.
@PP
If one array is assigned to another using the C @C { = } operator
or parameter passing, the arrays will have separate copies of their
three fields, yet share their elements.  This is only safe when the
original array is not used afterwards, or the array's length remains
constant thereafter.
@PP
Ha's array operations are macros, necessarily so since they are
generic.  They take structs as parameters, not pointers to structs.
This encourages the user to think of arrays as opaque objects, like
file pointers and so on.  A disadvantage of macros is that their
parameters may be evaluated more than once during a call.  Unless
explicitly stated otherwise, the user should assume that all
parameters of all array operations are evaluated more than once.
In many cases they are.
@PP
The first operation on any array must be to initialize it by a call to
@ID @C {
void HaArrayInit(ARRAY_X a, HA_ARENA arena);
}
This sets @C { a } to empty and specifies the arena which will
supply its memory when elements are added later.  Here and
throughout this section, array operations are presented as though
they are functions, even though they are really macros, and
@C { ARRAY_X } stands for the type created by
@ID @C {
typedef HA_ARRAY(X) ARRAY_X;
}
for any type @C { X }.  To find the arena that an initialized array
@C { a } lies in, call
@ID @C {
HA_ARENA HaArrayArena(ARRAY_X a);
}
In general, memory allocated by Howard's functions can only be
reclaimed by deleting the arena.  However, resizable objects
such as arrays are an exception, and function
@ID @C {
void HaArrayFree(ARRAY_X a);
}
frees the arena memory used by @C { a }, if any.  This does not free
@C { a } itself; @C { a } is not a pointer.  It frees the memory
holding the elements of @C { a }, making it available to other
resizable objects in @C { a }'s arena.
@PP
To find the number of elements currently stored in an array, call
@ID @C {
int HaArrayCount(ARRAY_X a);
}
The elements have indexes from @C { 0 } to @C { HaArrayCount(a) - 1 }
inclusive, as usual in C.  For efficiency, array bounds are not checked
by any Ha operation.  To access the element with index @C { i }, or the
first element, or the last element, call
@ID @C {
X HaArray(ARRAY_X a, int i);
X HaArrayFirst(ARRAY_X a);
X HaArrayLast(ARRAY_X a);
}
@C { HaArray } and @C { HaArrayFirst } evaluate their parameters only
once, and all three operations can be used as variables as well as
values.  So one can write, for example,
@ID @C {
HaArray(frequencies, i)++;
}
to increment the element of @C { frequencies } whose index is @C { i },
or
@ID @C {
do_something(&HaArrayFirst(a))
}
to pass a pointer to an element.
@PP
To add one element to an array, the operations are
@ID @C {
X HaArrayAdd(ARRAY_X a, int i, X x);
X HaArrayAddFirst(ARRAY_X a, X x);
X HaArrayAddLast(ARRAY_X a, X x);
}
@C { HaArrayAdd } adds @C { x } to @C { a } at index @C { i }, which
may range from @C { 0 } to @C { HaArrayCount(a) } inclusive.  It
makes room for @C { x } by shifting elements up one place, including
reallocating arena memory if necessary.  It returns @C { x }.
@C { HaArrayAddFirst(a, x) } is equivalent to @C { HaArrayAdd(a, 0, x) },
and @C { HaArrayAddLast(a, x) } is a faster version of
@C { HaArrayAdd(a, HaArrayCount(a), x) }.
@ID @C {
void HaArrayFill(ARRAY_X a, int len, X x);
}
adds @C { x } 0 or more times to the end of @C { a }, stopping
when @C { HaArrayCount(a) } is at least @C { len }.
@ID @C {
X HaArrayPut(ARRAY_X a, int i, X x);
}
replaces the value at index @C { i } with @C { x } and returns @C { x }.
It evaluates its parameters only once.  And
@ID @C {
void HaArrayMove(ARRAY_X a, int dest_i, int src_i, int len);
}
uses the C @C { memmove } function to move (that is, copy with
overlapping allowed) the @C { len } elements starting at index
@C { src_i } to index @C { dest_i }.  It assumes without checking
that @C { len >= 0 } and that @C { src_i } and @C { dest_i }
are at least @C { 0 } and at most @C { HaArrayCount(a) - len }.  It
is used by @C { HaArrayAdd } above and @C { HaArrayShiftRight },
@C { HaArrayShiftLeft }, and @C { HaArrayDeleteAndShift } below to
do their shifting.
@PP
For searching an array there is
@ID @C {
bool HaArrayContains(ARRAY_X a, X x, int *pos);
}
It returns @C { true } if @C { a } contains @C { x }, setting @C { *pos }
to the index of its first occurrence; otherwise it returns @C { false },
leaving @C { *pos } unchanged.  The individual comparisons are made by
@C { memcmp }.
@PP
Two operations shift the entire contents of an array to the right or
left:
@ID @C {
void HaArrayShiftRight(ARRAY_X a, int n, X x);
void HaArrayShiftLeft(ARRAY_X a, int n);
}
@C { HaArrayShiftRight } shifts the elements of @C { a } to the
right by @C { n } places.  Afterwards, the array has @C { n }
more elements than it did before.  The first @C { n } places,
opened up by the shift, are each initialized to @C { x }.  It
is up to the caller to ensure that @C { 0 <= n }.
@C { HaArrayShiftLeft } shifts the elements of @C { a } to the
left by @C { n } places.  Afterwards, the array has @C { n }
fewer elements than it did before.  It is up to the caller to
ensure that @C { 0 <= n } and @C { n <= HaArrayCount(a) }.
@PP
Two operations delete the @C { i }th element, offering two ways to
fill the gap it leaves behind:
@ID @C {
void HaArrayDeleteAndShift(ARRAY_X a, int i);
void HaArrayDeleteAndPlug(ARRAY_X a, int i);
}
@C { HaArrayDeleteAndShift } shifts the elements after @C { i }
down one place; @C { HaArrayDeleteAndPlug } assigns the last element
to position @C { i }, then deletes the last element.  Operations
@ID @C {
bool HaArrayFindDeleteAndShift(ARRAY_X a, X x, int *pos);
bool HaArrayFindDeleteAndPlug(ARRAY_X a, X x, int *pos);
}
call @C { HaArrayContains }, returning what it returns but also
using @C { HaArrayDeleteAndShift } or @C { HaArrayDeleteAndPlug }
to delete the element it found, if any.  There are also
@ID @C {
void HaArrayDeleteLast(ARRAY_X a);
void HaArrayDeleteLastSlice(ARRAY_X a, int n);
void HaArrayClear(ARRAY_X a);
}
for deleting the last element, deleting the last @C { n } elements
(which can be done very efficiently), and deleting the last
@C { HaArrayCount(a) } elements, leaving the array empty.  And
@ID @C {
X HaArrayLastAndDelete(ARRAY_X a);
}
returns the last element of @C { a } and also deletes it from @C { a }.
Deleting elements does not free any memory.  The vacated memory
remains available to the array, should it decide to grow again.
@PP
Here are some more complex operations that change the
contents of arrays.
@ID @C {
void HaArraySwap(ARRAY_X a, int i, int j, X tmp);
}
Swap the elements of @C { a } at positions @C { i } and @C { j }.
Parameter @C { tmp } is a variable used to hold an element temporarily
while swapping.
@ID @C {
void HaArrayWholeSwap(ARRAY_X a, ARRAY_X b, ARRAY_X tmp);
}
Swap two whole arrays, that is, swap the contents of their structs,
using @C { tmp } as a temporary.
@ID @C {
void HaArrayAppend(ARRAY_X dest, ARRAY_X source, int i);
}
Append the elements of @C { source } to the end of @C { dest },
leaving @C { source } unchanged.  Parameter @C { i } is a
variable used as an external cursor when scanning @C { source }.
@ID { 0.98 1.0 } @Scale @C {
void HaArraySort(ARRAY_X a, int(*compar)(const void *, const void *));
}
Sort @C { a } by means of a call to @C { qsort }, using @C { compar }
as the comparison function.
@ID @C {
void HaArraySortUnique(ARRAY_X a,
  int(*compar)(const void *, const void *));
}
Like @C { HaArraySort }, except that after sorting, elements are
deleted until no two adjacent elements return 0 when compared using
@C { compar }.  If this is done purely for uniqueifying, it is common
to implement @C { compar } as a mere subtraction of two pointers.
However, on a 64-bit architecture this yields a 64-bit integer, and
merely returning this cast to @C { int }, the return type of @C { compar },
does not work.  Use a conditional expression returning @C { -1 }, @C { 0 },
or @C { 1 } instead.
@PP
Finally, Ha offers iterator macros for traversing arrays:
@ID @C {
HaArrayForEach(ARRAY_X a, X x, int i)
HaArrayForEachReverse(ARRAY_X a, X x, int i)
}
These iterate over the elements of @C { a }, in forward or reverse
order.  Within each iteration, @C { x } is one element of @C { a }
and @C { i } is the index of @C { x } in @C { a }.  For example,
@ID @C {
HaArrayForEach(strings, str, i)
  fprintf(stdout, "string %d: %s\n", i, str);
}
prints the elements of array @C { strings }.  Like all Howard's
iterators, both macros expand to
@ID @C { for( ... ; ... ; ... ) }
and may be used syntactically in any way that this construct may be.
@End @SubSubAppendix

#@SubSubAppendix
#    @Title { Version string }
#@Begin
#@LP
#Macro @C { HA_HOWARD_VERSION } is a wide character string defining
#ha_howard_version @Index @C { HA_HOWARD_VERSION }
#the current version of Howard.  For example, its value was
#@ID @C { L"Howard Version 1.0 (June 2011)" }
#at the time of writing.
#@End @SubSubAppendix

#@SubSubAppendix
#    @Title { Version string and assertions }
#@Begin
#@LP
#Ha is used by every Howard library, so it has accumulated a few
#miscellaneous, generally useful features.  These are documented
#in this section.
#@PP
#Macro @C { HA_HOWARD_VERSION } is a wide character string defining
#ha_howard_version @Index @C { HA_HOWARD_VERSION }
#the current version of Howard.  For example, its value was
#@ID @C { L"Howard Version 1.0 (June 2011)" }
#at the time of writing.  Ha also offers two functions for checking
#assertions:
#@ID @C {
#void HaAbort(wchar_t *fmt, ...);
#void HaAssert(bool cond, wchar_t *fmt, ...);
#}
#ha.abort @Index @C { HaAbort }
#ha.assert @Index @C { HaAssert }
#@C { HaAbort }'s parameters are the same as @C { wprintf }'s, but it
#prints onto @C { stderr } and then calls @C { abort }.  @C { HaAssert }
#does nothing if @C { cond } is @C { true }, and it does what @C { HAbort }
#does if @C { cond } is @C { false }.  It is a function, not a macro, so
#its parameters must be well-defined whether @C { cond } is true or not.
#@End @SubSubAppendix

@SubSubAppendix
    @Title { Howard's memory allocator }
    @Tag { ha.allocator }
@Begin
@LP
# @I { The memory allocator has been substantially redone since this
# section was written.  Some parts of this section are still correct
# (especially concerning resizable memory), other parts are not. }
# @PP
This section contains more information about Howard's memory allocator
than the user is likely to need.  It explains how memory is aligned,
presents the operations for allocating resizable arena memory, and
describes how the allocator works.  What it does not do is explain
the rationale behind the various features.  They are all concerned
with trying to avoid problems that memory allocators are prone to,
as indeed is the basic design of arena sets containing arenas.
@PP
Howard's memory allocator promises to return memory aligned correctly for
any kind of data.  However, there seems to be no standard way to find out
what that alignment is.  So file @C { howard_a.h } includes a typedef of
a type @C { HA_ALIGN_TYPE }, and the allocator assumes that memory aligned
with this type aligns with all types.  By default this typedef is
@ID @C {
typedef void *HA_ALIGN_TYPE;
}
but it may be changed to any type whose size is at least the size of
a pointer.  @C { HaArenaSetMake } checks this condition and aborts if it
does not hold, since the implementation depends on it.
@PP
@I { Resizable arena memory } is arena memory that can be resized.  It
is usually accessed via resizable arrays and symbol tables, but it can
also be accessed directly, using these functions:
@ID @C {
void *HaResizableAlloc(HA_ARENA a);
void *HaResizableReAlloc(void *resizable, size_t size);
void HaResizableFree(void *resizable);
}
@C { HaResizableAlloc } returns a pointer to @C { 0 } bytes of
resizable arena memory from arena @C { a }.  This may seem useless,
but experience shows that it produces the most convenient initial
value.  All pointers to 0 bytes from @C { a } are shared, so there is
no memory cost.  @C { HaResizableReAlloc } assumes that @C { resizable }
points to resizable arena memory, and begins by finding its arena and
size.  If @C { size } is no larger than this old size, @C { resizable }
is returned.  If @C { size } is larger, a pointer to @C { size } or
more bytes of resizable arena memory from the same arena is returned.
Its first old size bytes are copied from @C { resizable } using
@C { memcpy }, and @C { resizable } is reclaimed for re-use by other
calls for resizable memory from the same arena (unless its size is 0).
@C { HaResizableFree } reclaims @C { resizable } just as
@C { HaResizableReAlloc } does, but without allocating new memory.
@PP
The user can find the arena and size in bytes of a block of resizable
arena memory:
@ID @C {
HA_ARENA HaResizableArena(void *resizable);
size_t HaResizableSize(void *resizable);
}
@C { HaResizableSize } may be larger than the size requested when
@C { resizable } was allocated.  Like ordinary arena memory,
resizable arena memory is aligned suitably for any kind of data.
# Resizable arena memory is not initialized to
# zero, however.
@PP
The remainder of this section describes the implementation of the arena
memory allocator.
@PP
An arena obtains its memory from its arena set, which obtains it
from @C { malloc }.  A piece of memory given to an arena set by
@C { malloc } and passed on to an arena will be called a @I { chunk };
a piece of memory given to the user by an arena will be called
a @I { block }.
@PP
Let @M { A } be @C { sizeof(HA_ALIGN_TYPE) }.  Since the memory
returned has to align, every block might as well contain (and does
contain) a number of bytes which is a multiple of @M { A }.  If the
number requested is not a multiple of @M { A }, it is increased to
the next multiple of @M { A }.  The resulting wasted memory is
called the @I { alignment overhead }.  It will be negligible in
practice, and often zero.  One block of memory of size @M { A },
suitably aligned, will be called one @I { word }.
@PP
An arena cannot satisfy all the block requests it receives out of one
chunk.  So it calls on its arena set more than once, and maintains a
linked list of the chunks it receives.  The arena object contains
a pointer to the most recently obtained chunk; this chunk begins
with a pointer to the next most recently obtained chunk, and so
on.  The chunk also records the initial number of words available
for allocation to end users in the chunk, the current number of words
still available for allocation, and another integer, called @M { k }
below, which is roughly the base 2 logarithm of the number of words
in the chunk.  Most chunks are large, so this @I { chunk overhead }
is negligible.
@PP
The linked list serves two purposes.  First, when the arena is
deleted, its memory is freed by traversing the list and returning
the chunks to the arena set.  The arena object itself lies in a
separate arena (the arena set's private arena), but the block
list header objects described below lie within chunks like user
blocks do, and so are freed when the chunks are freed.  Second,
when the end user requests a new block, the first step is to try
to obtain it from the first chunk on the list.  Later chunks may
not be entirely used up, but they are never tried.
@PP
When one chunk holds many blocks, arena allocation is much better
than general allocation.  Blocks are allocated contiguously within
chunks, with no memory overhead other than the alignment overhead.
Unless a new chunk is needed, allocating a block is very fast:
just round up the requested size, test whether memory is available
in the first chunk, and make two assignments.
@PP
All chunks cannot be the same size.  If they were, for memory
efficiency one would want that size to be large; but a large
chunk would be wasteful if the arena remains small.  Also,
a request for a block whose size is larger than the chunk
size could not be satisfied.
@PP
Accordingly, the chunks obtained from @C { malloc } vary in size,
as follows.  Each has size @M { 2 sup k - c } for some @M { k } and
@M { c }.  A value for @M { c } is chosen which is Ha's memory overhead
per chunk plus the amount of @C { malloc }'s overhead, which seems to
usually be 16 bytes.  When a chunk of this size is requested, hopefully
this will cause @C { malloc } to request a block of size @M { 2 sup k }
from the operating system, which should work well.
@PP
Whenever an arena needs a new chunk, it requests one from the
arena set of size @M { 2 sup {k+1} - c } words, given that its
previous chunk has size @M { 2 sup k - c } words.  Or if this
is not enough memory to cover the current request from the user,
it keeps doubling until the request is large enough.  The arena
set uses @M { k + 1 } to index an array of free chunks to find
a list of free chunks of size @M { 2 sup {k+1} - c } words.  If
this list is empty it calls @C { malloc }.
@PP
When an arena is deleted, its chunks are moved into the free
chunk lists of its arena set.  Each chunk contains @M { k }
so this is a simple array indexing operation for each chunk.
When an arena set is deleted, all its arenas except its private
arena should already have been deleted.  So it simply passes its
free chunks, and the chunks of its private arena, to @C { free }.
@PP
It remains to describe how resizable blocks are handled.  The size
of each resizable block is @M { R sub n ` A } for some @M { n >= 0 },
where @M { R sub 0 = 0 } and @M { R sub n = 3 cdot 2 sup n - 1 } for
@M { n >= 1 }.  These numbers (0, 5, 11, 23, 47, ...) make good hash
table sizes.  From 5 onwards, each is obtained from its predecessor
by doubling and adding one.
@PP
Growing out of each arena object is a linked list of
@I { block list header } objects.  The first block list header
contains @M { R sub 0 } and a pointer to a singly linked list of all
free blocks of size @M { R sub 0 A } (this particular pointer is
always @C { NULL }); the second contains @M { R sub 1 } and a pointer
to a singly linked list of all free blocks of size @M { R sub 1 A };
and so on.  Each block list header also contains a pointer to its
arena and a pointer to the block list header for the next larger
size.  Initially, only the first block list header is present.
@PP
In addition to the @M { R sub n A } bytes passed to the user, a
resizable block has @M { A } bytes, just in front of the pointer
returned to the user, holding a pointer to the block list header
holding @M { R sub n }.  If the block is free, its second @M { A }
bytes holds a pointer to the next free resizable block of that size.
@PP
Given a user's pointer to a resizable block, one can find its
block list header by going back @M { A } bytes and following
the pointer.  The block list header gives access to the block's
arena and size, and to the free block list of blocks of that size.
@PP
A resizable block of at least a given size can be obtained by searching
the block list header list for the first block list header whose block
size is sufficiently large.  New block list headers are added if
required as the search proceeds.  Once the appropriate block list
header is reached, its first free block is returned to the user; or if
it has no free blocks, a fresh block is obtained from @C { HaAlloc }, a
pointer to the block list header is placed in its first @M { A } bytes,
and a pointer to its @M { (A + 1) }th byte is returned to the user.
@C { HaResizableReAlloc } begins its search for a block list header
from @C { resizable }'s block list header.  Most calls to
@C { HaResizableReAlloc } request blocks about double the old size,
so most traversals of the list of block list headers visit only one
block list header, ensuring that the time taken to find a new
resizable block is usually a small constant.
@PP
The memory overhead is @M { A } bytes per allocated block (holding the
pointer to the block list header), plus the space occupied by the block
list headers (negligible once the blocks grow to even moderate size),
plus the free blocks, plus any unused space within allocated blocks.
@PP
The worst case is elicited by an arena containing a single extensible
array that grows one element at a time.  (This case can be duplicated
by growing two arrays in parallel.)  Now, resizable blocks are needed
just because the application cannot predict how much memory will be
needed.  Thus, the application might as well ask for sizes of the
form @M { R sub n ` A }, and the extensible array module does this.
As the array grows, it leaves a trail of freed blocks behind it of
sizes @M { (5 + 1)A }, @M { (11 + 1)A }, @M { (23 + 1)A }, and so on.
Their total size is less than half the current block size.  The current
block may itself be only half full, so at worst, three times as much
memory is allocated as is used.  But none of this memory is completely
lost:  half of it is available for further growth of the array, the
other half is available for other arrays, and all of it is freed
when the arena is deleted.
@End @SubSubAppendix

@EndSubSubAppendices
@End @SubAppendix
