@Appendix
    @Title { Interval Grouping in Detail }
    @Tag { interval_grouping }
@Begin
@LP
This Appendix describes interval grouping
(Section {@NumberOf resource_structural.task_grouping.interval_grouping})
in detail.  The source code, kept in file @C { khe_sr_interval_grouping.c },
is over 6000 lines long.
@BeginSubAppendices

@SubAppendix
    @Title { Constraint classes }
    @Tag { interval_grouping.constraint_classes }
@Begin
@LP
Function @C { KheIntervalGrouping } begins by using the constraint classes
module (Section {@NumberOf resource_structural.constraint_classes})
to group the limit active intervals constraints (plus offsets) of
the instance into classes.
@PP
The constraints of one class @M { C }
have the same time groups in the same order with the same polarities,
but they may have different minimum and maximum limits, and they may
apply to different resources.  We have to consider the effect of this
variability.  For example, suppose some of the resources can have
sequences of maximum duration 4, and others can have sequences of
maximum duration 5.  To build groups of maximum duration 4 would be
to throw away some potentially valuable groups of duration 5.  But
to build groups of maximum duration 5 could be dangerous, because
not all resources can be assigned to those groups.
@PP
In practice, however, the limits of the kinds of constraints we are
dealing with here do not seem to vary between resources.  They express
ideas about good working practice that are the same for all workers.
So we say that a class @M { C } has @I { uniform limits } if @M { C }
applies the same minimum and maximum limits to all resources of the
given resource type @C { rt }.  We let @M { C sub "min" } and
@M { C sub "max" } be these uniform limits, when they exist.
@PP
After @C { KheIntervalGrouping } groups the limit active intervals
constraints, it selects those classes @M { C } which have the
following properties:
@NumberedList

@LI @OneRow {
@M { C } has the same number of time groups as there are days
in the days frame, and the @M { i }th time group of @M { C }
is a subset of the @M { i }th time group of the days frame,
for all @M { i }.
}

@LI @OneRow {
All @M { C }'s time groups are singletons.
}

@LI @OneRow {
All @M { C }'s time groups are positive.
}

@LI @OneRow {
@M { C } applies to every resource of type @C { rt }.
}

@LI @OneRow {
@M { C } has uniform limits @M { C sub "min" } and @M { C sub "max" },
as just defined.
}

@LI @OneRow {
@M { C sub "min" } and @M { C sub "max" } satisfy the conditions
involving the @C { rs_interval_grouping_min } and
@C { rs_interval_grouping_range } options given in
Section {@NumberOf resource_structural.task_grouping.interval_grouping}.
}

@EndList
It then applies the following algorithm to each selected class
@M { C } in turn.
@End @SubAppendix

@SubAppendix
    @Title { Admissible and included tasks }
    @Tag { interval_grouping.tasks }
@Begin
@LP
For a given constraint class @M { C }, the first step is to decide
which tasks are @I { admissible }:  able to be included in the solve.
Task @M { s } is admissible if it satisfies these conditions:
@NumberedList

@LI @OneRow {
@M { s } has type @C { rt }.
}

@LI @OneRow {
@M { s } is a proper root task.
}

@LI @OneRow {
@M { s } does not have an empty domain.  A task with an empty domain
cannot be assigned a resource, so should not participate in task grouping.
}

@LI @OneRow {
@M { s } does not have a fixed non-assignment, that is, a @C { NULL }
resource assignment which is fixed.  A fixed non-assignment would
also prevent @M { s } from being assigned a resource.
}

@LI @OneRow {
The busy times of @M { s }, taken chronologically, lie
on consecutive days.
}

@LI @OneRow {
@M { s } is running at at least one time.
}

@LI @OneRow {
Build a sequence of times by taking the busy times of @M { s }
chronologically.  If any of these times are monitored by @M { C },
then those times must be adjacent in the sequence, and must
appear at the start of the sequence or at the end.  This
point is discussed further below.
}

@EndList
The algorithm builds groups from a subset of the admissible tasks
called the @I { included tasks }.  We've got some work to do before
we can specify exactly which tasks these are.
@PP
A @I { primary time } is a time monitored by the constraints of class
@M { C }; all other times are @I { secondary times }.  For example,
if @M { C } is concerned with consecutive night shifts, then the
primary times are the night times.
@PP
There is nothing to prevent an admissible task @M { s } from being
a multi-day task, or a previously constructed group of several
tasks, some running during primary times and others not.  We
define @M { s }'s @I { (full) duration } to be its number of busy
times, and its @I { primary duration } @M { p(s) } to be its number
of primary times.  We define a @I { primary task } to be an admissible
task whose primary duration is one or more, and a @I { secondary task }
to be an admissible task whose primary duration is 0.
# @PP
# Although we allow our algorithm to include secondary tasks,
# our actual aim is to group primary tasks.  And although we allow
# our algorithm to include non-must-assign tasks, we
# group tasks in the expectation that the group will be assigned a
# resource, so our actual aim is to group must-assign tasks.  So
# admissible tasks which are both primary tasks and must-assign tasks
# are called @I { comp ulsory tasks } and are always included in the
# solve; all other admissible tasks are called @I { opti onal tasks }.
# They are opt ional in two senses:  they do not have to be included
# in the solve at all, and if they are included, they do not have to
# be grouped.  Actually we insist that every included task must go
# into exactly one group, so we should rather say that a group
# containing only opt ional tasks can have any duration and is always
# assigned cost 0.
@PP
Primary and secondary times can appear intermingled within
one task @M { s }.  Our algorithm is limited in what it can do in
such cases, partly inherently.  For example, if @M { s } runs at
three times, in the order secondary-primary-secondary, the primary
time can't be part of a longer sequence of primary times:  the
secondary times prevent it.  We avoid such cases by means of
admissibility condition (7) above.  It makes primary-primary,
primary-secondary, secondary-primary, and secondary-secondary
admissible, for example, but excludes primary-secondary-primary
and secondary-primary-secondary.  The secondary-primary case can
arise in practice, when weekend grouping runs first and groups a
Saturday secondary task with a Sunday primary task.
@PP
Our algorithm is limited to building groups that contain at most
one sequence of primary times.  So a secondary-primary task can
only appear at the start of a group (it could appear later, but
only if the preceding tasks are all secondary tasks, and an
examination of the tasks that are actually included, to be given
next, will show that that cannot happen), and a primary-secondary
task can only appear at the end of a group.  This leads us to define
a property of each task @M { s }, called its @I { placement }
@M { q(s) }, whose type is an enumerated type with three values:
@I { first_only }, meaning that @M { s } must be chronologically
first in any group it is added to; @I { last_only }, meaning that
@M { s } must be chronologically last in any group it is added to;
and @I { any }, meaning that @M { s } may appear anywhere.
@PP
For any admissible task @M { s } we can call @C { KheTaskNonAsstAndAsstCost }
(Section {@NumberOf resource_structural.mtask_finding.ops}).  This
returns @M { s }'s @I { non-assignment cost } @M { n(s) } (the cost
of leaving @M { s } unassigned) and its @I { assignment cost }
@M { a(s) } (the cost of assigning @M { s }).  If @M { n(s) > a(s) },
then we can expect to assign a resource to @M { s } eventually, so
we call @M { s } a @I { must-assign task }.  Otherwise we call
@M { s } a @I { non-must-assign task }, since usually it will
not matter whether we assign a resource to @M { s } or not,
although assigning a resource to @M { s } might incur a positive
net cost.  If @M { s } is assigned a resource when interval
grouping begins, then we always classify @M { s } as a
must-assign task, because interval grouping will not unassign it.
@PP
We can now say which admissible tasks are included:
@BulletList

@LI @OneRow {
@I { All primary must-assign tasks. }
Most included tasks are of this kind.
# @I { Comp ulsory tasks. }
# All of these are included.  Most included tasks are com pulsory.
}

@LI @OneRow {
@I { Some weekend tasks. }
A @I day is one time group of the common frame. and a @I weekend is a
sequence of two adjacent days for which every resource is subject
to a complete weekends constraint.
When the numbers of primary must-assign tasks on the two days of a
weekend differ, we include some must-assign secondary tasks from the
less busy of the two days, to allow the algorithm to group every
weekend must-assign night task with another must-assign task, to
satisfy complete weekends constraints.  An extra Saturday secondary
task has placement @I { first_only }; an extra Sunday secondary task
has placement @I { last_only }.
}

@LI @OneRow {
@I { Some lengthener tasks. }
As explained in Section {@NumberOf interval_grouping.algorithm},
if the algorithm fails to give every group a suitable primary
duration on its first attempt, then we may choose to add a few
primary non-must-assign tasks that might solve this problem, and
run the algorithm again.
}

@EndList
# It does not matter to the running algorithm why any given task
# was included, although its placement does matter.
# , as does whether it is comp ulsory or opt ional.
# @PP
Here is the full list of attributes of a task @M { s } which affect
interval grouping:
@BulletList

@LI @OneRow {
@M { T(s) }, the set of times that @M { s } is running.
}

@LI @OneRow {
@M { T'(s) }, a sequence of integers.  It contains one element for
each time @C { t } of @M { T(s) }, holding @C { KheTimeIndex(t) },
the index of @C { t } in the enclosing instance, in decreasing order.
This is a function of @M { T(s) }.  See the corresponding task group
attribute @M { T'(g) } for how it is used.
}

@LI @OneRow {
@M { p(s) }, the @I { primary duration } of @M { s }, that is,
the number of primary times that @M { s } is busy for.  This is
also a function of @M { T(s) }, but it has its own uses.
}

@LI @OneRow {
@M { d(s) }, the @I domain of @M { s }, as returned by
@C { KheTaskDomain }.  This is a set of resources.
}

@LI @OneRow {
@M { r(s) }, the @I { (resource) assignment } of @M { s }, as
returned by @C { KheTaskAsstResource }.  This is either a
resource or the special value @C { NULL }.
}

# @LI @OneRow {
# @M { f (s) }, the @I { fixed non-assignment } of @M { s }, a
# Boolean value which is true when @M { r(s) } is @C { NULL }
# and @C { KheTaskAssignIsFixed } is true for @M { s }.  This is
# important for grouping, because a task with a fixed non-assignment
# cannot be grouped with an assigned task.
# }

# @LI @OneRow {
# @M { o(s) }, the @I opti onality of @M { s }, a Boolean value, true
# when @M { s } is opt ional.  As defined above, primary must-assign
# tasks are comp ulsory (not opt ional); all other admissible tasks are
# opt ional.
# }

@LI @OneRow {
@M { n(s) } and @M { a(s) }, the @I { non-assignment cost } and
@I { assignment cost } of @M { s }.  These are the cost of event
resource constraints when @M { s } is unassigned and when it is
assigned, as returned by @C { KheTaskNonAsstAndAsstCost }
(Section {@NumberOf resource_structural.mtask_finding.ops}).
They influence solution cost, and also whether @M { s } is
included:  @M { s } is a must-assign task when @M { n(s) > a(s) }
or @M { r(s) } is non-@C { NULL }.
# Apart from that they
# influence the algorithm only indirectly, via their contribution
# to @M { v (s) }, to be defined next.
}

# @LI @OneRow {
# @M { v (s) }, the @I { task cost }, is the cost attributable to
# @M { s } when it is added to a group which is assigned later.
# This usually includes @M { a(s) - n(s) }, which can be negative.
# Its full definition appears below under task group cost.
# }

# @LI @OneRow {
# @M { n(s) }, the @I { non-must-assign cost } of @M { s }.  When
# @M { s } is a non-must-assign task, @M { n(s) } is its assignment
# cost minus its non-assignment cost, possibly plus another cost
# that we will define later, when we come to consider the cost of
# a task group.  When @M { s } is a must-assign task, @M { n(s) } is
# 0.  So @M { n(s) } is well-defined and satisfies @M { n(s) >= 0 }
# for every admissible task @M { s }.
# }

@LI @OneRow {
@M { q(s) }, the @I placement of @M { s }, of enumerated type with
values @I { first_only }, @I { last_only }, and @I { any }.  It says
that @M { s } must appear first in its group, or last, or anywhere.
}

@EndList
It does not matter why any given task was included, except for how
that affects these attributes.
# Most of these attributes have corresponding task group attributes,
# as we will see later.
@End @SubAppendix

@SubAppendix
    @Title { Task groups }
    @Tag { interval_grouping.task_groups }
@Begin
@LP
A @I { task group }, or just @I { group }, is a set @M { g } of
tasks, plus an optional @I { history value }.  A history value
is a pair @M { (h, r) }, where @M { h } is a positive integer and
@M { r } is a resource.  It represents @M { r } being busy for
@M { h } primary times immediately before the start of the cycle.
@PP
A group @M { g } must satisfy nine conditions.  The first six
come from the task grouping module
(Section {@NumberOf resource_structural.task_grouping}),
which is used by interval grouping to build its groups.  They
ensure that each group can be assigned a single resource later,
without creating clashes or other problems.  Here are those
conditions, re-expressed to apply to @M { g } rather than to
a single incoming task:
@NumberedList

@LI @OneRow {
The interval of days that each task is running must be disjoint from
the interval of days that any other task is running.  This is subsumed
by condition (8) below.
}

@LI @OneRow {
The intersection of the domains of the tasks of @M { g } (if any)
must be non-empty.
}

@LI @OneRow {
Two tasks of @M { g } may not be assigned different resources.
If there is a history value, its resource is included in this
condition.  It is acceptable for some tasks to be assigned a
resource and others not.
}

@LI @OneRow {
If any task of @M { g } is assigned a resource @M { r }, or
if @M { g } has a history value with resource @M { r }, then
the intersection of the domains of the tasks of @M { g }
(if any) must contain @M { r }.
}

@LI @OneRow {
If any task of @M { g } is assigned a resource, or if @M { g }
has a history value, then none of the tasks of @M { g } may
have a fixed non-assignment.
}

@LI @OneRow {
If any task of @M { g } is assigned a resource @M { r }, or if
there is a history value with resource @M { r }, then there may
be no interference:  no task not in @M { g } but assigned @M { r }
may run on any of the days that @M { g }'s tasks are running.
}

@LII {
The remaining three conditions are specific to interval grouping:
}

@LI @OneRow {
Group @M { g } must contain at least one task, or a history value,
or both.
}

@LI @OneRow {
The multiset of all times that the tasks of @M { g } are running must
contain exactly one time from each element of a set of consecutive
days; and if @M { g } has a history value, this multiset (if non-empty)
must contain a time from the first day of the cycle.
}

@LI @OneRow {
The placement property of each task of @M { g } must be respected.
That is, if a task's placement is @I { first_only }, then that
task must be chronologically first in @M { g }, and @M { g } may
not have a history value; and if its placement is @I { last_only },
then it must be chronologically last in @M { g }.
}

@EndList
Only groups that satisfy all of these conditions are created.
Given the way that placements are assigned to tasks, condition
(9) ensures that within the chronological sequence of times
that a task group is running, the primary times are consecutive.
# lie at the start of the sequence or at the end (or both).
@PP
Here now is the complete list of attributes of a task group @M { g }
used by interval grouping:
@BulletList

@LI @OneRow {
@M { T(g) }, the set of times that the tasks of @M { g } are running.
}

@LI @OneRow {
@M { T'(g) }, a sequence of integers.  It contains one element for
each time @C { t } of @M { T(g) }, holding @C { KheTimeIndex(t) },
the index of @C { t } in the enclosing instance, in decreasing order,
plus one optional value at the end: the value @M { h } from the
history value, if there is one.  This sequence encodes everything
needed to work out the cost in resource constraints (not event
resource constraints) of assigning the tasks of @M { g } to some
resource @M { r }, since that cost depends only on the times that
this makes @M { r } busy, plus history if any.  For efficiency, this
cost is calculated just once for each distinct value of @M { T'(g) },
and cached in a trie data structure indexed by @M { T'(g) }.
}

@LI @OneRow {
@M { p(g) }, the number of primary times in @M { g }, including
@M { h } if there is a history value.  Ideally we would have
@M { C sub "min" <= p(g) <= C sub "max" } for every @M { g }.
}

@LI @OneRow {
@M { d(g) }, the @I domain of @M { g }, defined as the intersection
of the domains of the tasks of @M { g }, or, if there are no tasks,
the set of all resources of the resource type @C { rt } that we are
solving for.  The conditions given earlier, including (7), ensure
that @M { d(g) } is not empty.
}

@LI @OneRow {
@M { r(g) }, the @I assignment of @M { g }, a resource which is @M { r }
if some task of @M { g } is assigned @M { r }, or @M { g } has a history
value containing @M { r }, and the special value @C { NULL } otherwise.
Condition (3) ensures that there can be at most one distinct assignment.
The tasks do not have to all be assigned @M { r(g) }, but if the
grouping represented by @M { g } is actually carried out, they will be.
}

# @LI @OneRow {
# @M { f (g) }, the @I { fixed non-assignment }, a Boolean value which
# is true when @M { g } contains an unassigned task whose assignment
# is fixed.  By Condition (5) above, if @M { f (g) } is true then
# @M { r(g) } is @C { NULL }.
# }

@LI @OneRow {
@M { chi(g) }, a Boolean value.  When @M { chi(g) } is true we
say that we @I { handle @M { g } as assigned }.  This means that the
algorithm believes that a subsequent solver will
assign a resource to @M { g }, not that
@M { g } is assigned a resource now.  When
@M { chi(g) } is false, the algorithm has not decided whether a
subsequent solver will assign a resource to @M { g } or not.
# This
# is not the same as deciding that it will not assign a resource.
# @LP
@M { chi(g) } has value
@M { r(g) != @C { NULL } vee bar g bar >= 2 }, where @M { bar g bar }
is the number of tasks in @M { g }.  The reason for this is
explained at the end of Section {@NumberOf interval_grouping.cost}.
}

# @LI @OneRow {
# @M { o(g) }, the @I opti onality of @M { g }, a Boolean value which is
# true when all of the tasks of @M { g } are opti onal and @M { g } has
# no history value.  A group @M { g } is @I { opt ional } when @M { o(g) }
# is true, and @I { comp ulsory } when @M { o(g) } is false.  An opt ional
# group does not need to be assigned any resource, so it contributes no
# cost to any solution containing it.
# }

# @LI @OneRow {
# @M { v (g) }, the @I { task cost } of @M { g }, is the sum, over the
# tasks @M { s } of @M { g }, of @M { v (s) }.
# # @M { n(g) }, the @I { non-must-assign cost } of @M { g }, is one
# # part of @M { c(g) } and is also defined below.
# }

# @LI @OneRow {
# @M { c(g) }, the @I { cost } of @M { g }.  This is @M { g }'s contribution
# to the cost of a solution.  See below for more.
# }

@EndList
These attributes, and only these, determine how @M { g } influences
the interval grouping algorithm.
@PP
@BI { Adjustments to history. }
Defining history values seems straightforward.  For each case
where a resource @M { r } has a non-zero history value @M { h }
in a constraint of @M { C }, we define a group with history value
@M { (h, r) } and no tasks, and add it to the solution that the
algorithm starts from.
@PP
However, sometimes we need to adjust @M { h }.  The problem is
that constraints on the number of consecutive busy days can
contradict constraints on the number of consecutive busy primary
times.  This is not likely in general, but it is likely when
these constraints interact with history.
@PP
For example, in instance INRC2-4-100-0-1108, busy night times
are supposed to occur in sequences of 4 or 5.  The limits on
consecutive busy days vary depending on the nurse's workload,
but they always allow sequences of length 4 or 5.  So far, then,
there is no contradiction.
@PP
But consider resource HN_8.  This resource's sequences of busy night
times are supposed to have length 4 or 5, as usual.  Its sequences
of busy days are supposed to have length 3, 4, or 5.  So far there
is no contradiction.  HN_8 has a history value saying that it
worked for 1 night time at the end of the previous cycle, so it
should work for 3 or 4 consecutive nights at the start of the
current cycle.  However, HN_8 has another history value saying
that it worked for 3 consecutive days at the end of the previous
cycle (this must be two unspecified times followed by one night
time), so it should work for 0, 1, or 2 consecutive days at the
start of the current cycle.
@PP
The two requirements conflict.  We follow the requirement
with the higher weight.  In this case this is the busy days
requirement, so we adjust the night times requirement to
reflect the busy days requirement.  We do this by adjusting
HN_8's history value, increasing @M { h } from 1 to 3.
# @PP
# @BI { Task group cost. }
# We turn now to defining @M { c(g) }, the cost of a group @M { g }.
# This has to be a good estimate of the effect on final solution cost
# of including @M { g } in the final solution.
# @PP
# Group @M { g } can be made up of a mixture of various kinds of
# tasks:  primary and secondary, must-assign and non-must-assign.
# These affect the attributes of @M { g }, and through them they
# affect the cost of @M { g }, in non-trivial ways.  The author
# has not been able to find any unified expanation for these
# effects, so in the following each is simply identified and
# explained as it arises.
# @PP
# Now @M { c(g) } depends on whether @M { g } is subsequently assigned
# a resource or not.  Let @M { c sub a (g) } be the cost of @M { g }
# when it is assigned a resource, and let @M { c sub u (g) } be
# the cost of @M { g } when it remains unassigned.  The subsequent
# solve is assumed to preserve @M { g } and any initial resource
# assignments, but other than that it may, and probably will, do
# whatever produces the least cost, so
# @ID @Math {
# c(g) = min( c sub a (g) , c sub u (g) )
# }
# except that if @M { r(g) } is non-@C { NULL } then @M { g } must
# be assigned, so @M { c sub u (g) } is undefined and
# @M { c(g) = c sub a (g) }.
# @PP
# Actually we can say more.
# Take any group @M { g } containing @M { K } tasks whose costs
# indicate that it will remain unassigned.  Now @M { g } cannot
# contain history, because groups with history are always assigned.
# Break @M { g } into @M { K } unassigned groups, each containing
# one of @M { g }'s tasks.  These @M { K } groups are legal and
# their cost is the same as the cost of @M { g }, namely 0.  (Or
# less, if there is some unexpected advantage in breaking @M { g }
# up.)  So we can insist that every unassigned group contain exactly
# one task and no history, without any risk of missing an optimal
# solution.  Accordingly, we define and use @M { c sub u (g) }
# only when @M { r(g) } is @C { NULL } and @M { g } contains a
# single task and no history.
# @PP
# # The following analysis is easy to get wrong, so we need to be clear
# # about what we are doing.
# For any solution @M { S }, let @M { c(S) }
# be the cost of @M { S }.  Starting with a solution @M { S } in which
# the tasks of @M { g } are not grouped, we group them and then assign
# a resource to @M { g }, producing a solution @M { S sub a }.  Then
# the cost we want is the best estimate we can get of
# @ID @Math { c sub a (g) = c( S sub a ) - c( S) }
# Or we group the tasks but do not assign them,
# producing a solution @M { S sub u }.  Then the cost is
# @ID @Math { c sub u (g) = c( S sub u ) - c( S) }
# It is clear from this that these costs could be negative.  Indeed,
# given that @M { c sub a (g) } measures costs after assigning tasks,
# most of which need to be assigned, @M { c sub a (g) } probably will
# be negative.
# @PP
# One way to find these costs would be to carry out the grouping
# and possible assignment and just take the solution cost from
# the KHE platform.  And indeed we do do this to some extent, by
# means of calls to function @C { KheTaskGrouperEntryCost }
# (Section {@NumberOf resource_structural.task_grouping.task_grouper}),
# as we'll see shortly.  But relying entirely on this method has
# a few problems.  One is that we will be building thousands of
# groups and it could be slow.  Another is that some constraints
# are unwanted.  Avoid unavailable times constraints, for example,
# are unwanted since when one resource is unavailable, another
# usually is available.  Global constraints, on total workload for
# example, are also unwanted.  We don't want to favour large groups
# because they bring a resource closer to its minimum workload
# limit, for example.  So we interpret @M { c(S) } in the formulas
# above as referring only to the total cost of @I wanted constraints.
# We won't define this formally, although the discussion we've just
# had pretty well covers it.  @C { KheTaskGrouperEntryCost } limits
# itself to wanted constraints.
# # @PP
# # If @M { g } is opti onal, then by definition its tasks are all
# # unassigned and leaving them unassigned has no cost.  So we must
# # have @M { c sub u (g) = 0 }, and therefore @M { c(g) = 0 }.
# # It remains to find @M { c sub a (g) } and @M { c sub u (g) }
# # for comp ulsory groups.  We start with @M { c sub a (g) }.
# # @PP
# # It is easy to invent many unlikely costs.  For example, if every
# # resource is unavailable for either all of day 9 or all of day 10,
# # then any group which covers both of those days should have high
# # cost.  We don't claim to handle such unlikely cases.  The reader
# # is going to have to trust that our list of costs is suf ficien tly
# # complete to be useful in practice.  We start with costs that can
# # be attributed to individual tasks.
# # # , then move on to costs attributable to the group as a whole.
# # @PP
# # We assume here that in the initial state, before @M { g } is
# # created, there is no cost associated with each task @M { s }.
# # This is not actually true:  if @M { s } is unassigned initially,
# # its non-assignment cost will be part of the solution cost, and
# # if @M { s } is assigned initially, its assignment cost will be
# # part of the solution cost.  However the analysis is easier to
# # follow if we assume that there is no cost until @M { s } is
# # added to a group @M { g } and then that group is either assigned
# # or not assigned.  What we are really calculating is the change
# # in cost, so it's all the same in the end.
# @PP
# We start with @M { c sub a (g) }.
# @PP
# @B { (i) }
# When unassigned task @M { s } is added to a group @M { g } which is
# then assigned, @M { s } changes from unassigned to assigned.  So we
# need to add the possibly negative cost @M { a(s) - n(s) } to
# @M { c sub a (g) }, unless @M { s } is assigned initially, in
# which case there is no change.
# # @PP
# # If @M { s } is a must-assign task (primary or secondary), we can
# # ignore the non-assignment and assignment costs of @M { s },
# # because, however @M { s } is grouped, we expect to assign a
# # resource to @M { s } in every solution, producing the same
# # non-assignment and assignment costs in every solution.
# # @PP
# # If @M { s } is a non-must-assign task (primary or secondary), it
# # may lie in a comp ulsory group in some solutions and in an opt ional
# # group in others.  When it lies in a comp ulsory group @M { g }, we
# # need to add its assignment cost minus its non-assignment cost to
# # @M { c sub a (g) }, to reflect the fact that we have placed it into a
# # group which will subsequently become assigned.  @B { (i) }
# @PP
# @B { (i i) }
# There is a second cost that can be attributed to @M { s } when
# it is added to a group which is then assigned.  Suppose @M { s }
# is a non-must-assign task (primary or secondary).  By the way
# must-assign tasks are defined, @M { s } is unassigned.  We are
# assigning @M { s } even though, when @M { s } is considered
# alone, we don't have to.  If the supply of resources does not
# exceed the demand for resources made by must-assign tasks, then
# assigning @M { s } adds to the amount by which resources are
# overloaded, so for each time of @M { s }, we have to add the usual
# cost of overloading a resource by one time.
# @PP
# For each task @M { s } we add these first two costs together to get
# a single cost @M { v (s) } called the @I { task cost }, which is the
# cost of including task @M { s } in a group that is subsequently
# assigned.  We calculate @M { v (s) } for each included task @M { s }
# before solving begins, to save time.
# # This is
# # @M { a(s) - n(s) } plus, if @M { s } is an unassigned
# # non-must-assign task, and the supply of resources falls short
# # of or equals the demand, then we add the usual cost of overloading
# # a resource by one time.
# # @PP
# # One way to handle supply and demand imbalances is to simply not assign
# # some must-assign tasks.  Reflecting this in our algorithm would involve
# # allowing a must-assign task @M { s } to appear alone in a group whose
# # cost is the cost of not assigning @M { s }.  This is not currently done.
# @PP
# We move on now to costs associated with group @M { g } as a whole.
# These all come from resource constraints, which can be divided into
# @I { local constraints }, affected by what happens on a few adjacent
# days, and @I { global constraints }, affected by what happens on many
# (usually all) days.
# @PP
# @B { (ii i) }
# Local constraints include illegal pattern constraints, complete
# weekends constraints, and so on.  They are evaluated for @M { g }
# by function @C { KheTaskGrouperEntryCost }
# (Section {@NumberOf resource_structural.task_grouping.task_grouper}).
# As discussed above, it only includes wanted constaints.  Its
# result is never negative.
# @PP
# There is a complication here.  When we call
# @C { KheTaskGrouperEntryCost }, we are assuming that @M { g }
# is the complete group of interest: the tasks of @M { g } are
# assigned some resource @M { r } which is free on nearby days.
# This is not necessarily correct, but it is a reasonable
# assumption when @M { p(g) > 0 }, because @M { g }
# contains primary tasks and there are no other primary tasks
# nearby to add to it, because all the primary tasks we have
# chosen to include have been assigned to @M { g } and to other
# groups.  But when @M { p(g) = 0 } the group is free
# to combine with other nearby tasks.  Take the leading example:
# a Saturday day task, included so that a Sunday night task can be
# grouped with it to avoid a complete weekends constraint violation.
# If the Saturday day task is indeed grouped with a Sunday night task,
# it is reasonable to call @C { KheTaskGrouperEntryCost }.  But if
# it remains ungrouped, then @C { KheTaskGrouperEntryCost } reports
# a violated complete weekends constraint.
# @PP
# One might say that this is acceptable, and that the better course
# would be to leave the ungrouped Saturday day task unassigned.  But
# what if that task is a must-assign task (as in our solver it
# actually is)?  Then assigning it has a cost, and not assigning it
# has a cost, but in reality what will happen is that a subsequent
# solver will (in effect) group the task with a Sunday day task and
# avoid all cost.  But the Sunday day task is not an included task
# (nor can it be, because we have to stop somewhere), so interval
# grouping cannot perceive this possibility.
# @PP
# We solve this problem by not calling @C { KheTaskGrouperEntryCost }
# when @M { p(g) = 0 }, instead pretending that it returned its
# minimum possible value, 0.  This can be interpreted as an
# admission of our ignorance about the real cost of leaving a
# secondary task ungrouped.
# @PP
# @B { (iv) }
# Global constraints include limits on total workload, limits on
# consecutive busy shifts of a particular kind, limits on consecutive
# busy days, and limits on consecutive free days.  Most of them are
# irrelevant here (we assume that any limits on total or consecutive
# busy days are not more restrictive that the limits on consecutive
# primary times).  The only relevant ones are those in constraint
# class @M { C }.  So we include the usual cost derived from comparing
# @M { p(g) } with @M { C sub "min" } and @M { C sub "max" }, except
# that when @M { p(g) = 0 } the cost is 0.
# @PP
# To summarize, then, @M { c sub a (g) }, the cost of building a group
# @M { g } and assigning a resource to it, is (i) the sum of the assignment
# costs minus the non-assignment costs of its unassigned tasks, plus (i i) the
# usual cost of overloading one resource by one time for each time of each
# unassigned non-must-assign task (if demand for resources equals or exceeds
# supply), plus (i ii) the cost returned by @C { KheTaskGrouperEntryCost }
# given @M { g } (or 0), plus (iv) any cost of violating @M { C }.
# @PP
# To find @M { c sub a (g) } efficiently, we store @M { v (g) }, the sum
# over the tasks @M { s } of @M { g } of @M { v (s) }, in @M { g }, and
# this gives us items (i) and (i i).  Items (i ii) and (iv) depend only
# on the times that @M { g }'s tasks are running, plus any history value.
# For each such set of times and history that we encounter, we calculate
# (i ii) plus (iv) and store it in a cache indexed by @M { T'(g) } as
# explained earlier.  So after these costs are calculated once, a quick
# scan of the tasks can be used to index the cache and retrieve them.
# @PP
# Now for @M { c sub u (g) }, the cost of group @M { g }
# when it remains unassigned.
# @PP
# When a task @M { s } is added to a group @M { g } which remains
# unassigned, @M { s } itself remains unassigned.  (Here @M { s }
# cannot be assigned initially, because then @M { r(g) } is
# non-@C { NULL } and we do not calculate @M { c sub u (g) }.)
# No other costs contribute to @M { c sub u (g) }:  we aren't
# assigning or overloading any resources, so there is nothing like (i i),
# (i ii), and (iv).  So @M { c sub u (g) = 0 }.  But this doesn't
# mean that leaving @M { g } unassigned is a good idea.  Cost 0 will
# usually be inferior to the negative cost obtained by assigning tasks.
# # @PP
# # The other cost mentioned above, @M { n(g) }, the @I { non-must-assign cost }
# # of @M { g }, is (i) plus (i i).  This definition holds even when @M { g }
# # is opti onal.  We also define @M { n(s) } for task @M { s } as the sum
# # of (i) and (i i) for @M { s } when @M { s } is a non-must-assign task,
# # and 0 when @M { s } is a must-assign task.  Then @M { n(g) } is the
# # sum, over the tasks @M { s } of @M { g }, of @M { n(s) }.
# # @PP
# # The conditions @M { o (g) } and @M { n(g) > 0 } seem similar,
# # but in fact there is no logical relationship between them.  To
# # prove this, we give an example for each combination of a value
# # for one and a value for the other.  Let @M { s sub 1 } be a
# # comp ulsory task; let @M { s sub 2 } be a non-must-assign task with
# # @M { n( s sub 2 ) > 0 }; and let @M { s sub 3 } be a must-assign
# # task which runs during a secondary time.  The examples are:
# # @CD @Tbl
# #   aformat { @Cell i { ctr } A | @Cell i { ctr } B | @Cell C }
# #   mv { 0.5vx }
# # {
# # @Rowa
# #   A { @M { o (g) } }
# #   B { @M { n(g) > 0 } }
# #   C { Example of @M { g } }
# #   rb { yes }
# # @Rowa
# #   A { false }
# #   B { false }
# #   C { @M { lbrace s sub 1 rbrace } }
# # @Rowa
# #   A { false }
# #   B { true }
# #   C { @M { lbrace s sub 1 , s sub 2 rbrace } }
# # @Rowa
# #   A { true }
# #   B { false }
# #   C { @M { lbrace s sub 3 rbrace } }
# # @Rowa
# #   A { true }
# #   B { true }
# #   C { @M { lbrace s sub 2 rbrace } }
# #   rb { yes }
# # }
# # We naturally do not expect @M { n(g) > 0 ==> o (g) }, because
# # the very usual case of one opti onal task mixed in with several
# # comp ulsory tasks refutes it.  The converse, @M { o (g) ==> n(g) > 0 },
# # seems more likely to be true, but in fact it isn't.
@End @SubAppendix

@SubAppendix
    @Title { Solutions }
    @Tag { interval_grouping.solutions }
@Begin
@LP
A @I solution @M { G(S) } for a given set of included tasks @M { S }
is a set of groups such that each @M { s in S } lies in exactly one
group @M { g in G(S) }.  For any @M { S } there is always at least
one solution, because one way to make one is to place each
@M { s in S } into its own group.
@PP
Let @M { T sub 1 ,..., T sub n } be the time groups defining the
days.  For each @M { T sub i }, define two sets of tasks:
@BulletList

@LI @OneRow {
@M { X sub i }, the set of included tasks @M { s } such that
the first time @M { t } that @M { s } is running satisfies
@M { t in T sub i }.
}

@LI @OneRow {
@M { Y sub i }, the set of included tasks @M { s } such that
some time @M { t } that @M { s } is running satisfies
@M { t in T sub i }.
}

@EndList
We have @M { X sub i subseteq Y sub i }, and also the curious result
@M { X sub 1 cup cdots cup X sub i = Y sub 1 cup cdots cup Y sub i }.
The @M { X sub i } are pairwise disjoint; the @M { Y sub i } may not
be, since each @M { s } appears in one @M { Y sub i } for each time
it is running.  We will be particularly interested in the sets
@M { S sub i = X sub 1 cup cdots cup X sub i }.  At present,
@M { Y sub i } is only used when comparing the number of Saturday and
Sunday tasks, to decide whether any weekend tasks need to be added,
as discussed above (Section {@NumberOf interval_grouping.tasks}).
@PP
Let @M { G( S sub i ) } be a solution for @M { S sub i }.
For example, @M { G( S sub 7 ) } might be this:
@CD @Diag paint { lightgrey } margin { 0c } { @VContract {
1.5c @Wide @M { T sub 1 } |
1.5c @Wide @M { T sub 2 } |
1.5c @Wide @M { T sub 3 } |
1.5c @Wide @M { T sub 4 } |
1.5c @Wide @M { T sub 5 } |
1.5c @Wide @M { T sub 6 } |
1.5c @Wide @M { T sub 7 } |
1.5c @Wide @M { T sub 8 } |
1.5c @Wide @M { T sub 9 } |
//0.2f
@Box { 6c @Wide 0.5c @High } |
@Box { 4.5c @Wide 0.5c @High } |
//
@Box paint { white } outlinestyle { noline } { 3c @Wide 0.5c @High } |
@Box { 7.5c @Wide 0.5c @High }
//
@Box { 7.5c @Wide 0.5c @High } |
@Box paint { white } outlinestyle { noline } { 1.5c @Wide 0.5c @High } |
@Box { 3c @Wide 0.5c @High }
} }
where each grey rectangle represents one group.  Every included task
which begins at or before @M { T sub 7 } is present in a group of
@M { G ( S sub 7 ) }.  The columns occupied by a group indicate the
days when its tasks are running.  The row occupied by a group has
no significance.
# @PP
# @BI { Sol ution cost. }
# The next step is to define a cost @M { c( G( S sub i )) }
# for each solution @M { G( S sub i ) }.  This cost needs
# to be a good estimate of the cost to a complete solution of
# grouping tasks in this way.  Our aim is to find a solution
# of minimum cost for @M { S sub n }, the whole set of included tasks.
# @PP
# We are free to assign different resources to different groups, so
# each group will make an independent contribution, giving
# @ID @Math {
# c( G( S sub i )) = big sum from { g in G ( S sub i ) } c(g)
# }
# where @M { c(g) } is the cost of @M { g } as defined earlier.
# @PP
# Actually there is a problem with this formula:  some of the
# groups may not be finished.  For example, suppose there is
# a group with primary duration 1 which begins during @M { T sub i }.
# We don't want to penalize @M { G( S sub i ) } for this short
# length, because the group could (and usually will) get extended
# later.  So we proceed as follows.
@PP
Within a given @M { G( S sub i ) }, a @I { finished group }
is a group @M { g } that cannot be extended to the right.  If
@M { g } is not finished it is @I { unfinished }.  There are four
ways in which @M { g } can come to be finished:
@ParenAlphaList

@LI @OneRow {
If @M { g } does not include a task running at @M { T sub i },
it is finished because it is now too late to add such a task,
since every task running before or during @M { T sub i } is
already in a group, and adding a task from a later day
would create a gap in @M { g } at @M { T sub i }.
}

@LI @OneRow { 
If @M { p(g) >= C sub "max" }, then @M { g } is finished because adding
any task would make @M { g }'s primary duration too large, which is not
permitted.  (Actually we could add a task whose primary duration is
0.  So we should apply this rule only when @M { X sub {i+1} } does
not contain any tasks whose primary duration is 0 and whose placement
is not @I { first_only }.  This is not currently implemented.)
}

@LI @OneRow {
If @M { g } contains a task whose placement is @I { last_only },
then @M { g } is finished because extending @M { g } to the right
would contradict this requirement.
}

@LI @OneRow {
If @M { g } is running during @M { T sub n }, the last time
group, then @M { g } is finished because there are no tasks
to the right of @M { T sub n } to add to @M { g }.
}

@EndList
The first condition depends on the context in which the group
appears:  the same group could be not finished in @M { G( S sub {i-1} ) }
and finished in @M { G( S sub i ) }, for example.  The last
three conditions are independent of the context.  A group is said
to be @I { self-finished } when any of those apply.
# We write @M { F( S sub i ) } for the finished groups of
# @M { G( S sub i ) }, and @M { U( S sub i ) } for the
# unfinished groups of @M { G( S sub i ) }.
# @PP
# We can only be sure of @M { c(g) } once @M { g } is finished.
# It is true that time sweep resource assignment
# (Section {@NumberOf resource_solvers.matching.time.sweep})
# finds a way around this problem, but here we prefer to leave
# @M { c(g) } undefined for unfinished groups.  We include only
# finished groups in the sum that defines the solution cost.
# as well as for opti onal groups.
# For example,
# the cost of an unfinished group whose tasks are all opt ional will
# depend on whether any comp ulsory tasks are added to it.  Even if
# an unfinished group is comp ulsory, some constraints that affect
# its cost will depend on what is still to come---the constraints
# of @M { C }, for example.  This problem is solved well by time
# sweep resource assignment
# (Section {@NumberOf resource_solvers.matching.time.sweep}), but
@End @SubAppendix

@SubAppendix
    @Title { Solution cost }
    @Tag { interval_grouping.cost }
@Begin
@LP
The next step is to define a cost @M { c( G( S sub i )) } for each
solution @M { G( S sub i ) }.  This cost needs to be a good estimate
of the cost to a complete solution of grouping tasks in the way that
@M { G( S sub i ) } does.  Our aim is to find a solution of minimum
cost for @M { S sub n }, the whole set of included tasks.
# @PP
# Defining solution cost is complicated.  The author has done his best
# to explain it all clearly; but nevertheless, as the reader will soon
# discover, some aspects are intricate, and some have to be presented
# partially at first, and completed later.  Anyway, here we go.
@PP
We will use this formula for @M { c( G( S sub i )) }:
@ID @Math {
c( G( S sub i )) `` = ``
# big sum from { s in S sub i } c(s)
# `` non { + }
big sum from { g in G ( S sub i ) } gamma(g)
}
That is, assign a # @I { task cost } @M { c(s) } to each included task,
@I { combined cost } @M { gamma(g) } to each group, and let the solution
cost be the sum of the @M { gamma(g) }.  Essentially, this formula says
that each group's cost is independent of the other groups, which is true
because different groups can be assigned different resources.
@PP
Each combined cost is the
sum of one cost for the group as a whole, called the @I { group cost },
and one cost for each task of the group, called the @I { task costs },
giving something like this:
@ID @Math {
gamma(g) `` = `` c(g) + big sum from {s in g} c(s)
}
Group costs handle resource monitors and task costs handle event
resource monitors.  Actually this formula is not quite right, partly
because it does not distinguish between unfinished and finished
groups (a problem for later), and partly for another reason that
we will look into now.
@PP
As Section {@NumberOf resource_structural.task_grouping.interval_grouping}
stated, the interval grouping algorithm understands that a group
need not be assigned a resource.  A subsequent solver will either
assign a resource or not, and interval grouping assumes that that
decision is made in favour of whichever alternative costs less.
Accordingly, we define @M { c sub a (g) } to be the group cost of
@M { g } when @M { g } is assigned a resource, and @M { c sub a (s) }
to be the task cost of @M { s } when @M { s } is assigned a resource
(because it lies in an assigned group).  Similarly, we define
@M { c sub n (g) } to be the group cost of @M { g } when @M { g } is
not assigned a resource, and @M { c sub n (s) } to be the task cost
of @M { s } when @M { s } is not assigned a resource (because it lies
in an unassigned group).  We'll find formulas for these four quantities
later, but for now let's assume that we have them.  Then we define the
combined cost like this:
@IndentedList

@LI @Math {
gamma sub a (g) `` = ``  c sub a (g) + big sum from {s in g} c sub a (s)
&1c @R "and" &1c
gamma sub n (g) `` = ``  c sub n (g) + big sum from {s in g} c sub n (s)
}

@LI @Math {
gamma(g) `` = `` min( gamma sub a (g), gamma sub n (g) )
}

@EndList
This accords with what we have just said about choosing whichever
alternative costs less.
@PP
Assigning a resource to a group @M { g } is always possible, but
not assigning one is not always possible.  When @M { g } has
history, for example, it must be assigned.  In such cases there
is no point in finding @M { c sub n (g) } and @M { c sub n (s) }.
Instead, we just let @M { gamma(g) = gamma sub a (g) }.  We will
be using a valuable optimization based on this:  at the moment we
become aware that @M { g } must be assigned, we will add its tasks'
@M { c sub a (s) } values to the total cost, even though @M { g }
may be unfinished and @M { c sub a (g) } may be unknown.
@PP
The logical next step is to define @M { c sub a (g) },
@M { c sub n (g) }, @M { c sub a (s) }, and @M { c sub n (s) },
but before we do that we need to consider the effect of
grouping on the overall supply of and demand for resources.
# @PP
# # Task costs come mainly from event
# # resource constraints; group costs come from resource constraints.  We
# # can't prove here that this is the right formula, and indeed it is
# # slightly wrong as it stands.  The full story will emerge as we go on.
# @PP
# Before we define these costs, we need to analyse two issues of
# detail.  The author has found that tryng to skip over these details
# leads to inaccurate costs and inferior solutions.
# @PP
# @BI { Assigned and unassigned groups. }
# @PP
# @BI { Assigned and unassigned groups. }
# As Section {@NumberOf resource_structural.task_grouping.interval_grouping}
# stated, the interval grouping algorithm understands that a group
# need not be assigned a resource.  A subsequent solver will either
# assign a resource or not, and interval grouping assumes that that
# decision is made in favour of whichever alternative costs less.
# So we need to work out for each
# group @M { g } whether it is best to assign @M { g } or not.  If
# it is best to assign it, then @M { c(g) } and @M { c(s) } for each
# @M { s in g } will assume that @M { g } is assigned, and if it is
# best not to assign it, then @M { c(g) } and @M { c(s) } for each
# @M { s in g } will assume that @M { g } is not assigned.  The
# formula above does not indicate that @M { c(s) } depends on
# @M { s }'s group in this way, but it does.  The only influence
# of @M { g } on @M { c(s) } is this Boolean condition:  assigned or
# not assigned.
# @PP
# Assigning a resource to a group @M { g } is always possible, but
# not assigning one is not always possible.  When @M { g } has
# history, for example, it must be assigned.  In such cases there
# is no point in finding @M { c(s) } and @M { c(g) } when @M { g }
# is not assigned, so we won't; we'll just use the assigned costs.
# @PP
# When a group @M { g } is finished, it turns out to be easy to work
# out whether it is better to assign it or not; we'll be doing
# that shortly.  Based on that, we can find @M { c(g) } and
# @M { c(s) } for each @M { s in g }, and include these values
# in the total cost.  But when @M { g } is unfinished, the
# situation is less clear.  We will not try to calculate or
# include @M { c(g) } for an unfinished group @M { g }.  But
# we will calculate and include @M { c(s) } for each @M { s in g },
# provided we know whether @M { g } will be assigned or not.  For
# example, the task costs of an unfinished group @M { g } with
# history will be included, because we know that @M { g } will be
# assigned.  We'll come back later to the question of when we know
# whether a task group will be assigned or not; but the moment we
# do, its tasks' task costs will be calculated and added in.
# @PP
# The possibility of including @M { c(s) } at a time when
# @M { c(g) } is not known is the reason why we separate @M { c(s) }
# from @M { c(g) }.  If @M { c(g) } included @M { c(s) } for each
# @M { s in g }, we would be forced to wait until all of @M { c(g) }
# was known before including @M { c(s) } in the total cost.  The
# resulting loss of precision would weaken dominance testing, as
# the author has verified in real instances.
@PP
@BI { Demand cost. }
Suppose the supply of resources (the total number of times
that resources are available for assignment, taking their workload
limits into account) is less than the demand for resources (the
total duration of must-assign tasks).  Then for each time that
demand exceeds supply, either the minimum cost of overloading one
resource, or the minimum cost of leaving one must-assign task
unassigned, must be incurred somewhere.  We call the smaller of
these two costs the @I { demand cost } and denote it @M { D }.
When demand does not exceed supply, we let @M { D = 0 }.
@PP
One's first thought is that total demand cost is an unavoidable
constant cost, and thus irrelevant to task grouping.  However
there are at least two ways in which it is relevant.
@PP
(a)
When a non-must-assign task lies in a group that is assigned a
resource, it becomes a must-assign task, effectively, increasing
the demand and thus the total demand cost.
@PP
(b)
When a decision is made to not assign some group, the cost of not
assigning its tasks, although real, was already present as a
demand cost.  The only change is that a demand cost which was
being incurred by some unknown overloaded resource or unassigned
task is now being incurred by a known unassigned task or tasks.
It is right, therefore, to subtract the demand cost from the
unassigned task cost.
@PP
If this subtraction is done too often, there might be more groups
taken to be unassigned than can be justified by the amount that
demand exceeds supply.  It is not practicable to find the true
upper limit, so instead we will use a heuristic that severely limits
the cases where it is decided that a group will remain unassigned.
The subtractions will therefore be few and so justifiable.
@PP
@BI { Cost formulas. }
The obvious starting point for calculating total cost is
the cost reported by KHE before interval grouping begins.  That
can be made to work, but it is awkward in practice, because many
tasks are must-assign tasks, and for them @M { c sub a (s) } will
be negative.  So instead we are going to start (notionally)
from a solution which has cost zero because all its monitors are
detached.  At the moment when we are ready to calculate some
@M { c sub a (g) }, @M { c sub n (g) }, @M { c sub a (s) }, or
@M { c sub n (s) }, we (notionally) assign the task or group
(or not) and attach the relevant monitors.  Their total cost
is the cost we need.  This way, no costs are negative, because
no monitor costs are negative.
@PP
We start with @M { c sub a (s) }.  This is the cost
of event resource monitors when @M { s } is assigned a
resource, as returned by @C { KheTaskNonAsstAndAsstCost },
for which we have previously used the notation @M { a(s) }.
If @M { s } is a non-must-assign task we also have to include
the demand cost, as discussed at (a) above, giving
@M { c sub a (s) =  a(s) } when @M { s } is a must-assign
task, and @M { c sub a (s) =  a(s) + d(s) D } when @M { s }
is a non-must-assign task with duration @M { d(s) }.
@PP
Now for @M { c sub n (s) }.  This is the cost of event resource
monitors when @M { s } is not assigned a resource, as returned
by @C { KheTaskNonAsstAndAsstCost }, for which we have previously
used the notation @M { n(s) }.  If @M { s } is a must-assign task
we have to subtract the demand cost, as discussed at (b) above,
giving @M { c sub n (s) =  n(s) } when @M { s } is a
non-must-assign task, and @M { c sub n (s) =  n(s) - d(s) D }
when @M { s } is a must-assign task with duration @M { d(s) }.  As
stated above, cases where must-assign tasks lie in unassigned groups
will be rare, which is important for justifying the subtraction.
@PP
Next we tackle @M { c sub n (g) }.  This is the cost of resource
monitors when @M { g } is not assigned a resource.  But if @M { g }
is not assigned a resource, no resource monitors are affected,
and so we have @M { c sub n (g) = 0 }.
@PP
Our last quantity, @M { c sub a (g) }, is the cost of resource
monitors when @M { g } is assigned a resource.  It is the most
complicated.  One way to find @M { c sub a (g) } would be to
carry out the grouping and assignment and just take the solution
cost from the KHE platform.  And indeed we do do this to some
extent, by calling @C { KheTaskGrouperEntryCost }
(Section {@NumberOf resource_structural.task_grouping.task_grouper}),
as we'll see shortly.  But relying entirely on this has
a few problems.  One is that we will be building thousands of
groups and it could be slow.  Another is that some constraints
are unwanted.  Avoid unavailable times constraints, for example,
are unwanted since when one resource is unavailable, another
usually is available.  Global constraints, on total workload for
example, are also unwanted.  We don't want to favour large groups
because they bring a resource closer to its minimum workload limit,
for example.  So we need to limit ourselves to @I wanted constraints.
We won't define these constraints formally, although the discussion
we've just had pretty well covers it.
@PP
The resource monitors affected by @M { g } can be divided into
@I { local monitors }, which are those concerned with what happens
on a few adjacent days, and @I { global monitors }, concerned with
what happens on many (usually all) days.
@PP
# @B { (iii) }
Local monitors include illegal pattern monitors, complete
weekends monitors, and so on.  They are evaluated for @M { g }
by function @C { KheTaskGrouperEntryCost }
(Section {@NumberOf resource_structural.task_grouping.task_grouper}),
whose result forms one part of @M { c sub a (g) }.  It limits itself
to wanted constraints.
@PP
There is a complication here.  When we call
@C { KheTaskGrouperEntryCost }, we are assuming that @M { g }
is the complete group of interest: the tasks of @M { g } are
assigned some resource @M { r } which is free on nearby days.
This is not necessarily correct, but it is a reasonable
assumption when @M { p(g) > 0 }, because @M { g }
contains primary tasks and there are no other primary tasks
nearby to add to it, because all the primary tasks we have
chosen to include have been assigned to @M { g } and to other
groups.  But when @M { p(g) = 0 } the group is free
to combine with other nearby tasks.  Take the leading example:
a Saturday day task, included so that a Sunday night task can be
grouped with it to avoid a complete weekends constraint violation.
If the Saturday day task is indeed grouped with a Sunday night task,
it is reasonable to call @C { KheTaskGrouperEntryCost }.  But if
it remains ungrouped, then @C { KheTaskGrouperEntryCost } reports
a violated complete weekends constraint.
@PP
One might say that this is acceptable, and that the better course
would be to leave the ungrouped Saturday day task unassigned.  But
what if that task is a must-assign task (as in our solver it
actually is)?  Then assigning it has a cost, and not assigning it
has a cost, but in reality what will happen is that a subsequent
solver will (in effect) group the task with a Sunday day task and
avoid all cost.  But the Sunday day task is not an included task
(nor can it be, because we have to stop somewhere), so interval
grouping cannot perceive this possibility.
@PP
We solve this problem by not calling @C { KheTaskGrouperEntryCost }
when @M { p(g) = 0 }, instead pretending that it returned its
minimum possible value, 0.  This amounts to an admission of our
ignorance about the real cost of leaving a secondary task ungrouped.
@PP
# @B { (iv) }
Global monitors include limits on total workload, limits on
consecutive busy shifts of a particular kind, limits on consecutive
busy days, and limits on consecutive free days.  Most of them are
irrelevant here (we assume that any limits on total or consecutive
busy days are not more restrictive that the limits on consecutive
primary times).  The only relevant ones are those derived from
constraint class @M { C }.  So the other part of @M { c sub a (g) } is
the cost resulting from comparing @M { p(g) } with @M { C sub "min" }
and @M { C sub "max" }, except that when @M { p(g) = 0 } the cost is 0.
# @B { (v) }
# It only remains to incorporate (b) above.  If we decide not to
# assign @M { g }, then we subtract from @M { c(g) } the value of
# @M { S } multiplied by the total duration of the must-assign
# tasks of @M { g }.  This will cancel out some of the costs
# @M { n(s) } of choosing to not assign the tasks @M { s } of @M { g }.
@PP
Because calculating @M { c sub a (g) } is slow, we store it in a
cache indexed by the set of times that @M { g } is running.  This
is appropriate, because resource monitors are affected only by the
set of times that assigned tasks are running.  So once
@M { c sub a (g) } has been calculated once for some
set of times, subsequent calculations for that set of times
are just cache lookups and run very quickly.
@PP
@BI { Optimizing cost accumulation. }
If we declare that only finished groups contribute costs, we now
have a complete definition of the cost of a solution.  However,
we wish to take one more step, which is to optimize so that
task costs are accumulated as soon as possible.  The idea
is to identify cases where it is known that a group @M { g }
will be assigned.  Then the task costs @M { c sub a (s) } of
its tasks @M { s } can be added to the total cost without waiting
until the group is finished.  In all cases, however, group costs
are accumulated only for finished groups.
@PP
To @I { handle a group g as assigned } means to assume that a
subsequent solver will assign a resource to @M { g }, and to
treat @M { g } accordingly, as follows.  At the moment we decide
to do this, for each @M { s in g } we add @M { c sub a (s) }
to the total cost.  Later, as other tasks @M { s } are added to
@M { g }, we add their costs @M { c sub a (s) } to the total
cost as well.  (Once we decide to handle @M { g } as assigned,
we never take that decision back.)  Finally, when @M { g } becomes
finished, we add @M { c sub a (g) } to the solution cost.
@PP
When @M { g } contains history, we handle @M { g } as assigned
from the start.  Otherwise (when @M { g } does not contain
history), the first time (if ever) that a task is added to
@M { g } which is assigned a resource when interval grouping
begins, we handle @M { g } as assigned.
@PP
Otherwise (when @M { g } contains no history and no initially
assigned tasks), the straightforward way to proceed, which we will
optimize shortly, is to calculate no costs until @M { g }
becomes finished.  At that moment, we evaluate the expression
@ID @Math {
gamma(g) `` = `` min( gamma sub a (g), gamma sub n (g) )
}
given earlier, and add the smaller result to the solution cost,
declaring @M { g } to be assigned or unassigned depending on
whether @M { gamma sub a (g) } or @M { gamma sub n (g) } is
the smaller.
@PP
Most groups have no history and most tasks have no initial
assignment, so the last case is the usual one.  As things stand,
then, it is unusual for a group @M { g}'s task costs to be
accumulated before @M { g } is finished.  But now we introduce
an optimization that makes it much more usual.
@PP
Take any finished group @M { g } containing @M { K } tasks, where
@M { K >= 2 }, and suppose we have worked out that it is best for
@M { g } to remain unassigned.  So @M { g } has no history and all
@M { K } tasks are unassigned.  Break @M { g } into @M { K }
unassigned groups, each containing one of @M { g }'s tasks.  These
@M { K } groups are legal and their total cost (i.e. including both
task and group costs) is equal to the total cost of @M { g }, because
the tasks are unassigned in both cases, and unassigned groups have
group cost 0, as we established some time ago.  Grouping changes
nothing relevant to cost when the group is unassigned.
@PP
So we can insist that every unassigned group contain exactly one
task and no history, without risk of missing an optimal solution,
because any larger unassigned group can be broken up without any
change in cost.  Accordingly, at the moment a group @M { g } changes
from having one task to having two, if we are not already handling
@M { g } as assigned, we begin to do so.
@PP
The previously defined Boolean condition
@ID @Math { chi(g) = (r(g) != @C { NULL } vee bar g bar >= 2) }
is a concrete expression equivalent to handling @M { g }
as assigned.  To show this, first suppose that @M { chi(g) = "true" }.
Then @M { r(g) != @C { NULL } vee bar g bar >= 2 }, so
either @M { g } has history, or @M { g } contains an
initially assigned task, or @M { g } contains two or more
tasks.  In all these cases we handle @M { g } as assigned.
@PP
Conversely, suppose @M { chi(g) = "false" }.  Then
@M { r(g) = @C { NULL } wedge bar g bar < 2 }, so @M { g } has no
history, no initially assigned tasks, and fewer than 2 tasks, and
in these cases we do not handle @M { g } as assigned.  Also, we
assumed long ago that every group either has history or at least
one task.  So @M { bar g bar >= 1 } and we conclude that @M { g }
has no history and exactly one task, which is not initially assigned.
@End @SubAppendix

@SubAppendix
    @Title { The algorithm, and dominance testing }
    @Tag { interval_grouping.algorithm }
@Begin
@LP
We now present the overall algorithm.  As already stated
(Section {@NumberOf resource_structural.task_grouping.interval_grouping}),
it builds all solutions up to the end of the first day, then up to the
end of the second day, and so on, using dominance testing to reduce
the number of solutions kept on each day before starting the next day.
@PP
A basic step, then, is to move from a solution @M { G( S sub {i-1} ) }
for the tasks @M { S sub {i-1} = X sub 1 cup cdots cup X sub {i-1} }
to a solution @M { G( S sub i ) } for
@M { S sub i = X sub 1 cup cdots cup X sub i }.  To do this,
we need to assign each task of @M { X sub i } to a group:  either
to an existing unfinished group of @M { G( S sub {i-1} ) },
or to a new group.
@PP
Any task can start a new group, but a task @M { s } whose
first busy time lies in @M { T sub i } can only be added
to a group @M { g } if the following conditions hold (we
write @M { g + s } for the new task group @M { g cup lbrace s rbrace }):
@NumberedList

@LI @OneRow {
@M { g } is running during @M { T sub {i-1} } but not during
@M { T sub i } or thereafter; or if @M { T sub i } is the
first day, @M { g } contains a history value and nothing else.
}

@LI @OneRow {
The resulting group does not have too many primary times; that is,
@M { p(g + s) <= C sub "max" }.  Such @I { oversized groups } could
be allowed with a penalty, but that could lead to vast numbers of
solutions containing oversized groups.  Indeed, excluding oversized
groups is crucial for proving that the algorithm runs in polynomial time.
An exception is made, however, when @M { r(g) } is non-@C { NULL } and
@M { s } is assigned @M { r(g) }.  Since both @M { g } and @M { s }
are going to be assigned @M { r(g) } anyway, they might as well
be united.
}

@LI @OneRow {
@M { g } does not contain a task whose placement is @I { last_only }.
}

@LI @OneRow {
The placement of @M { s } is not @I { first_only }.
}

@LI @OneRow {
Adding @M { s } to @M { g } must be legal according to the task grouper
module (Section {@NumberOf resource_structural.task_grouping.task_grouper}).
Its rules involving times will be satisfied by the way @M { g } and
@M { s } are chosen, but it has other rules, involving domains and
assignments, and those are required by including this condition.
}

@EndList
These conditions essentially say that @M { g } is an unfinished group
and @M { s } is a suitable addition to it.
@PP
After all tasks have been added, any unfinished groups from
@M { G ( S sub {i-1} ) } that are not running during @M { T sub i } are
declared to be finished, as are all groups running during @M { T sub i }
that satisfy (b), (c), or (d) above.  Any finished group which has
not yet been declared assigned or not assigned is declared one
way or the other now.
@PP
Each group @M { g in G( S sub i ) } has its group cost @M { c(g) }
added to @M { c(G( S sub i )) } at the moment it transitions from
unfinished to finished.  Each task @M { s } lying in a group
@M { g in G( S sub i ) } has its task cost @M { c(s) } added to
@M { c(G( S sub i )) } at the moment @M { g }'s assignment status
transitions from unknown to assigned or unassigned, regardless
of whether @M { g } is finished.
@PP
To find all solutions, we do this in all ways and for all @M { i },
starting from @M { G ( S sub 0 ) }, where @M { S sub 0 } is the
empty set of tasks, and @M { G ( S sub 0 ) } contains one group for
each resource with history in @M { C }, containing a corresponding
history value and no tasks.  A solution @M { G( S sub n ) } of
minimum cost is the final result.
@PP
@BI { Dominance testing. }
As described, this process is hopelessly exponential.  However,
in many cases, one solution for @M { S sub i } can be shown to
@I dominate another, meaning that for each complete solution
(i.e. each solution containing all the tasks of @M { S sub n })
derived from the second solution, there is a complete solution
derived from the first solution whose cost is equal or less.  We can
drop the dominated solutions, except that when two solutions dominate
each other, we can only drop one of the two.  The algorithm, then,
is to build all solutions for @M { S sub 0 }, then all solutions for
@M { S sub 1 }, and so on, dropping dominated solutions at each stage.
We proceed in this breadth-first manner because dominance testing is
only practicable between pairs of solutions for the same set of tasks.
@PP
For example, after the last day all groups are finished and the only
complete solution derived from @M { G( S sub n ) } is
@M { G ( S sub n ) } itself.  So @M { G sub 1 ( S sub n ) }
dominates @M { G sub 2 ( S sub n ) } when
@M { c( G sub 1 ( S sub n ) ) <= c( G sub 2 ( S sub n ) ) }.
Dropping dominated solutions on the last day leaves just one
survivor, and that is the final result.
@PP
Testing whether solution @M { G sub 1 ( S sub i ) } dominates
solution @M { G sub 2 ( S sub i ) } has two steps.  First, for
each solution we calculate a @I { signature }:  a summary of the
solution containing everything needed for testing dominance and
nothing else.  For solutions for @M { S sub n }, the signature
is just the cost, but in general it is more complicated, as we'll
see; it must include everything that determines how the solve can
proceed from here onwards.  Second, we compare signatures, giving
an outcome that indicates either that @M { G sub 1 ( S sub i ) }
dominates @M { G sub 2 ( S sub i ) } or that it does not.  If
desired, a second call on the test can be made to decide whether
@M { G sub 2 ( S sub i ) } dominates @M { G sub 1 ( S sub i ) } or
not.  So to complete the algorithm we need to define the signature
of each solution and say how signatures are compared.
@PP
Our dominance test does not have to be perfect.  If it declares that
one solution dominates another, that must be true, otherwise the
algorithm is in danger of missing the optimal solution.  But if it
declares that one solution does not dominate another, that need not
be true; if it is false, it just means that the algorithm keeps more
solutions than is strictly necessary, making its running time and
memory consumption larger than it needs to be.
@PP
The cost of a solution is an important part of its signature:  we
have seen that it is the entire signature on the last day.  It will be
part of the signature of every solution on every day.  Finished groups
have no effect on what happens on later days, so they are not in the
signature, except that their costs are included in the solution cost.
So we take the signature of a solution @M { G( S sub i ) } to be
@M { c( G ( S sub i ) ) } plus the unfinished groups of @M { G ( S sub i ) }.
@PP
For example, if @M { G sub 1 ( S sub i ) } and @M { G sub 2 ( S sub i ) }
have identical unfinished groups, clearly they can only develop in
identical ways, so only one need be kept---one whose cost is less
than or equal to the other's.  But for the dominance test to require
unfinished groups to be identical would be to give up too many cases
of dominance.  Instead, for @M { G sub 1 ( S sub i ) } to dominate
solution @M { G sub 2 ( S sub i ) }, we require:
@BulletList

@LI @OneRow {
@M { c( G sub 1 ( S sub i ) ) <= c( G sub 2 ( S sub i ) ) };
}

@LI @OneRow {
@M { G sub 1 ( S sub i ) } and @M { G sub 2 ( S sub i ) } have the same
number of unfinished groups;
}

@LI @OneRow {
The unfinished groups of @M { G sub 1 ( S sub i ) } and
@M { G sub 2 ( S sub i ) } can be permuted so
that for each pair of corresponding unfinished groups @M { g sub 1 } from
@M { G sub 1 ( S sub i ) } and @M { g sub 2 } from @M { G sub 2 ( S sub i ) },
@M { g sub 1 } dominates @M { g sub 2 }.
}

@EndList
We say what it means for one group to dominate another below.
Notice the `imperfection' in this definition:  we have given away
any prospect of declaring dominance when the number of unfinished
groups differs.  This is because anything more is impractical.
Similarly, we will try just one permutation of the unfinished
groups of @M { G sub 1 ( S sub i ) } and @M { G sub 2 ( S sub i ) },
and if we try the wrong one, we will miss a case of dominance.
# There is often a one-to-one correspondence between
# unfinished groups in @M { G ( S sub i ) } and tasks in @M { X sub i },
# however, so fewer cases of dominance will be missed for this reason
# than at first seems likely.
# This source of imperfection is more likely.
@PP
The apparently innocuous condition
@M { c( G sub 1 ( S sub i ) ) <= c( G sub 2 ( S sub i ) ) }
depends crucially on two facts.  First, whenever we add a group cost
or a task cost to a solution cost, that cost will not be removed as
the solution is extended into any larger solution.  Second, all costs
are non-negative.
@PP
The next step is to determine how two unfinished groups @M { g sub 1 }
and @M { g sub 2 } can be compared for dominance.  The idea is that
the two groups do not have to be identical, but nevertheless they
must be similar enough to satisfy these two general requirements:
@BulletList

@LI @OneRow {
@I { Legality. }  For every set of tasks @M { Q } such that extending
@M { g sub 2 } into a finished group @M { g sub 2 cup Q } is legal,
extending @M { g sub 1 } into a finished group @M { g sub 1 cup Q }
is also legal.
}

@LI @OneRow {
@I { Cost. }  For each @M { Q }, the cost contributed by
finished group @M { g sub 1 cup Q } does not exceed the cost
contributed by finished group @M { g sub 2 cup Q }, excluding
costs already contributed by @M { g sub 1 } and @M { g sub 2 }.
}

@EndList
If these requirements hold, then for every way that solution
@M { G sub 2 ( S sub i ) } can be extended into a complete
solution for @M { S sub n }, that same way applied to
@M { G sub 1 ( S sub i ) } will extend it into another complete
solution for @M { S sub n } whose cost is equal or less.  This
proves dominance.  We now look for concrete, implementable
conditions that imply these two general requirements.
@PP
To show that @M { g sub 1 cup Q } is legal, we have to show that
adding @M { Q } to @M { g sub 1 } satisfies the six conditions for
legality defined in
Section {@NumberOf resource_structural.task_grouping.task_grouper}.
The fact that @M { g sub 1 } and @M { g sub 2 } are legal
groups, plus the assumption that @M { g sub 2 cup Q } is legal,
plus the following conditions, give us what we need:
@NumberedList

@LI {
The last day that @M { g sub 1 } is running, and the last day
that @M { g sub 2 } is running, are the same.  This gives us
condition (1) from
Section {@NumberOf resource_structural.task_grouping.task_grouper}.
}

@LI @OneRow {
@M { d( g sub 1 ) supseteq d( g sub2 ) }.
This gives us condition (2) from
Section {@NumberOf resource_structural.task_grouping.task_grouper}.
Actually, we loosen this using @C { KheTaskGroupDomainDominates }, as
explained in Section {@NumberOf resource_structural.task_grouping.domains}.
}

@LI @OneRow {
@M { r( g sub 1 ) = r( g sub2 ) }.
This gives us conditions (3), (4), and (6) from
Section {@NumberOf resource_structural.task_grouping.task_grouper}.
}

# @LI @OneRow {
# @M { f ( g sub 1 ) = f ( g sub 2 ) }.
# This gives us condition (5) from
# Section {@NumberOf resource_structural.task_grouping.task_grouper}.
# }

@LII {
Any group can be declared to be finished at any time, so requiring
@M { g sub 1 cup Q } to be finished is not a problem.  To support the
cost requirement, we'll show that these additional conditions suffice:
}

@LI @OneRow {
@M { T prime ( g sub 1 ) = T prime ( g sub 2 ) }.  In other words,
the times of @M { g sub 1 } and @M { g sub 2 }, plus any history,
are the same.  This subsumes condition (1) above.
}

@LI {
@M { chi( g sub 1 ) = chi( g sub 2 ) }.
# , where
# @M { chi(g) = (r(g) != @C { NULL } vee bar g bar >= 2) } as usual.
}

@LI {
If @M { chi( g sub 1 ) = chi( g sub 2 ) = "false" }, then
@M { g sub 1 } consists of a single task @M { s sub 1 } and
@M { g sub 2 } consists of a single task @M { s sub 2 } (as
proved at the end of Section {@NumberOf interval_grouping.cost}),
and we require @M { c sub a ( s sub 1 ) <= c sub a ( s sub 2 ) } and
@M { c sub n ( s sub 1 ) <= c sub n ( s sub 2 ) }.
}

@EndList
Condition (5) may not be strictly necessary, but it is not likely
to cause many dominance misses in practice, and it allows us to
reduce the analysis to just the following two cases.
@PP
@B { Case 1 }.
Suppose @M { chi( g sub 1 ) = chi( g sub 2 ) = "true" }.
Then @M { g sub 1 } and @M { g sub 2 } are being handled as assigned,
so the task costs of their tasks have already been included in
the solution cost, and so play no role here.  The task costs of
the tasks of @M { Q } are the same in both @M { g sub 1 cup Q } and
@M { g sub 2 cup Q }, given that @M { Q } is the same in both and
both groups are being handled as assigned.
@PP
Now for group costs.  Condition (4) implies
@M { T prime ( g sub 1 cup Q ) = T prime ( g sub 2 cup Q ) },
which implies that @M { c ( g sub 1 cup Q ) = c ( g sub 2 cup Q ) },
because group costs depend only on resource monitors, which depend
only on the times that the group's tasks are running (including
any history).  Whether the groups are assigned or unassigned
also matters, but both are being handled as assigned here.
@PP
@B { Case 2 }.
Suppose @M { chi( g sub 1 ) = chi( g sub 2 ) = "false" }.  This
implies, as above, that @M { g sub 1 } consists of a single task
@M { s sub 1 }, and @M { g sub 2 } consists of a single task
@M { s sub 2 }.  Because @M { g sub 1 } and @M { g sub 2 } are
not being handled as assigned, the task costs of @M { s sub 1 }
and @M { s sub 2 } have not been included in the total cost and
must be accounted for here.
@PP
First, suppose @M { Q != emptyset }.  Then
@M { bar g sub 1 cup Q bar >= 2 } and @M { bar g sub 2 cup Q bar >= 2 },
so @M { chi( g sub 1 cup Q ) = chi( g sub 2 cup Q ) = "true" }.  Both
groups start being handled as assigned when the first task of @M { Q }
is added, at which point @M { c sub a ( s sub 1 ) } and
@M { c sub a ( s sub 2 ) } are added to the costs of their respective
solutions.  So we require @M { c sub a ( s sub 1 ) <= c sub a ( s sub 2 ) }.
After that, the same task costs (for
the tasks of @M { Q }, handled as assigned) are added.  For the group
cost, since both complete groups are handled as assigned, the argument
we used before shows that @M { c( g sub 1 cup Q ) = c( g sub 2 cup Q ) }.
@PP
Second, suppose that @M { Q = emptyset }.  In this case, for both
@M { g sub 1 } and @M { g sub 2 } we evaluate @M { gamma sub a (g) }
and @M { gamma sub u (g) }, and we add the smaller of these two
values, which we have called @M { gamma (g) }, to the solution cost.
So we need @M { gamma ( g sub 1 ) <= gamma ( g sub 2 ) }.  So our
last job is to find a concrete condition that implies this one.
@PP
Now, because @M { g sub 1 } contains a single task @M { s sub 1 },
and @M { g sub 1 } contains a single task @M { s sub 2 }, we have
@ID @Math {
gamma ( g sub 1 ) = min(
c sub a ( g sub 1 ) + c sub a ( s sub 1 ) , `
c sub n ( g sub 1 ) + c sub n ( s sub 1 ) )
}
and
@ID @Math {
gamma ( g sub 2 ) = min(
c sub a ( g sub 2 ) + c sub a ( s sub 2 ) , `
c sub n ( g sub 2 ) + c sub n ( s sub 2 ) )
}
Now @M { c sub a ( g sub 1 ) = c sub a ( g sub 2 ) = alpha }, say,
because @M { g sub 1 } and @M { g sub 2 } run at the same times,
so have the same group cost when assigned.  Also,
@M { c sub n ( g sub 1 ) = c sub n ( g sub 2 ) = 0 }, because,
as we know,
there is no group cost when the group is unassigned.
# Furthermore,
# @M { c sub a ( s sub 1 ) = a( s sub 1 ) + beta }, say, where
# @M { beta } is a demand cost, or 0 if that is not wanted; and
# @M { c sub a ( s sub 2 ) = a( s sub 2 ) + beta }, where
# we use the same @M { beta } because it is added under the
# same conditions and @M { s sub 1 } and @M { s sub 2 } have
# equal durations.  Finally,
# @M { c sub n ( s sub 1 ) = n( s sub 1 ) } and
# @M { c sub n ( s sub 2 ) = n( s sub 2 ) }.
Substituting these values gives
@ID @Math {
gamma ( g sub 1 ) = min(
alpha + c sub a ( s sub 1 ),
0 + c sub n ( s sub 1 ) )
}
and
@ID @Math {
gamma ( g sub 2 ) = min(
alpha + c sub a ( s sub 2 ),
0 + c sub n ( s sub 2 ) )
}
We could require @M { gamma ( g sub 1 ) <= gamma ( g sub 2 ) } directly
at this point, which would be maximally precise; but, since this case
already requires @M { c sub a ( s sub 1 ) <= c sub a ( s sub 2 ) }, it seems
better (simpler) to require @M { c sub n ( s sub 1 ) <= c sub n ( s sub 2 ) }
as well.  Clearly, these two conditions together imply
@M { gamma ( g sub 1 ) <= gamma ( g sub 2 ) }.  And we're done.
# @PP
# Second, suppose that @M { Q = emptyset }.  In this case, for both
# @M { g sub 1 } and @M { g sub 2 } we evaluate
# @ID @Math {
# c(g) + big sum from {s in g} c(s)
# }
# twice, once assuming that @M { g } is assigned and again assuming that
# @M { g } is not assigned, and we add the smaller of the two results
# to the solution cost.  Let this smaller value be @M { c prime (g) }.
# Then we need @M { c prime ( g sub 1 ) <= c prime ( g sub 2 ) }.  So
# our last job is to find a concrete condition that implies this one.
# @PP
# Now writing @M { c sub a (g) } and @M { c sub a ( s ) } for the
# cost of @M { g } and @M { s } when they are assigned, and
# @M { c sub n (g) } and @M { c sub n ( s ) } for the
# cost of @M { g } and @M { s } when they are not assigned, we have
# @ID @Math {
# c prime ( g sub 1 ) = min(
# c sub a ( g sub 1 ) + c sub a ( s sub 1 ) , `
# c sub n ( g sub 1 ) + c sub n ( s sub 1 ) )
# }
# and
# @ID @Math {
# c prime ( g sub 2 ) = min(
# c sub a ( g sub 2 ) + c sub a ( s sub 2 ) , `
# c sub n ( g sub 2 ) + c sub n ( s sub 2 ) )
# }
# Now @M { c sub a ( g sub 1 ) = c sub a ( g sub 2 ) = alpha }, say,
# because @M { g sub 1 } and @M { g sub 2 } run at the same times,
# so have the same group cost when assigned.  Also,
# @M { c sub n ( g sub 1 ) = c sub n ( g sub 2 ) = 0 }, because,
# as we know,
# there is no group cost when the group is unassigned.  Furthermore,
# @M { c sub a ( s sub 1 ) = a( s sub 1 ) + beta }, say, where
# @M { beta } is the usual cost of overloading a resource by
# the number of times that @M { s sub 1 } is running, or 0
# if that is not wanted; and
# @M { c sub a ( s sub 2 ) = a( s sub 2 ) + beta }, where
# we use the same @M { beta } because it is added under the
# same conditions and @M { s sub 1 } and @M { s sub 2 } have
# equal durations.  Finally,
# @M { c sub n ( s sub 1 ) = n( s sub 1 ) } and
# @M { c sub n ( s sub 2 ) = n( s sub 2 ) }.
# Substituting all this gives
# @ID @Math {
# c prime ( g sub 1 ) = min(
# alpha + a( s sub 1 ) + beta,
# 0 + n( s sub 1 ) )
# }
# and
# @ID @Math {
# c prime ( g sub 2 ) = min(
# alpha + a( s sub 2 ) + beta,
# 0 + n( s sub 2 ) )
# }
# We could require @M { c prime ( g sub 1 ) <= c prime ( g sub 2 ) }
# directly at this point, which would be maximally precise; but,
# since this case already requires @M { a( s sub 1 ) <= a( s sub 2 ) },
# it seems better (simpler) to require @M { n( s sub 1 ) <= n( s sub 2 ) }
# as well.  Clearly, these two conditions together imply
# @M { c prime ( g sub 1 ) <= c prime ( g sub 2 ) }.
# And we're done.
@PP
@BI { Concluding points. }
In practice, most groups contain only unassigned, primary, must-assign
tasks.  For such groups, only conditions (2) and (4) are likely to
fail to hold, with (5) failing only when the groups have different
durations.  This is important for the time complexity analysis.
@PP
We need to explain how the unfinished groups are permuted.  We do
this by sorting them, using the attributes that appear in the
dominance test as sort keys.  This is not perfect, but, as
explained above, we do not need perfection.
@End @SubAppendix

@SubAppendix
    @Title { Some detailed points about the algorithm }
    @Tag { interval_grouping.algorithm_detailed }
@Begin
@LP
In this section we present some points of
detail about the algorithm just described.
# Each is helpful but not essential.
@PP
@BI { Grouping tasks with similar domains. }
At first sight, the algorithm does not seem to include anything
which favours grouping tasks with similar domains.  However, the
dominance test does favour such groups, as we now show.
@PP
Suppose that there are two groups, @M { g sub 1 } and
@M { g sub 2 }, in some solution, and that their domains satisfy
@M { d( g sub 1 ) subseteq d( g sub 2 ) }.  Suppose that there
are two tasks @M { s sub 1 } and @M { s sub 2 } running on the
next day, and that their domains happen to be
@M { d( s sub 1 ) = d( g sub 1 ) } and 
@M { d( s sub 2 ) = d( g sub 2 ) }.  If we group @M { s sub 1 }
with @M { g sub 1 } and @M { s sub 2 } with @M { g sub 2 }, the
domains of the new groups will be @M { d( g sub 1 ) } and
@M { d( g sub 2 ) }.  If we group @M { s sub 1 } with @M { g sub 2 }
and @M { s sub 2 } with @M { g sub 1 }, their domains will both
be @M { d( g sub 1 ) cap d( g sub 2 ) = d( g sub 1 ) }.  But
@M { d( g sub 1 ) subseteq d( g sub 2 ) }, so this second solution
is dominated by the first.
@PP
The early dominance testing optimization
(Section {@NumberOf interval_grouping.early_dominance})
is also relevant to this issue.
# Nevertheless there are occasional points in the optimal solution where
# swapping two tasks which run at the same times, that is, moving each
# to the other's group, enlarges the domain of one of the two groups
# without shrinking the other.  The author has not fully analysed this
# situation.  However, when two solutions dominate each other (when
# they have the same cost and the same signature), he has added a
# further test.  Each solution keeps track of the number of its
# finished groups whose tasks do not all have the same domain.  The
# further test favours retaining solutions for which this number
# is smaller.  This seems to have fixed the immediate problem.
@PP
@BI { Adding randomness. }
When two solutions dominate each other, there is an opportunity
to add randomness.  We do this by building task sets in a somewhat
random order, depending on the solution's diversifier.  This should
cause solutions to be created in a somewhat random order, and since
the first solution to be created is retained when two solutions
dominate each other, this in turn should cause the set of solutions
retained on each day to be somewhat random.
# @PP
# @BI { Adding randomness. }
# When two solutions dominate each other, there is an opportunity
# to add randomness.  We do this by finding a random integer for
# each solution based on the total number of solutions that have
# been generated so far, plus the KHE solution's diversifier.  If
# all else is equal, a solution with a smaller (or equal) random
# integer is considered to dominate.  In practice this seems to
# produce fairly different solutions.
@PP
@BI { Optimizing expansion. }
Suppose we have a specific solution
@M { G( S sub {i-1} ) } and a set of tasks @M { X sub i } starting
on day @M { i } that we need to add to @M { G( S sub {i-1} ) } in all
possible ways.  We call this @I { expanding } @M { G( S sub {i-1} ) }.
Let @M { U sub {i-1} } be the unfinished groups of @M { G( S sub {i-1} ) }.
Some of the tasks of @M { X sub i } will be added to some of the
groups of @M { U sub {i-1} }; the other tasks of @M { X sub i } will
start new groups.  This matching up of the groups @M { U sub {i-1} }
and tasks @M { X sub i } can be split into four cases:
@NumberedList

@LI {
@I { Overhangs case. }
For each group @M { g in U sub {i-1} } which ends with a multi-day
task which is already running during @M { T sub i } (we call this
an @I { overhang }), we include @M { g }, as is, in all solutions
for @M { S sub i }.
}

@LI {
@I { Continuing assignments case. }
For each group @M { g in U sub {i-1} } not handled by (1)
and @M { s in X sub i } which are assigned the same resource,
we must add @M { s } to @M { g }.  This includes cases where
@M { g } has no tasks but does have a history value.
}

@LI {
@I { Undersized groups case. }
An @I undersized group is a group @M { g } such that
@M { 0 < p(g) < C sub "min" }.
For each undersized @M { g in U sub {i-1} } not handled
by cases (1) and (2), we can insist that @M { g } receive some
@M { s in X sub i } whenever a suitable @M { s } can be found.
This is quite different from the main rule, which insists
that each @M { s } be included.  It implies that avoiding
undersized groups has high priority, which is not strictly
true (its priority depends on the cost of undersized groups),
but which we take to be true in practice, as an optimization
that reduces the number of solutions generated.  This case
is omitted when the @C { rs_interval_grouping_complete }
option is set to @C { true }; this is the only effect of
that option.
}

@LI {
@I { Default case. }
This covers all @M { g } and @M { s } which are not
covered by cases (1), (2), and (3).
}

@EndList
For a given @M { G( S sub {i-1} ) } and @M { X sub i }, we first
handle cases (1) and (2), producing a single partial solution
for @M {  S sub i }.  We then recursively assign a task to each
undersized group not handled by (1) and (2) in each possible way.
For each partial solution thus obtained, we then recursively assign
each task not handled by cases (1), (2), and (3) to either an existing
unused group or to a new group.  This saves time by not generating
solutions in which undersized groups are not assigned tasks, except
when suitable tasks cannot be found.
@PP
@BI { Optimizing dominance testing. }
If one solution dominates another, then the two solutions have
the same number of unfinished task groups, and those task
groups can be sorted so that corresponding task groups have
equal primary durations.  Although it is regrettable that such
rigid rules need to be applied, they do have one advantage:
their presence means that there is no need to carry out dominance
testing at all between solutions that do not satisfy them.
@PP
Accordingly, the set of undominated solutions for each day is
partitioned into one part for each distinct value of the sequence
of integers beginning with the number of unfinished groups and
continuing with those groups' primary durations in non-decreasing
order.  When adding a new solution, a trie data structure is used
to access its part, and all dominance testing occurs within that
part.  This reduces the number of dominance tests required when
adding a solution.
@PP
@BI { Including lengthener tasks. }
The algorithm never produces a group whose primary duration exceeds
@M { C sub "max" }, except for groups that are unavoidable because
they contain just one task, or because all their tasks are assigned
the same resource.  But it does produce groups whose primary duration
falls short of @M { C sub "min" }, when it cannot avoid it.
@PP
In good solutions, these short sequences are often extended by
non-must-assign primary tasks.  But we can't include all such tasks
in the solve, because the algorithm might become swamped by these
tasks and run too slowly.
@PP
So we do this.  First, we run the algorithm without including
non-must-assign primary tasks.  Then, for each undersized group
in the solution we identify up to two non-must-assign primary tasks
that could be used to lengthen that group, one just before it
and one just after it.  We then run the algorithm a second time,
adding these @I { lengthener tasks } to the previously included tasks.
@PP
On one run, including lengthener tasks reduced the total duration
of undersized groups from 15 to 4, while introducing 10
non-must-assign primary tasks lying in must-assign groups.
@PP
At present, this lengthener tasks run is turned off.  It may
not be worth the time it takes, given that subsequent solvers
can easily add non-must-assign tasks to undersized groups.
@End @SubAppendix

@SubAppendix
    @Title { Equivalent task groups and tasks }
    @Tag { interval_grouping.equivalent }
@Begin
@LP
Consider again the expansion step, which takes
a solution @M { G( S sub {i-1} ) } with unfinished groups
@M { U sub {i-1} }, and adds the tasks @M { X sub i } to
@M { G( S sub {i-1} ) } in all possible ways.  Even one
expansion can produce thousands of new solutions and
thousands of dominance tests.  We need to generate fewer solutions.
@PP
We can do this by detecting equivalent task groups and tasks (we
define equivalence below, but informally we mean interchangeable),
and not generating solutions which just permute these interchangeable
objects among themselves.  Write @M { g + s } for the task group
@M { g cup lbrace s rbrace }.  Let @M { g sub 1 } and @M { g sub 2 }
be two distinct unfinished groups from @M { G( S sub {i-1} ) }, and
let @M { s sub 1 } and @M { s sub 2 } be two distinct tasks from
@M { X sub i }.  If @M { g sub 1 } and @M { g sub 2 } are equivalent,
or if @M { s sub 1 } and @M { s sub 2 } are equivalent (note this
is `or', not `and'), then any solution which includes
@M { g sub 1 + s sub 1 } and @M { g sub 2 + s sub 2 } is
equivalent to the same solution with these two groups replaced
by @M { g sub 1 + s sub 2 } and @M { g sub 2 + s sub 1 }.  So
only one of these two solutions needs to be generated.
@PP
We realize these savings by grouping the task groups of
@M { U sub {i-1} } into equivalence classes before expansion begins,
and grouping the tasks of @M { X sub i } into equivalence classes
before solving constraint class @M { C } begins.  Then we recurse
over task group classes and task classes, rather than over task
groups and tasks.  For a given task group class @M { g overbar }
and a given task class @M { s overbar }, we only try assigning
different @I numbers of tasks (including 0) from @M { s overbar }
to task groups from @M { g overbar }; we no longer need to try
different assignments of @I specific tasks to specific task groups.
@PP
Also, before starting on the recursive part of the expansion,
for each task group class @M { g overbar } and each task class
@M { s overbar }, we decide whether any one task @M { s in s overbar }
can be added to any one group @M { g in g overbar }, and if so we add
a @I { link object } joining @M { g overbar } with @M { s overbar }.
Then the recursion traverses the links, and so only tries additions
of tasks to groups that are known to work.
@PP
We now define what it means for two task groups @M { g sub 1 } and
@M { g sub 2 } to be equivalent.  We exclude task groups with an
overhang, since they don't really participate in expansion; they just
get copied from the old solution to the new.  So let @M { g sub 1 }
and @M { g sub 2 } be unfinished task groups with no overhang.
@PP
Clearly, @M { g sub 1 } and @M { g sub 2 } are equivalent if all
their attributes that influence the algorithm (as listed in
Section {@NumberOf interval_grouping.task_groups}) are equal.  From
this list, the attributes we actually test are @M { T prime (g) },
@M { d(g) }, @M { r(g) }, and @M { chi(g) }.  We omit @M { T(g) }
and @M { p(g) } because @M { T prime (g) } includes them.  It does
not matter whether a group has a @I { last_only } task, because
if it does, it is finished and does not participate in expansion.
@PP
Similarly, tasks @M { s sub 1 } and @M { s sub 2 } are equivalent
if all their attributes that influence the algorithm (as listed in
Section {@NumberOf interval_grouping.tasks}) are equal.  From this
list, the attributes we actually test are @M { T prime (s) },
@M { d(s) }, @M { r(s) }, and @M { q(s) }.  We omit @M { T(s) }
and @M { p(s) } because they are included in @M { T prime (s) },
and we omit @M { n(s) } and @M { a(s) } because we handle them
in another way that we will explain now.
@PP
Consider two included tasks @M { s sub 1 } and @M { s sub 2 }
whose attributes are equal except that their @M { n(s) } and
@M { a(s) } values differ.  What difference do these different
values actually make?
@PP
To begin with, @M { n(s) } and @M { a(s) } do not influence which
groups @M { s } can be added to.  A glance at the five conditions
at the start of Section {@NumberOf interval_grouping.algorithm}
proves this.  So they can influence only the cost of groups and
solutions.  Furthermore, if
every group had to be assigned, then since every task goes
into some group, @M { n(s) } would be unused and @M { a(s) } would
be included in all solutions, so it could be ignored.  So @M { n(s) }
and @M { a(s) } only matter when there are unassigned groups.
@PP
When @M { s sub 1 } and @M { s sub 2 } are both added to existing
groups, they make no difference at all, because both groups then
have more than one task, and so, as explained in
Section {@NumberOf interval_grouping.cost}, their task costs
are calculated assuming that they will be assigned.
When @M { s sub 1 } and @M { s sub 2 } are both used to start
new groups, we don't know which costs they will contribute, but it
does not matter, because the situation is symmetrical.  So these
different values make a difference only when one of @M { s sub 1 }
and @M { s sub 2 } starts a new group and the other does not.
Even then, there is a difference only when the new group ends
up containing just the one task and a decision is made to
leave it unassigned.
@PP
Now we are free to swap @M { s sub 1 } and @M { s sub 2 }, because
of our assumption that their other attributes are equal.  So which
one should go into the existing group, and which one should start
the new group?  Clearly the task with the smaller value of
@M { n(s) - a(s) } should start the new group, because more
cost will be saved if that task ends up unassigned, which
is only possible with the new group.
@PP
One way to implement this is to use the linking mechanism of
Section {@NumberOf interval_grouping.early_dominance}.  A link
would prohibit one of the tasks being added to an empty group
and the other being added to a non-empty group at the same
time.  However, we can do better, as follows.
@PP
First, we omit @M { n(s) } and @M { a(s) } from the equivalence
test.  So tasks which have most attributes equal but differ in
these ones go into the same task class.  We sort the members of
each task class by decreasing @M { n(s) - a(s) }, so that when
taking tasks from a class to add to a group, tasks with larger
values of @M { n(s) - a(s) } are taken before tasks with smaller
ones.  Then, when expanding, we make sure that using tasks to
start new groups is always done last, which (as it turns out)
is the sensible way anyway.  So the tasks with the smallest
@M { n(s) - a(s) } values go into the new groups.
@PP
Most tasks run on one day, and they are unassigned, not fixed,
must-assign, and primary, and their placement is @I { any }.  So
most tasks differ only in their domains, and we can expect to build
one equivalence class for each domain.  Task groups are similar,
except that their durations also vary, so we can expect to build
one equivalence class for each (domain, duration) pair.
@End @SubAppendix

@SubAppendix
    @Title { Early dominance testing }
    @Tag { interval_grouping.early_dominance }
@Begin
@LP
In this section we continue to save time during expansion using
equivalence between task groups and tasks, but here we do not require
equivalence in full.  We'll explain the idea using task groups and
tasks, although the implementation uses task group classes and task
classes.
# As before, let @M { g sub 1 } and @M { g sub 2 } be two
# unfinished groups from @M { G( S sub {i-1} ) }, and let
# @M { s sub 1 } and @M { s sub 2 } be two tasks from @M { X sub i }.
@PP
(1) Let @M { s sub 1 } and @M { s sub 2 } be two tasks from @M { X sub i }
which are equivalent except that their domains differ.  Let
@M { g sub 1 } and @M { g sub 2 } be any two unfinished task groups
from @M { G( S sub {i-1} ) }.  Suppose the @I { superset condition }
@ID @Math {
d( g sub 1 + s sub 2 ) supseteq d( g sub 1 + s sub 1 )
@B "  and  "
d( g sub 2 + s sub 1 ) supseteq d( g sub 2 + s sub 2 )
@B "  and  "
"proper"
}
holds, where @I proper signifies that at least one of the superset
relations is proper.  Then any
solution in which @M { s sub 1 } is grouped with @M { g sub 1 } and
@M { s sub 2 } is grouped with @M { g sub 2 } will be dominated by
the solution which is the same except that @M { s sub 1 } is grouped
with @M { g sub 2 } and @M { s sub 2 } is grouped with @M { g sub 1 }.
This follows from the fact that @M { s sub 1 } and @M { s sub 2 }
are equivalent in most respects, so that it makes no difference
which goes with @M { g sub 1 } and which with @M { g sub 2 },
except for their domains, and those influence only the domains of
the groups they go into, which are handled by the superset condition.
@PP
Actually, as mentioned earlier, we use @C { KheTaskGroupDomainDominates }
instead of the superset test, as explained in
Section {@NumberOf resource_structural.task_grouping.domains}.  So
the reader who wants the exact story will have to mentally substitute
@C { KheTaskGroupDomainDominates } for each superset operation.
@PP
The dominated solutions may be eliminated by dominance testing (or
not---our test is not perfect), but here we intend to avoid creating
them at all.  We add @I proper because we don't want to eliminate
two solutions just because they dominate each other.
@PP
(2) Now (1) allows @M { g sub 1 } and @M { g sub 2 } to be any
unfinished task groups.  We can extend the idea by allowing one
of @M { g sub 1 } and @M { g sub 2 } to be the empty group, or
in other words, allowing one of the almost equivalent tasks
@M { s sub 1 } and @M { s sub 2 } to begin a new group.  If
@M { g sub 1 } is the empty group, in effect its domain is
the set of all resources.  So if the superset condition holds
with the two cases of `@M { g sub 1 `` non + }' deleted, then
any solution in which @M { s sub 1 } starts a new group and
@M { s sub 2 } is grouped with @M { g sub 2 } will be dominated
by the solution which is the same except that @M { s sub 1 } is
grouped with @M { g sub 2 } and @M { s sub 2 } starts a new group.
# (We allow a link object to have a @C { NULL } value for @M { g }).
# We don't have to worry about whether the
# empty group is equivalent to @M { g sub 2 }, because this
# analysis applies to any two groups.
@PP
If @M { g sub 2 } is the empty group, we get the same case with
different labels, so we forbid that.  If both groups are empty, we
would need to delete them both from the superset condition.  This
produces @M { d( s sub 2 ) = d( s sub 1 ) }, putting @M { s sub 1 }
and @M { s sub 2 } into the same task class, so giving us nothing useful.
@PP
(3) Conversely to (1), let @M { g sub 1 } and @M { g sub 2 } be
two unfinished task groups from @M { G( S sub {i-1} ) } which
are equivalent except that their domains are different.  Let
@M { s sub 1 } and @M { s sub 2 } be any two tasks from
@M { X sub i }.  @I { Logic lacuna is here. }
Again, if the superset condition
holds, then any solution in which @M { s sub 1 } is grouped with
@M { g sub 1 } and @M { s sub 2 } is grouped with @M { g sub 2 }
will be dominated by the solution which is the same except that
@M { s sub 1 } is grouped with @M { g sub 2 } and @M { s sub 2 }
is grouped with @M { g sub 1 }.  This follows from the fact that
@M { g sub 1 } and @M { g sub 2 } are equivalent in most respects,
so that it makes no difference which receives @M { s sub 1 } and
which receives @M { s sub 2 }, except for their domains, and those
influence only the domains of the groups they go into, which are
handled by the superset condition.  Groups are less likely to be
equivalent than tasks are, because groups commonly vary in their
durations as well as their domains, so we can expect less
improvement from this kind of early dominance testing.
@PP
(4) Since (3) allows @M { s sub 1 } and @M { s sub 2 } to be any
tasks, we can extend its scope by allowing one of @M { s sub 1 }
and @M { s sub 2 } to be the empty task, that is, nothing, which
amounts to saying that the group it goes into comes to an end.
Taking @M { s sub 1 } to be the empty task, in effect its
domain is the set of all resources.  So if the superset condition
holds with `@M { {non +} `` s sub 1 }' deleted, then any solution
in which @M { g sub 1 } ends and @M { s sub 2 } is grouped with
@M { g sub 2 } will be dominated by the same solution except that
@M { g sub 1 } is grouped with @M { s sub 2 } and @M { g sub 2 }
ends.  As with empty groups, taking @M { s sub 2 } to be the empty
task is not really different, and taking both tasks to be empty
gives nothing useful.
@PP
These four cases, (1), (2), (3), and (4), constitute
@I { early dominance testing }.  We uncover them before beginning the
expansion, and we avoid creating the dominated solutions by marking
the link between @M { g sub 1 } and @M { s sub 1 } while
@M { g sub 2 + s sub 2 } is being tried, and marking the link between
@M { g sub 2 } and @M { s sub 2 } while @M { g sub 1 + s sub 1 } is
being tried, and declining to build groups whose links are marked.
@PP
As mentioned earlier, all this is actually applied to task group classes
and task classes @M { g overbar sub 1 }, @M { g overbar sub 2 },
@M { s overbar sub 1 }, and @M { s overbar sub 2 } rather than to
task groups and tasks @M { g sub 1 }, @M { g sub 2 }, @M { s sub 1 },
and @M { s sub 2 }.  It works for classes:  just mark the link between
@M { g overbar sub 2 } and @M { s overbar sub 2 } while the link
between @M { g overbar sub 1 } and @M { s overbar sub 1 } is
being used to build one or more groups, and mark the link between
@M { g overbar sub 1 } and @M { s overbar sub 1 } while the link
between @M { g overbar sub 2 } and @M { s overbar sub 2 } is
being used to build one or more groups, and decline to build
any groups which would utilize marked links.  Special links
represent the cases where @M { g overbar sub 1 } is the class
of empty groups and @M { s overbar sub 1 } is the class of
empty tasks.
@End @SubAppendix

# @SubAppendix
#     @Title { Handling task cost }
#     @Tag { interval_grouping.task_cost }
# @Begin
# @LP
# We now take a closer look at the task cost, @M { v(s) }, and
# in particular how we can safely omit it when testing two tasks
# for equivalence.
# @PP
# To begin with, @M { v(s) } largely consists of @M { n(s) } and
# @M { a(s) }, which also influence whether @M { s } is included
# or not.  However, our concern here is for the effect of
# @M { v(s) } after it has been decided to include @M { s }, so
# we can put aside the connection with inclusion.  Furthermore,
# @M { v(s) } does not influence which groups @M { s } can be
# added to.  A glance at the five conditions at the start of
# Section {@NumberOf interval_grouping.algorithm} proves this.
# So @M { v(s) } can influence only the cost of groups and solutions.
# # It appears in dominance tests because of its potential to
# # influence these costs.
# @PP
# One's first thought is that @M { v(s) } has no influence on
# cost either.  Every solution for a given set of tasks that
# includes @M { s } contains a group containing @M { s }, so
# every solution cost includes @M { v(s) }, so when comparing
# two solution costs, @M { v(s) } could be omitted with no
# effect on the outcome.
# @PP
# This would be correct except that @M { s } could be added to
# a group @M { g } which is subsequently not assigned, so that in
# the formula @M { c(g) = min( c sub a (g) , c sub u (g) ) } the
# @M { c sub u (g) } option is taken and @M { v(s) } is not
# included in the cost of @M { g }.  So there is a close
# connection between @M { v(s) } and unassigned groups.
# # @PP
# # Take any group @M { g } containing @M { K } tasks whose costs
# # indicate that it will remain unassigned.  Now @M { g } cannot
# # contain history, because groups with history are always assigned.
# # Break @M { g } into @M { K } unassigned groups, each containing
# # one of @M { g }'s tasks.  These @M { K } groups are legal and
# # their cost is the same as the cost of @M { g }, namely 0.  (Or
# # less, if there is some advantage in breaking @M { g } up.)  So
# # we can insist that every unassigned group contain exactly one
# # task and no history, without any risk of missing an optimal
# # solution.  Accordingly, we define and use @M { c sub u (g) }
# # only when @M { r(g) } is @C { NULL } and @M { g } contains a
# # single task and no history.
# @PP
# Consider two included tasks @M { s sub 1 } and @M { s sub 2 }
# whose attributes are equal except that their task costs
# @M { v( s sub 1 ) } and @M { v( s sub 2 ) } differ.  Without
# loss of generality, suppose @M { v( s sub 1 ) < v( s sub 2 ) }.
# What difference do these different values actually make?
# @PP
# When @M { s sub 1 } and @M { s sub 2 } are both added to existing
# groups, they make no difference at all, because both groups then
# have more than just a single task, and so, as explained when defining
# task group costs (Section {@NumberOf interval_grouping.task_groups}),
# their costs are calculated assuming that they will subsequently be
# assigned.  When @M { s sub 1 } and @M { s sub 2 } are both used to
# start new groups, they make no difference because the situation is
# symmetrical.  So @M { v( s sub 1 ) } and @M { v( s sub 2 ) } make
# a difference only when one of @M { s sub 1 } and @M { s sub 2 }
# starts a new group and the other does not.
# @PP
# Now we are free to swap @M { s sub 1 } and @M { s sub 2 }, because
# of our assumption that their other attributes are equal.  So which
# one should go into the existing group, and which one should start
# the new group?  Clearly @M { s sub 2 } should start the new group,
# because more cost will be saved if @M { v( s sub 2 ) } ends up
# being omitted from the solution cost, which is only possible with
# the new group.  (Roughly speaking, @M { v(s) = minus n(s) }, so
# we are saying that the task with the smaller non-assignment cost
# should start the new group.  This makes sense, since not assigning
# this group adds @M { n(s) } to the solution cost.)
# @PP
# One way to implement this is to use the linking mechanism of
# Section {@NumberOf interval_grouping.early_dominance}.  A link
# would prohibit @M { s sub 1 } being added to an empty group and
# @M { s sub 2 } being added to a non-empty group at the same
# time.  However, we can do better, given that task cost often
# does not matter at all.
# @PP
# First, we omit @M { v(s) } from the test for equivalence.  So tasks
# whose task costs differ go into the same task class.  We sort the
# elements of each task class by increasing @M { v(s) }, so that when
# taking tasks from the class for adding to groups, tasks with smaller
# @M { v(s) } are chosen before tasks with larger @M { v(s) }.  Then
# we ensure that when building a solution, adding tasks to empty
# groups is always done last, which (as it turns out) is a sensible
# way to do things anyway.
# # We would probably be doing this anyway,
# # because it is just a matter of adding tasks to existing groups
# # first, and then starting one new group for each leftover task.
# @End @SubAppendix

@SubAppendix
    @Title { Time complexity }
    @Tag { interval_grouping.time_complexity }
@Begin
@LP
To find the time complexity of this algorithm, we need to
estimate the number of undominated solutions @M { G( S sub i ) }
for each @M { i }.  We ignore secondary tasks, non-must-assign
tasks, multi-day tasks, domains, and assignments, because apart
from domains, few tasks differ in these properties, and there
are likely to be only a few domains.  This means that one group
differs from another only in its primary duration.
@PP
Suppose each solution @M { G( S sub i ) } has @M { n } unfinished
groups, each of which has primary duration in the range @M { [1, K] }.
Then, apart from the cost, the signature of each undominated
@M { G( S sub i ) } is nothing other than a distinct multiset
of primary durations, since given two solutions whose groups
have equal primary durations, one always dominates the other.
So the number of undominated solutions is at most @M { p(n, K) },
the number of distinct multisets of cardinality @M { n } whose
elements are integers in the range @M { [1, K] }.  For example,
if @M { n = 4 } and @M { K = 3 } there are 15 of these multisets:
@CD @OneRow @F lines @Break {
3 3 3 3  &2c  3 3 1 1  &2c  2 2 2 2
3 3 3 2  &2c  3 2 2 2  &2c  2 2 2 1
3 3 3 1  &2c  3 2 2 1  &2c  2 2 1 1
3 3 2 2  &2c  3 2 1 1  &2c  2 1 1 1
3 3 2 1  &2c  3 1 1 1  &2c  1 1 1 1
}
In general, we can argue as follows.  Divide these multisets into
two parts.  In the first part place those multisets that contain
at least one @M { K }.  This fixes one of the values in the multiset
but places no new constraints on the others.  So the number of such
multisets is @M { p(n-1, K) }.  In the second part place those
multisets that do not contain at least one @M { K }.  There are
@M { p(n, K-1) } of those.  So
@ID @M { p(n, K) ``=`` p(n-1, K) ``+`` p(n, K-1) }
We can have @M { n = 0 }, but the smallest valid @M { K } is @M { 1 },
so the bases of this recurrence are @M { p(n, 1) = 1 }, a multiset
containing just one element (a sequence of @M { n } ones), and
@M { p(0, K) = 1 }, a multiset containing just one element (the empty
sequence).  Although the base is not the usual one, the recurrence
is familiar and tells us that @M { p(n, K) } is a combination:
@ID @M {
p(n, K) `` = `` pmatrix { row col n + K-1 row col {K-1} }
}
For example, @M { p(4, 3) = "-2p" @Font pmatrix { row col 6 row col 2 } = 15 }.
For a fixed @M { K } this is polynomial in @M { n }, of order @M { n sup K }.
@PP
In instance INRC2-4-100-0-1108, sequences of night tasks are the main
concern.  The maximum primary duration of an unfinished group is
@M { K = 4 } (because the maximum number of consecutive night tasks
is 5, so sequences of length 5 are finished), and there are at most
about 25 must-assign night tasks per day.  So our value is
@M { p(25, 4) = "-2p" @Font pmatrix { row col @R "28" row col 3 } = 3276 },
a manageable number.
@PP
The algorithm could still be exponential, if a single expansion could
take exponential time.  But that is not possible under our current
assumptions, because the task groups can lie in at most @M { K }
distinct classes, one for each primary duration, and the @M { n }
tasks in each @M { X sub i } are all equivalent.  So even if each
class could take any number of tasks from @M { 0 } to @M { n }, that
would still only amount to @M { (n + 1) sup K } combinations.
@End @SubAppendix

@EndSubAppendices
@End @Appendix
