Assigned |
|
Due |
|
Correction |
26 April "or" in algorithm insert(x), first line in step 2, should be "and". |
The purpose of this project is to give you exposure to different kinds of heap data structures. In particular, you are asked to implement one known as Interval Heap, which extends binary heap so that both findMin and findMax operations can be done efficiently in O(1). All other basic operations (insert, deleteMin, deletMax) are O(log N), and a Interval Heap can be built in O(N) time. The definition of the interval heap and its operations are given in the Description Section below. To fully understand this data structure and its advantages, you will need to read the original paper by J. Van Leeuwen and D. Wood (especially the first two sections) that introduced this data structure in 1987.
Let X be a totally ordered domain of values (e.g., a set of integers that specify the priorities, with the smaller value representing a higher priority). We are interested in a data structure that supports the following five operations over this domain:
Many data structures have been developed in order to accomplish the five tasks effectively, most of which generalize the binary heap. As discussed in the class, binary heaps support only operations (1), (3) and (4), or, alternatively, the operations (2), (3) and (5) from the preceding list. Min-Max Heap was proposed by Atkinson to support all of the five operations, which was organized like a heap, except that a different ordering relation was employed (see pp. 247 -248 of the text for a brief description of Min-Max Heap). Another data structure that generalizes heaps in a more consistent way to support these operations is the Interval Heap.
An interval heap is very much like a binary heap, it can be seen as a complete binary tree (CBT), and it can be stored in an array. There are two main differences. 1) Each node in CBT, except perhaps the last one, stores two values rather than one. These two values, say a, b in X (and a <= b), represent an interval [a, b] in X. The last node, called the left-end-node, or simply L-node, may store either a single or a pair of values, depending on whether the total number of keys stored is odd or even. Therefore, an interval heap storing N values would have ceiling(N/2) nodes. 2) A different heap order is required: the interval at each node contains the intervals of its children.
In what follows, we give the definition of the interval heap and the algorithms (in pseudo code) of the five operations. Illustrative examples, in the form of CBT, are then given to help you to understand this data structure. We denote the interval specified by a node v as I(v).
Definition.
An interval heap is a heap structure that is either empty or satisfied the following three conditions (heap properties):
What we can see immediately from the definition is that, if I(root) = [a, b], then all values currently stored in the heap are within this interval, and a is thus the Min, b the Max among these values. Therefore, findMin and findMax would have O(1) time performance.
Note: as with binary heap, duplicate values are allowed for interval heaps.
Operations.
The pseudo code for the algorithm of each of the five operations is given next. If node v with I(v) = [a, b], we call a the lower endpoint and b the higher endpoint of the interval. Operations for interval heaps are similar to those for binary heaps, except 1) you need to determine which of the two endpoints is to be swapped in each step of percolateUp and percolateDown, and 2) special care must be given to the L-node since it may store either a single or a pair of values.
findMin:
if the heap is empty then throw exception;
if root is the L-node and contains a single key a, return a;
else return a where a is the lower endpoint of root.
findMax( ): Analogous to findMin.
deleteMin( ):
if the heap is empty then throw exception;
if root is the L-node {
if it contains a single key a, delete a and remove L-node; /* heap becomes empty */
else delete a where a is the lower endpoint of root; /* L-node now contains a single key */
}
else /* root contains two keys: I(root) = [a, b] */
{
remove a and replace it with a hole;
percolateDown: contunue to swap the hole with the smaller of the two lower endpoints of its two children until a leaf v is reached;
if v is not L-node, swap hole with any value in L-node and order the two values now in v ;
/* now the hole is in the L-node */
if the L-node now contains no value, remove it from the heap;
}
Note: percolateDown outlined here is slightly different from the author’s code for binary heaps. Here, the hole holds no value, but for binary heap, the hole is initialized with the value of the last node, and the percolateDown may stop before a leaf is reached. You may implement deleteMin by either following this algorithm or modifying the author's code. However, be aware the complications when directly modifying the author’s code (the value percolated down may become the lower endpoint in some nodes, and higher endpoint in others).
deleteMax( ): Analogous to deleteMin. PercolateDown now is with respect to the higher endpoint in each child node.
insert(x):
if the heap is empty
create a new node containing a single value x and insert this node into heap as its L-node;
else {
/* step 1: modify or create a new L-node */
if L-node contain a single key y
{insert x into L-node, the new L-node then has interval [x, y] if x <= y, or [y, x], other wise;}
else
{create a new node containing a single value x and insert this node into heap as its new L-node;}
/* step 2: percolateUp x */
while x is not within [a, b], the interval of its parent, and x is not at root
{if x < a, swap x with a; else swap x with b;}
}
Note: Although you are not asked to implement the method to heapify an array of random intervals, it can be done by modifying the buildHeap() method for binary heaps.
Examples illustrate how these operations work are given below.
insert: successively insert the following keys 10, 50, 70, 30, 5, 60, 20, 80 into a previously empty interval heap.
(1) [10] (2) [10, 50] (3) [10, 50] [10, 70]
/ => /
[70] [50]
(new L-node created) (swap 70 and 50)
(4) [10, 70] (5) [10, 70] [5, 70]
/ / \ ==> / \
[30, 50] [30, 50] [5] [30, 50] [10]
(L-node modified) (swap 5 and 10)
(6) [5, 70] (7) [5, 70] [5, 70]
/ \ / \ / \
[30, 50] [10, 60] [30, 50] [10, 60] ==> [20, 50] [10, 60]
/ /
[20] [30]
(swap 20 and 30)
(8) [5, 70] [5, 80]
/ \ / \
[20, 50] [10, 60] [20, 70] [10, 60]
/ /
[30, 80] [30, 50]
(swap 80 first with 50, then with 70)
findMin: returns 5.
findmax: returns 80.
printHeap: [5, 80]
[20, 70] [10, 60]
[30, 50].
deleteMin:
[__, 80] [10, 80] [10, 80]
/ \ / \ / \
[20, 70] [10, 60] [20, 70] [__, 60] ] [20, 70] [30, 60]
/ / /
[30, 50] [30, 50] [50]
a hole created at root hole is percolated down hole is filled by 30 from
after 5 is removed L-node
deleteMax:
[10, __] [10, 70] [10, 70]
/ \ / \ / \
[20, 70] [30, 60] [20, __] [30, 60] [20, 50] [30, 60]
/ /
[50] [50]
a hole created at root 50 is percolated down hole is filled by 50 from L-node,
after 80 is removed the old L-node is removed from the heap.
findMin()
findMax()
deleteMin()
deleteMax()
insert(x)
printHeap()
You may implement these functions according to the algorithms outlined earlier in this description or with your own algorithms. In any case, the heap order must be maintained.
/afs/umbc.edu/users/y/p/ypeng/pub/CMSC341/Proj5/341-Spring04-p5_questions.txt
These questions are worth 10% of the project grade.
Each line in the command file gives one command together with its argument(s). The commands you are ask to execute and their formats are listed below:
· INSERT X – Insert the integer X into the heap. All integers inserted will be between 0 and 99, inclusive.
· FINDMIN – Find and print the minimum value in the heap.
· FINDMAX – Find and print the maximum value in the heap.
· DELETEMIN – remove the minimum value from the heap.
· DELETEMAX – remove the maximum value from the heap.
· PRINT – Print the heap in level-order of its CBT. Your print must start with a new line with each level, and print 8 nodes/intervals per line if there are more than 8 nodes at a given level. (See sample output below for the format of our output.)
Commands in the command file are case sensitive (they should all be in upper case). Your output should echo each command read from the command file before printing whatever is required by the command.
The command file may look like this (following the examples given earlier):
PRINT
INSERT 10
INSERT 50
INSERT 70
INSERT 30
INSERT 5
INSERT 60
INSERT 20
INSERT 80
PRINT
FINDMIN
FINDMAX
DELETEMIN
PRINT
DELETEMAX
PRINT
The output for this example command file should look like this:
PROJECT 5:
INTERVAL HEAPS
COMMAND: PRINT
The
heap is empty
COMMAND: INSERT 10
COMMAND: INSERT 50
COMMAND: INSERT 70
COMMAND: INSERT 30
COMMAND: INSERT 5
COMMAND: INSERT 60
COMMAND: INSERT 20
COMMAND: INSERT 80
COMMAND: PRINT
The
current heap content:
[5,
80]
[20, 70] [10, 60]
[30, 50]
COMMAND: FINDMIN
The
minimum value in the heap is 5.
COMMAND: FINDMAX
The
maximum value in the heap is 80.
COMMAND: DELETEMIN
COMMAND: PRINT
The
current heap content:
[10, 80]
[20, 70] [30, 60]
[50]
COMMAND: DELETEMAX
COMMAND: PRINT
The
current heap content:
[10, 70]
[20, 50] [30, 60]
1. Your code must follow all project guidelines established for this course. See the project organization and project policies web pages. You are not required to modify the author's code to follow these guidelines.
2. You are free to name your main program file, but your executable must be named Proj5.
3. Your code must handle (try/throw/catch) any exceptions thrown by the author's code or by your own classes.
Submit any files which you have written and modified:
· file that contains your main( );
· files for your IntervalHeap class;
· any other files you write for this project;
· your makefile -- which must be named makefile or Makefile;
· the question file with your answers inserted.
Submit
the files using the procedure given to you for your section of the
course.
If your makefile is set up correctly, you should be able to execute the
command
make submit.
There are a number of tools available to you to check on your submittal. It is your responsibility to ensure that the submittal is correct and will result in a successful compilation of your project. Do not wait till the last minute to submit your files. Give yourself enough time to check that your submittal is correct.
If you don't submit a project correctly, you will not get credit for it. Why throw away all that hard work you did to write the project? Check . Make sure they work. Do this before the due date.
Documentation
for the submit program is on the web at http://www.gl.umbc.edu/submit/
. One of the tools provided by the submit program is
submitls. It lists the
names of
the files you have submitted.
Additionally,
there are two programs for use only by CMSC-341 students (not part of
the UCS
submit program). They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/
and are named submitmake
and submitrun.
You can
use these programs to make or run your submitted projects.
The syntax is similar to that for submit:
submitmake <class> <project>
Example: submitmake cs341 Proj4
This
makes the project, and shows you the report from the make utility. It
cleans up
the directory after making the project (removes .o and ii_files), but
leaves
the
executable in place.
submitrun <class> <project> [command-line args]
Example: submitrun cs341 Proj5 test.txt
This runs the project, assuming there is an executable (i.e. submitmake was run successfully).
Your
project will be tested using a variety of command lines, some of which
will be
invalid.
Your project will also be tested using a variety of command files which
will
test various conditions which your code should handle.
Project
grading is described in the Project
Policy handout.
Cheating in any form will not be tolerated. Please re-read the Project
Policy handout for further details on honesty in doing projects for
this
course.
Remember,
the due date is firm. Submittals made after