CMSC-341 Fall 2002
Project 4
Assigned |
06 Nov 2002 |
|
Due |
20 Nov 2002 by 11:59PM |
|
Updates |
11 Nov 2002 - The function prototypes for find(), increaseKey(),
and decreaseKey() had the '&' in the wrong place. They have been
fixed. |
|
09 Nov 2002 - The broken questions and sample output links now
work. |
|
07 Nov 2002 - The description says there are CPUTimer methods
named last() and total(). They are actually named getLast() and
getTotal(). |
|
07 Nov 2002 - Some students are getting errors about arguments
being re-ordered in the BinaryHeap constructor. Take a new copy of
BinaryHeap.C and the error will go away. |
Background
Priority queues are an important data structure, with applications in areas
such as job scheduling for printers and CPUs. Array-based binary heaps are
one very efficient implementation of priority queues, with O(1) time
findMin() and O(lg n) time insert() and deleteMin(). Because the branching
factor (the number of children a node can have) in a binary heap is 2, the
base of the logarithm in O(lg n) is 2 as well. In this project you will
modify the author's binary heap code to implement a d-ary heap, where d is
the branching factor of the heap.
Why might it be a good idea to use d-ary heaps rather than binary heaps?
The height of a complete binary tree containing 10,000 items is 13, meaning
that if the items are stored in a binary heap, insertions and deletions
might take up to 13 swaps. If the same items are stored in a heap with
branching factor 16, no more than 3 swaps will be required. The
disadvantage of having 16 children for one node is that, for example, when
percolating down you have to find the smallest of the children, and the
cost of this operation grows linearly with the number of children.
In addition to implementing d-ary heaps, you will measure the CPU time
required to perform sequences of various operations in an attempt to get an
empirical view into how the cost of these operations is affected by the
branching factor of the heap.
NOTE: The questions for this project require you to run your program on a
particular data file using heaps with different branching factors. Please
take this into account and allocate enough time to generate and analyze the
results.
Description
Here are your tasks:
- Read and understand the description of the BinaryHeap ADT in the text.
- Copy the binary heap files (BinaryHeap.C, BinaryHeap.H, and
dsexceptions.H) from Dr. Oates' public directory
/afs/umbc.edu/users/o/a/oates/pub/CMSC341
to your directory.
- Read through CPUTimer.C and CPUTimer.H Dr. Oates' public directory
/afs/umbc.edu/users/o/a/oates/pub/CMSC341
. Your makefile should
reference these files rather than a local copy in your directory.
- Using the BinaryHeap code as a starting point, implement a d-aryHeap
class.
- Read and understand the description of the STL version of the
vector ADT at the following URL:
http://www.msoe.edu/eecs/cese/resources/stl/vector.htm
- Modify your d-ary heap code so that it dynamically adjusts its size as
new items are added and never throws an Overflow exception.
- Design and implement Proj4.C which contains main( ) and is the driver
program for this project.
- Implement a makefile for this project.
- Answer the questions posed in 341-Fall02-proj4_questions.txt. Copy the
file
/afs/umbc.edu/users/o/a/oates/pub/CMSC341/Proj4/341-Fall02-p4_questions.txt
to your own directory and edit it to provide your answers to the questions.
Don't forget to submit the edited file; it is 10% of this project's grade.
Definition of the ADT
All classes that you design and implement must support the "Big-4", even if
the default behavior provided by the compiler would be acceptable.
- Default constructor
- Destructor
- Copy Constructor
- Assignment Operator
As always, you are free to add whatever private data and methods you
desire, and you must provide the public methods listed in this project
description. If you need to add any additional public methods, think very
carefully about it and provide ample justification for why the method must
be public in comments in your code. Points will be deducted for public
methods that are not well justified.
The d-aryHeap
Rather than having at most two children, nodes in a d-ary heap can have at
most d children, where d is a parameter to the constructor.
Therefore, you will need to implement a d-ary heap constructor whose
prototype is given below:
- explicit d-aryHeap(int d = 2, int capacity = 100) --
Note that the arity of the heap is an argument to the constructor, and the
value defaults to 2, i.e. to a binary heap. The second argument is the
initial capacity of the heap.
Clearly, each heap will need a private data member to record the value of d
passed into the constructor. You will need to modify the author's binary
heap code to allow nodes to have more than 2 children. For a d-ary heap,
the children of the node at array index I (when the root is at index 1) are
at indices:
d(I - 1) + 2
d(I - 1) + 3
...
d(I - 1) + d
d(I - 1) + d + 1
Also, the parent of the node at array index I is at:
floor((I - 2) / d) + 1
You will also need to implement the following methods:
- int find(const Comparable & x) const --
This method returns the index of x in the d-ary heap if the heap
contains x, otherwise it returns -1.
- void increaseKey(int i, const Comparable & delta) --
This method increases the key at array index i in the d-ary heap by
the amount delta. It must be the case that operator+ is
defined for objects stored in the heap so that you can modify the value
stored at index i by simply adding (using the + operator)
delta to it. After changing the value at index i, you will
need to percolate it down.
- void decreaseKey(int i, const Comparable & delta) --
This method decreases the key at array index i in the d-ary heap by
the amount delta. It must be the case that operator- is
defined for objects stored in the heap so that you can modify the value
stored at index i by simply subtracting (using the - operator)
delta from it. After changing the value at index i, you will
need to percolate it up.
Because we want to explore the asyptotic performance of d-ary heap
operations, we will need to build heaps with large numbers of elements.
The author's code requires knowledge of the maximum size of the heap when
it is created. You must alter the code so that the vector is dynamically
enlarged as needed when items are created. The key to this will be to use
the push_back() method to add items to the vector when it is full.
Be sure to use operator[] to add items when the vector is not full.
The CPUTimer
To measure the impact of d on the performance of d-ary heaps, you
will use the CPUTimer class. This class is provided for you and must be
used without modification. Objects of type CPUTimer can be used to record
the amount of CPU time used by your program. The public interface is as
follows.
- CPUTimer() -- Construct a CPUTimer object.
- void start() -- Start recording CPU usage.
- void stop() -- Stop recording CPU uage.
- int getLast() --
Return the number of micro-seconds of CPU time used by the program between
the last pair of start/stop calls for the timer. If the timer has been
started but not stopped, a call to stop() is made, the CPU usage is
computed, and the timer is restarted.
- int getTotal() --
Return the total number of micro-seconds of CPU time used by the process
between pairs of start/stop calls for the timer. If the timer has been
started but not stopped, a call to stop() is made, the CPU usage is
computed, and the timer is restarted.
- void reset() --
Reset the totals maintained by the timer to zero.
You must declare and start a CPUTimer before performing any d-aryHeap
operations. For example:
void main(int argc, char **argv) {
CPUTimer ct;
ct.start();
...
ct.stop();
exit(0);
}
The Command Line
Project 4 will be invoked with a command line that consists of two
arguments - the arity of the heap (which must be an integer greater than or
equal to 2) and the name of the data file to read.
As usual, you should check the validity of all command line arguments.
After validating the command line, you should create a single d-aryHeap
with the specified arity, and then perform all of the operations given in
the data file on that heap.
The Data File
IMPORTANT: DO NOT ECHO COMMANDS READ FROM THE DATA FILE. We will
create test files with large numbers of commands (i.e. tens of thousands),
and if you echo these commands the output of the grading scripts will be
much too large. That's also the reason that command names (see below) are
1 or 2 characters long. We don't want to waste space in these files with
long command names.
Commands in the file specify operations to be performed on the d-ary heap.
Each line in the file represents one command. Blank lines may appear
anywhere in the file and should be ignored. Otherwise, you can assume that
any line containing a command is syntactically well-formed. We make this
assumption so you don't have to spend lots of time making the command file
parser bullet proof.
The command file format follows:
- I <value> --
Insert the value into the heap. Again, your heap should never
overflow.
- DM --
Perform a deleteMin() on the heap. Do not print the value. If the heap is
empty, do not report an error.
- IK <value> <delta> --
If the value exists in the heap, increase it by delta. If
the value does not exist, do nothing.
- DK <value> <delta> --
If the value exists in the heap, decrease it by delta. If
the value does not exist, do nothing.
- PM --
Print the minimum element in the heap. Do not remove it.
- PT --
Print the total CPU time used by your program. This will require a call to
CPUTimer::total().
Sample Output
Sample output is available for your study at
/afs/umbc.edu/users/o/a/oates/pub/CMSC341/Proj4/341-Fall02-p4-sample_output.txt
Project Submission
Submit all files required to build your project.
Submit Tools
There are a number of tools available to you to check on
your submittal. It is your responsibility to ensure that the submittal is
correct and will result in a successful compilation of your project. Do not
wait till the last minute to submit your files. Give yourself enough time
to check that your submittal is correct.
If you don't submit a project correctly, you will not get credit for
it. Why throw away all that hard work you did to write the project? Check
your submittals. Make sure they work. Do this before the due date.
Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/.
One of the tools provided by the submit program is submitls. It
lists the names of the files you have submitted.
Additionally, there are two programs for use only by CMSC-341 students
(not part of the UCS submit program). They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/ and are
named submitmake and submitrun. You can use
these programs to make or run your submitted projects.
The syntax is similar to that for submit:
submitmake <class> <project>
Example: submitmake cs341 Proj4
This makes the project, and shows you the report from the make utility.
It cleans up the directory after making the project (removes .o files), but
leaves the executable in place.
submitrun <class> <project> [command-line args]
Example: submitrun cs341 Proj4 test.dat
This runs the project, assuming there is an executable (i.e. submitmake
was run successfully).
Grading and Academic Integrity
Your project will be tested using a variety of command lines, some of which
will be invalid.
Your project will also be tested using a variety of command files which
will test various conditions which your code should handle.
Project grading is described in the Project
Policy handout.
Your answers to 341-Fall02-proj4_questions.txt
are worth 10% of your project grade.
Cheating in any form will not be tolerated. Please re-read the Project
Policy handout for further details on honesty in doing projects for
this course.
Remember, the due date is firm. Submittals made after midnight of the
due date will not be accepted. Do not submit any files after that time.