CMSC 341 Spring 2007
Project 4
Assigned |
Wed 18 April 2007 |
Due |
Tues 01 May 2007 by 11:59PM |
Update
| 27 Apr
The original prototype for the Print( ) method was incorrect.
Since the ostream parameter had a default value (cout), it must
be the 2nd parameter. The prototype has been changed.
21 Apr
The CPUTimer class has been modified to report only USER cpu time
and not both USER and SYSTEM cpu time. This seems to give more consistent
results.
Bigger files are now available for testing....
p4.1000000.cmds and p4.500000.cmds
contain 1,000,000 and 500,000 insertions and a single PT command respectively.
Since the output of the CPU timer varies, we suggest running your
program "many" (e.g. ~20) times and using the average CPU time
when answering the project questions.
|
Background
Priority queues are an important data structure, with applications in areas
such as job scheduling for printers and CPUs. Array-based binary heaps are
one very efficient implementation of priority queues, with O(1) time
for findMin( ) and O(lg n) time insert( ) and deleteMin( ). Because the branching
factor (the number of children a node can have) in a binary heap is 2, the
base of the logarithm in O(lg n) is 2 as well. In this project you will
modify the author's binary heap code to implement a k-ary heap, where k is
the branching factor of the heap.
Why might it be a good idea to use k-ary heaps rather than binary heaps?
The height of a complete binary tree containing 10,000 items is 13, meaning
that if the items are stored in a binary heap, insertions and deletions
might take up to 13 swaps. If the same items are stored in a heap with
branching factor 16, no more than 3 swaps will be required. The
disadvantage of having 16 children for one node is that, for example, when
percolating down you have to find the smallest of the children, and the
cost of this operation grows linearly with the number of children.
In addition to implementing k-ary heaps, you will measure the CPU time
required to perform sequences of various operations in an attempt to get an
empirical view into how the cost of these operations is affected by the
branching factor of the heap.
NOTE: The questions for this project require you to run your program on a
particular data file using heaps with different branching factors. Please
take this into account and allocate enough time to generate and analyze the
results.
Mr. Frey's public directory for this project is
/afs/umbc.edu/users/f/r/frey/pub/CMSC341/Proj4
Description
Here are your tasks:
- Read and understand the description of the BinaryHeap ADT in the text.
- Copy the binary heap file (BinaryHeap.h) from Mr. Frey's public directory
to your directory.
- Read through CPUTimer.h and CPUTimer.cpp in Mr. Frey's public directory.
Your makefile should reference these files. DO NOT submit these files.
The grading scripts will remove these files if present.
- Using the BinaryHeap code as a starting point, implement a KaryHeap
class template.
- Design and implement Proj4.cpp which contains main( ) and is the driver
program for this project.
- Implement a makefile for this project.
- Answer the questions posed in 341-spring07-proj4_questions.txt. Copy the
file 341-spring07-p4_questions.txt
from Mr. Frey's public directory
to your own directory and edit it to provide your answers to the questions.
Don't forget to submit the edited file; it is 20% of this project's grade.
Run your project using the command file named p4.commands.huge in Mr. Frey's public directory
to gather the data necessary to answer these questions.
Submit your output from using p4.commands.huge per the directions in the
questions file.
Definition of the KaryHeap ADT
Rather than having at most two children, nodes in a k-ary heap can have at
most k children, where k is a parameter to the constructor.
Clearly, each heap will need a private data member to record the value of k
passed into the constructor. You will need to modify the author's binary
heap code to allow nodes to have more than 2 children. Also, code which calculates and uses the index of a node's parent and children will require modification.
For a given "arity" of k, the indices of the children of the node
at index I (when the root is at index 1) are
k(I - 1) + 2, k(I - 1) + 3, ...., k(I - 1) + k + 1
The index of the parent of the node at array index I (with I > 1) is
floor( (I - 2) / k ) + 1
You will also need to implement the following methods:
- explicit KaryHeap(int k = 2, int capacity = 100)
Note that the "arity" of the heap is an argument to the constructor, and the
value defaults to 2, i.e. to a binary heap. The second argument is the
initial capacity of the heap.
- int Find(const Comparable & x) const
This method returns the index of x in the k-ary heap if the heap
contains x, otherwise it returns -1.
- void IncreaseKey(int i, const Comparable & delta)
This method increases the key at array index i in the k-ary heap by
the amount delta.
It must be the case that operator+ is
defined for objects stored in the heap so that you can modify the value
stored at index i by simply adding (using the + operator)
delta to it. After changing the key at index i, you will
need to percolate it down to re-establish the heap property.
- void DecreaseKey(int i, const Comparable & delta)
This method decreases the key at array index i in the k-ary heap by
the amount delta.
It must be the case that operator- is
defined for objects stored in the heap so that you can modify the value
stored at index i by simply subtracting (using the - operator)
delta from it. After changing the value at index i, you will
need to percolate it up to re-establish the heap property.
- void Print( int nrlevels, ostream& out = cout) const
This method prints the KaryHeap in level order in the format required
by the PH command in the data file.
As always, you are free to add whatever private data and methods you
desire, and you must provide the public methods listed in this project
description. If you need to add any additional public methods, think very
carefully about it and provide ample justification for why the method must
be public in a README file that must be submitted.
Points will be deducted for public methods that are not well justified.
The CPUTimer
To measure the impact of k on the performance of k-ary heaps, you
will use the CPUTimer class. This class is provided for you and must be
used without modification. Objects of type CPUTimer can be used to record
the amount of CPU time used by your program. The public interface is defined below
and in CPUTimer.h
- CPUTimer( ) -- Construct a CPUTimer object.
- void Start( ) -- Start recording CPU usage.
- void Stop( ) -- Stop recording CPU usage.
- int GetLast( ) --
Return the number of micro-seconds of CPU time used by the program between
the last pair of start/stop calls for the timer. If the timer is running
a call to Stop() is made, the CPU usage is
computed, and the timer is restarted via a call to Start( ).
- int GetTotal( ) --
Return the total number of micro-seconds of CPU time used by the process
between all pairs of start/stop calls for the timer since the timer
was instantiated or the last Reset( ). If the timer is running
a call to Stop() is made, the CPU usage is
computed, and the timer is restarted via a call to Start( ).
- void Reset( ) --
Stops the timer and resets the totals maintained by the timer to zero.
You must use the timer to accumlate the total CPU time used
for all KaryHeap operation. That is, Start() the timer before
each KaryHeap operation and Stop( ) the timer immediately after
the operation completes. For example,
int main(int argc, char **argv)
{
KaryHeap<int> kheap( 6, 20 );
CPUTimer timer;
timer.Start();
kheap.insert( 42);
timer.Stop();
// .....
timer.Start( );
kheap.IncreaseKey( 3, 20 );
timer.Stop( );
// .....
cout << "Total CPU Time = " << CPUTimer.Total( ) << endl;
exit( 0 );
}
The Command Line
Project 4 will be invoked with a command line that consists of two
arguments - k, the "arity" of the heap (which must be an integer greater than or
equal to 2) and the name of the data file to read.
As usual, you should check the validity of all command line arguments.
After validating the command line, you should create a single KaryHeap
with the specified arity, and then perform all of the operations given in
the data file on that heap.
The Data File
IMPORTANT: DO NOT ECHO COMMANDS READ FROM THE DATA FILE.
"Small" test files will be used to test the functionality of
your KaryHeap. However, you will use a test file with very large numbers of
commands (i.e. tens of thousands) to gather data for answering the project
questions and if you echo these commands the output of the grading scripts will be
much too large. That's also the reason that command names (see below) are
1 or 2 characters long. We don't want to waste space in these files with
long command names.
Commands in the file specify operations to be performed on the k-ary heap.
Each line in the file represents one command. Blank lines may appear
anywhere in the file and should be ignored. Lines which begin with
'#' are comments and should also be ignored. Otherwise, you can assume that
any line containing a command is syntactically well-formed. We make this
assumption so you don't have to spend lots of time making the command file
parser bullet proof.
The command file format follows:
- I <value>
Insert the value into the heap. Again, your heap should never
overflow.
- DM
Perform a deleteMin() on the heap. Do not print the value. If the heap is
empty, do not report an error.
- IK <value> <delta>
If the value exists in the heap, increase it by delta. If
the value does not exist, do nothing.
- DK <value> <delta>
If the value exists in the heap, decrease it by delta. If
the value does not exist, do nothing.
- PM
Print the minimum element in the heap. Do not remove it.
- PT
Print the total CPU time (in milliseconds with 3 decimal places)
used by your program executing KaryHeap operations.
- PH <NrLevels>
Print NrLevels of your KaryHeap using a level-order traversal.
As in project 2, your output must indicate which level is being printed.
In addition the value of a node's parent must be printed for every node output.
Use -1 for the value of the "parent" of the root. For example, the first 3 levels
of a 3-ary heap might be printed as shown below.
Level 0: (2, -1)
Level 1: (4, 2), (5, 2), (6, 2)
Level 2: (7, 4), (10, 4), (12, 4)
(13,5), (17, 5), (20, 5)
(11, 6), (16, 6), (9, 6)
Sample Output
Sample output is available for your study in
341-spring07-p4-sample_output.txt
in Mr. Frey's public directory.
Project Submission
Submit all files required to build your project except for CPUTimer.h and CPUTimer.cpp
which must be referenced by your makefile. These files will be deleted from
our submittal directory.
Submit Tools
There are a number of tools available to you to check on
your submittal. It is your responsibility to ensure that the submittal is
correct and will result in a successful compilation of your project. Do not
wait till the last minute to submit your files. Give yourself enough time
to check that your submittal is correct.
If you don't submit a project correctly, you will not get credit for
it. Why throw away all that hard work you did to write the project? Check
your submittals. Make sure they work. Do this before the due date.
Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/.
One of the tools provided by the submit program is submitls. It
lists the names of the files you have submitted.
Additionally, there are two programs for use only by CMSC-341 students
(not part of the UCS submit program). They are in the directory
/afs/umbc.edu/users/f/r/frey/pub/CMSC341/ and are
named submitmake and submitrun. You can use
these programs to make or run your submitted projects.
The syntax is similar to that for submit:
submitmake <class> <project>
Example: submitmake cs341 Proj4
This makes the project, and shows you the report from the make utility.
It cleans up the directory after making the project (removes .o files), but
leaves the executable in place.
submitrun <class> <project> [command-line args]
Example: submitrun cs341 Proj4 test.dat
This runs the project, assuming there is an executable (i.e. submitmake
was run successfully) and that test.dat is in your local directory.
Grading and Academic Integrity
Your project will be tested using a variety of command lines, some of which
will be invalid.
Your project will also be tested using a variety of command files which
will test various conditions which your code should handle.
Project grading is described in the Project
Policy handout.
Your answers to 341-Spring07-proj4_questions.txt
are worth 10% of your project grade.
Cheating in any form will not be tolerated. Please re-read the
Project
Policy handout for further details on honesty in doing projects for
this course.
Remember, the due date is firm. Submittals made after midnight of the
due date will not be accepted. Do not submit any files after that time.