CMSC-341 Spring 2003
Project 3
Assigned |
Mar 9th, 2003 |
|
Due |
Sunday Mar 23, 2003 by 11:59PM |
|
Updates |
March 11, 2003, 12:45pm
You may assume the data files contain non-negative integers
March 10, 2003, 10:20am
The initial posting on Saturday had the incorrect
prototype for the K-ary tree node constructor. The
correct prototype as specified in this document is
KaryTreeNode (const Object& data, int K);
where K specifies the number of children (K) in each node.
|
In this course, we discuss the asymptotic performance of many
data structures and often give proofs or convincing arguments
that validate the claims about performance. In this project, you
will conduct an experiment to gather empirical evidence that the
actual performance of Binary Search Trees (BSTs) and K-ary Trees
(KTs) matches the theoretical performance.
Description
In this project, you will read random integers from a file and insert
them into a modfied version of the author's BST. After every M insertions,
you will display information about the tree, as described below and
seen in the sample output.
You will then insert those same values into a K-ary tree with K children
per node and print the same information about the tree.
The project questions will ask you to analyze the data.
A summary of things to do
- Copy the author's BST code (BinarySearchTree.H/C) from Mr. Frey's
public directory to your directory.
- Modify the author's BST code in the following ways
- Modify insert() to allow duplicates. Insert duplicates
on the left.
- Add a new public method: int CountLeaves ( void ) const;
that returns the number of leaves in the tree.
- Add a new public method: int CountNodes ( void ) const;
that returns the number of nodes in the tree
- Add a new public method: int Height ( void ) const;
that returns the height of the tree.
- Implement a K-ary tree node class template.
- See class lecture notes for ideas on this class.
- The only method is the private constructor
KaryTreeNode (const Object& data, int K); where K specifies the
number of children (K) in each node.
- The K-ary tree class is a friend of the K-ary tree node class.
- Implement a minimal K-ary tree class template.
(see class lecture notes for the algorithm for find( ) which will
give you an idea about how to implement the methods below.
You must implement the following public methods for the K-ary tree
- KaryTree (int K); -- a constructor that specifies
the number of children (K) in each node of the K-ary tree.
- ~KaryTree( void ) -- the destructor; do this last
- void Insert (const Object& data) -- that inserts the data
into the tree. Attach the new node to a randomly chosen child. You may
assume that insertion never fails.
- int CountLeaves ( void ) const;
that returns the number of leaves in the tree.
- int CountNodes ( void ) const;
that returns the number of nodes in the tree
- int Height ( void ) const;
that returns the height of the tree.
- Implement the application in Proj3.C
- Create a makefile for the project
- Copy the question file from Mr. Frey's public directory and edit
it with your answers.
The questions for this project are worth 30% of you grade.
- Submit your project
- Use submitmake and submitrun to verify your submittal
Project 3 will be invoked with three (3) command line arguments in this
order
- The file of integers to read and insert
- K - the number of children in each K-ary tree node
- M - the reporting interval; display one line of data after every M insertions
The Data File
The data file will contain random non-negative integers separated by whitespace.
Data files are available from Mr. Frey's public directory.
File names are of the form N.dat, where N is an integer indicating
the number of values in the file. E.g 2000.dat contains 2000 integers.
Sample Output
After every M insertions (see The Command Line above),
your program will display one line of data. Each line of data will consist
of the following information, in the order specified below. All data
must be neatly formatted in columns as shown below. Decimal output
(the last 4 columns) must be printed with 4 decimal places.
- The number of nodes (N) in the tree
- The number of leaves (L) in the tree
- The height (H) of the tree
- The number of leaves / log (number of nodes) -- L / lgN
For BSTs, use log base 2; for K-ary trees, use log base K
- The number of leaves / number of nodes -- L / N
- The height / log (number of node) -- H / lgN
For BSTs, use log base 2; for K-ary trees, use log base K
- The height / number of nodes -- H / N
The following output was produced by inserting the file 100.dat
into a BST and 3-ary tree with reporting after every 10 insertions. Use it as an
example of formatting. Many more values are required before patterns
can bee seen in the data.
linux3[82]% Proj3 100.dat 3 10
BST Data from file 100.dat
Nodes Leaves Height L / lgN L / N H / lgN H / N
----- ------ ------ ------- ------- ------- ------
10 4 4 1.2041 0.4000 1.2041 0.4000
20 6 5 1.3883 0.3000 1.1569 0.2500
30 11 7 2.2417 0.3667 1.4266 0.2333
40 16 8 3.0064 0.4000 1.5032 0.2000
50 16 9 2.8349 0.3200 1.5947 0.1800
60 21 9 3.5552 0.3500 1.5236 0.1500
70 25 9 4.0788 0.3571 1.4684 0.1286
80 28 9 4.4290 0.3500 1.4236 0.1125
90 33 10 5.0833 0.3667 1.5404 0.1111
100 34 11 5.1175 0.3400 1.6557 0.1100
K-ary Data from file 100.dat with K = 3
Nodes Leaves Height L / lgN L / N H / lgN H / N
----- ------ ------ ------- ------- ------- ------
10 5 3 2.3856 0.5000 1.4314 0.3000
20 10 4 3.6673 0.5000 1.4669 0.2000
30 16 4 5.1681 0.5333 1.2920 0.1333
40 21 5 6.2542 0.5250 1.4891 0.1250
50 26 5 7.3016 0.5200 1.4041 0.1000
60 29 5 7.7814 0.4833 1.3416 0.0833
70 34 5 8.7920 0.4857 1.2929 0.0714
80 38 5 9.5269 0.4750 1.2535 0.0625
90 43 5 10.4983 0.4778 1.2207 0.0556
100 48 6 11.4509 0.4800 1.4314 0.0600
Hints, Notes and Suggestions
These hints, notes and suggestions are simply things we thought of
as we designed and implemented the project. Depending on your
approach to the project, not all of these may apply to you.
- Mr. Frey's public directory for this project is
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/Proj3
- Input files and output files are located in Mr. Frey's public directory.
Input files are named N.dat where N is the number of integers in the file. E.g. 1000.dat
Output files are named N.K.M.dat where N is the
number of integers inserted, K is number of children in each K-ary node and
M is the reporting interval. E.g. 1000.5.50.dat
- In some cases, smaller reporting intervals (which means more data) will
help you see the pattern in the K-ary tree data better.
- In some cases, running your program on the same input file with different
values of K will help you see the pattern in the K-ary tree data better.
- Use the random number generator (RNG) random( ) when
randomly choosing one of the K children for K-ary tree insertion.
DO NOT seed the RNG using srandom().
NOTE:If you are not developing on linux.gl.umbc.edu, your K-ary data will almost
certainly NOT match the data provided since you will undoubtedly be using a different
RNG. The patterns in the data should be the same nonetheless.
- For initial testing, consider writing a Print() method for the trees.
On each line of output, display a node's value and the values stored in
each of its children. If you insert known values (rather than random ones),
you can verify that CountLeaves(), CountNodes() and Height()
are working properly.
The Print() method performs a level-order traversal
which requires using a Queue to hold tree node pointers.
Here's a print from a BST
50: 40, 60
40: null, null,
60: null, 70
70: null, 80
80: null, null,
and one from a K-ary tree with K=3
50: null, 40, null,
40: 60, 80, 70,
60: null, null, null,
80: null, null, null,
70: null, null, null,
- One way to read a file from beginning to end twice is to close() the file
and then open() it again. It is necessary to clear() the file between the
close() and the open().
- The C++ math library (#include <cmath>) provides the function
log10(double x)
that returns the log of X, base 10. To compute the log of X, base K, note that
log X base K = log X base 10 / log K base 10.
- You can redirect your program's output to a file using Unix file redirection,
E.g Proj3 100.dat 5 10 > 100.5.10.out
redirects Proj3's output to the file named 100.5.10.out
- Think recursively.
- Your program should be able to handle files of 200000 integers. The processing
takes a couple of minutes.
Project Submission
Because you are submitting your own makefile, you are free to name your
files whatever you like, with the following restrictions:
- All class definitions appear in .H files. Header files may NOT contain
any code.
- All class implementations appear in .C files.
- The name of your executable MUST BE Proj3 (upper-case 'P'). The
submitrun program and the scripts that we use to grade your files assume
that you follow this convention for naming your executable.
Submit Tools
There are a number of tools available to you to check on
your submittal. It is your responsibility to ensure that the submittal is
correct and will result in a successful compilation of your project. Do not
wait till the last minute to submit your files. Give yourself enough time
to check that your submittal is correct.
If you don't submit a project correctly, you will not get credit for
it. Why throw away all that hard work you did to write the project? Check
your submittals. Make sure they work. Do this before the due date.
Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/.
One of the tools provided by the submit program is submitls. It
lists the names of the files you have submitted.
Additionally, there are two programs which are not part of the
UCS submit program. They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/ and are
named submitmake and submitrun. You can use
these programs to make or run your submitted projects.
To access these programs, do one of the following
- Create an alias in your .cshrc file (the preferred method)
alias submitmake /afs/umbc.edu/users/d/e/dennis/pub/CMSC341/submitmake
alias submitrun /afs/umbc.edu/users/d/e/dennis/pub/CMSC341/submitrun
- Type the full path name everytime you run them
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/submitmake cs341 Proj3
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/submitrun cs341 Proj3 txfile
- Copy them to your directory
You will have to give yourself permission to execute these programs
after you have copied them. This is accomplished using the chmod command
The syntax is similar to that for submit:
submitmake <class> <project>
Example: submitmake cs341 Proj3
This makes the project, and shows you the report from the make utility.
It cleans up the directory after making the project (removes .o files), but
leaves the executable in place.
submitrun <class> <project> [command-line args]
Example: submitrun cs341 Proj3 test.dat 10 2
This runs the project, assuming there is an executable (i.e. submitmake
was run successfully) and that test.dat is in your local (not submittal)
directory.
Your project will be tested using a variety of command lines, some of which
will be invalid.
Your project will also be tested using a variety of command files which
will test various conditions which your code should handle.
Project grading is described in the
Project Policy handout.
Your answers to
341-Spring03-proj3_questions.txt
are worth 30% of your project grade.
Cheating in any form will not be tolerated. Please re-read the
Project
Policy handout for further details on honesty in doing projects for
this course.
Remember, the due date is firm. Submittals made after 11:59pm of the
due date will not be accepted. Do not submit any files after that time.