CMSC 341 - Fall 2001
Project 5
Assigned |
Monday 26 November 2001 |
Due |
Sunday, 09 December 2001 |
Review Sessions |
Wednesday/Thursday 28/29 November 2001 |
Modifications |
27 November 2001 - to make parsing the file a little easier, there will be
no leading or trailing whitespace around a name.
-- i.e. entries will be of the form
(Bob Smith,Mary) and NOT ( Bob Smith, Mary )
26 November 2001 - questions posted
|
Clarifications
- About the auxiliary data structure...
- It's okay to use more than one data structure
- If you use one of the author's data structures, it's okay to modify
his interface slightly (e.g. add a new constructor) if it makes it much easier for you to use. Please alert the grader to any such changes you make.
- It's okay to use container classes from the STL (vector, list, etc), but NOT Map,
HashMap or other similar "mapping" containers.
- It is NOT okay to change the Disjoint Set interface in any way
- About problems with HashTable.C
The author's HashTable.C
file contained both template code (for the HashTable) and non-template
code (isPrime(), nextPrime(), and two hash functions ). Such a combination often leads to
linker errors when HashTable.C is included in more than one other .C file
(like Families.C and Proj5.C because they include HashTable.H). This is
not something that guarding
the header files can prevent (Be sure you know why.)
Soooo... I've
removed isPrime() and nextPrime() from HashTable.C and created Prime.C
and Prime.H which are in my public directory. The two hashing functions were also
removed and can be found in HashFuncs.C; prototypes in HashFuncs.H.
You should know how to use
them properly.
- More about the author's HashTable.....
Like the Disjoint set class, the HashTable class did not have a default constructor,
so it was not was not possible to use a default HashTable as a private data member of
another class. To solve this problem, I added a default constructor that sets size equal
to 101 and I added a method called setItemNotFound(). There should probably be a setSize()
function for completeness, but at 101 is sufficient for this project and at this late date
it seems unnecessary.
Background
Equivalence relations have a wide variety of applications. You
have discussed examples in class and will see more applications when discussing
graphs. One "real life" application is in the field of genealogy -- the
investigation into one's ancestors and relatives to discover the depth
and breadth of a family. In our project, everyone belongs to
one and only one family. This project will give you an opportunity
to use disjoint sets to represent families. This project will also
allow you some design leeway as you are required to choose an second data
structure to implement this project. Finally, you will get even more
practice getting input from a file and interpreting it's contents.
Description
In this project you will use disjoint sets to represent families. You
will create new families as people are born and combine families as relationships
among people are determined. You will read a file that describes the
birth of people and creates their relationships. See the
data file description
below for details. You will represent each family with a disjoint set
implemented as an up-tree. As each person is born or a relationship
is established, you will create or combine sets via the up-trees using path
compression and union-by-weight heuristics discussed in class. From
time to time you will also be required to print all current members of a
person's family, or to print the members of all families. There will
be no more than 100 people in all families combined. In the event that
an attempt is made to create a relationship with a nonexistent person, or
to print the family of a nonexistent person, your program should report the
error and continue to the next command in the file. See the section
on the UnBornPerson exception class
below.
Recall that up-trees are best represented using an array of integers.
Since the data in the file are people's names and not integers, you
will need a data structure which allows you to quickly find a person's name
and "converts" it to that person's corresponding index into the array of
up-trees. You are at liberty to choose any data structure we have
used in this class. The author's code for various data structures can
be found in Mr. Frey's public directory.
The Command Line
Your program will accept one command line argument -- the name of the
data file. If the name of the data file is provided, you may assume
the file is in the specified data file format
. If the name is not provided or the file cannot be opened, your program
will terminate in user-friendly manner.
Classes
Disjoint Set
The author's (modified) disjoint set code is available in Mr. Frey's
public directory. As usual, you may not add any public data methods
or public data members to the class, but you made add whatever private data
and/or methods you deem necessary. Recall that the disjoint set class
actually supports a forest of up-trees (i.e. multiple sets).
Usage Note
Note that it is not possible to use the author's DisjSet constructor
DisjSets(int numElements) inside a class definiton without declaring
a static definition of the DisjSets and initializing the DisjSets object
outside of the class. To avoid this problem, a default constructor
and a new method makeSets (int numElements) have been added to the
author's disjoint set class. These should be called from the constructor
of the class that uses the DisjSets class.
Auxiliary Data Structure
Because disjoint sets use integers to represent the members of each
set and the data in our file are people's names, you must have an auxiliary
data structure that allows you to "translate" a name to an integer through
some look-up or conversion mechanism You must also be able to add
new names to the data structure for future look-up or conversion. The
data structure you choose will have an impact on the big-Oh performance of
your program. Several data structures may be usable, but you must choose a
data structure that provides the necessary functionality and has the least
possible big-Oh impact on your program.
You will be required to defend your choice and analyze it's impact on performance.
Families
You will implement a new class named Families. You are
free to design this class and its interface as you see fit, but you must
use good OO design principles (i.e. DON'T create an accessor and a mutator
for every private data member. The user (i.e. main( ) )should need
no knowledge about how the class is implemented. The design of this
class will be part of your grade for this project). This class should
contain the disjoint set class and your auxiliary data structure. It
must support the following operations.
- Record the birth of a person as the beginning of a new family
- Record the addition of a child to a family
- Record the discovery of siblings (i.e. brothers and sisters)
- Record the marriage of two people (names do not change)
- Print all members of any person's family (family member names may
be printed in any order)
- Determine if two people are related
- Find and return the name of the head of a family
Any attempt to create a relationship with a nonexistent person or to
print a nonexistent person's family should throw an UnBornPerson exception.
UnBornPerson exception class
When any reference to a nonexistent person is attempted, your Families
class should throw an UnBornPerson exception object. The exception
mst be caught at the appropriate level which will report the type of relationship
that was attempted and the name of the nonexistent person. Your
program should continue to execute after the exception is handled.
You are at liberty to design and implement this class as you see fit,
except that you must implement the default constructor, the destructor,
the copy constructor and the assignment operator.
Tasks
- Read and understand the Disjoint Set ADT and the author's implementation
- Choose and implement an auxiliary data structure class as described
above.
- Design and implement the Families class described above.
- Design and implement the UnBornPerson exception class.
- Design and implement any other classes you feel are necessary. All
classes must support the big-4.
- Write Proj5.C that contains main( ). Main should read
data from the file and perform the appropriate operation and provide the
appropriate output.
- Copy the project questions file from Mr. Frey's public directory and answer the
questions. The answers to these questions are worth 20% of your project grade.
Data File
The data file consists of "commands" and command parameters in the format
specified below. There is one "command" per line. The file may contain
blank lines. Commands are case insensitive (i.e. commands may be in
upper-case, lower-case or mixed cases). Command parameter(s) are peoples
names which are enclosed in parentheses and separated by commas (if more
than one name is required). Names may contain white space and punctuation
(but not commas) and will be of mixed case(e.g. Robert B. Smith). No
more than 100 different person's names will appear in the file. The
test data file proj5test.dat
is available in Mr. Frey's public directory. This file does not test
every possible combination of relationships or exceptions. Being able to generate
correct output from this file does not guarantee a perfect project grade.
Students are encouraged to create their own test files that present
every possible relationship and exception.
The data file command format is listed below (commands are case-insensitive)
- BORN (name) -- indicates the birth of a person and the beginning
of a new family
- CHILD (name1, name2) -- indicates that name2 is a child of name1
- SIBLINGS (name1, name2) -- indicates that name1 and name2 are brothers/sisters
- MARRIED (name1, name2) -- indicates that name1 and name2 were married.
Names are NOT CHANGED when two people get married.
- FAMILY (name) -- prints the name of all members of this person's
family. Family member's names may be printed in any order.
- RELATED (name1, name2) -- reports whether or not name1 and name2
are in the same family and if so, reports the name of the head of the family
to which they belong.
Sample Output
Sample output is located in Mr. Frey's public directory for this project
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/proj5 in the
file p5output. Note that this output is not complete
in that it involves only one family and no exceptions were thrown. This
output was not created by reading any file and is to be used only as a guide
to the required contents of the output and the formatting your output.
Your output must echo each command being processed.
Project Submission
DO submit these files
Since you are submitting your own makefile, you are free to name your
files whatever you like, with the following restrictions:
- Use only .C and .H extensions
- All class definitions must appear in .H files. Header files may
not contain any code unless that code was put there by the author.
- All class implementations appear in .C files.
- The name of your executable MUST BE Proj5 (upper-case 'P').
The submitrun program and the scripts that we use to grade your files assume
that you follow this convention for naming your executable.
- Your makefile. Your makefile must work properly!
This is the fifth and last project of the semester and you should have all
the bugs worked out of your makefile (i.e. with respect to using files from
Mr. Frey's public directory). If your makefile doesn't work in your
submittal directory, expect no sympathy.
- The answers to the project questions.
DO NOT submit these files
Do not submit any files from Mr. Frey's public directory that you use
without modification.. These files should be included in your project through
your makefile. If found in your submittal directory, these files will
be deleted (for example, Queue.H and Queue.C)
Submit Tools
There are a number of tools available to you to check on your submittal.
It is your responsibility to ensure that the submittal is correct and
will result in a successful compilation of your project. Do not wait till
the last minute to submit your files. Give yourself enough time to check
that your submittal is correct.
If you don't submit a project correctly, you will not get credit for
it. Why throw away all that hard work you did to write the project? Check
your submittals. Make sure they work. Do this before the due date.
Documentation for the submit program is on the web at
http://www.gl.umbc.edu/submit/
. One of the tools provided by the submit program is submitls
. It lists the names of the files you have submitted.
Additionally, there are two programs for use only by CMSC-341 students
(not part of the UCS submit program). They are in the directory
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/ and are named
submitmake and submitrun. You can use these
programs to make or run your submitted projects.
The syntax is similar to that for submit:
submitmake <class> <project>
Example: submitmake cs341 Proj5
This makes the project, and shows you the report from the make utility.
It cleans up the directory after making the project (removes .o files),
but leaves the executable in place.
submitrun <class> <project> [command-line-args]
Example: submitrun cs341 Proj5 test.dat
This runs the project, assuming there is an executable (i.e. submitmake
was run successfully).
Grading and Academic Integrity
Your project will be tested using a variety of command lines, some
of which will be invalid.
Your project will also be tested using a variety of command files which
will test various conditions which your code should handle.
Project grading is described in the
Project Policy
handout.
Your answers to project questions are worth
20% of your
project grade.
Cheating in any form will not be tolerated. Please re-read the
Project Policy
handout for further details on honesty in doing projects for this course.
Remember, the due date is firm. Submittals made after midnight of the
due date will not be accepted. Do not submit any files after that time.
Last modified on Monday, November 19, 2001 by Dennis Frey