Assigned | Wednesday, October 5, 2005 |
Due | 11:59pm Wednesday October 19, 2005 |
Updates | Oct 9
Since an MRU list is inherently NOT sorted, new elements can be inserted anywhere in the MRU list. The statement " New words are inserted at the front of the MRU List." (in the Description section) is part of the project requirements so that everyone gets exactly the same output. This may be accomplished by having the user of the MRU list always insert at the front of the list, or by overriding the LinkedList's insert( ) function. |
A spell checker is a common application used in conjunction with document preparation. Since many documents focus on one subject, many words in a document are often repeated. Hence, it makes sense that checking the spelling of repeated words should be made as fast as possible. In this project you will implement a rudimentary spell checker that provides for faster access to frequently used words. The spell checker will be implemented as a List of Lists.
We will implement two variations of the author's linked list implementation by deriving two new kinds of linked list from the author's list using C++ inheritance. One new list is called an "MRU List" (MRULL). MRU stands for "Most Recently Used". Every time an element of an MRU List is "found" it is moved to the front of the list. In this way, frequently accessed data is always close to the front of the list, so subsequent "finds" for that data are quicker. New words are inserted at the front of the MRU List. For our spell-checker this means that repeated words in a document are checked more quickly.
The other new Linked List is the "Sorted Linked List" (SLL). A SLL keeps its data sorted at all times by inserting new elements in the appropriate place in the list.
Project 2 will be invoked with multiple command line arguments. The first argument is the name of the text file which will be checked for spelling errors. The second argument is the name of the dictionary file. The third command line argument is the number of words to print for each letter on the command line. The remaining argument(s) are single lower-case characters [a-z] for which words and recently used words from the dictionary are printed.
The Dictionary ADT defines the following operations
For example, PrintMRU( cout, 'a', 10) and PrintMRU( cout, 'A', 10) BOTH
print words which begin with 'a' OR 'A'.
The CountedWord ADT defines the following operations
Be sure to submit the answers to the project questions in plain text format.
Submit the files using the procedure given to you for your section of the
course.
Submission Tools
If you don't submit a project correctly, you will not get credit for it.
Why throw away all that hard work you did to write the project? Check your
submittals. Make sure they work. Do this before the due date. Documentation for the submit program is on the web at http://www.gl.umbc.edu/submit/ . One
of the tools provided by the submit program is Additionally, there are two programs for use only by CMSC-341 students (not
part of the UCS submit program). They are in the directory The syntax is similar to that for submit: submitmake <class> <project> Example: submitmake cs341 Proj2 <parameter list> This makes the project, and shows you the report from the make utility. It cleans
up the directory after making the project (removes .o and ii_files), but leaves
the submitrun <class> <project> Example: submitrun cs341 Proj2 This runs the project, assuming there is an executable (i.e. submitmake was
run successfully). Project grading is described in the Project
Policy handout. Remember, the due date is firm. Submittals made after midnight of the due
date will not be accepted. Do not submit any files after that time. Spelling Rules
For example, in the sentence "Hi, how are you today?"., you would remove the quotes, comma, and the question mark and search
the dictionary for the words Hi, how, are, you and today.
The ADTs
The MRU List ADT
The Most Recently Used (MRU) List is derived from LinkedList. It provides no other functionality.
It differs from a LinkedList in just one way --
Elements which are successfully found are moved to the front of
the list. The list is unchanged in the event of an unsuccessful find operation.
The Sorted List
The Sorted List is derived from LinkedList. It provides no other functionality.
It differs from a LinkedList in just one way -- the elements are always in sorted order according
to operator<
The Dictionary ADT
The dictionary contains C++ strings which represent the complete set of correctly spelled words.
To facilitate fast look-up, these words are stored in an MRU list for each letter of
the alphabet. The Dictionary is essentially a SortedList of MRU word lists.
The words must be printed in a neat columnar format, 5 words per row.
The width of the columns will depend on the length of the longest word to be printed. Note that this operation is
case-INsensitive. The parameter passed to the function may be upper- or lower-case. In either event, words
that begin with the same upper- or lower-case letter must be printed.
Word List ADT
You will find it necessary (or at least easier) to associate some information information
with the MRU list for each letter of the alphabet. The WordList ADT encapsulates
this information and becomes an adapter for the MRU list. The WordList ADT
defines the following operations
Misspelled Word List ADT
Your program must maintain a list of misspelled words along with the number
of times each misspelled word appears. To do so, you must implement the
MisspelledList ADT which defines the following operations
CountedWord ADT
Since the printed list of misspelled words also includes the number of times each
misspelled word appears, good OO design dictates that these data items be stored
together in a simple object which we are calling a CountedWord. (You're free to choose
a better name if you can think of one).
Basic Project Requirements
The words "a", "an" and "the" (known as "articles") should be added to the dictionary.
They may or may not be present in the dictionary file read by your program.
The words must be printed in a neat columnar format, 5 words per row. The format is shown
in the sample output. For formatting purposes only, you may assume that no misspelled word will
appear more than 99 times.
Project Notes and Hints
Sample Output
The output below is provided for formatting purposes only. It is not the
result of executing any version of project 2. Your output may vary slightly,
but must conform to the output format found elsewhere in this description.
Your output must be column aligned for each letter. Use the length of the
longest word starting with each letter to determine your column width.
Project Questions
Copy the file 341-Fall05-proj2_questions.txt found in Mr. Frey's public directory for this project to your directory.
Edit it to answer the questions. Be sure to submit this file.
Files To Be Submitted
Submit any files which you have written or modified -- source code and makefile.
DO NOT submit any unmodified code provided by the author or your instructors.
In particular, this means LinkedList.H and LinkedList.C found in Mr. Frey's public directory.
As in project 1, your makefile should reference these files.
If your makefile is set up correctly, you should be able to execute the command
make submit.
There are a number of tools available to you to check on your submittal. It is
your responsibility to ensure that the submittal is correct and will result in
a successful compilation of your project. Do not wait till the last minute to
submit your files. Give yourself enough time to check that your submittal is
correct.
submitls. It lists the names of
the files you have submitted.
/afs/umbc.edu/users/d/e/dennis/pub/CMSC341/
and are named submitmake
and submitrun. You can
use these programs to make or run your submitted projects.
executable in place.
Grading and Academic Integrity
Cheating in any form will not be tolerated. Please re-read the Project
Policy handout for further details on honesty in doing projects for this
course.