Assigned | Friday, February 20, 2009 |
Due | 11:59pm Friday March 6, 2009 |
Updates | Please update your LinkedList.java with this one. Changes to some descriptions of functions. |
A spell checker is a common application used in conjunction with document preparation. Since many documents focus on one subject, many words in a document are often repeated. Hence, it makes sense that checking the spelling of repeated words should be made as fast as possible. In this project you will implement a rudimentary spell checker that provides for faster access to frequently used words. The spell checker will be implemented as a List of Lists.
We will implement two variations of the a linked list implementation by deriving two new kinds of linked list using inheritance. This will require you to implement your own linked list ADT and create the derived list implementations. One new list is called an "MRU List" (MRULL). MRU stands for "Most Recently Used". Every time an element of an MRU List is "found" it is moved to the front of the list. In this way, frequently accessed data is always close to the front of the list, so subsequent "finds" for that data are quicker. New words are inserted at the front of the MRU List. For our spell-checker this means that repeated words in a document are checked more quickly.
The other new Linked List is the "Sorted Linked List" (SLL). A SLL keeps its data sorted at all times by inserting new elements in the appropriate place in the list.
Project 2 will be invoked with multiple command line arguments. The first argument is the name of the text file which will be checked for spelling errors. The second argument is the name of the dictionary file. The third command line argument is the number of words to print for each letter on the command line. The remaining argument(s) are single lower-case characters [a-z] for which words and recently used words from the dictionary are printed.
The Dictionary ADT defines the following operations
For example, PrintMRU('a', 10) and PrintMRU('A', 10) BOTH
print words which begin with 'a' OR 'A'.
The CountedWord ADT defines the following operations
Project grading is described in the Project
Policy handout. Remember, the due date is firm. Submittals made after midnight of the due
date will not be accepted. Do not submit any files after that time. Spelling Rules
For example, in the sentence "Hi, you're looking nice today?"., you would remove the quotes, comma,
and the question mark and search
the dictionary for the words Hi, you're, looking, nice and today.
The ADTs
The MRU List ADT
The Most Recently Used (MRU) List is derived from LinkedList. It provides no other functionality.
It differs from a LinkedList in just one way --
Elements which are successfully found are moved to the front of
the list. The list is unchanged in the event of an unsuccessful find operation. Since an MRU
list is inherently NOT sorted, new elements can be inserted anywhere in the MRU list.
The Sorted List
The Sorted List is derived from LinkedList. It provides no other functionality.
It differs from a LinkedList in just one way -- the elements are always in ascending sorted order.
The Dictionary ADT
The dictionary contains ascii strings which represent the complete set of correctly spelled words.
To facilitate fast look-up, these words are stored in an MRU list for each letter of
the alphabet. The Dictionary is essentially a SortedList of MRU word lists.
The words must be printed in a neat columnar format, 4 words per row. Each column must be separated by a minimum of 3 whitespaces.
The width of the columns will depend on the length of the longest word to be printed. Note that this operation is
case-INsensitive. The parameter passed to the function may be upper- or lower-case. In either event, words
that begin with the same upper- or lower-case letter must be printed.
Word List ADT
You will find it necessary (or at least easier) to associate some information
with the MRU list for each letter of the alphabet. The WordList ADT encapsulates
this information and becomes an adapter for the MRU list. The WordList ADT
defines the following operations
Misspelled Word List ADT
Your program must maintain a list of misspelled words along with the number
of times each misspelled word appears. To do so, you must implement the
MisspelledList ADT which defines the following operations
CountedWord ADT
Since the printed list of misspelled words also includes the number of times each
misspelled word appears, good OO design dictates that these data items be stored
together in a simple object which we are calling a CountedWord. (You're free to choose
a better name if you can think of one).
Basic Project Requirements
The words "a", "an" and "the" (known as "articles") should be added to the dictionary.
They may or may not be present in the dictionary file read by your program.
The words must be printed in a neat columnar format, 4 words per row. The format is shown
in the sample output. For formatting purposes only, you may assume that no misspelled word will
appear more than 99 times.
Project Notes and Hints
Sample Output
The output below is provided for formatting purposes only. It is not the
result of executing any version of project 2. Your output may vary slightly,
but must conform to the output format found elsewhere in this description.
Your output must be column aligned for each letter. Use the length of the
longest word starting with each letter to determine your column width.
Project Questions (10 points)
Edit the file 341-spring09-proj2_questions.txt found in your repository for this project.
Edit it to answer the questions. Be sure to submit this file.
Files To Be Submitted
You must use CVS to check out your module in the Proj2 repository and to check in your code.
That must include an ant build.xml file with targets for creating javadoc (named "doc"), for
running your program (named "run" and that allows the command line arguments to be specified
with -Dargs, like this ant -Dargs="arg1 arg2 ..." run), and for compiling your code (this
must be the default target). The build.xml file that comes with your module in the Proj2
repository has all of this. Do not submit or put under CVS control your javadoc output.
We will build it with the "ant doc" command. See the projects page for more information on all of these topics.
If you don't submit a project correctly, you will not get credit for it. Why throw away all
that hard work you did to write the project? Check your submittals. Make sure they work.
Do this before the due date.
Grading and Academic Integrity
Cheating in any form will not be tolerated. Please re-read the Project
Policy handout for further details on honesty in doing projects for this
course.