Project 3 — You've Got Mail, Round 2!
|
|
In this second part of writing a mail client application, you will now replace your "client programmer" hat with your "library designer" hat. Your tasks for this second phase are:
EmailClient
class you created in Project 2 to instantiate and test the classes
you will be creating in Project 3. (If you failed miserably in
Project 2, we can provide a basic EmailClient implementation for you,
but you would learn a lot more if you did your own, and it
would feel much more satisfying!)
In this project, you will also do most of your own file I/O, including opening files, reading from files, and being responsible for closing them. (You will not have to write to files for this project, however.)
The small set of support code we will provide is just what is necessary to overcome a couple of issues with handling IMAP-standard mail files. First, creating a parser for these files requires implementing a parser with "look-ahead": you don't have to know what this is, just that it makes interpreting input more complicated. A second problem is that recognition of the boundary between messages requires some slightly complex regular expression-based string matching. Both of these challenges are somewhat peripheral to the objectives of this project, so we are providing code to you for handling this. However, for extra credit, you can try to solve these tasks for yourself--see the "Extra Credit" description below.
This project requires a bit more OO design, especially as it is meant to lead up to Project 4. This is compounded by the fact that we cannot reveal the details for Project 4, yet! We want to simulate a real-world design scenario, where you would be asked to come up with a design based upon requirements that you know will be added to at the last minute by the client. In fact, your trying to anticipate what the next project will require is an essential part of this project. One clue that we can give you is that we will be asking much more of you in the network side of mail-reading.
Because good planning and design is an important part of what we want to teach in this project, we are breaking up the assignment into two separate "deliverables":
The purpose of this first part is for me to verify that you are on the right track. I will try to get any corrective feedback to you ASAP. Therefore, if you can get it done earlier and get it to me, I will be able to look over it and get back to you sooner. You can give it to me in class, or slide it under my office door (if you do the latter, make sure you keep a copy).
This part will not receive a separate grade, but it is mandatory, and there will be a 10-point deduction for not handing in something that shows at least some effort.
In Project 2, we said you had to learn to walk before you can run. Now, you are running, but this is still just a short race: kind of like a 5K Run for Some Charity Benefit, with people on the sidelines shouting encouragement and providing drinks. Think of it as training for Project 4: the Boston Marathon.
At the top of your hierarchy will be a general
MailRepository
class. If you recall, that was the top class I provided for Project 2.
For Project 3, this will only serve as a parent class to derive
other classes and inherit from (Hint: can you say "abstract class"?).
From the MailRepository class, you will derive subclasses for the various types of specific mail repository types you think are appropriate. As a basic dichotomy, mail can exist as local files, or it can be managed by a server somewhere on the Internet. So, at the least, you will have a subclass for simple file-based mail repositories, as well as one or more subclasses for network-based repositories.
You should gather together all the important instance variables that
you can imagine would be important for storing state across any
of the general kinds of mail repositories, and place them in the
parent MailRepository
class.
You should also think about the kinds of behaviors that would be
universal, and also put declarations for methods to cover these
in this parent class.
Next, in each of the subclasses, you should add those instance variables and methods that would only be applicable to that specific subclass, and not to the parent or other "sibling" classes.
It is important to keep in mind here the role of abstract methods and abstract classes. When you put an abstract method declaration in a parent class, you are not making the claim that there is a common way to implement that method: just that what such a method would do applies to all subclasses, but not necessarily the how.
To start you off on thinking about what additional factors might apply to network-based mail repositories, think about what fields you have had to enter when you were configuring a real mailer (e.g., Thunderbird or Outlook Express) to fetch your UMBC mail, for instance.
EmailClient
class) should be able to use your
class implementations to fully read file-based mail.
For the other branches--mainly, the network-based repository
(or repositories)--you would only need stub methods; that is,
methods with full headers and descriptive header comments,
but mostly-empty bodies.
Note that "stub methods with empty bodies" is different
from abstract method headers. The latter has the form:
public abstract boolean myMethod(int param1);... while the former would look like:
public boolean myMethod(int param1) { return false; }In other words, stub methods should be syntactically "complete", even though they don't perform any real functions.
In order for us to test your classes effectively, we would like you to use specific names for some of your classes:
EmailClient
;
in other words, you are free to fix/improve upon your code from Project 2,
but leave the public interface alone.
MailRepository
;
it should have at least the public methods that were documented
in the Project 2 description. Again, note that this will probably be
an abstract class.
FileMailRepository
;
it should be derived from MailRepository
, but you
can add other intervening classes between the two if you see fit.
This will be the class that your EmailClient can instantiate
to operate on file-based
mail repositories. This obviously means you will have to at least
tweak your EmailClient methods to instantiate this class instead of
the plain MailRepository
class.
MailRepository
, to work on network-based repositories.
As noted above, this subclass should have method stubs, but
does not need to have fully fleshed-out the method implementations.
As always, also pay attention to good procedural programming principles! You should still write clear, modular code. This means breaking up oversized, monolithic methods into smaller, logically divided submethods, abstracting out common, repeated chunks of logic into private helper methods, giving your variables meaningful names, etc.--you know the drill.
You are free (and encouraged) to add any helper classes you think would be useful and reflective of good OOP design, but only if it really makes your code "better", by which I mean "clearer" or "easier to understand" or "more logical". Recall what I said in lecture: performance is definitely a lesser consideration than clarity for any application that we will assign in this course, and even in the real world, it is rare that sacrificing clarity for efficiency is a Good Thing. Also note that trying to show off by making your code unnecessarily clever/complex just annoys and antagonizes the graders.
You are not to use any classes other than those in the standard Java libraries, and those specifically provided by me, without first checking with me.
Important: As with the prime number project, there are probably many implementations on the web of projects similar to this one. It would be best not to refer to those: this project should be simple enough that you don't need outside help at the design level, and you will only risk inadvertantly copying too much of someone else's idea.
The general format of a mail file is a concatenated set of individual mail items, each with the following form: (note that the text at the left margin is my annotation, and only the indented text is the real mail text)
An IMAP preamble header: From park@cs.umbc.edu Sun Sep 6 22:30:32 2010 Followed by 1 or more header lines: Received: from [192.168.1.153] (pool-173-79-24-32.washdc.fios.verizon.net [173.79.24.32]) (authenticated bits=0) by mail.cs.umbc.edu (8.14.3/8.14.3) with ESMTP id o8R2UVV0022172 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) forHere are the other rules you need to know:; Sun, 6 Sep 2010 22:30:31 -0400 (EDT) Message-ID: <4CA001CE.9070505@cs.umbc.edu> Date: Sun, 6 Sep 2010 22:30:38 -0400 From: John Park User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: park@cs.umbc.edu Subject: Test 1 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 (mail.cs.umbc.edu [130.85.36.69]); Sun, 6 Sep 2010 22:30:32 -0400 (EDT) Content-Length: 13 Status: RO X-Status: X-Keywords: X-UID: 1 Followed by a blank line, then the body: This is the first line of the body.
SomeHeaderFieldName: value as some textin other words, a field identifier, starting at the left, doesn't contain any spaces, but can have symbols like '-' and '$'; then a ':' (colon), followed by the value for the header field, with possible leading spaces in the value (remove these with String.trim()).
You should discard the blank line that ends the header fields; it is not part of the body.
The provided library has two classes you can use (any other classes/methods
in the library are helpers for the internal implementation, and are
not documented). The classes are all in the package
proj3MailUtil
.
The two classes' public interfaces are:
The ReadaheadReader class: No public instance variables. Public constructors: /** * returns an instance of a readahead-enabled reader with input set * to file named "inFileName". This is the version you are most * likely to use. */ ReadaheadReader(String inFileName); /** * Alternative version if you want to provide your own * BufferedReader instance */ ReadaheadReader(BufferedReader in); Public methods: /* Reads the next line of text from the input, breaking at the next * newline/carriage return. The terminating NL/CR is stripped off. */ String readLine(); /* Method to return a just-read line back into the input buffer. * This "unread" line will then be the next item returned by readLine() * You can sequentially "unread" multiple items, in which case * they will be returned by readLine() in reverse order * (i.e., last-unread is first-read). */ String unreadLine(String line); The P3Util class: /* * This class only has a few STATIC public helper methods: */ Public static methods: /* * This method takes a line of text, and does a regular expression * based match to see if it fits the pattern for an IMAP file's * per-message header line. These specially-formatted lines * mark the boundary between the mail items in a mail file. * (A big name, for a little method.) * * Returns true if the argument fits the pattern */ boolean isImapMessageHeaderFlagLine(String line); /* * Method to help you read in email header specifications. * This method takes care of header fields that are continued * across multiple lines, returning the entire field as one * String. It will remove the final NL/CR, like readLine(), * but will leave intermediate line breaks intact. * Note that unlike readLine(), this method is static, so * a) it takes the reader as a parameter; and * b) it has no "unread" capabilities. */ String readContinuedLine(ReadaheadReader reader); }
As in Project 2, this library class will be provided as a JAR file, and so will a sample mail file. These will be available soon, to download at these links:
Once you've downloaded these two files, follow the same instructions as in Project 2 to copy the files to the correct directories and configure Eclipse to use the libraries.
sendMessage()
method properly. So, the method will not
actually deliver any mail, nor even store it in any file. However,
it will soon, in Project 4, so your method should at least do all
the error-checking outlined in the description for this method in Project 2:
for e.g.: the header's toAddr field must be non-null and non-empty.
EmailClient
class will not have changed much,
the output should be much as it was in Project 2.
As mentioned earlier, there are three issues with parsing that file: first, you have to know how to open files, and create buffered streams so that you can get a line at a time. To do this, use:
inReader = new BufferedReader(new FileReader(inFileName));BufferedReader instances provide a line-at-a-time reading method analogous to a Scanner object's "nextLine()", so in the example above, you could invoke inReader.readLine() to read Strings nicely broken up at end-of-line points from the input file. You can then layer some kind of internal storage scheme on top of this to provide an "unreadLine()" method.
Note that BufferedReader objects are much more primitive than Scanner objects. You just keep reading until readLine() returns null. Also note that you will have to wrap this in the appropriate try-catch structure, since the FileReader() constructor might throw a "FileNotFoundException" -- you should know how to handle that now.
Another problem is that the "grammar" for the file format requires that you do some look-ahead: i.e., you need to have already read in some of the next part before you know you have all of the current part. This makes it difficult to write modular code, since some method like getNextHeaderField() would have possibly already read in a few characters of the next header, and there is no easy way to "put it back."
So, you will have to devise a way to accomplish exactly this: to "unread" some text in a controlled fashion.
The last problem is that detecting the inter-message boundary in mail files requires detecting a certain, slightly flexible format line of text. In turn, this is best handled by calling the regular expression matcher that is part of the Java library, and giving it the appropriate regular expression, neither of which is part of the syllabus. This part, I will provide for you. I will give you 2-3 lines of code that you can embed into a method to detect the inter-message marker.
First, in the class that will be doing the inter-message marker recognition, you will need to import the right package:
import java.util.regex.*;Then, you should insert into the class:
private static String MESSAGE_HEADER_PATTERN = "^From (\\S+@\\S+|MAILER-DAEMON) [SMTWF][a-z][a-z] [JFMASOND][a-z][a-z] (\\d| )\\d \\d\\d:\\d\\d:\\d\\d \\d\\d\\d\\d( [+\\-]?\\d\\d\\d\\d)?$"; private static Pattern pattern = Pattern.compile(MESSAGE_HEADER_PATTERN);This will generate a regular expression engine that will recognize that specific pattern. (Make sure you get the pattern string exactly right! I worked hard on that, and every double-slash counts!) Finally, to use this engine, do the following, where the variable "line" contains the String to be tested:
Matcher matcher = pattern.matcher(line); if (matcher.find()) { // Code here to handle case where line is inter-message marker. // --probably just set a flag. }
So, if you choose to do the extra credit, other than inserting the above few lines of code into your program, you will have built a complete top-to-bottom working application using only Java standard libraries that can parse and interpret RFC-5322-standard mail files. Good job!
Important Note:
When I provide any .jar files for the project, unless you
are an expert with JREs and classpaths, the easiest process is to
unpack the class files into your tree. To do this, from the same directory
as above, just type:
jar xf proj3UtilLib.jar
where you would replace "proj3UtilLib.jar" with whatever jar file you
are using.
To submit your project, type the command
Do not submit the provided library or test input file--we have those already :-)
More complete documentation for submit and related commands can be found here.
Remember -- if you make any change to your program, no matter how insignificant it may seem, you should recompile and retest your program before submitting it. Even the smallest typo can cause errors and a reduction in your grade.