due on Sunday, 3/4 before 11:59 PM
Sue Evans & Travis Mayberry
Hit the space bar for next slide
For every remaining assignment, whether a homework or a project, you'll need to create a file that contains your code.
You should change into your hw4 directory and then use emacs to create a file called hw4.py by entering emacs hw4.py & at the linux prompt.
Again, if you are working from home the & probably won't work and you should just login twice to have two different windows, one for your text editor and the other to run your program. Just type emacs hw4.py to run emacs.
At the top of each file, we are requiring a file header comment.
The fileheader comment has six required parts:
Here's an example of a file header comment for this homework :
# Filename: hw4.py # Written by: Sue Evans # Date: 7/27/09 # Section #: All # Email: bogar@cs.umbc.edu # Description: # This file contains python code that will decode an English # message that was encoded using a Caesar cypher with a fixed # rotation length. It makes use of letter frequencies in # the English language to determine the rotation length.
You should also have some comments within your code to explain what confusing-looking code is doing. These comments should be directly above and indented evenly with the code it is explaining.
For a full explanation of style and documentation requirements, see the 201 Python standards. A good portion of each of your assignment grades depends upon how well you adhere to these standards.
For this homework you are going to write a program that tries to break the cipher that you wrote in your lab last week.
I have already run the Ceasar cypher with a fixed rotation length on the preamble of the U. S. Constitution and captured the output into a file called code.txt. You can get a copy of it from my directory:
cp /afs/umbc.edu/users/b/o/bogar/pub/code.txt .
UNIX allows us to use the contents of a file as input to our program and also capture output from our program into a file. These operations are called redirection. The < is used when you're using a file for input and the > is used when you want to capture the output into a file. You can just think of these symbols as arrows to or from your programs name. Here are some examples of their use:
python lab3.py < preamble.txt
where preamble.txt is the preamble written in English all on one line. This redirection causes the raw_input statement in lab3.py to read the contents of the preamble.txt file in instead of some message that the user enters from the keyboard. No modifications have to be made to the lab3.py file for this to work.
python lab3.py > code.txt
would run the program getting input from the user and capturing the output in a file called code.txt
Could we do both at the same time? Sure.
python lab3.py < preamble.txt > code.txt
and it would run very fast. This is what I did to produce the code.txt file you just copied.
You will be running your program using Unix redirection to provide input to your program like this:
python hw4.py < code.txt
Recall that the Caesar cipher shifts characters by a fixed amount to encrypt the message. How would you break something like this?
It turns out that English has a very specific distribution of letters in an average sample of text. As the amount of text you have increases, the closer to this ideal distribution the letter frequencies become.
Your program must calculate the correct shift amount by finding the letter used most often in the coded text and using the information in the letter frequencies chart shown below. You may not just look at the coded text to figure out the shift amount yourself and hard-code that shift amount into your program.
Here are the English frequencies for each letter :
a 8.167% b 1.492% c 2.782% d 4.253% e 12.702% f 2.228% g 2.015% h 6.094% i 6.966% j 0.153% k 0.772% l 4.025% m 2.406% n 6.749% o 7.507% p 1.929% q 0.095% r 5.987% s 6.327% t 9.056% u 2.758% v 0.978% w 2.360% x 0.150% y 1.974% z 0.074%
Hints:
linuxserver1.cs.umbc.edu[101] more code.txt BJ YMJ UJTUQJ TK YMJ ZSNYJI XYFYJX, NS TWIJW YT KTWR F RTWJ UJWKJHY ZSNTS, JXYFGQNXM OZXYNHJ, NSXZWJ ITRJXYNH YWFSVZNQNYD, UWTANIJ KTW YMJ HTRRTS IJKJSXJ, UWTRTYJ YMJ LJSJWFQ BJQKFWJ, FSI XJHZWJ YMJ GQJXXNSLX TK QNGJWYD YT TZWXJQAJX FSI TZW UTXYJWNYD, IT TWIFNS FSI JXYFGQNXM YMNX HTSXYNYZYNTS KTW YMJ ZSNYJI XYFYJX TK FRJWNHF. linuxserver1.cs.umbc.edu[102] python hw4.py < code.txt WE THE PEOPLE OF THE UNITED STATES, IN ORDER TO FORM A MORE PERFECT UNION, ESTABLISH JUSTICE, INSURE DOMESTIC TRANQUILITY, PROVIDE FOR THE COMMON DEFENSE, PROMOTE THE GENERAL WELFARE, AND SECURE THE BLESSINGS OF LIBERTY TO OURSELVES AND OUR POSTERITY, DO ORDAIN AND ESTABLISH THIS CONSTITUTION FOR THE UNITED STATES OF AMERICA. linuxserver1.cs.umbc.edu[103]
Notice that each of these files contain just one long string that needs to go off the edge of the page to print out. If you looked at these files using less, you would see a lot of wrapping. For this exercise, that's perfectly okay.
For 5 points of extra credit, break the string up so that it prints out normally in an 80-character window without wrapping. Words must go all of the way across the page (one word per line doesn't count). This must be done without hard-coding specifically for the Preamble of the Constitution.
Sample output for Extra Credit
WE THE PEOPLE OF THE UNITED STATES, IN ORDER TO FORM A MORE PERFECT UNION, ESTABLISH JUSTICE, INSURE DOMESTIC TRANQUILITY, PROVIDE FOR THE COMMON DEFENSE, PROMOTE THE GENERAL WELFARE, AND SECURE THE BLESSINGS OF LIBERTY TO OURSELVES AND OUR POSTERITY, DO ORDAIN AND ESTABLISH THIS CONSTITUTION FOR THE UNITED STATES OF AMERICA.
When you've finished your homework, use the submit command to submit the file.
submit cs201 HW4 hw4.py
Don't forget to watch for the confirmation that submit worked correctly.
Specifically, the confirmation will say:
Submitting hw4.py...OK
If not, try again.
You can check your submission by entering
submitls cs201 HW4
You should see the name of the file that you just submitted,
in this case, hw4.py.