Regular Expressions (DUE: 2/21 11:59 PM EDT)

Updates

02/14/18: Updated Problem 2 to modify order requirements

This assignment will give you practice with regular expressions, both for the purposes of finding matches and performing substitutions. The assignment will be submitted through GitHub, and starter code for each problem in the assignment will be checked out to your repository when you start your assignment. To get started with the assignment, go to https://classroom.github.com/a/qvOJYcdi and sign in to your GitHub account. This will create a repository for you containing the starter code.

1. regex1.pl (10%)

In this problem, you will write a regular expression to identify lines containing a header element in an HTML document. The valid tags you are looking for are h1, h2, h3, h4, h5, and h6, in either upper or lower cases. The tags may have attributes in the opening tag only, and there may be arbitrary content between the tags. You may assume that the '<' and '>' characters will not appear in any attributes.

 
	
#!/usr/bin/perl

# Print lines that contain an opening and closing HTML heading tag.  Valid 
# tags are h1, h2, h3, h4, h5, h6 regardless of the case.  Tags may contain
# attributes inside of the opening portion of the same number.  Tags may also
# have any arbitary content inside of them.  You may assume that a '<' or a
# '>' character will not appear inside of the attribute's value portion.
#
# For example:
#
#      '<h1>foo</h1>' matches
#      '<h2 class="foo">foo</h2>' matches
#      '<h3>bar</H3>' matches
#      '<h4 class="blah" id="foo">baz</h4>' matches
#      '<h5><span class="big">B</span>ig</h5>' matches
#      '<h9>foo</h9>' does not match
#      '<h1>something</h4>' does not match
#      '<h1>foo bar baz...' does not match



while(<>) {
  print if /REGEX/;
}

2. regex2.pl (10%)

Update 02/14/18

In the example below, it would be extrordinarily complex to prevent all matches from being out of order. So for the example of "5 minutes 1 day ago" not matching, it is fine if the line as a whole matches, but if we check the part of the line that matches, only "1 day ago" should have matched.

In this problem, you must identify relative time expressions in a variety of formats. These epxressions will consists of one or more units of time, plus the string 'ago' for times in the past, and 'from now' for days in the future. The units of time you are expected to recognize are minutes, hours, days, and years. Assume any number is valid, so someone could say 40 days ago.

		
#!/usr/bin/perl

# Print line if it contains a relative time expression expressed in any combination 
# of minutes, hours, days, and years.
#
# Note: 1 of any unit should not use the pluar form of the unit.
# 	The combination of units should be in logical order
#
# For example:
#
#     "5 years ago" matches
#     "5 minutes from now" matches
#     "1 day 5 minutes ago" matches
#     "5 minutes 1 day ago" does not match
#     "from now 5 minutes" does not match
#     "10 seconds ago" does not match
#     "a day ago" does not match
#

while (<>) {
  print if /REGEX/;
}
~

3. regex3.pl (10%)

In this problem, you must match lines from the output of git log that show files that have only been added to or only been deleted from, not files that have had both operations done. The git log format we will be using is a multi-line format organized as follows:

HEX COMMIT_MESSAGE

FileName | NUM_LINES CHANGE_TYPES

....

FileName | NUM_LINES CHANGE_TYPES

N file changed, X insertions(+), Y deletions(-)

An example of output from this command is below

008f60c4d3b904d37bdcaf49d654656cb2fab6bf Making sure airport data is there
 data/airports.tsv | 1284 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1284 insertions(+)
448b476ecd22466f250137d0207c0b261d0dbda2 Intial Ulpoad of Lecture 2
 Lecture02.ipynb | 1208 ++++++++++++++++++++++---------------------------------
 1 file changed, 488 insertions(+), 720 deletions(-)
18589f64b676c74f0665098ee51063305b095ff7 Completed Lecture 1
 Lecture01.ipynb | 697 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 608 insertions(+), 89 deletions(-)
c2f185a441e4d1bf108fc4949d7fa8ce6735a572 PostBuild appears to be working, removing early exit
 binder/postBuild | 1 -
 1 file changed, 1 deletion(-)
5f29169b33f0a25aafef5ab8246c553d8c0a963f Update set_envs.py to not error on empty lines)
 binder/perl_env    | 1 -
 binder/set_envs.py | 3 ++-
 2 files changed, 2 insertions(+), 2 deletions(-)
51b87421372cc59739231cf885edc58adced5af3 Debugging variable setting perl
 binder/postBuild | 2 ++
 1 file changed, 2 insertions(+)

For this problem, you should return any line that displays information about a file, and whose CHANGE_TYPE string consist entirely of +'s or entirely of -'s

		
#!/usr/bin/perl 

# Match lines of git log --stat --pretty=oneline that indicate files with only additions or only deletions
#
# The lines you are interested in have the format: 
# 
#     FileName | NUM_LINES CHANGE_TYPES
#
# Where...
#
# CHANGE_TYPES is entirely +'s or entierly -'s
#
# For example:
# 'data/airports.tsv | 1284 +++++++++++++++++++++++++++++++++++++++++++++++++++++' matches
# 'binder/postBuild | 1 -' matches
# 'Lecture02.ipynb | 1208 ++++++++++++++++++++++---------------------------------' does not match
# '008f60c4d3b904d37bdcaf49d654656cb2fab6bf Making sure airport data is there' does not match
# '1 file changed, 2 insertions(+)' does not match'

while(<>) {
  print if /REGEX/;
}

4. regex4.pl (10%)

This problem requires you to process a tab seperated values (TSV) data file of food facts modified from data available from World Food Facts and print lines that meets the following criteria:

Sold in the United States
Data updated in 2017 or 2016
Contains 1 unit or more of vitamin C

	
#!/usr/bin/perl

# Match foods that are sold in the United States, contain greater than or equal to 1 unit  of vitamin C, and
# whose information was updated in 2016 or 2017
#
# For example:
#  'Tomato Paste    03/09/2017      US      33 g (2 Tbsp)   6.1     6.06    21.21   12.12   363.6   14.5    0.0     0.0     45.0    61.0' matches
# 'Reese minis     12/23/2016      Canada  0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0' does not match
# 'Tootie Fruities 03/09/2017      United States   1 cup (32g)     3.12    6.25    87.5    46.9    469.0   18.7    12.5    0.0     469.0   312.0' matches
# '100% pure orange juice with calcium & vitamin D 08/09/2015      United States   8 fl. oz (240 mL)       0.0     0.833   10.8    9.17    0.0     30.0    4.17    0.0     188.0   146.0' does not match 

#

while(<>) {
  print if /REGEX/;
}

A sample datafile along with the meaning of each column is shown below

		
# Data from world.openfoodfacts.org
#
# Fields separated by tabs are:
#
# 1) Product Name
# 2) Date information was last updated
# 3) The country the food is sold in (NOTE: US and United State both appear)
# 4) The serving size
# 5) The amount of fiber
# 6) The amount of protien
# 7) The amount of carbohydrates
# 8) The amount of sugar
# 9) The amount of Vitamin A
# 10) The amount of Vitamin C
# 11) The amount of Vitamin D
# 12) The amount of Vitamin E
# 13) The amount of Sodium
# 14) The amount of Calciumi

Galettes de Pommes de terre Classiques  12/16/2016      Canada  60 g / 1 galette        1.67    1.67    25.0    0.0     0.0     0.0     0.0     0.0     350.0   0.0
Reese minis     12/23/2016      Canada  0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
Classic Cheddar Pierogies       03/10/2017      United States   114g    0.877   4.39    27.2    0.877   0.0     5.26    0.0     0.0     404.0   17.5
Four Cheese Mashed Potatoes     03/09/2017      US      28 g (0.25 cup) 3.6     7.14    71.43   7.14    0.0     12.9    0.0     0.0     2036.0  143.0
Enriched Macaroni Product, Small Shells 03/10/2017      US      56 g (0.5 cup)  3.6     12.5    75.0    1.79    0.0     0.0     0.0     0.0     0.0     0.0
Tootie Fruities 03/09/2017      United States   1 cup (32g)     3.12    6.25    87.5    46.9    469.0   18.7    12.5    0.0     469.0   312.0
Party Rainbow Chip cake mix     11/26/2016      United States   43 g    2.33    2.33    81.4    41.9    0.0     0.0     0.0     0.0     744.0   186.0
Lay's Classic   08/09/2015      United States   1 oz (28 g)     3.57    7.14    53.6    3.57    0.0     21.4    0.0     4.29    1250.0  0.0
Chocolate Chex  08/15/2015      United States   3/4 cup (32g)   3.12    6.25    81.2    25.0    469.0   18.7    12.5    0.0     625.0   312.0
Jif Beurre De Cacahuettes Extra Crunchy 03/16/2017      France  0.0     2.0     7.0     0.0     3.0     0.0     0.0     0.0     0.0     0.0     0.0
Orangina Sparkling Citrus Beverage      03/11/2017      United States   0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
Petite Cut, Diced Tomatoes With Green Chilies, Medium   03/09/2017      US      126 g (0.5 cup) 0.8     1.59    5.56    3.17    178.5   14.3    0.0     0.0     238.0   32.0
Juice Cocktail  03/09/2017      US      240 ml (8 fl oz)        0.0     0.0     12.5    12.5    0.0     25.0    0.0     0.0     33.0    0.0
Argentinian Red Shrimp  07/26/2015      United States   4 oz (113 g)    0.0     13.3    0.0     0.0     53.1    1.06    0.0     0.0     142.0   17.7
Papaya  03/10/2017      United States   1 can (340 mL)  0.0     0.0     13.8    13.2    0.0     17.6    0.0     0.0     5.88    0.0
Diced Tomatoes, Basil, Garlic & Oregano 03/09/2017      US      123 g (0.5 cup) 0.8     0.81    5.69    4.07    183.0   7.3     0.0     0.0     220.0   33.0
Tomato Ketchup  03/09/2017      United States   1 tbsp (17 g)   0.0     0.0     23.5    23.5    0.0     0.0     0.0     0.0     941.0   0.0
Extra Crispy Shoestring French Fried Potatoes   03/09/2017      US      85 g (3 oz)     2.4     2.35    27.06   0.0     0.0     2.8     0.0     0.0     365.0   0.0
Shortcake, Lemon & Cream        03/09/2017      US      80 g (1 SLICE)  0.0     3.75    35.0    25.0    187.5   3.0     0.0     0.0     200.0   50.0
Apple Pommes    03/10/2017      US      154 g (1 MEDIUM APPLE)  3.2     0.0     14.29   10.39   19.5    3.1     0.0     0.0     0.0     0.0
Tropical Gold Premium Pineapple Chunks in Juice 12/13/2015      France  0.0     0.1     0.4     15.0    12.0    0.0     32.0    0.0     0.0     100.0   0.0
Classico Four Cheese Pasta Sauce        03/09/2017      United States   125 g   1.6     1.6     8.0     4.8     72.0    2.88    0.0     0.0     392.0   48.0
Danoises à la cannelle roulées  01/15/2017      Canada  146 g / 1 danoise       2.05    4.79    54.1    28.1    205.0   6.16    0.0     0.0     363.0   54.8
Yellow Rice Mix With Saffron    03/10/2017      US      56 g (0.333 CUP RICE AND 1.5 TBSP SEASONING (1 CUP PREPARED) | ABOUT)   1.8     7.14    78.57   1.79    0.0     0.0     0.0     0.0     1304.0  36.0
Chocolate Chip Cookies  08/07/2017      Canada  0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0
Oikos, Greek Yogurt, Key Lime   03/09/2017      US      150 g (150 g)   0.0     7.33    12.0    11.33   20.1    0.0     0.0     0.0     53.0    100.0
V8 original     08/09/2015      United States   8 fl. oz (240 mL)       0.833   0.833   4.17    2.92    250.0   30.0    0.0     0.0     271.0   16.7
0% Fat Greek Style Yogurt With Honey    04/08/2017      France  0.0     0.5     6.5     0.0     11.8    0.0     0.0     0.0     0.0     70.8661417323   0.0
Cottage Cheese  08/09/2015      Canada  125 g   0.0     11.2    5.6     4.0     24.0    0.0     0.0     0.0     248.0   120.0
Walnuts Dropped 03/09/2017      US      30 g (0.25 cup) 6.7     16.67   13.33   3.33    0.0     0.0     0.0     0.0     0.0     67.0
Marionberry Pie 03/09/2017      US      135 g (0.25 PIE)        3.0     2.22    38.52   18.52   44.4    0.0     0.0     0.0     348.0   15.0
Cheerios        08/20/2016      United States   1 cup (28 g)    10.7    0.0     71.4    3.57    536.0   21.4    35.7    0.0     500.0   357.0
Chimmichurri    03/09/2017      US      15 g (1 tsp)    0.0     0.0     6.67    0.0     0.0     8.0     0.0     0.0     400.0   0.0
Soursop Juice   03/19/2017      France  350 ml (1 CAN)  0.0     0.0     12.57   11.0    0.0     18.9    0.0     0.0     31.0    6.0
Cut Broccoli    03/09/2017      US      85 g (1 cup)    2.4     2.35    3.53    1.18    176.4   35.3    0.0     0.0     12.0    24.0
Olive oil, basil and garlic tomato sauce        08/14/2015      United States   1/2 cup (125g)  1.6     1.6     8.0     5.6     120.0   7.2     0.0     0.0     408.0   16.0
Mayonnaise Light        07/31/2015      United States   1 Tbsp (15g)    0.0     0.0     6.67    0.0     200.0   0.0     0.0     2.67    733.0   0.0
100% pure orange juice with calcium & vitamin D 08/09/2015      United States   8 fl. oz (240 mL)       0.0     0.833   10.8    9.17    0.0     30.0    4.17    0.0     188.0   146.0
Wheatgerm Loaf  03/08/2017      France  0.0     4.8     9.7     0.0     4.3     0.0     0.0     0.0     0.0     393.700787402   0.0
Organic Cut Leaf Spinach        03/10/2017      US      81 g (1 cup)    1.2     2.47    3.7     1.23    925.8   1.5     0.0     0.0     148.0   99.0
Totinos Cheese Pizza Rolls      07/31/2017      United States   65 g    0.0     7.69    46.2    0.0     0.0     0.0     0.0     0.0     0.0     0.0
Source 0% MG Vanille    12/13/2015      Canada  100 g   0.0     4.0     5.0     4.0     60.0    2.4     6.0     0.0     50.0    100.0
Graham Crackers 03/09/2017      US      27 g (27 g)     3.7     7.41    74.07   18.52   0.0     0.0     0.0     0.0     519.0   74.0
Lasagna 03/09/2017      US      215 g (1 cup)   0.9     7.91    13.95   2.33    174.3   9.8     0.0     0.0     372.0   93.0
Indian Diced Tomatoes   03/09/2017      US      120 g (0.5 cup) 0.8     0.83    5.83    4.17    75.0    3.0     0.0     0.0     125.0   33.0
Sliced Ripe Olives      03/09/2017      US      16 g (2 Tbsp)   0.0     0.0     6.25    0.0     0.0     0.0     0.0     0.0     781.0   0.0
2% Chocolate Reduced Fat Milk   03/10/2017      United States   1 cup (240 mL)  0.417   4.17    15.0    14.2    62.5    0.5     4.17    0.0     91.7    146.0
goldfish baked crackers  flavour blasted        08/05/2017      Canada  20 g (34 crackers)      5.0     10.0    65.0    5.0     0.0     0.0     0.0     0.0     900.0   100.0
Citrus Green Tea        08/09/2015      United States   12 fl. oz (355 mL)      0.0     0.0     20.0    19.0    0.0     12.0    0.0     0.0     110.0   0.0
Chaussons tressés aux pommes    01/15/2017      Canada  150 g / 1 chausson      2.0     3.33    38.7    24.7    0.0     1.6     0.0     0.0     255.0   13.3
Enchilada Black bean and vegetable      03/09/2017      United States   135g    2.96    3.7     16.3    1.48    66.7    1.78    0.0     0.0     289.0   59.3
Concord Grape Fruit Snacks      03/09/2017      United States   40g     0.0     1.0     31.0    18.0    375.0   60.0    0.0     5.0     15.0    0.0
Grated English Medium Cheddar   02/14/2017      France  0.0     0.0     24.9    0.0     0.1     0.0     0.0     0.0     0.0     787.401574803   0.0
Toasted Multi-Grain Cereal With Almonds & Honey Oat Clusters    03/09/2017      US      32 g (0.75 cup) 6.2     9.38    81.25   18.75   1171.8  18.8    7.8     0.0     375.0   0.0
Ananas au jus   11/05/2016      France  0.0     1.0     0.5     12.1    12.0    0.0     28.5    0.0     0.0     39.3700787402   0.0
Quiche Lorraine 04/11/2017      Canada  280 g / 1/5 de la Quiche        0.357   5.36    7.86    0.714   10.7    0.429   0.0     0.0     196.0   28.6
Jus De Mangue Foco 350ML 0      03/18/2017      France  350 ml (1 CAN)  0.0     0.0     14.0    14.0    0.0     28.5    0.0     0.0     34.0    9.0
Kettle Cooked Potato Chips, Sweet Mesquite Barbeque     03/10/2017      US      28 g (18 CHIPS | ABOUT) 3.6     7.14    60.71   3.57    214.2   17.1    0.0     0.0     679.0   0.0
Organic Apple Raspberry Fruit Wrap      08/11/2015      United States   1 bar (14g)     7.14    0.0     85.7    78.6    0.0     17.1    0.0     0.0     0.0     0.0
QT Bruschetta Mix       07/26/2015      United States   1 container (207 g)     1.45    7.25    10.6    3.38    217.0   8.7     0.0     0.0     309.0   169.0
Cheese Twists   06/19/2017      France  0.0     3.3     12.1    50.5    2.6     0.0     0.0     0.0     0.0     728.346456693   0.0
Cranberry Classic       04/20/2017      France  0.0     0.0     0.0     12.0    12.0    0.0     24.0    0.0     0.0     0.0     0.0
Stone ground garbanzo bean flour        08/09/2015      United States   0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0     0.0

5. regex5.pl (15%)

Markdown is a simplified markup langauge commonly used to write small bits of formattted text. Markdown is used to display readme files on GitHub, as well as to create the slides for this course. Before being displayed in a browser, markdown needs to be converted to HTML.

For this problem, you will focus on converting links written in markdown to the HTML <a> tag. You should assume that multiple tags may appear in the context of a given line.

	
#!/usr/bin/perl

# Convert any lines containing markdown for a link to HTML fo ra link
# markdown links are formatted as follows:
#
# '['
#     All links start out with a single opening square bracket. Make sure that
#     the '[' is not proceeded by a '!' character, as that indicates an image.
#
# 'link_text'
#     Next there is the text of the link. This is what will appear in the
#     rendered HTML page
#
# ']('
#     The link text is closed with a square bracket. Immediately following
#     this a parenetheses, which indicates the start of the hyperlink
# 
# 'hyperlink'
#     This is the URL of the link. For this purposes of this assignment 
#     all links will start with either 'http://' or 'https://'.
#     The actual rules regarding URLs are much more complex
# 
#  ')' 
#     Finally, the hyperlink is closed using a right parentheses
#
# For example, the following markdown links...
#
#     [Schedule](https://www.csee.umbc.edu/~bwilk1/433/index.html#schedule)
#     [UMBC's Homepage](http://umbc.edu)
#
# ...would get converted to the following HTML links
#
#     <a href="https://www.csee.umbc.edu/~bwilk1/433/index.html#schedule">Schedule</a>
#     <a href="http://umbc.edu">UMBC's Homepage</a>
#     
#     The following should not be converted
#     ![A picture of UMBC](https://umbc.edu/picture.png)
#

while(<>) {
  s/REGEX/REPLACE/;
  print;
}

6. regex6.pl

Another common type of markdown is to indicate what text should be bold or italic. Bold text is denoted by placing two asterisks before and after the text to be bolded. Italic text is denoted by blacing a single asterisk before and after the text to be bolded. You do not have to worry about text that is bold and itatlic for this example. For this problem you will be converting the markdown into the corresponding HTML tags of <strong> and <em> , respecitively. This will take two regexes subtistutions to complete, chose the order you do them in carefully.

		
#!/usr/bin/perl

# Convert markdown formatting into the corresponding HTML tag.
# The markdown to be converted are ** and *.
# Print the line regardless of substitution.
#
# For example, the following lines...
#
#     "**this is bolded**"
#     "*this is italic*"
#     "*this has both italic* and **bold text**"
#
# ...would get converted to...
#
#     "<strong>this is bolded</strong>"
#     "<em>this is italic</em>"
#     "<em>this has both italic</em> and <strong>bold text</strong>"
#
# The following should be left unconverted
# 	* A string with no closing asterisk
# 	A string with no opening asterisk**
#

while (<>) {
   s/SEARCH/REPLACE/;
   s/SEARCH2/REPLACE2/;
   print;
}

7. regex7.pl (15%)

This problem will be looking for and sanitizing credit card numbers from the 4 major credit card companies. For this problem you will be following the simple rules laid forth in the code below, whereas in the real world you would continue validation to make sure that it is a valid card number by performing the Luhn formula to compute the checksum of the card number and compare it against the check digit (the last digit on the card).

		
#!/usr/bin/perl

# Find and replace all instance of valid Visa, MasterCard, Discover or
# American Express card numbers with '...' followed by the last 4 numbers.
# The following table details the digits each card must begin with as well as
# the number of digits allowed for each card.  In order to be a valid card
# number it must not surrounded by a "word" character (in the regex sense)
#
# Card Type               Starts With       Number Digits
# -------------------------------------------------------
# American Express        34 or 37          15
# Discover                65 or 6011        16
# MasterCard              51 through 55     16
# Visa                    4                 13 or 16
#
# For example, the following lines...
#
#     'not my amex: 341234567890123'
#     '412345678901234'
#     '6011012345678901 is a discover card'
#
# ...would get converted to...
#
#     'not my amex: ...0123'
#     '412345678901234'
#     '...8901 is a discover card'

while(<>) {
  s/REGEX/REPLACE/;
  print;
}

8. regex8.pl (15%)

Another very common use of substitution is to convert from one data fromat to another. In this problem you will be converting a file written in comma seperated values, into as json like format. Do not worry about what JSON is, or if this output is valid JSON, we will discuss this when we get to JavaScript and Web technologies later in the course. Each line in a file represents a state and some statistics about that state.

		
#!/usr/bin/perl

# Convert each line in the CSV into an JSON-like structure as shown below:
#
# {
#         name: "STATE_NAME",
#         year_joined: "YEAR",
#         area: "AREA",
#         governor: "GOVERNOR_NAME"
# }
#
# For example, the following line...
#
#     'Maryland,04/28/1788,MD,Annpolis,-76.7,39.0,Larry Hogan, 32133, 2--752,America/New York,https://www.maryland.gov,Henrietta Maria of France'
#
# ...would be reported as...
#
#     '{[NEWLINE][TAB]name: "Maryland",[NEWLINE][TAB],year_joined: "1788",[NEWLINE][TAB]area: "32133",[NEWLINE][TAB]governor: "Larry Hogan"[NEWLINE]}'
#
# ...where [NEWLINE] is literally a newline character and [TAB] is literally a
# tab character.

while(<>) {
  print if s/REGEX/REPLACE/;
}
~

The datafields in the file are as follows

The name of the state
The date the state joined the Union
The abbreviation for the state
The capital of the state
The latitutude of the state
The longitude of the state
The govenor of the state
The area of the state
The Dewey Decimal section for the state
The time zone of the state
The website for the state
The namesake of the state (Optional)

Running your code

To run your code, you should call the perl interpreter with your file name as the argument, and redirect an input file that you want to test with. An example call is shown below

		
		 perl regex1.pl < my_test_file

Submitting your code

Git on GL is outdated and requires a slightly different mechanism to you. On most systems, running a git command will prompt for your GitHub username and password. On GL, it expects the user name as part of the command. One was to achieve this is to modify the clone command so that it now reads

git clone https://YOUR_GITHUB_USERNAME@github.com/YOUR_REPO_INFORMATION

Prior to doing this, you may need to run the command unset SSH_ASKPASS. This is so GL knows not to try and prompt you for your password using a GUI, and instead asks for it on the command line.

If you are using tcsh or another c-shell, you need to type unsetenv SSH_ASKPASS to prevent the GUI error.

Your code should be committed and pushed back to GitHub before the due date. DO NOT rename the files. You do not need to commit anything other than the .pl files.

How you will be graded

Each script will be run with a test file containing both negative and positive examples. The output of your script will be checked automatically to see if it is correct. You will lose between 0.5% to 1% per mistake, depending on the number of tests run.