Regular Expressions (DUE: 2/21 11:59 PM EDT)
Updates
02/14/18: Updated Problem 2 to modify order requirements
This assignment will give you practice with regular expressions, both for the purposes of finding matches and performing substitutions. The assignment will be submitted through GitHub, and starter code for each problem in the assignment will be checked out to your repository when you start your assignment. To get started with the assignment, go to https://classroom.github.com/a/qvOJYcdi and sign in to your GitHub account. This will create a repository for you containing the starter code.
1. regex1.pl (10%)
In this problem, you will write a regular expression to identify lines containing a header element in an HTML document. The valid tags you are looking for are h1, h2, h3, h4, h5, and h6, in either upper or lower cases. The tags may have attributes in the opening tag only, and there may be arbitrary content between the tags. You may assume that the '<' and '>' characters will not appear in any attributes.
#!/usr/bin/perl
# Print lines that contain an opening and closing HTML heading tag. Valid
# tags are h1, h2, h3, h4, h5, h6 regardless of the case. Tags may contain
# attributes inside of the opening portion of the same number. Tags may also
# have any arbitary content inside of them. You may assume that a '<' or a
# '>' character will not appear inside of the attribute's value portion.
#
# For example:
#
# '<h1>foo</h1>' matches
# '<h2 class="foo">foo</h2>' matches
# '<h3>bar</H3>' matches
# '<h4 class="blah" id="foo">baz</h4>' matches
# '<h5><span class="big">B</span>ig</h5>' matches
# '<h9>foo</h9>' does not match
# '<h1>something</h4>' does not match
# '<h1>foo bar baz...' does not match
while(<>) {
print if /REGEX/;
}
2. regex2.pl (10%)
In the example below, it would be extrordinarily complex to prevent all matches from being out of order. So for the example of "5 minutes 1 day ago" not matching, it is fine if the line as a whole matches, but if we check the part of the line that matches, only "1 day ago" should have matched.
In this problem, you must identify relative time expressions in a variety of formats. These epxressions will consists of one or more units of time, plus the string 'ago' for times in the past, and 'from now' for days in the future. The units of time you are expected to recognize are minutes, hours, days, and years. Assume any number is valid, so someone could say 40 days ago.
#!/usr/bin/perl
# Print line if it contains a relative time expression expressed in any combination
# of minutes, hours, days, and years.
#
# Note: 1 of any unit should not use the pluar form of the unit.
# The combination of units should be in logical order
#
# For example:
#
# "5 years ago" matches
# "5 minutes from now" matches
# "1 day 5 minutes ago" matches
# "5 minutes 1 day ago" does not match
# "from now 5 minutes" does not match
# "10 seconds ago" does not match
# "a day ago" does not match
#
while (<>) {
print if /REGEX/;
}
~
3. regex3.pl (10%)
In this problem, you must match lines from the output of git log that show files that have only been added to or only been deleted from, not files that have had both operations done. The git log format we will be using is a multi-line format organized as follows:
HEX COMMIT_MESSAGE
FileName | NUM_LINES CHANGE_TYPES
FileName | NUM_LINES CHANGE_TYPES
....
FileName | NUM_LINES CHANGE_TYPES
N file changed, X insertions(+), Y deletions(-)
An example of output from this command is below
008f60c4d3b904d37bdcaf49d654656cb2fab6bf Making sure airport data is there
data/airports.tsv | 1284 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 1284 insertions(+)
448b476ecd22466f250137d0207c0b261d0dbda2 Intial Ulpoad of Lecture 2
Lecture02.ipynb | 1208 ++++++++++++++++++++++---------------------------------
1 file changed, 488 insertions(+), 720 deletions(-)
18589f64b676c74f0665098ee51063305b095ff7 Completed Lecture 1
Lecture01.ipynb | 697 ++++++++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 608 insertions(+), 89 deletions(-)
c2f185a441e4d1bf108fc4949d7fa8ce6735a572 PostBuild appears to be working, removing early exit
binder/postBuild | 1 -
1 file changed, 1 deletion(-)
5f29169b33f0a25aafef5ab8246c553d8c0a963f Update set_envs.py to not error on empty lines)
binder/perl_env | 1 -
binder/set_envs.py | 3 ++-
2 files changed, 2 insertions(+), 2 deletions(-)
51b87421372cc59739231cf885edc58adced5af3 Debugging variable setting perl
binder/postBuild | 2 ++
1 file changed, 2 insertions(+)
For this problem, you should return any line that displays information about a file, and whose CHANGE_TYPE string consist entirely of +'s or entirely of -'s
#!/usr/bin/perl
# Match lines of git log --stat --pretty=oneline that indicate files with only additions or only deletions
#
# The lines you are interested in have the format:
#
# FileName | NUM_LINES CHANGE_TYPES
#
# Where...
#
# CHANGE_TYPES is entirely +'s or entierly -'s
#
# For example:
# 'data/airports.tsv | 1284 +++++++++++++++++++++++++++++++++++++++++++++++++++++' matches
# 'binder/postBuild | 1 -' matches
# 'Lecture02.ipynb | 1208 ++++++++++++++++++++++---------------------------------' does not match
# '008f60c4d3b904d37bdcaf49d654656cb2fab6bf Making sure airport data is there' does not match
# '1 file changed, 2 insertions(+)' does not match'
while(<>) {
print if /REGEX/;
}
4. regex4.pl (10%)
This problem requires you to process a tab seperated values (TSV) data file of food
facts modified from data available from World Food Facts
and print lines that meets the following criteria:
- Sold in the United States
- Data updated in 2017 or 2016
- Contains 1 unit or more of vitamin C
#!/usr/bin/perl
# Match foods that are sold in the United States, contain greater than or equal to 1 unit of vitamin C, and
# whose information was updated in 2016 or 2017
#
# For example:
# 'Tomato Paste 03/09/2017 US 33 g (2 Tbsp) 6.1 6.06 21.21 12.12 363.6 14.5 0.0 0.0 45.0 61.0' matches
# 'Reese minis 12/23/2016 Canada 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0' does not match
# 'Tootie Fruities 03/09/2017 United States 1 cup (32g) 3.12 6.25 87.5 46.9 469.0 18.7 12.5 0.0 469.0 312.0' matches
# '100% pure orange juice with calcium & vitamin D 08/09/2015 United States 8 fl. oz (240 mL) 0.0 0.833 10.8 9.17 0.0 30.0 4.17 0.0 188.0 146.0' does not match
#
while(<>) {
print if /REGEX/;
}
A sample datafile along with the meaning of each column is shown below
# Data from world.openfoodfacts.org
#
# Fields separated by tabs are:
#
# 1) Product Name
# 2) Date information was last updated
# 3) The country the food is sold in (NOTE: US and United State both appear)
# 4) The serving size
# 5) The amount of fiber
# 6) The amount of protien
# 7) The amount of carbohydrates
# 8) The amount of sugar
# 9) The amount of Vitamin A
# 10) The amount of Vitamin C
# 11) The amount of Vitamin D
# 12) The amount of Vitamin E
# 13) The amount of Sodium
# 14) The amount of Calciumi
Galettes de Pommes de terre Classiques 12/16/2016 Canada 60 g / 1 galette 1.67 1.67 25.0 0.0 0.0 0.0 0.0 0.0 350.0 0.0
Reese minis 12/23/2016 Canada 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Classic Cheddar Pierogies 03/10/2017 United States 114g 0.877 4.39 27.2 0.877 0.0 5.26 0.0 0.0 404.0 17.5
Four Cheese Mashed Potatoes 03/09/2017 US 28 g (0.25 cup) 3.6 7.14 71.43 7.14 0.0 12.9 0.0 0.0 2036.0 143.0
Enriched Macaroni Product, Small Shells 03/10/2017 US 56 g (0.5 cup) 3.6 12.5 75.0 1.79 0.0 0.0 0.0 0.0 0.0 0.0
Tootie Fruities 03/09/2017 United States 1 cup (32g) 3.12 6.25 87.5 46.9 469.0 18.7 12.5 0.0 469.0 312.0
Party Rainbow Chip cake mix 11/26/2016 United States 43 g 2.33 2.33 81.4 41.9 0.0 0.0 0.0 0.0 744.0 186.0
Lay's Classic 08/09/2015 United States 1 oz (28 g) 3.57 7.14 53.6 3.57 0.0 21.4 0.0 4.29 1250.0 0.0
Chocolate Chex 08/15/2015 United States 3/4 cup (32g) 3.12 6.25 81.2 25.0 469.0 18.7 12.5 0.0 625.0 312.0
Jif Beurre De Cacahuettes Extra Crunchy 03/16/2017 France 0.0 2.0 7.0 0.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0
Orangina Sparkling Citrus Beverage 03/11/2017 United States 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Petite Cut, Diced Tomatoes With Green Chilies, Medium 03/09/2017 US 126 g (0.5 cup) 0.8 1.59 5.56 3.17 178.5 14.3 0.0 0.0 238.0 32.0
Juice Cocktail 03/09/2017 US 240 ml (8 fl oz) 0.0 0.0 12.5 12.5 0.0 25.0 0.0 0.0 33.0 0.0
Argentinian Red Shrimp 07/26/2015 United States 4 oz (113 g) 0.0 13.3 0.0 0.0 53.1 1.06 0.0 0.0 142.0 17.7
Papaya 03/10/2017 United States 1 can (340 mL) 0.0 0.0 13.8 13.2 0.0 17.6 0.0 0.0 5.88 0.0
Diced Tomatoes, Basil, Garlic & Oregano 03/09/2017 US 123 g (0.5 cup) 0.8 0.81 5.69 4.07 183.0 7.3 0.0 0.0 220.0 33.0
Tomato Ketchup 03/09/2017 United States 1 tbsp (17 g) 0.0 0.0 23.5 23.5 0.0 0.0 0.0 0.0 941.0 0.0
Extra Crispy Shoestring French Fried Potatoes 03/09/2017 US 85 g (3 oz) 2.4 2.35 27.06 0.0 0.0 2.8 0.0 0.0 365.0 0.0
Shortcake, Lemon & Cream 03/09/2017 US 80 g (1 SLICE) 0.0 3.75 35.0 25.0 187.5 3.0 0.0 0.0 200.0 50.0
Apple Pommes 03/10/2017 US 154 g (1 MEDIUM APPLE) 3.2 0.0 14.29 10.39 19.5 3.1 0.0 0.0 0.0 0.0
Tropical Gold Premium Pineapple Chunks in Juice 12/13/2015 France 0.0 0.1 0.4 15.0 12.0 0.0 32.0 0.0 0.0 100.0 0.0
Classico Four Cheese Pasta Sauce 03/09/2017 United States 125 g 1.6 1.6 8.0 4.8 72.0 2.88 0.0 0.0 392.0 48.0
Danoises à la cannelle roulées 01/15/2017 Canada 146 g / 1 danoise 2.05 4.79 54.1 28.1 205.0 6.16 0.0 0.0 363.0 54.8
Yellow Rice Mix With Saffron 03/10/2017 US 56 g (0.333 CUP RICE AND 1.5 TBSP SEASONING (1 CUP PREPARED) | ABOUT) 1.8 7.14 78.57 1.79 0.0 0.0 0.0 0.0 1304.0 36.0
Chocolate Chip Cookies 08/07/2017 Canada 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Oikos, Greek Yogurt, Key Lime 03/09/2017 US 150 g (150 g) 0.0 7.33 12.0 11.33 20.1 0.0 0.0 0.0 53.0 100.0
V8 original 08/09/2015 United States 8 fl. oz (240 mL) 0.833 0.833 4.17 2.92 250.0 30.0 0.0 0.0 271.0 16.7
0% Fat Greek Style Yogurt With Honey 04/08/2017 France 0.0 0.5 6.5 0.0 11.8 0.0 0.0 0.0 0.0 70.8661417323 0.0
Cottage Cheese 08/09/2015 Canada 125 g 0.0 11.2 5.6 4.0 24.0 0.0 0.0 0.0 248.0 120.0
Walnuts Dropped 03/09/2017 US 30 g (0.25 cup) 6.7 16.67 13.33 3.33 0.0 0.0 0.0 0.0 0.0 67.0
Marionberry Pie 03/09/2017 US 135 g (0.25 PIE) 3.0 2.22 38.52 18.52 44.4 0.0 0.0 0.0 348.0 15.0
Cheerios 08/20/2016 United States 1 cup (28 g) 10.7 0.0 71.4 3.57 536.0 21.4 35.7 0.0 500.0 357.0
Chimmichurri 03/09/2017 US 15 g (1 tsp) 0.0 0.0 6.67 0.0 0.0 8.0 0.0 0.0 400.0 0.0
Soursop Juice 03/19/2017 France 350 ml (1 CAN) 0.0 0.0 12.57 11.0 0.0 18.9 0.0 0.0 31.0 6.0
Cut Broccoli 03/09/2017 US 85 g (1 cup) 2.4 2.35 3.53 1.18 176.4 35.3 0.0 0.0 12.0 24.0
Olive oil, basil and garlic tomato sauce 08/14/2015 United States 1/2 cup (125g) 1.6 1.6 8.0 5.6 120.0 7.2 0.0 0.0 408.0 16.0
Mayonnaise Light 07/31/2015 United States 1 Tbsp (15g) 0.0 0.0 6.67 0.0 200.0 0.0 0.0 2.67 733.0 0.0
100% pure orange juice with calcium & vitamin D 08/09/2015 United States 8 fl. oz (240 mL) 0.0 0.833 10.8 9.17 0.0 30.0 4.17 0.0 188.0 146.0
Wheatgerm Loaf 03/08/2017 France 0.0 4.8 9.7 0.0 4.3 0.0 0.0 0.0 0.0 393.700787402 0.0
Organic Cut Leaf Spinach 03/10/2017 US 81 g (1 cup) 1.2 2.47 3.7 1.23 925.8 1.5 0.0 0.0 148.0 99.0
Totinos Cheese Pizza Rolls 07/31/2017 United States 65 g 0.0 7.69 46.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Source 0% MG Vanille 12/13/2015 Canada 100 g 0.0 4.0 5.0 4.0 60.0 2.4 6.0 0.0 50.0 100.0
Graham Crackers 03/09/2017 US 27 g (27 g) 3.7 7.41 74.07 18.52 0.0 0.0 0.0 0.0 519.0 74.0
Lasagna 03/09/2017 US 215 g (1 cup) 0.9 7.91 13.95 2.33 174.3 9.8 0.0 0.0 372.0 93.0
Indian Diced Tomatoes 03/09/2017 US 120 g (0.5 cup) 0.8 0.83 5.83 4.17 75.0 3.0 0.0 0.0 125.0 33.0
Sliced Ripe Olives 03/09/2017 US 16 g (2 Tbsp) 0.0 0.0 6.25 0.0 0.0 0.0 0.0 0.0 781.0 0.0
2% Chocolate Reduced Fat Milk 03/10/2017 United States 1 cup (240 mL) 0.417 4.17 15.0 14.2 62.5 0.5 4.17 0.0 91.7 146.0
goldfish baked crackers flavour blasted 08/05/2017 Canada 20 g (34 crackers) 5.0 10.0 65.0 5.0 0.0 0.0 0.0 0.0 900.0 100.0
Citrus Green Tea 08/09/2015 United States 12 fl. oz (355 mL) 0.0 0.0 20.0 19.0 0.0 12.0 0.0 0.0 110.0 0.0
Chaussons tressés aux pommes 01/15/2017 Canada 150 g / 1 chausson 2.0 3.33 38.7 24.7 0.0 1.6 0.0 0.0 255.0 13.3
Enchilada Black bean and vegetable 03/09/2017 United States 135g 2.96 3.7 16.3 1.48 66.7 1.78 0.0 0.0 289.0 59.3
Concord Grape Fruit Snacks 03/09/2017 United States 40g 0.0 1.0 31.0 18.0 375.0 60.0 0.0 5.0 15.0 0.0
Grated English Medium Cheddar 02/14/2017 France 0.0 0.0 24.9 0.0 0.1 0.0 0.0 0.0 0.0 787.401574803 0.0
Toasted Multi-Grain Cereal With Almonds & Honey Oat Clusters 03/09/2017 US 32 g (0.75 cup) 6.2 9.38 81.25 18.75 1171.8 18.8 7.8 0.0 375.0 0.0
Ananas au jus 11/05/2016 France 0.0 1.0 0.5 12.1 12.0 0.0 28.5 0.0 0.0 39.3700787402 0.0
Quiche Lorraine 04/11/2017 Canada 280 g / 1/5 de la Quiche 0.357 5.36 7.86 0.714 10.7 0.429 0.0 0.0 196.0 28.6
Jus De Mangue Foco 350ML 0 03/18/2017 France 350 ml (1 CAN) 0.0 0.0 14.0 14.0 0.0 28.5 0.0 0.0 34.0 9.0
Kettle Cooked Potato Chips, Sweet Mesquite Barbeque 03/10/2017 US 28 g (18 CHIPS | ABOUT) 3.6 7.14 60.71 3.57 214.2 17.1 0.0 0.0 679.0 0.0
Organic Apple Raspberry Fruit Wrap 08/11/2015 United States 1 bar (14g) 7.14 0.0 85.7 78.6 0.0 17.1 0.0 0.0 0.0 0.0
QT Bruschetta Mix 07/26/2015 United States 1 container (207 g) 1.45 7.25 10.6 3.38 217.0 8.7 0.0 0.0 309.0 169.0
Cheese Twists 06/19/2017 France 0.0 3.3 12.1 50.5 2.6 0.0 0.0 0.0 0.0 728.346456693 0.0
Cranberry Classic 04/20/2017 France 0.0 0.0 0.0 12.0 12.0 0.0 24.0 0.0 0.0 0.0 0.0
Stone ground garbanzo bean flour 08/09/2015 United States 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5. regex5.pl (15%)
Markdown is a simplified markup langauge commonly used to write small bits of
formattted text. Markdown is used to display readme files on GitHub, as well
as to create the slides for this course. Before being displayed in a browser, markdown
needs to be converted to HTML.
For this problem, you will focus on converting links written in markdown to the HTML <a> tag.
You should assume that multiple tags may appear in the context of a given line.
#!/usr/bin/perl
# Convert any lines containing markdown for a link to HTML fo ra link
# markdown links are formatted as follows:
#
# '['
# All links start out with a single opening square bracket. Make sure that
# the '[' is not proceeded by a '!' character, as that indicates an image.
#
# 'link_text'
# Next there is the text of the link. This is what will appear in the
# rendered HTML page
#
# ']('
# The link text is closed with a square bracket. Immediately following
# this a parenetheses, which indicates the start of the hyperlink
#
# 'hyperlink'
# This is the URL of the link. For this purposes of this assignment
# all links will start with either 'http://' or 'https://'.
# The actual rules regarding URLs are much more complex
#
# ')'
# Finally, the hyperlink is closed using a right parentheses
#
# For example, the following markdown links...
#
# [Schedule](https://www.csee.umbc.edu/~bwilk1/433/index.html#schedule)
# [UMBC's Homepage](http://umbc.edu)
#
# ...would get converted to the following HTML links
#
# <a href="https://www.csee.umbc.edu/~bwilk1/433/index.html#schedule">Schedule</a>
# <a href="http://umbc.edu">UMBC's Homepage</a>
#
# The following should not be converted
# 
#
while(<>) {
s/REGEX/REPLACE/;
print;
}
6. regex6.pl
Another common type of markdown is to indicate what text should be bold or italic.
Bold text is denoted by placing two asterisks before and after the text to be bolded.
Italic text is denoted by blacing a single asterisk before and after the text to be bolded.
You do not have to worry about text that is bold and itatlic for this example. For this problem
you will be converting the markdown into the corresponding HTML tags of <strong> and <em>
, respecitively. This will take two regexes subtistutions to complete, chose the order you do them
in carefully.
#!/usr/bin/perl
# Convert markdown formatting into the corresponding HTML tag.
# The markdown to be converted are ** and *.
# Print the line regardless of substitution.
#
# For example, the following lines...
#
# "**this is bolded**"
# "*this is italic*"
# "*this has both italic* and **bold text**"
#
# ...would get converted to...
#
# "<strong>this is bolded</strong>"
# "<em>this is italic</em>"
# "<em>this has both italic</em> and <strong>bold text</strong>"
#
# The following should be left unconverted
# * A string with no closing asterisk
# A string with no opening asterisk**
#
while (<>) {
s/SEARCH/REPLACE/;
s/SEARCH2/REPLACE2/;
print;
}
7. regex7.pl (15%)
This problem will be looking for and sanitizing credit card numbers from the 4 major credit card companies. For this problem you will be following the simple rules laid forth in the code below, whereas in the real world you would continue validation to make sure that it is a valid card number by performing the Luhn formula to compute the checksum of the card number and compare it against the check digit (the last digit on the card).
#!/usr/bin/perl
# Find and replace all instance of valid Visa, MasterCard, Discover or
# American Express card numbers with '...' followed by the last 4 numbers.
# The following table details the digits each card must begin with as well as
# the number of digits allowed for each card. In order to be a valid card
# number it must not surrounded by a "word" character (in the regex sense)
#
# Card Type Starts With Number Digits
# -------------------------------------------------------
# American Express 34 or 37 15
# Discover 65 or 6011 16
# MasterCard 51 through 55 16
# Visa 4 13 or 16
#
# For example, the following lines...
#
# 'not my amex: 341234567890123'
# '412345678901234'
# '6011012345678901 is a discover card'
#
# ...would get converted to...
#
# 'not my amex: ...0123'
# '412345678901234'
# '...8901 is a discover card'
while(<>) {
s/REGEX/REPLACE/;
print;
}
8. regex8.pl (15%)
Another very common use of substitution is to convert from one data fromat to another. In this problem you will be
converting a file written in comma seperated values, into as json like format. Do not worry about what JSON is,
or if this output is valid JSON, we will discuss this when we get to JavaScript and Web technologies later in the course.
Each line in a file represents a state and some statistics about that state.
#!/usr/bin/perl
# Convert each line in the CSV into an JSON-like structure as shown below:
#
# {
# name: "STATE_NAME",
# year_joined: "YEAR",
# area: "AREA",
# governor: "GOVERNOR_NAME"
# }
#
# For example, the following line...
#
# 'Maryland,04/28/1788,MD,Annpolis,-76.7,39.0,Larry Hogan, 32133, 2--752,America/New York,https://www.maryland.gov,Henrietta Maria of France'
#
# ...would be reported as...
#
# '{[NEWLINE][TAB]name: "Maryland",[NEWLINE][TAB],year_joined: "1788",[NEWLINE][TAB]area: "32133",[NEWLINE][TAB]governor: "Larry Hogan"[NEWLINE]}'
#
# ...where [NEWLINE] is literally a newline character and [TAB] is literally a
# tab character.
while(<>) {
print if s/REGEX/REPLACE/;
}
~
The datafields in the file are as follows
- The name of the state
- The date the state joined the Union
- The abbreviation for the state
- The capital of the state
- The latitutude of the state
- The longitude of the state
- The govenor of the state
- The area of the state
- The Dewey Decimal section for the state
- The time zone of the state
- The website for the state
- The namesake of the state (Optional)
Running your code
To run your code, you should call the perl interpreter with your file name as the argument, and redirect an input file that you want to test with. An example call is shown below
perl regex1.pl < my_test_file
Submitting your code
Git on GL is outdated and requires a slightly different mechanism to you. On most systems, running a git command will prompt for your GitHub username and password. On GL, it expects the user name as part of the command. One was to achieve this is to modify the clone command so that it now reads
git clone https://YOUR_GITHUB_USERNAME@github.com/YOUR_REPO_INFORMATION
Prior to doing this, you may need to run the command
unset SSH_ASKPASS
. This is so GL knows not to try and prompt you for your password using a GUI, and instead asks for it on the command line.
If you are using tcsh or another c-shell, you need to type unsetenv SSH_ASKPASS
to prevent the GUI error.
Your code should be committed and pushed back to GitHub before the due date. DO NOT rename the files. You do not need to commit anything other than the .pl files.
How you will be graded
Each script will be run with a test file containing both negative and positive examples. The output of your script will be checked automatically to see if it is correct. You will lose between 0.5% to 1% per mistake, depending on the number of tests run.