Regex Exercises (POSTPONED)

Due Tue, Jun 06, 2017 by midnight

Work in Pairs for this one. Only one pull request needed per pair. Reflections always important.

Submit a well-formatted pull request to our class blog with embedded Trinket programs for the below three exercises from Chapter X (use these instead of the ones in the book - I added a few explanations). Complete these on your own, using only the materials in this Chapter. Do not look at other students’ submissions until after you’ve completed your work.


Exercise 1: Write a simple program to (kind-of) simulate the operation of the grep command on Unix. Ask the user to enter a regular expression and count the number of lines that matched the regular expression. Unlike the real grep, your progam shuld always search mbox-short.txt.

Enter a regular expression: ^Author
mbox-short.txt had 1798 lines that matched ^Author

Enter a regular expression: ^X-
mbox-short.txt had 14368 lines that matched ^X-


Enter a regular expression: java$
mbox-short.txt had 4218 lines that matched java$

Exercise 2: Write a program to look for lines of the form

New Revision: 39772 and extract the number from each of the lines using a regular expression and the findall() method. Compute the average of the numbers and print out the average.

Enter file:mbox.txt 38549.7949721

Enter file:mbox-short.txt 39756.9259259

Exercise 3: Write a program that prompts the user for a regular expression and a column number n and returns all rows which nth column has a match with the regex. This program should have the .csv data we previuously used loaded into a list of lists. You can re-use code from a previous assignment, but duplicate it in Trinket so you can save the original embeds. Make sure your table is printed as tab separated values. The "\t".join() thingie will be helpful here.

This program will let users quickly select all rows in a table that match a given regex in a given column. So I could find all Johns or all transactions that took place in a range of states, etc.

Elliott Hauser is a PhD Student in information science at UNC Chapel Hill. He's hacking education as one of the cofounders of Trinket.io. Find Elliott Hauser on Twitter, Github, and on the web.