Homework 4: Scanners and Patterns

## Introduction

In this homework, you'll learn some things that we won't talk about in lecture: classes and methods dedicated to searching strings for selected patterns and for reading formatted input.

Warning: There's a lot of jargon for this homework. Sorry! Focus on the experiments and writing code and it should all come together.

As usual, you can obtain the skeleton with

$git fetch shared$ git merge shared/hw4
• The sequence (?m) always matches the empty string, but has a side effect of causing ^ and $ to match the beginnings and ends of lines as well as of entire strings. • The two-character escape sequences \?, \*, \., \+, etc., match the character after the backslash, ignoring their special significance. Thus, the pattern who\? matches the string "who?", and would be written in a program as the string literal "who\\?". ### Experiment #2: Matching Compile and run the Matching class. This class allows you to type in strings and patterns and see if the entire string matches the pattern. If you include any groups (read ahead if you're curious), it will also print those. For example: $ java Matching
Alternately type strings to match and patterns to match against
them. Use \ at the end of line to enter multi-line strings or
patterns (\s are removed, leaving newlines).  The program
will indicate whether each pattern matches the ENTIRE
preceding string.  Enter QUIT to end the program.
String: 123456
Pattern: [0-9]{6}
Matches.
String: 123456
Pattern: [0-9]{5}
No match.
String: 12345
Pattern: [0-9]{6}
No match.
String: abdeffff
Pattern: ab(c|de)f+
Matches.
Group 1: 'de'
String: abbbbdefefgg*h
Pattern: a(b+)d(ef)+gg\*h
Matches.
Group 1: 'bbbb'
Group 2: 'ef'
String: QUIT

Use this class to experiment with how patterns work. Try writing patterns that match the following. Sample answers are given for each problem (drag the mouse over the white area after "Answer:" to see it).

• A single digit between 5 and 8. Answer: [5-8].
• Sequences of lower case letters. Answer: [a-z]+
• Sequences of lower case letters except the letter j. Answer: [a-ik-z]+
• Sequences of characters that start with the uppercase letter A and end with the letter f. Answer: A.*f
• Sequences of three words separated by spaces, where a word is defined as a sequence of lower case letters. Answer: [a-z]+ +[a-z]+ +[a-z]+
• Sequences of three words separated by spaces, and where group 1 corresponds to the second word. Answer: [a-z]+ +([a-z]+) +[a-z]+

To get more practice with writing regular expressions check out RegExr or regular expressions 101. These sites use plain regular expressions rather than Java patterns. which differ slightly as we have discussed above. They are still a great way to build more familiarity with regular expressions, which as we have mentioned, have many different applications involving string matching across multiple different programming languages.

In P2Pattern.java, fill in the string with a pattern that matches lists of non-negative numerals in the notation we used in homework 2 (e.g. (1, 2, 33, 1, 63)). Each numeral but the last should be followed by a comma and one or more spaces.
Run TestP2Pattern to verify that your pattern is correct.