Lecture#10 Outline
Chapter 9 - Using Regular Expressions


m// operator

We've been matching with / /, but the actual pattern match is m//. You can choose other delimiters like we did with qw if you use the m, ex: m%%.

Option Modifiers (Flags)
/i - Case Insensitivity

    $_="HeLlO ThEre";
    if (m%hello%i) {print "THIS MATCHES";}

/s - changes . so it matches newline as well

    $_="Hello\nThere";
    if (/Hello.There/) { print “This will never get printed.”;}
    if (/Hello.There/s) { print "HERE'S A MATCH";}

You can combine these operators as well...
   if (/hello.there/is) { print “HERE’S A MATCH TOO”;}


The Binding Operator =~
We've been using $_ as our variable, but if you have a string in another variable, you do the comparison like this:

   if ($whatever =~ /hello/) { print "\n"; }

Pretty much anything that has a value can be put on the left hand side of the operator. You can also interpolate a string into a pattern:

   $thepattern="hello";    if ($whatever =~ /$thepattern/) { print "\n"; }


Match Variables
After a pattern has been matched, anything you used ()s on to match is stored in variables automatically. \1=$1, \2=$2, etc.

   $_ = "Hello there";
   if (/\s(\w+)$/) {print "We matched $1";} #prints We matched there

Since these variables are reused often by Perl, you should use them and be done with them as soon as possible in your program.

3 Automatch Variables
$_ = "Hello there";
if (/\s(\w+)$/) {print "We matched $1";} #prints We matched there

Using the example before, after this is run, there are actually 3 more variables you get:

$`      Skipped over in match: "Hello"
$&    The match: " there"
$'      Unchecked: empty in this case

Compare this to what is in $1, which is only what's in the 1st ()s.

See page 103 for a nice program to check your matches.


Substitutions
s/// - This is like a single search and replace operation

    $_ = "09dash09dash1999";
    s/dash/-/;
    print $_; #prints 09-09dash1999

    $_ = "09dash09dash1999";
    s/dash/-/g; #/g is the global option
    print $_; #prints 09-09-1999

Substitution returns a true/false value indicating success, so you can use it in an if statement.
You can also use other variables:

    $thedate = "09dash09dash1999";
    $thedate =~ s/dash/-/g; #i would work too to match Dash,Dash, etc, s too
    print $thedate; #prints 09-09-1999

You can use different delimiters for s/// as in qw// and m//


Case shifting
Sometimes you may want to control the case of the substitution. You can accomplish this with a few more flags:

\U - Uppercase
\L - Lowercase
\E - The above 2 by default affect the rest of the string, \E turns it off
\l - Lowercase, next character only affected
\u - Uppercase, next character only affected

Examples:

$_ = "My name is Joe and yours is John";
s/(joe|john)/\U$1/gi
#Replace joe or john (case insensitive) with the uppercase of what was matched globally
#$_ now contains "My name is JOE and yours is JOHN"

s/(joe|john)/\L$1/gi #Same as before but change to lowercase
#$_ now contains "My name is joe and yours is john"

s/(joe).*(john)/\U$2\E$1/i
#$_ now contains "My name is JOHNjoe"

s/(joe|john)/\u$1/ig
#$_ now contains "My name is JOHNJoe"

s/(joe|john)/\u\L$1/ig
#$_ now contains "My name is JohnJoe"

One other useful thing is that you can do this is regular strings as well....

$msg="ALAN";
print "My name is: \L\u$msg"; #My name is Alan


Split and Join
Split breaks up a string with a separator
Join brings items back together

Example:
$msg="::Alan:John::Joe:";
@names = split /:/, $msg #@names has ("","","Alan", "John", "", "Joe")

The leading empty strings are given, along with the ones in the middle, but trailing ones are dropped

The separator can be a regular expression, to split on whitespace: split /\s+/

$stuff = join glue @pieces
$stuff = join "-", 9,9,1999 #$stuff = 9-9-1999
$stuff = join "-", 9 #$stuff=9


CSC255 - Alan Watkins - North Carolina State University