-
Perl Regular Expressions - Extracting substrings between two characters
Suppose we need to extract a program name from a URL.
Ie. '//www.domain.com/program_name'Simple enough with,
my $url = '//www.domain.com/program_name?extra=junk' my ($domain, $program) = $url =~ m/(.*/)(.*)/;
(.*/) tells perl to grab everything up until the furthest right '/'.
(.*) grabs everything else after.The '/' on either side of the regular expression define the delimiter.
Perl puts everything inside the () into variables for you. $1 for the first set of (), $2 is for the second, and so on. In the example above we are pre-defining our own variables for perl to put the results in.
Taking It Further
Now what if we want to cater for parameters or other variables in our url? We want to extract everything from the furthest right '/' to the end of the string or first '?' encountered.
Ie. '//www.domain.com/program_name' or '//www.domain.com/program_name?extra=junk'This does the trick,
my $url = '//www.domain.com/program_name?extra=junk' my ($domain, $program) = $url =~ m{(.*/)([^?]*)};
([^?]*) tells perl to grab everything that is not a '?'.
The [] represents a character class, where everything inside represents one character.
The ^ is a negation (everything that isn't a '?').
As the '?' is inside a character class, it does not need to be escaped.Pleasing The Critics
Perl Critic complains about the above regular expression, for a few reasons.
Probibited Escaped Characters - It doesn't like how we've used '/', because our delimited is a '/'. The fix is to use a different delimiter like '{}'.
Missing /x (Extended Format) Flag - Adding this flag allows for comments and extra whitespace in the regular expression, to make it easier to read.
Missing /m (Line Boundary Matching) Flag - Adding this flag makes boundary matching work as most would expect.
Missing /s (Dot Anything Matching) Flag - Adding this flag makes '.' match anything, instead of anything but an 'n'.
Perl Regular Expression Resources
-
Gmail Subject Word Count User Script
I generally post to twitter mostly via Gmail using TwitterMail. It's great for fast posts without having to visit the Twitter website. You just put the post in the subject line. However, I soon found I needed to find the word count of the subject in order to stay under Twitter's 140 character limit.
A script was born...
I've written a GreaseMonkey script to get the job done. On the Gmail Compose screen, it displays the word count in brackets next to the subject header. It's compatible with Firefox and Google Chrome.
The script can be found here.
-
Syncing Android contacts with social networks via Google Contacts
I've had a gMail account for a long time now, but had never paid much attention to my 'Google Contacts'. It was just a collection of email addresses built up from email conversations over the past few years.
However with a new Android phone in hand, Google Contacts becomes a whole lot more useful. You can now have your phone contacts stored in the cloud for easy backup and editing.
Cleaning Up The Mess
I followed this guide to clean up my Google Contacts. It didn't take too long to merge email addresses with phone contacts and delete any garbage. Google Contacts makes it fairly easy to merge duplicate contacts and keep the data you want.
Completing Contacts Social Information
Now that I've got a nice clean contact list, it was time to spruce up the data with my contacts social network information.
Facebook For Android provides integration between Google Contacts and Facebook, but it won't update the data to your Google Contacts. Luckily, there are other ways to go about it. Rainmaker is one such way. Rainmaker makes it incredibly easy to sync Google Contacts to Facebook, Twitter and LinkedIn.
Where are my Contacts Photos?
One thing I did notice lacking from Rainmaker, was the uploading of Facebook Profile pictures to my Google Contacts. Thankfully, there's a handy Android app to take care of that.
SyncMyPix does exactly what its title suggests. It syncs your Facebook contacts profile pictures with your Google Contacts, allowing them to be uploaded.
- Newer posts