Slightly more Regular Expressions

IMG_8221For me regular expressions are like magic. I can appreciate the wonderful things they can do but I’ve never really understood how the trick works. As I develop primarily in PHP there are plenty of PHP string functions which when combined, can get me the desired result. However, yesterday I had to extract two values from a single string, so I thought I’d give Regular Expressions another go. Over the years of sniffing at them I realised that I’d picked up a few things.

  • Word Boundaries – These look different depending what flavour of regex you’re using. I’m using PHP’s PCRE functions so my word boudaries are forward slashes
  • Characters – I knew about Character Classes and I know what some do, not all of them. For example I know that [A-Za-z] will find an alpahbetical character and \d will find a single digit.
  • Repetition – I knew about greediness ( the + character). This character will try to continue matching your token after its found the first occurence.

So when I was confronted with the following string: item_newsArticle1138, I decided that it would just extend my knowledge enough to investigate using regex to extract “newsArticle” and “1138”

Getting the number off the end of the string was easy enough. I’m using the preg_match() function in php to pass the found string into a variable:
preg_match('/\d+/', 'item_newsArticle1138', $modelid);

The forward slashes act as regex delimters, so all we need is \d+ which finds the first digit in the string and the plus sign (+) continues finding subsequent digits: 1138.

Then came the tough one. If I use /[A-Za-z]+ as my regex;
preg_match('/[A-Za-z]+/', 'item_newsArticle1138, $model);

The regex engine reports finding “item” in the string and stops at the underscore, ignoring the part I needed. What I needed the regex to do was find the underscore and then get all the alphabetic charaters after it. To do this I needed to learn how to use positive and negative lookahead and lookbehind.

So what I needed to add to my string was a positive lookbehind: (?<=_). The brackets and the question mark provide the code for the lookahead/lookbehind. The inclusion of a less than makes it a lookbehind (N.B. omit the less than symbol for lookahead - not a greater than symbol). The equals sign tells the regex "to look for" and then I pass in the underscore character, as that's what I'm looking for in the string. That gives us the following regex:
preg_match(‘/(?<=_)[A-Za-z]+/', 'item_newsArticle1138, $model);

Be careful the php function preg_match returns the results in the variable declared as the last arguement ($model) as an array, so to use the value you need $model[0] to return “newsArticle”.

Hey presto, the magic is revealed. I’m no expert, so somebody’s bound to tell me a better way in the comments, hello?

Arrays in dBs and YAML Config

I’m working on some new mega-menus for GO. I’ll post about that when we go live.

Following a pre-Christmas code review of the user-defined munu choices, we identified that we needed a way to provide a default set of menu items for everybody prior to them making and saving their personal choices. Mike remembered that we had an Option table capable of storing an array of values per person.

As this table existed in the GO symfony framework it already had methods for getting and setting values. Reading those methods introduced me to the php serialize function. This function takes an array and generates a storable representation of its value. For example this
   [0] => 1
   [1] => 2
   [2] => 3
   [3] => 4
   [4] => 9

is turned into this:a:5:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;i:4;i:9;}

When the value is queried we can simply unserialize it to return it to its php readable form.

Default Values in YAML Config Files

In the Option table, if a record isn’t returned for a user (no option has been set) a default value (or array) can be returned when passed into the querying method. As I needed to query the data in more than one place in the application, I needed to set the default value in one place. Symfony uses YAML files for configuration so I set it in the application config file app.yml.
    default_dashbar:      [1, 2, 3, 4, 9]

Spaces are all important in YAML. Ensure you don’t indent each line and include spaces between array elements.

To pull YAML config variables into your code enter:sfConfig::get('param_name', $default_value);. The ‘param_name’ takes the file name and the parent child structure of the YAML file, separated by underscores and can be found in the Configuring Symfony chapter of the manual.

Heavy duty stuff for the first post of 2011.