Regular Expressions are about as elegant as a pig on a bicycle. Using a regular expression feels like resorting to machine code when all those patterns we're taught to love just aren't up to the job. Which, I suppose, is also a reason to like them. They have a brute force directness, free from pattern politics and endless analysis.

And they work. Eventually.

If the JavaScript Regular Expressions API makes your head spin then this might be for you. I'll document the basics and demonstrate how you might use them to full effect.

For the sake of brevity (not to mention my own lack of regex proficiency) I won't discuss the syntax of the expressions themselves. Suffice to say, JavaScript regEx syntax is Perl based. There are many excellent online resources for this, as well as some nice online RegEx testers.

The RegExp object

RegExp is a global object which serves three purposes:-

1) It's a constructor function for creating new instances of Regular Expressions...

It takes a regEx literal expression as its argument. As with strings, in regex you can drop the constructor syntax and just specify the literal on it own. RegEx literals are delimited by the / symbol instead of quotes.

1var a = new RegExp(/\b\w{4}\w\b/g); //match all four letter words
3//same as...
4a = /\b\w{4}\b/g;
5a.constructor //RegExp()

2) It aggregates a set of global (static) properties reflecting the most recent regex match...

leftContext, the text to the left of the most recent match
rightContext, text to the right of the most recent match
lastMatch, the most recently matched text
lastParen, the text matched by the last parenthezised subexpression
$n, the text matched by the nth parenthezised groups (up to n==9)

1"(penalty)Lampard, Frank(1-0)".match(/\b([\w]+),\s?([\w]+)/g);
3RegExp.leftContext //"(penalty)"
4RegExp.rightContext //"(1-0)"
5RegExp.$1 //"Lampard"
6RegExp.$2 //"Frank"

...and 2 variables that will be applied to the next regex match...

input, if no argument is passed to exec and test use this value instead.
, a boolean specifying whether string used for next match should be treated as single or multiline (equivalent to the m attribute)

1var a = /\b[a-z]{10,}\b/i; //match long alpha-only word
6a.test(); //true (on

3) Each instance stores additional properties

source,  the full source of the regex expression
global,  search for all matches (the expression's g attribute is present)
ignoreCase,  search ignore's case (the expression's i attribute is present)
lastIndex,  index to begin the next search

(lastIndex is writeable, the other three properties are not)

The RegExp prototype also defines 3 methods:-


Was the match succesful? (see example above)


When a match is found it returns an array of results where element 0 is the matched text and elements 1 to n represent the matched groups in sequence (equivalent to the RegExp.$n values). If the expression includes the global(g) attribute, the lastIndex property is updated after each call so that repeated calls to exec will loop through each match in the string.

Here's a method to return the first n cards from the "pack", such that their total value does not exceed 21. Notice we define an optional group 2 to match the numeric value of cards with non numeric names (e.g King)

01var expr = /\b([^@\(]+)\(?(\d*)\)?@([^\s]+)\s?/g
02<pre>var theString = '3@Clubs King(10)@Hearts 3@Spades 5@Diamonds 7@Clubs 2@Hearts 9@Spades Jack(10)@Clubs 4@Diamonds 9@Hearts';
03var result = [], total=0, matching = true;
05while(true) {
06    var matching = expr.exec(theString);
07    var value = parseInt(RegExp.$2 ? RegExp.$2 : RegExp.$1);
08    if (!matching || (total += value)>21) {
09        break;
10    }
11    alert('&' + RegExp.$1);
12    result.push(RegExp.$1 + " of " + RegExp.$3);
15result; //["3 of Clubs", "King of Hearts", "3 of Spades", "5 of Diamonds"]


Edit this RegExp instance. If you're neurotic about the overhead of creating a new RegExp instance everytime then this is for you. Enough said.

The String methods

Three string methods accept regular expressions as arguments. They differ from the RegExp methods in that they ignore RegExp's last index property (more accurately they set it to zero) and if the pattern is global they return all matches in one pass, rather than one match for each call. RegExp static properties (e.g. RegExp.$1) are set with each call.


Returns the array of pattern matches in a string. Unless the pattern is global the array length will be 0 or 1

1var a = /(-[\d*\.\d*]{2,})|(-\d+)/g //all negative numbers
3"74 -5.6 9 -.5 -2 49".match(a); //["-5.6", "-.5", "-2"]
4RegExp.$2; //"-2"
5RegExp.leftContext; //"74 -5.6 9 -.5 "
1var queryExpr = new RegExp(/\?/);
2var getQueryString = function(url) {
3    url.match(queryExpr);
4    return RegExp.rightContext;


Converts to array according to the supplied delimiter Optionally takes a regular expression as delimiter

1var names = "Smith%20O'Shea%20Cameron%44Brown".split(/[^a-z\']+/gi); //names = ["Smith", "O'Shea", "Cameron", "Brown"];
2RegExp.lastMatch; //"%44"

Nick Fitzgerald points out that IE is out on a limb when it comes to splitting on grouped expressions

1var time = "Two o'clock PM".split(/(o'clock)/);
2//time = ['Two','PM'] (IE)
3//time = ['Two', 'o,clock', 'PM'] (FF, webkit)


Replaces argument 1 with argument 2. Argument 1 can be a regular expression and if its a global pattern, all matches will be replaced.

Additionally replace comes with two little used but very nice features.

First, you can use $1...$n in the second argument (representing 1...n matched groups)

1var a = "Smith, Bob; Raman, Ravi; Jones, Mary";
2a.replace(/([\w]+), ([\w]+)/g,"$2 $1"); //"Bob Smith; Ravi Raman; Mary Jones"
4var a  = "California, San Francisco, O'Rourke, Gerry";
5a.replace(/([\w'\s]+), ([\w'\s]+), ([\w'\s]+), ([\w'\s]+)/,"$4 $3 lives in $2, $1"); //"Gerry O'Rourke lives in San Francisco, California"

Second, you can also use a function as the second argument. This function will get passed the entire match followed by each matched group ($1...$n) as arguments.

1var chars = "72 101 108 108 111  87 111 114 108 100 33";
2chars.replace(/(\d+)(\s?)/gi,function(all,$1){return String.fromCharCode($1)}); //"Hello World!"
time = ['Two','PM'] (IE)

