Monday, February 7, 2011

What is wrong with this PHP regular expression?

$output = preg_replace("|(/D)(/s+)(/d+)(;)|", "//1,//3;", $output);

I'm trying to replace all alphabetical character followed by one or more whitespace characters (tabs and/or spaces) followed by one or more numerical characters followed by a semicolon with the alphabetical character followed by a comma followed by the numerical digits and then the semicolon.

I'll provide an example:

Start:

hello world      1007;

End:

hello world,1007;
  • Should those forward-slashes be backslashes? You'll need to escape them for PHP too unless you change your double-quotes to single-quotes.

  • The two '|' at the start and end probably are incorrect - and should both be forward-slashes.

    All other forward slashes should be backward slashes (and need escaping).

    And since PHP 4.04 $n is the preferred way of referring to a capture group.

    $output = preg_replace("/(\\D)\\s+(\\d+;)/", "$1,$2", $output);
    

    If you use single quotes you don't need to escape your backslashes:

    $output = preg_replace('/(\D)\s+(\d+;)/', '$1,$2', $output);
    
    Ted Percival : If I recall correctly, the start- and end-of-pattern symbols (the '|'s) are required in the PHP preg_* functions.
    samjudson : Good point - I also forgot to escape my back slashes.
    From samjudson
  • I'm not in a programming mindset today...that was the fifth stupid, idiotic thing I've done this morning, and I've only been at work for 2 hours.

  • You want backslashes in the regular expression, not forward slashes. The starting and ending pipes are needed (or another delimiter for the regex)

    $x = "hello world      1007;";    
    echo preg_replace('|(\D)(\s+)(\d+)(;)|','$1,$3',$x);
    echo preg_replace('/(\D)(\s+)(\d+)(;)/','$1,$3',$x);
    echo preg_replace('{(\D)(\s+)(\d+)(;)}','$1,$3',$x);
    
    samjudson : I didn't know you could use anything for the start and end delimiters, thanks.
    From Alan Storm

0 comments:

Post a Comment