0

I am a PHP beginner and saw on the forum this PHP expression:

My PHP version is 5.2.X ()

$regex = <<<'END'
/
  ( [\x00-\x7F]                 # single-byte sequences   0xxxxxxx
  | [\xC0-\xDF][\x80-\xBF]      # double-byte sequences   110xxxxx 10xxxxxx
  | [\xE0-\xEF][\x80-\xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2
  | [\xF0-\xF7][\x80-\xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3 
  )
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111
/x
END;

Is this code correct? What do these strange (for me) constructions like <<<, 'END', /, /x, and END; mean?

My PHP version does not support nowdoc, how should I replace this expression? without quotes 'END' $regex became NULL

I recieve:

Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X

Thanks

Yahel
  • 37,023
  • 22
  • 103
  • 153
serhio
  • 28,010
  • 62
  • 221
  • 374
  • 2
    While this is valid PHP, for sure, a vast majority of that code is actually a regular expression and understanding regexes is somewhat independent from understanding PHP. Most languages in common use today have a regex engine built in that accepts similar expressions. – Austin Fitzpatrick Apr 08 '10 at 22:33

4 Answers4

6

Parse error: syntax error, unexpected T_SL in /home/vhosts/mysite.com/public_html/mypage.php on line X

This comes from the 's around END. This is called nowdoc, which was added in PHP 5.3. Since you're using PHP 5.2, and this regex uses '\x', you'll need a quoted string or you'll need to escape the '\'s.

An example of the regex as a quoted string, used in this answer:

$regex = '/
( [\x00-\x7F]                 # single-byte sequences   0xxxxxxx
  | [\xC0-\xDF][\x80-\xBF]      # double-byte sequences   110xxxxx 10xxxxxx
  | [\xE0-\xEF][\x80-\xBF]{2}   # triple-byte sequences   1110xxxx 10xxxxxx * 2
  | [\xF0-\xF7][\x80-\xBF]{3}   # quadruple-byte sequence 11110xxx 10xxxxxx * 3
  )
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111
/x
';

The "/" and "/x" portions are control characters in the regex. The "/"s mark the beginning and end, and the meaning of the x flag (PCRE_EXTENDED) is defined in: http://us.php.net/manual/en/reference.pcre.pattern.modifiers.php

Community
  • 1
  • 1
ddrown
  • 170
  • 4
5

<<< and END are called heredoc syntax - a way of quoting a large amount of data to a variable.

$mytext = <<<TXT

this is my text and it
can be many lines
etc
etc

TXT;

The three characters (here TXT, END in your example) can be whatever you like although they must be alphanumeric as far as I'm aware.

Read more at the manual

NullUserException
  • 83,810
  • 28
  • 209
  • 234
Adam Hopkinson
  • 28,281
  • 7
  • 65
  • 99
3

It's heredoc syntax.

The <<< 'END' says that it's the start of a string and that everything until the next appearance of "END" will be part of the string (even newlines).

The / and /x are actually part of the regex.

Michael Myers
  • 188,989
  • 46
  • 291
  • 292
2

In addition to what other users have said about it being heredoc syntax (typically used for large strings that would otherwise require a lot of escaping), the code is defining a regular expression using "/" as the deliminator.

the "/x" at the end is closing the regular expression and then telling the regex engine to execute it in "free-spacing mode". Other possible options would have been /i for case-insensitive or /m for multi-line mode.

You can read more about PHP's regex engine here:

Using Regular Expressions in PHP

Austin Fitzpatrick
  • 7,243
  • 3
  • 25
  • 22
  • heredoc, newdoc... What difference between them.. `'END'` or just `END`? – serhio Apr 08 '10 at 22:38
  • $vars within heredocs are expanded as if the string were in "double quotes". $ in nowdocs is treated like $ in 'single quotes'. (Or vice versa, I haven't looked at 5.3 recently.) – jmucchiello Apr 08 '10 at 22:48
  • my php version does not support nowdoc, how should I replace this expression? without quotes '' $regex became NULL – serhio Apr 08 '10 at 22:50