32

I have this data in a LONGTEXT column (so the line breaks are retained):

Paragraph one
Paragraph two
Paragraph three
Paragraph four

I'm trying to match paragraph 1 through 3. I'm using this code:

preg_match('/Para(.*)three/', $row['file'], $m);

This returns nothing. If I try to work just within the first line of the paragraph, by matching:

preg_match('/Para(.*)one/', $row['file'], $m);

Then the code works and I get the proper string returned. What am I doing wrong here?

Norse
  • 5,674
  • 16
  • 50
  • 86

4 Answers4

70

Use the s modifier.

preg_match('/Para(.*)three/s', $row['file'], $m);

Pattern Modifiers

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tasso Evangelista
  • 1,612
  • 1
  • 16
  • 23
  • 3
    Warning: the `/s` modifier is greedy. If there is more than one 'three' in the text, the match will include all text until the last occurence. use `/sU` to prevent this (note the upper case U). – Frank Forte Feb 05 '18 at 19:40
  • 3
    @FrankForte not really a modifier issue: the `*` repetition is greedy per se. A better approach is to put a question mark to make it lazy: `/Para(.*?)three/s`. Also, the `/U` modifier not cancel greedyness, but invert it: `*` becomes lazy and `*?` becomes greedy. Is not a problem on OP code, but it can trigger weird errors in a more complex regular expression. – Tasso Evangelista Feb 06 '18 at 20:17
14

Add the multi-line modifier.

Eg:

preg_match('/Para(.*)three/m', $row['file'], $m)
temporalslide
  • 957
  • 6
  • 9
  • 11
    For anybody wondering about the difference between this and the accepted answer (`s` modifier), the `s` modifier makes `.` match newlines as well as all other characters (by default it excludes them), whereas `m` controls how `^` and `$` match; forcing them to only match the start and end of the whole string (almost - see also `D`) as opposed to the start and end of each line. – Dave Mar 08 '16 at 15:42
4

Try setting the regex to dot-all (PCRE_DOTALL), so it includes line breaks (the extra 's' parameter at the end):

preg_match('/Para(.*)three/s', $row['file'], $m);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jen
  • 458
  • 5
  • 15
0

If you don't like / at the start and and, use T-Regx

$m = Pattern::of('Para(.*)three')->match($row['file'])->first();
Danon
  • 2,771
  • 27
  • 37
  • It appears as if it would work, but still an explanation would be in order. E.g., what is the principle of operation? What is going on? What features of the library does it take advantage of? How does it handle end-of-line/line breaks? Please respond by [editing (changing) your answer](https://stackoverflow.com/posts/54202870/edit), not here in comments (***without*** "Edit:", "Update:", or similar - the answer should appear as if it was written today). – Peter Mortensen Nov 18 '21 at 22:45
  • @PeterMortensen I don't know how to explain it better. The interface of the library simply doesn't take delimiters such as `/`, as opposed to `preg_match()`. – Danon Nov 28 '21 at 13:01