0

I'm developing a WordPress Plugin and I've ran into a serious problem with getting the correct information from a regular expression.

Code for the regular expression is below:

preg_match_all("|grass1(.*)grass2|i", $split[1], $out);

The output is an empty array.

$split[1] is the body of the email (as a string).

I'm looking to extract all the information between grass1 & grass2. So in the body of the email(as a string) there exists "grass1 Something Something grass2" which I need Something Something as an output. This is where the problem lies and it's wierd because it works if the string is the subject, header or the entire email it works just fine. So I'm confused on what is going on. Any help on understanding the problem will be helpful.

Sample of the body of the email:

"_NextPart_000_0098_01CE2BC1.BFAF9200
 Content-Type: text/plain charset="us-ascii" Content-Transfer-Encoding: 7bit
 grass1 I'm testing the before or after function. If successful you should see all of           this message in a future email. grass2" 
  • 1
    Can you include a sample of $split[1] where this returns no matches? – Adrian Apr 01 '13 at 17:52
  • Is the e-mail body HTML perhaps? – Decent Dabbler Apr 01 '13 at 17:52
  • Is the mail body in HTML ? If it is, you cannot use regexes to parse it. – Bart Friederichs Apr 01 '13 at 17:53
  • I will edit and provide a sample of the email for everyone. –  Apr 01 '13 at 17:55
  • @BartFriederichs you can use regex to parse any text, even HTML. RegEx doesn't care about content as long as it's plain text, which HTML is. – Adrian Apr 01 '13 at 17:57
  • @BartFriederichs Are you sure? It works when I send the whole email from outlook which as all the css work at the bottom and it works just fine. –  Apr 01 '13 at 17:57
  • @Adrian read this: http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not – Bart Friederichs Apr 01 '13 at 17:58
  • @BartFriederichs yes, you can't parse HTML into DOM using regex, but you can most certainly to normal regex match and replace operations on HTML. I promise. I do it all day long using regex find & replace in jEdit and using grep :) – Adrian Apr 01 '13 at 18:00

1 Answers1

1

May be there're some newline "\n" between grass1 and grass2 so the "m" (multilines search) modifier can be useful:

preg_match_all("|grass1(.*)grass2|im", $split[1], $out);

See PHP manual reference.pcre.pattern.modifiers.php to the references of modifiers.

For example the "s" ("." match "\n" too) can be an alternative:

preg_match_all("|grass1(.*)grass2|is", $split[1], $out);

ADD Just to be clear: "m" and "s" are similar but not the same:

  • "s" let the "."(dot) match the newline too, so ".*" match a multilines text
  • "m" apply the pattern to a multilines text, so "a\nb" match a line ending with a and begining with b, in other word I can use "\n" as pattern of search
Ivan Buttinoni
  • 4,110
  • 1
  • 24
  • 44