0

I have a preg replace pattern thats work quite good on phpliveregex.com:

(\>*\s?)_______________________________________________\n(\>*\s?)(talk|tagging|talk-us|talk-gb|talk-de|osm-talk) mailing list\n(\>*\s?)(talk|tagging|talk-us|talk-gb|talk-de|osm-talk)@openstreetmap.org\n(\>*\s?)https://lists.openstreetmap.org/listinfo/(talk|tagging|talk-us|talk-gb|talk-de|osm-talk)

for example here, it deletes all the mailinglist-signatures:

>> Text, blablabla
>>
>> _______________________________________________
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>talk mailing list
>talk@openstreetmap.org
>https://lists.openstreetmap.org/listinfo/talk

-- 
personal signature, blabla._______________________________________________
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk

But when I try exactly the same in php with preg_replace, only the last of the three mailinglist signatures is deleted. And thats only with the given variable. When I echo the variable content to the browser, and copy that to a new variable like $text = 'long echoed text' it works.

$slugs = 'talk|tagging|talk-us|talk-gb|talk-de|osm-talk';            
$pattern = '!(\>*\s?)_______________________________________________\n(\>*\s*)('.$slugs.') mailing list\n(\>*\s*)('.$slugs.')@openstreetmap.org\n(\>*\s*)https://lists.openstreetmap.org/listinfo/('.$slugs.')!mi';            
return preg_replace($pattern,'',$text);

So I guess there must be some hidden encoding or hidden chars else in the original variable. But how can I find out whats the problem?

edit: it looks for me now like there is a problem with linebreaks and the > afterwards, but I still don't know how I could check it exactly and how to solve it.

edit2: when I try $text==$text2 (where $text is the original an $text2 is the result of echo $text), I get FALSE

TL;DR: when I use the given variable it does not work. But when I echo the variable to the browser, copy the text to a new variable, it works. what is hidden there?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Asara
  • 2,791
  • 3
  • 26
  • 55
  • Try using u modifier if you have problem with encoding. – phpio.net Nov 29 '16 at 15:57
  • It works, see http://ideone.com/BdG43Y – Wiktor Stribiżew Nov 29 '16 at 16:13
  • 1
    yes, there it works, like it does on phpliveregex and like it does when i put the text in a variable (sorry, but did you even read the whole question?). I updated my question, it looks like it is something with linebreaks because I have a similary problem with another regex – Asara Nov 29 '16 at 16:17

1 Answers1

1

Right now the above expression matches line breaks encoded as "\n". However, line breaks can also be encoded as "\n", "\r" and "\r\n", depending on the environment. So instead of \n, you should use:

[\n\r]+

See also this question and the corresponding article on Wikipedia.

Community
  • 1
  • 1
friedemann_bach
  • 1,418
  • 14
  • 29
  • 1
    thanks for your answer, I just found it out some seconds before by using json_decode() :) but anyway, your answer is good and correct. And again for other people with problems like that: using json_decode() on a string shows all hidden chars – Asara Nov 29 '16 at 16:47