0

I've been struggling with this problem for quite some time now and I just can't seem to find a solution. I have the following regular expression for matching URLs which appears to work flawlessly until I post a bunch of links on new lines without spaces between them.

(http|ftp)+(s)?:(\/\/)((\w|\.|\-)+)(\/)?(\S)+

I tried this in a couple of regex testers and it seems to pick URLs correctly, unlike the code at my application. Which made me think there must be something wrong with the code and I started debugging. What I found out when I echo'ed the string I'm applying the regular expression to is this:

http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/

I have never seen new lines \r\n appear as text in the browser. This makes me think that there's something else getting its hands on this string. I followed my logic and it turned out that this string comes right from a textarea element into $_POST and is not being manipulated anywhere.

What may be causing those \r\ns to appear as text and how would I go about matching those URLs that users may input separated by new lines?

I'm kind of really desperate over here, I would really appreciate your help guys.

php_nub_qq
  • 15,199
  • 21
  • 74
  • 144
  • I can't reproduce. Using your regexp pattern and the text, `preg_match_all` grabs all three just fine. See here: http://3v4l.org/1KX69 – jszobody Nov 16 '13 at 18:34
  • @Thrustmaster sorry I thought it was clear enough, I edited the question – php_nub_qq Nov 16 '13 at 18:35
  • 1
    The \r\n is a windows newline character. So windows users typing into a textarea and adding linebreaks will cause that – jszobody Nov 16 '13 at 18:36
  • How are you applying the regex? You're using `preg_match_all` right? – jszobody Nov 16 '13 at 18:38
  • @jszobody as I said I also tried this in a couple online regex tester and in a php sandbox and it did work as expected, but it just doesn't on my hosting server. This is such a pain – php_nub_qq Nov 16 '13 at 18:38
  • Then show your full PHP code. My example isn't just an online tester, it's actual PHP code running, successfully. – jszobody Nov 16 '13 at 18:38
  • @jszobody I'm using preg_replace but it doesn't really matter since regular expressions always apply the same – php_nub_qq Nov 16 '13 at 18:39
  • @jszobody As I also said, this string goes from a `textarea` object right into `$_POST` and into this `preg_replace`, it's not being manipulated or used anywhere before the `preg_replace`. – php_nub_qq Nov 16 '13 at 18:43
  • The regex is solid. You need to show more of your PHP code, how you're doing the preg_replace. The isn't a regex issue. – jszobody Nov 16 '13 at 18:44
  • See http://3v4l.org/AaLvW. Same text, same regex, working with preg_replace. You have a PHP issue somewhere that you aren't showing us. – jszobody Nov 16 '13 at 18:45
  • @jszobody Turned out the string was being escaped without me knowing it. That's what I hate about OOP.. Thank you for your time jszobody!! – php_nub_qq Nov 16 '13 at 18:48

1 Answers1

2

If you are seeing

http://www.google.com/\r\nhttp://www.google.com/\r\nhttp://www.google.com/

when you echo the string, that means that the actual string you are echoing is:

http://www.google.com/\\r\\nhttp://www.google.com/\\r\\nhttp://www.google.com/

i.e. the backslashes have been escaped, causing them to not be treated as newline characters. This means that you are only getting a single match in your regex.

Check out this question: Why are $_POST variables getting escaped in PHP? for reasons why your requests may be getting escaped.

Community
  • 1
  • 1
DaveJohnston
  • 10,031
  • 10
  • 54
  • 83
  • That is exactly right, I am only getting one match and that is the whole string with all the URLs as one. The problem is that I'm not escaping that string, how can those extra backslashes appear on their own? – php_nub_qq Nov 16 '13 at 18:42
  • Yeah, so the problem is nothing to do with the regex, but somewhere else in code that we haven't seen there must be escaping of characters occurring. – DaveJohnston Nov 16 '13 at 18:44
  • 1
    Ahhhhh god, how could I have not seen this coming. The string is actually being escaped in another method. But I would have never found that out if it weren't for your advice. Thank you!!! – php_nub_qq Nov 16 '13 at 18:46