0

I am new to PHP and trying to replace a URL pattern with google.com in the code below.

    $textStr = "Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";

$pattern = '(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)';

$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

I found the regular expression pattern at http://daringfireball.net/2010/07/improved_regex_for_matching_urls but I have not been able to escape the single quote, double quotes in the pattern successfully.

Currently I get the message -- Warning: preg_replace() Unknown modifier '\' But I used the slash() to escape the single quote in {};:\'"

Can someone please help me with the code above?

James
  • 836
  • 1
  • 7
  • 5
  • 1
    possible duplicate of [Converting ereg expressions to preg (missing delimiters)](http://stackoverflow.com/questions/6270004/converting-ereg-expressions-to-preg) – mario Jan 09 '12 at 05:56

2 Answers2

1
$patterrn='/([wW]{3,3}\.|)[A-Za-z0-9]+?\./';
$text="Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";
$output = preg_replace($patterrn,"abc.",$text);
print_r($output);

the output will be ,

Test string contains http://abc.com/more_(than)_one_(parens) http://abc.com/blah_(wikipedia)#cite-1 http://abc.com/blah_(wikipedia)_blah#cite-1 http://abc.com/unicode_(?)_in_parens http://abc.com/(something)?after=parens more urls abc.ca/me some other text
Manigandan Arjunan
  • 2,260
  • 1
  • 25
  • 42
  • Thank you for your help. While I can't use this for my current needs, this will definitely be handy in other situations. – James Jan 11 '12 at 04:56
1

In the first place for preg_replace you have to delimit your regular expression by /, as in:

/\b((?:https: ... etc etc)/

Second, since you delimit your regular expressions with / you have to escape any / with a backslash. So https:// -> https:\/\/.

Third, your modifiers (?i) go after the trailing slash:

`/\b((?:https: .. etc etc)/i`

Try (changes made: escaped /, moved regex from (?i)regex to /regex/i):

$pattern = '/\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)/i';
$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

Now, since $pattern matches the entire URL you will just get out:

"Test string contains google.com
google.com
google.com
google.com
google.com
more urls google.com some other text"

so all in all, I recommend either @Ampere's answer (but this has a looser regex than your original), or using capturing brackets and backreferences to do something like preg_replace($pattern,'google.com/\2',$textStr) (but modify your capturing brackets appropriately, as this will not work with your current capturing bracket arrangement).

This site is useful for testing things out.

mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
  • The regex delimiter doesn't have to be `/`, it can be almost any punctuation character. For example, if you use `~` you won't have to escape anything because that character never appears in the regex. Also, PHP does support the `(?i)` (inline modifier) syntax, so you don't have to change that (but trailing modifiers work, too). – Alan Moore Jan 09 '12 at 18:46