2

I have a text which contains hyperlinks, some hyperlinks contain spaces and I want to convert them into %20.

For example:

To make hyperlinks <a href="http://www.link-to-my-page.com/page 1.html">Page 1</a>

If I convert above text using rawurlencode function it returns

To%20make%20hyperlinks%20%3Ca%20href%3D%22http%3A%2F%2Fwww.link-to-my-page.com%2Fpage%201.html%22%3EPage%201%3C%2Fa%3E

I wrote following RE to convert space into %20 in links only but I am not sure how to apply space (\s)* with preg_replace.

/(http|https|ftp|ftps)(\:\/\/[a-zA-Z0-9\-\.]+)(\s)*\.[a-zA-Z]{2,4}(\/\S*)?/

Any help would be greatly appreciated.

Thanks

Maximus
  • 2,906
  • 4
  • 35
  • 55
  • If you just want to convert spaces to %20, you can use `$transformed = str_replace(' ', '%20', $input);` Is this what you're looking for? – Andrew Jun 21 '12 at 17:49
  • @Andrew This will result same as I mentioned in my question. – Maximus Jun 21 '12 at 17:54

2 Answers2

4

The easiest thing to do is to use DOMDocument and let it fix this for you:

$html = 'To make hyperlinks <a href="http://www.link-to-my-page.com/page 1.html">Page 1</a>';
$doc = new DOMDocument();
$doc->loadHTML( $html);

// Save the fixed HTML
$innerHTML = '';
foreach( $doc->getElementsByTagName('p')->item(0)->childNodes as $child) {
    $innerHTML .= $doc->saveHTML($child);
}

echo $innerHTML;

Output, thanks to this SO question:

To make hyperlinks <a href="http://www.link-to-my-page.com/page%201.html">Page 1</a>
Community
  • 1
  • 1
nickb
  • 59,313
  • 13
  • 108
  • 143
  • It's not letting me edit your post. But I think you're right. This should actually be the getElementsByTagName('a') not p. – hsanders Jun 21 '12 at 18:00
  • No it's not, see the example that I linked as well as the other SO question. It needs to be `p` because `loadHTML()` inserts an HTML container around incomplete HTML. The entire loop is there to get rid of that excess HTML. – nickb Jun 21 '12 at 18:01
3

The right answer here isn't a regexp. It's urlencode() http://php.net/manual/en/function.urlencode.php

hsanders
  • 1,913
  • 12
  • 22
  • That adds a `+` in the text. I think @jason only wants to urlencode the links within a block of text. – sachleen Jun 21 '12 at 17:50
  • Then he would want to detect URLs using a regular expression, then feed all matches to urlencode(). – hsanders Jun 21 '12 at 17:51
  • I reread this question, and I think he should actually use the DOMDocument like @nickb suggests. I thought he was talking about echoing URLs into an "a" tag instead of parsing an HTML document and fixing it. Although, it's still slightly unclear, but I"m leaning towards the latter. – hsanders Jun 21 '12 at 18:01