-1

I have a html string and needs to remove double quote from href of anchor tag.

$content = '<p style="abc" rel="blah blah"> Hello I am p </p> <a href="https://example.com/abc?name="xyz&123""></a>';

should return

$content = '<p style="abc" rel="blah blah"> Hello I am p </p> <a href="https://example.com/abc?name='xyz&123'"></a>';

I have tried

preg_replace('/<a\s+[^>]*href\s*=\s*"([^"]+)"[^>]*>/', '<a href="\1">', $content)

but this removes all attributes from anchor tag except for href. Unable to find out something that can actually works inside href Looking for some php code for the same.

RiggsFolly
  • 93,638
  • 21
  • 103
  • 149
user3350093
  • 43
  • 10

2 Answers2

1

You may try:

(<a href=".*?)"(.*?)"(.*)

Explanation of the above regex:

  • (<a href=".*?) - Represents first capturing group capturing capturing everything before the first ". Notice I used lazy matching which facilitates this task.
  • " - Matches " literally.
  • (.*?) - Represents second capturing group capturing data xyz&123 which is in between ".
  • (.*) - Represents 3rd capturing group which captures everything after the ".
  • $1\'$2\'$3 - For the replacement part; use the captured groups along with single quotes.

enter image description here

You can find the demo of the above regex in here.

Sample Implementation inf php:

<?php
$re = '/(<a href=".*?)"(.*?)"(.*)/m';
$str = '<p style="abc" rel="blah blah"> Hello I am p </p> <a href="https://example.com/abc?name="xyz&123""></a>';
$subst = '$1\'$2\'$3';

$result = preg_replace($re, $subst, $str);

echo $result;

You can find the sample run of the above code in here.

  • Hey thanks @Mandy8055 for the explanation. Let me just try it out. – user3350093 Jul 02 '20 at 09:32
  • Hey thanks @Mandy8055 . That works fantastic. Just one more doubt what if i had more than one group of double quote. Something like New bbb – user3350093 Jul 02 '20 at 09:38
  • In that case @user3350093; does [**this**](https://stackoverflow.com/questions/15926142/regular-expression-for-finding-href-value-of-a-a-link/15926317) help? –  Jul 02 '20 at 09:59
-1

I have tried preg_replace('/<a\s+[^>]*href\s*=\s*"([^"]+)"[^>]*>/', '<a href="\1">', $content) regex. but this removes all attributes from anchor tag except for href.

Maybe be a bit more generic - and leave all that <a ...> stuff out of the equation to begin with?

Not too many HTML elements have a href attribute to begin with - and even if you encountered a different one with such a href value, it would not make sense there either, so it would need replacing as well anyway.

#href="(\S+)"# as a greedy pattern looking for & capturing the longest possible non-whitespace string between href=" and ". That gives href="https://example.com/abc?name="xyz&123"" as the full match, and just the https://example.com/abc?name="xyz&123" as the partial one.

Let’s feed the latter into str_replace to get rid of the ", using preg_replace:

$content = preg_replace_callback('#href="(\S+)"#', function($m) {
  return 'href="'.str_replace('"', '', $m[1]).'"';
}, $content);
CBroe
  • 91,630
  • 14
  • 92
  • 150
  • If you really need single quotes around the `xyz123` value here (can’t really see why that would be necessary, your explanation of that does not make sense to me), then make the second parameter of the `str_replace` call `"'"` instead. – CBroe Jul 02 '20 at 08:53
  • your explanation seems convincing but unfortunately did not work out for me. still getting the same href. Am fine with removing double quotes & not replacing the same. – user3350093 Jul 02 '20 at 08:57
  • It definitively works for the input data you had shown in your question. What you posted in comments later - unclear what _code_ exactly that is actually supposed to be, due to StackOverflow doing its own parsing, and making part of the URL into a clickable link. Please post code in comments in backticks, or add the example to your question. – CBroe Jul 02 '20 at 09:00
  • Sorry I got some white space as well New bbb in 1978 . let me just try some more variations to your callback function. Got the basic idea of how this works. – user3350093 Jul 02 '20 at 09:29