5

for this php script,

$dom = new DOMDocument();
    $dom->substituteEntities =FALSE;
    $dom->loadHTML('<a href="$a?">$a</a>');
    // print_r ($dom->getElementsByTagName("a")->item(0)->getAttribute("href")); 

//the above statement show $a? correctly

    echo $dom->saveHTML();

but it returned <a href="%24a">$a</a> to the browser when a saveHTML method was called. The $ in the href attribute was turned into %24 whereas the $ in the content of the a tag remains unchanged.

I expect the output is <a href="$a">$a</a> Is there any way to do this aside from the replace method?

By the way,

  echo $dom->saveXML();

I get what I want with saveXML(); but together with an unexpected <!--xml...... Thanks

user3204729
  • 317
  • 1
  • 13
  • 1
    Well, [`$` is not valid in a url](http://stackoverflow.com/questions/7109143/what-characters-are-valid-in-a-url). Any special reason you want to keep an invalid href? – Wrikken Jan 25 '14 at 12:07
  • Thanks for your reply, it is due to the requirement of a web service. they use $XXX as a variable to inject their script on it. – user3204729 Jan 25 '14 at 12:24
  • 2
    Hm, yeah, so 'not-quite-html-but-almost' it is.. Tricky indeed. If the HTML snippet doesn't contain content that would make it deviate from `XML` standards, an alternative to the accepted answer is saving a node instead of the while document to prevent the xml prologue by saving it like `$dom->saveXML($dom->documentElement);`, but that has it's own quircks & drawbacks. The current one would have as drawback that if you _need_ something urlencoded you'd have to double encode it. So, take your pick as to what suits you better ;) – Wrikken Jan 25 '14 at 12:42
  • Same problem here. The loadHTML method automatically and unwantedly decodes url's in metatags. For instance a canonical. So after using this method the canonical on this URL http://mathsgenius.co.za/qa/961/solve-%24x-2-2x-1-0%24 was decoded into http://mathsgenius.co.za/qa/961/solve-$x-2-2x-1-0$ The source of the HTML contains the correct canonical URL. – Patrick Savalle Jun 17 '15 at 11:12

2 Answers2

1

A safer approach in my case was to use:

$dom->saveXML();
Ivan Chaer
  • 6,980
  • 1
  • 38
  • 48
-2

You can wrap echo in urldecode to solve this issue :

echo urldecode($dom->saveHTML());
voodoo417
  • 11,861
  • 3
  • 36
  • 40
  • 1
    What if `urldecode` decodes characters that don't have to? – Epoc May 30 '16 at 12:44
  • 1
    urldecode should only be done at specific parts like https://stackoverflow.com/questions/29448119/how-can-i-prevent-html-entities-with-php-a-domdocumentsavehtml – Philipp Dahse Nov 06 '17 at 11:48