27

I have a strange problem:

In the database, I have a literal ampersand lt semicolon:

<div  

whenever its printed into a html textarea tag, the source code of the page shows the > as >.

How do I stop this decoding?

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
Rami Dabain
  • 4,709
  • 12
  • 62
  • 106

7 Answers7

41

You can't stop entities being decoded in a textarea since the content of a textarea is not (unlike a script or style element) intrinsic CDATA, even though error recovery may sometimes give the impression that it is.

The definition of the textarea element is:

<!ELEMENT TEXTAREA - - (#PCDATA)       -- multi-line text field -->

i.e. it contains PCDATA which is described as:

Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with & and end with a semicolon (e.g., Herg&eacute;'s adventures of Tintin contains the character entity reference for the e acute character).

This means that when you type (the invalid HTML of) "start of tag" (<) the browser corrects it to "less than sign" (&lt;) but when you type "start of entity" (&), which is allowed, no error correction takes place.

You need to write what you mean. If you want to include some HTML as data then you must convert any character with special meaning to its respective character reference.

If the data is:

&lt;div

Then the HTML must be:

<textarea>&amp;lt;div</textarea>

You can use the standard functions for converting this (e.g. PHP's htmlspecialchars or Perl's HTML::Entities module).

NB 1: If you were using XHTML[2] (and really using it, it doesn't count if you serve it as text/html) then you could use an explicit CDATA block:

<textarea><![CDATA[&lt;div]]></textarea>

NB 2: Or if browsers implemented HTML 4 correctly


Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved &lt; , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a < in the database , why the textarea decodes it ?

  • The server sends (to the browser) data encoded as HTML.
  • The browser sends (to the server) data encoded as application/x-www-form-urlencoded (or multipart/form-data).

Since the browser is not sending the data as HTML, the characters are not represented as HTML entities.

If you take the data received from the client and then put it into an HTML document, then you must encode it as HTML first.

TylerH
  • 20,799
  • 66
  • 75
  • 101
Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
25

In PHP, this can be done using htmlentities(). Example below.

<?php
  $content = "This string contains the TM symbol: &trade;";
  print "<textarea>". htmlentities($content) ."</textarea>";
?>

Without htmlentities(), the textarea would interpret and display the TM symbol (™) instead of "&trade;".

http://php.net/manual/en/function.htmlentities.php

Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
Chris Hubbard
  • 554
  • 5
  • 11
  • Good thing is it didn't even required to convert-back after form submission, since the browser decodes `htmlentities`. – Moradnejad Nov 30 '19 at 07:07
1

You can serve your DB-content from a separate page and then place it in the textarea using a Javascript (jQuery) Ajax-call:

request = $.ajax
({  
    type: "GET",
    url: "url-with-the-troubled-content.php",           
    success: function(data)
    {
        document.getElementById('id-of-text-area').value = data;    
    }
}); 

Explained at

http://www.endtask.net/how-to-prevent-a-textarea-element-from-decoding-html-entities/

Victor
  • 19
  • 1
1

You have to be sure that this is rendered to the browser:

<textarea name="somename">&amp;lt;div</textarea>

Essentially, this means that the & in &lt; has to be html encoded to &amp;. How to do it will depend on the technologies you're using.

UPDATE: Think about it like this. If you want to display <div> inside a textarea, you'll have to encode <> because otherwise, <div> would be a normal HTML element to the browser:

<textarea name="somename">&lt;div&gt;</textarea>

Having said this, if you want to display &lt;div&gt; inside a textarea, you'll have to encode & again, because the browser decodes HTML entities when rendering HTML. It has nothing to do with your database.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • 1
    Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved &lt; , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a `<` in the database , why the textarea decodes it ? – Rami Dabain Dec 15 '11 at 11:28
  • 1
    The browser does the decoding, according to HTML standards. It would be decoded all the same if displayed outside of a ``. So you must encode `&` to the browser, not to the database. Think about it the other way round. If you wanted to display a `
    ` to the browser (textarea or not). How would you do it without encoding of `<>` to `<>`? You couldn't because `
    ` would be interpreted as an HTML element. Now, recurse this thought. How would you display `<div>` in the browser without encoding of `&`...?
    – Lukas Eder Dec 15 '11 at 11:35
  • if you view source of the page , its displayed as `<` but in the browser it IS decoded to `<` !!! i know its impossible but its happening am positive about that – Rami Dabain Dec 15 '11 at 13:22
  • 1
    @RonanDejhero: There are now two interesting explanations to your question, the rest of the thinking, you'll have to do yourself, I'm afraid... Except if you refuse to understand how things work :-) – Lukas Eder Dec 15 '11 at 13:35
  • 1
    @Ronan Dejhero — It is not impossible, it is what is *required* to happen by the HTML specification. – Quentin Dec 15 '11 at 13:54
  • @Quentin it is required in the HTML , but it is also required in the HTML that text areas ` – Rami Dabain Mar 17 '12 at 10:15
  • @RonanDejhero — I've no idea what you are reading that suggests that but the specification says that it contains `#PCDATA` which means it can contain character references, but not tags. So when the HTML includes `<` then the textNode's value will be `<` and that will be submitted. When the HTML includes `<` then that is an error and the browser will error recover so the textNode also includes `<` (this is defined in the HTML 5 drafts). (NB: Consistent error recover is not a license to make errors). – Quentin Mar 17 '12 at 15:44
0

I had the same problem and I just made two replacements on the text to show from the database before letting it into the text area:

myString = Replace(myString, "&", "&amp;")
myString = Replace(myString, "<", "&lt;")

Replace n:o 1 to trick the textarea to show the codes. replace n:o 2: Without this replacement you can not show the word "" inside the textarea (it would end the textarea tag).

(Asp / vbscript code above, translate to a replace method of your language choice)

Andreas Jansson
  • 830
  • 1
  • 9
  • 21
0

I found an alternative solution for reading and working with in-browser, simply read the element's text() using jQuery, it returns the characters as display characters and allows me to write from a textarea to a div's innerHTML using the property via html()...

-1

With only JS and HTML...

...to answer the actual question, with a bare-minimal example:

<textarea id=myta></textarea>

<script id=mytext type=text/plain>
  &trade;
</script>

<script>  myta.value = mytext.innerText;  </script>

Explanation:

Script tags do not render html nor entities. By storing text in a script tag, it will remain unadultered-- problem is it will try to execute as JavaScript. So we use an empty textarea and store the text in a script tag (here, the first one).

To prevent that, we change the mime-type to text/plain instead of it's default, which is text/javascript. This will prevent it from running.

Then to populate the textarea, we copy the script tag's content to it (here done in the second script tag).

The only caveats I have found with this are you have to use JavaScript and you cannot include script tags directly in it.

jdmayfield
  • 1,400
  • 1
  • 14
  • 26
  • Whoever thumbed down my answer, please explain yourself. It is the purest answer possible, as I explicated, since you cannot store html entities directly inside the textarea in an html file without them being pre-rendered on page load, and this answer is the only one which has no dependencies. Therefore it is the simplest resolution. – jdmayfield Oct 29 '22 at 16:41