14

Given the following HTML fragment:

<p id="para">hello &lt;world&gt;</p>

The jQuery .text() and .html() methods will return different values...

var text = $('#para').text(); // gives 'hello <world>'
var html = $('#para').html(); // gives 'hello &lt;world&gt;'

The jQuery documentation states

The result of the .text() method is a string containing the combined text of all matched elements.

and...

In an HTML document, we can use .html() to get the contents of any element. If the selector expression matches more than one element, only the first one's HTML content is returned.

But this specific difference with &lt; and &gt; doesn't seem to be documented anywhere.

Can anyone comment on the rationale for this difference?

Edit

A little more investigation suggests that the values .text() and .html() match those from the native JavaScript innerText and innerHTML calls respectively (when the jQuery selector has returned a single element, at least). Again, this is not reflected in the jQuery documentation so I am not 100% sure if this observation holds true in all scenarios. Reading through the jQuery source reveals that this isn't what's actually going on under the hood.

Richard Ev
  • 52,939
  • 59
  • 191
  • 278
  • From instinct, I'd say this is as designed, but I don't have a rationale ready... (checking manual) – Pekka Dec 16 '10 at 10:57
  • 1
    Nope, can't find anything. However, the right place to look for this is in the manuals on the `innerText`/`textContent` properties rather than jQuery's `text()` which is only a wrapper to one of those functions. It would also be interesting to see whether this is consistent cross-browser behaviour? – Pekka Dec 16 '10 at 11:00
  • @Pekka: Are you sure `text()` wraps these methods? I can't see that in the jQuery source. – Richard Ev Dec 16 '10 at 11:33

4 Answers4

7

I think it happens so that round-tripping can work correctly. You should be able to get a perfect clone of the original content by calling $() on the result of html():

var clonedContent = $($("#para").html());

If HTML entities were not escaped by html(), the above would create a <world> element that doesn't exist in the original content.

Frédéric Hamidi
  • 258,201
  • 41
  • 486
  • 479
7

This is in accordance with the corresponding JavaScript methods textContent and innerHTML.

>> console.log(document.getElementsByTagName('p')[0].textContent);
hello <world>

>> console.log(document.getElementsByTagName('p')[0].innerHTML);
hello &lt;world&gt;
dheerosaur
  • 14,736
  • 6
  • 30
  • 31
  • 4
    @Richard, But, I was really intrigued as to the implications of this decision by the creators of DOM. I searched through many documents, but haven't found the rationale behind it. I will update when I find something. – dheerosaur Dec 16 '10 at 11:31
0

.html(): Get the HTML contents of the first element in the set of matched elements.

.text(): Get the combined text contents of each element in the set of matched elements, including their descendants.

With .text() you get the real displayed text. With .html() you get the html of all childelements.

Floyd
  • 1,898
  • 12
  • 20
  • 3
    I think he got that. The question is why HTML entities are de-entitied into characters and not passed through as is (i.e. `<`) – Pekka Dec 16 '10 at 11:00
0

.html() gives you HTML content. If it were to transform &lt; and &gt; to opening and closing tags it could easily mess everything up by creating meaningless elements. i.e. <world> element in your example

.text() gives you text content that does not have any Markup code in it. Therefore it is safe to transform &lt; and &gt; to "less than" and "greater than" signs.

Saeb Amini
  • 23,054
  • 9
  • 78
  • 76