1

How to construct a document from a string

I have got a string, which is html-like ,I want to extract the element in the html text, I know that I can use htmlparser with java, but how to do the same function with javascript?

How can I construct a document from the string, Does createHTMLDocument work?

Or any other way to extract the element in the html text?

for example:

I have got the html text as :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"               "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml"> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />   
<title>titleValue</title> 
<meta name="description" content="It is a good way to learn science." /> 
<meta name="keywords" content="Symfony2,Redis,PHP" /> 
<meta name="author" content="CSDN.NET" /> 
<meta name="Copyright" content="CSDN.NET" /> 
</head> 
<body> 
.......................... 
</body> 
</html>

how to get the value of "description"

Here is my code, but the output is 0, what's wrong?

                                var el = document.createElement("div");
                                el.innerHTML = ' <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>titleValue</title> <meta name="description" content="It is a good way to learn science." /> <meta name="keywords" content="Symfony2,Redis,PHP" /> <meta name="author" content="CSDN.NET" /> <meta name="Copyright" content="CSDN.NET" /> </head> <body> hello</body> </html>';
                                var descElements = el.getElementsByTagName("head");
                                document.getElementById("news_content").innerHTML = descElements.length;
jinhong_lu
  • 238
  • 1
  • 2
  • 11
  • 1
    Possible duplicate of http://stackoverflow.com/questions/10585029/parse-a-html-string-with-js. That said, if your html happens to be valid XML, there is crossbrowser support for parsing xml from a string: http://stackoverflow.com/questions/649614/xml-parsing-of-a-variable-string-in-javascript – hugomg Aug 05 '14 at 01:34
  • Another possible duplicate: http://stackoverflow.com/q/12808770/174774 – Trav L Aug 05 '14 at 01:36

1 Answers1

1

The simplest way to do this kind of manipulation would be with a library like jQuery. This is one way you could accomplish this task with jQuery (see a demo):

var markup = '<!DOCTYPE ...';

var parsed = $(markup);

var description = parsed.filter("meta[name='description']").attr('content');

alert(description);

Note that you will not have access to all elements (the <head/> element is not represented, for example) because not all elements are legal in the context of another document. The <meta/> elements should be fine, though.

cdhowie
  • 158,093
  • 24
  • 286
  • 300