173

I am running into issues when trying to use the DOMParser in my js code. In my code, I retrieve an xml file via xmlhttp.responseText soap response. I want to be able to access its elements in JSON format, so my code looks like:

var xml = new DOMParser();
xml = xml.parseFromString(xmlhttp.responseText, 'text/xml');
var result = xmlToJson(xml);

I get this error message: ReferenceError: DOMParser is not defined

Edit: This link hasn't worked for me because my javascript isn't in the HTML page, as it is a node.js file. JavaScript DOMParser access innerHTML and other properties

Community
  • 1
  • 1
Stephen D
  • 2,836
  • 4
  • 27
  • 40

10 Answers10

205

A lot of browser functionalities, like DOM manipulations or XHR, are not available natively NodeJS because that is not a typical server task to access the DOM - you'll have to use an external library to do that.

DOM capacities depends a lot on the library, here's a quick comparisons of the main tools you can use:

  • jsdom: implements DOM level 4 which is the latest DOM standard, so everything that you can do on a modern browser, you can do it in jsdom. It is the de-facto industry standard for doing browser stuff on Node, used by Mocha, Vue Test Utils, Webpack Prerender SPA Plugin, and many other:

    const jsdom = require("jsdom");
    const dom = new jsdom.JSDOM(`<!DOCTYPE html><p>Hello world</p>`);
    dom.window.document.querySelector("p").textContent; // 'Hello world'
    
  • deno_dom: if using Deno instead of Node is an option, this library provides DOM parsing capabilities:

    import { DOMParser } from "https://deno.land/x/deno_dom/deno-dom-wasm.ts";
    const parser = new DOMParser();
    const document = parser.parseFromString('<p>Hello world</p>', 'text/html');
    document.querySelector('p').textContent; // 'Hello world';
    
  • htmlparser2: same as jsdom, but with enhanced performances and flexibility at the price of a more complex API:

    const htmlparser = require("htmlparser2");
    const parser = new htmlparser.Parser({
      onopentag: (name, attrib) => {
        if (name=='p') console.log('a paragraph element is opening');
      }
    }, {decodeEntities: true});
    parser.write(`<!DOCTYPE html><p>Hello world</p>`);
    parser.end();
    // console output: 'a paragraph element is opening'
    
  • cheerio: implementation of jQuery based on HTML DOM parsing by htmlparser2:

    const cheerio = require('cheerio');
    const $ = cheerio.load(`<!DOCTYPE html><p>Hello world</p>`);
    $('p').text('Bye moon');
    $.html(); // '<!DOCTYPE html><p>Bye moon</p>'
    
  • xmldom: fully implements the DOM level 2 and partially implements the DOM level 3. Works with HTML, and with XML also

  • dom-parser: regex-based DOM parser that implements a few DOM methods like getElementById. Since parsing HTML with regular expressions is a very bad idea I wouldn't recommend this one for production.

Nino Filiu
  • 16,660
  • 11
  • 54
  • 84
26

I used jsdom because it's got a ton of usage and is written by a prominent web hero - no promises that it's behavior perfectly matches your browser (or even that every browser's behavior is the same) but it worked for me:

const jsdom = require("jsdom")
const { JSDOM } = jsdom
global.DOMParser = new JSDOM().window.DOMParser
jrz
  • 4,286
  • 1
  • 29
  • 21
21

You can use a Node implementation of DOMParser, such as xmldom. This will allow you to access DOMParser outside of the browser. For example:

var DOMParser = require('xmldom').DOMParser;
var parser = new DOMParser();
var document = parser.parseFromString('Your XML String', 'text/xml');
Chris Alley
  • 3,015
  • 2
  • 21
  • 31
17

There is no DOMParser in node.js, that's a browser thing. You can try any of these modules though:

https://github.com/joyent/node/wiki/modules#wiki-parsers-xml

Esailija
  • 138,174
  • 23
  • 272
  • 326
  • 1
    I know this thread is old, but here it goes. What about jquery for node? An ajax call with dataType xml should receive an xml dom response. – ffflabs Jul 24 '14 at 02:45
  • 1
    The link provided appears to have expired. – vhs May 20 '22 at 11:24
5

I really like htmlparser2. It's a fantastic, fast and lightweight library. I've created a small demo on how to use it on RunKit: https://runkit.com/jfahrenkrug/htmlparser2-demo/1.0.0

Johannes Fahrenkrug
  • 42,912
  • 19
  • 126
  • 165
  • it's good to know that additional packages that make it actually useful (css-select, domutils) are not free – okram Jul 17 '23 at 08:30
3
var DOMParser = require('xmldom').DOMParser;
var doc = new DOMParser().parseFromString(
    '<xml xmlns="a" xmlns:c="./lite">\n'+
        '\t<child>test</child>\n'+
        '\t<child></child>\n'+
        '\t<child/>\n'+
    '</xml>'
    ,'text/xml');
Anjali
  • 75
  • 1
0

I use yet another DOM parser from html string to DOM and back > Himalaya, or at npmjs.com:

import { parse, stringify } from 'himalaya';

const dom = parse(htmlString)

// Do something here

const htmlStringNext = stringify(dom)
Roman
  • 19,236
  • 15
  • 93
  • 97
0

RSS parser is easy for parsing Atom feeds. I you are using NextJs for example you can simply create an API like so:

import Parser from 'rss-parser'

export default async function API(req, res) {
    let parser = new Parser();
    try {
        const feed = await parser.parseURL(`https://www.nasa.gov/rss/dyn/lg_image_of_the_day.rss`);
        if (feed) return res.json({ "message": `Here is your data feed title`, status: 200, data: feed.title })
    } catch (error) {
        return res.json({ "message": "You made an invalid request", status: 401 })
    }
}
w. Patrick Gale
  • 1,643
  • 13
  • 22
0

I needed DOMParser in node. This is what I used

const jsdom = require('jsdom');
const {JSDOM} = jsdom;

class DOMParser {
  parseFromString(s, contentType = 'text/html') {
    return new JSDOM(s, {contentType}).window.document;
  }
}

now the example from this answer works for me.

function htmlDecode(input) {
  var doc = new DOMParser().parseFromString(input, "text/html");
  return doc.documentElement.textContent;
}

console.log(  htmlDecode("&lt;img src='myimage.jpg'&gt;")  )    
// "<img src='myimage.jpg'>"
gman
  • 100,619
  • 31
  • 269
  • 393
0

you can as well use html-to-text in case you need html to text conversion as can be seen in few answers to this question.

const { convert } = require('html-to-text');
// There is also an alias to `convert` called `htmlToText`.

const options = {
  wordwrap: 130,
  // ...
};
const html = '<div>Hello World</div>';
const text = convert(html, options);
console.log(text); // Hello World
Abhay Phougat
  • 280
  • 2
  • 6