0

I've got an XML string, like this:

'<ALEXA VER="0.9" URL="davidwalsh.name/" HOME="0" AID="="><SD TITLE="A" FLAGS="" HOST="davidwalsh.name"><TITLE TEXT="David Walsh Blog :: PHP, MySQL, CSS, Javascript, MooTools, and Everything Else"/><LINKSIN NUM="1102"/><SPEED TEXT="1421" PCT="51"/></SD><SD><POPULARITY URL="davidwalsh.name/" TEXT="7131"/><REACH RANK="5952"/><RANK DELTA="-1648"/></SD></ALEXA>'

I'd like to convert it into JSON format:

  {
    "ALEXA":{
       "@attributes":{
          "VER":"0.9",
          "URL":"davidwalsh.name/",
          "HOME":"0",
          "AID":"="
       },
       "SD":[
          {
             "@attributes":{
                "TITLE":"A",
                "FLAGS":"",
                "HOST":"davidwalsh.name"
             },
             "TITLE":{
                "@attributes":{
                   "TEXT":"David Walsh Blog :: PHP, MySQL, CSS, Javascript, MooTools, and Everything Else"
                }
...

I've found lot's of solutions for , but none of them worked in . I've also seen this question: Parsing XML on a Google Apps script but it does not exactly my case: I'de like to parse any XML into JSON, not just the provided sample. I've found own solution (in the answer), and not sure it matches all cases.

Max Makhrov
  • 17,309
  • 5
  • 55
  • 81

2 Answers2

2

I thought the solution should be a recursion function. After some research, I've found this great code by David Walsh and was able to adopt it. Here's what I've come to:

// Changes XML to JSON
// Original code: https://davidwalsh.name/convert-xml-json
function xmlToJson_(xml) {

  // Create the return object
  var obj = {};

  // get type
  var type = '';
  try { type = xml.getType(); } catch(e){}

  if (type == 'ELEMENT') {
    // do attributes
    var attributes = xml.getAttributes();
    if (attributes.length > 0) {
      obj["@attributes"] = {};
      for (var j = 0; j < attributes.length; j++) {
        var attribute = attributes[j];
        obj["@attributes"][attribute.getName()] = attribute.getValue();
      }
    }
  } else if (type == 'TEXT') {
    obj = xml.getValue();
  }

  // get children
  var elements = [];
  try { elements = xml.getAllContent(); } catch(e){}


  // do children
  if (elements.length > 0) {
    for(var i = 0; i < elements.length; i++) {
      var item = elements[i];

      var nodeName = false;
      try { nodeName = item.getName(); } catch(e){}

      if (nodeName)
      {
        if (typeof(obj[nodeName]) == "undefined") {
          obj[nodeName] = xmlToJson_(item);
        } else {
          if (typeof(obj[nodeName].push) == "undefined") {
            var old = obj[nodeName];
            obj[nodeName] = [];
            obj[nodeName].push(old);
          }
          obj[nodeName].push(xmlToJson_(item));
        }                
      }
    }
  }
  return obj;
};

I've posted the sample on GitHub.

Usage:

  var xml = XmlService.parse(xmltext); 
  Logger.log(JSON.stringify(xmlToJson_(xml)));

Reference:

Max Makhrov
  • 17,309
  • 5
  • 55
  • 81
2

The original answer didn't work for me. There may have been a change in the apps script XML API but it wouldn't include the text content of a node without children. Here is the code I wrote that seems to work well.

Note, it outputs in a slightly different fashion than the example you provided. I found that this might be a more consistent format for a broader range of use cases. I also found that including the attributes wasn't necessary for everything I was doing and created clutter, so I've included a version that doesn't parse attributes.

If you include attributes, the output follows this pattern:

{foo:{attributes:{...},content:{...}}

To Include Attributes:

function xmlParse(element) {
  /*
   * Takes an XML element and returns an object containing its children or text
   * If children are present, recursively calls xmlTest() on them
   * 
   * If multiple children share a name, they are added as objects in an array
   * If children have unique names, they are simply added as keys
   * i.e. 
   * <foo><bar>one</bar><baz>two</baz></foo> === {foo: {bar: 'one', baz: 'two'}}
   * <foo><bar>one</bar><bar>two</bar></foo> === {foo: [{bar: 'one'},{bar: 'two'}]}
   */

  let obj = {}
  const rootName = element.getName();
  
  // Parse attributes
  const attributes = element.getAttributes();
  const attributesObj = {};

  for(const attribute of attributes) {
    attributesObj[attribute.getName()] = attribute.getValue();
  }

  obj[rootName] = {
    attributes: attributesObj,
    content: {}
  }

  const children = element.getChildren();
  const childNames = children.map(child => child.getName());

  if (children.length === 0) {
    // Base case - get text content if no children
    obj = {
      content: element.getText(),
      attributes: attributesObj
    }
  } else if (new Set(childNames).size !== childNames.length) {
    // If nonunique child names, add children as an array
    obj[rootName].content = [];
    for (const child of children) {
      if (child.getChildren().length === 0) {
        const childObj = {};
        childObj[child.getName()] = xmlParse(child);
        obj[rootName].content.push(childObj)
      } else {
        const childObj = xmlParse(child);
        obj[rootName].content.push(childObj)
      }
    }
  } else {
    // If unique child names, add children as keys
    obj[rootName].content = {};
    for (const child of children) {
      if (child.getChildren().length === 0) {
        obj[rootName].content[child.getName()] = xmlParse(child);
      } else {
        obj[rootName].content = xmlParse(child);
      }
    }
  }

  return obj;
}

Without Attributes:

function xmlParse(element) {
  /*
   * Takes an XML element and returns an object containing its children or text
   * If children are present, recursively calls xmlTest() on them
   * 
   * If multiple children share a name, they are added as objects in an array
   * If children have unique names, they are simply added as keys
   * i.e. 
   * <foo><bar>one</bar><baz>two</baz></foo> === {foo: {bar: 'one', baz: 'two'}}
   * <foo><bar>one</bar><bar>two</bar></foo> === {foo: [{bar: 'one'},{bar: 'two'}]}
   */

  let obj = {}
  const rootName = element.getName();

  const children = element.getChildren();
  const childNames = children.map(child => child.getName());

  if (children.length === 0) {
    // Base case - get text content if no children
    obj = element.getText();
  } else if (new Set(childNames).size !== childNames.length) {
    // If nonunique child names, add children as an array
    obj[rootName] = [];
    for (const child of children) {
      if (child.getChildren().length === 0) {
        const childObj = {};
        childObj[child.getName()] = xmlParse(child);
        obj[rootName].push(childObj)
      } else {
        const childObj = xmlParse(child);
        obj[rootName].push(childObj)
      }
    }
  } else {
    // If unique child names, add children as keys
    obj[rootName] = {};
    for (const child of children) {
      if (child.getChildren().length === 0) {
        obj[rootName][child.getName()] = xmlParse(child);
      } else {
        obj[rootName] = xmlParse(child);
      }
    }
  }

  return obj;
}

Usage for both of these:

  const xml = XmlService.parse(xmlText); 
  const rootElement = xml.getRootElement();
  const obj = xmlParse(rootElement);
  const asJson = JSON.stringify(obj);

Reference:

XMLService