When trying to parse html as xml in google apps script, this code:
var yahoo= 'http://finance.yahoo.com/q?s=aapl'
var xml = UrlFetchApp.fetch(yahoo).getContentText();
var document = XmlService.parse(xml);
will return an error like this:
Error on line 20: The entity name must immediately follow the '&' in the entity reference. (line 13, file "")
Presumably because the html is not xml-compliant in some way in line 20. What surprises me is that when you do the same thing in google sheets and also supply an xpath, the html will be parsed as xml without problems:
=IMPORTXML("http://finance.yahoo.com/q?s=aapl,"//div[@class='title']")
will return "Apple Inc. (AAPL)". I assume that the sheets function has some way of cleaning the html to make it xml compliant.
- do you think that could be the case?
- if yes, do you have an idea how I could adapt the xml parser in apps script in such a way that I can access html from yahoo finance and treat it as xml?
thanks in advance!