0

So I had written a python script that scrapes my email for data. I used the following code to find items by class:

HRDataUnClean = str(soup.findAll("h2", {"class": "numbers"}))

This worked incredibly well, however as I am using GMail and some people complained about installing python and using the google api I wanted to write a script in google sheets that does a similar task.

I know how to grab the body of the email with:

var html = messages[0].getBody()

However this returns a string not an html object as I had before with pythons beautifulsoup. I have found google scrip code to search a by element class SearchByClass

However XmlService.parse(html) appears to require an html object. Is there anyway I can covert the email body from a string to a html object?

Marc Henning
  • 45
  • 1
  • 4
  • Possible duplicate of [What is the best way to parse html in google apps script](https://stackoverflow.com/questions/19455158/what-is-the-best-way-to-parse-html-in-google-apps-script) – Liora Haydont Jun 28 '18 at 18:19

1 Answers1

1

There is nothing like HTML representing object in GAS (2019) of similar HTML representation like in browser console or JQUERY objects.

XML service is deprecated but it still works and it takes string as input.

var pageHtmlString = UrlFetchApp.fetch(webAddressUrl);
var doc = Xml.parse(pageHtmlString, true);
var bodyHtml = doc.html.body.toXmlString();
doc = XmlService.parse(bodyHtml);
var root = doc.getRootElement();

Note: This solution may not work if the old Xml.parse is completely removed from Google Scripts.