- You want to retrieve a table from the HTML data of the URL.
- From
I am trying to get the following data onto a Google Sheet
, I thought like this.
If my understanding is correct, how about this answer?
Issue and workaround:
Unfortunately, it seems that the file size of HTML is large. So when =IMPORTXML("https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yieldAll","//title")
is used, an error of Resource at url contents exceeded maximum size.
occurs. When I retrieve HTML data from the URL, the size of HTML data was about 9 MB. It is considered that the reason of error is due to this. So as one of workaround, how about using Google Apps Script? In this workaround, the following flow is used.
- Retrieve HTML data using
UrlFetchApp
- Parse the HTML data using
Parser
which is a GAS library.
- Put the parsed data to the active sheet on the Spreadsheet using
PasteDataRequest
of Sheets API.
Usage:
Preparation:
Please install Parser
. About the install of library, you can see it at here.
- The project key of the library is
M1lugvAXKKtUxn_vdAG9JZleS6DrsjUUV
.
Please enable Sheets API at Advanced Google services.
Sample script:
Please copy and paste the following script to the script editor of the container-bound script of the Spreadsheet. After above settings were done, please run the function of myFunction()
. When the script is run, the table of HTML is put to the active sheet on the Spreadsheet.
function myFunction() {
// Retrieve HTML data from URL.
var url = "https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yieldAll";
var html = UrlFetchApp.fetch(url).getContentText();
// Parse HTML data.
var table = "<table" + Parser.data(html).from("<table class=\"t-chart\"").to("</table>").build() + "</table>";
// Put the values to the Spreadsheet.
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet = ss.getActiveSheet();
var resource = {requests: [{pasteData: {html: true, data: table, coordinate: {sheetId: sheet.getSheetId()}}}]};
Sheets.Spreadsheets.batchUpdate(resource, ss.getId());
}
References:
If I misunderstood your question and this was not the direction you want, I apologize.
Updated at April, 23, 2021:
New IDE for Google Apps Script has finally been released at December 7, 2020. Ref By this, in the current stage, in order to install Google Apps Script library, it is required to use the script ID of Google Apps Script project.
In this case, when the Google Apps Script library of Parser is installed, unfortunately, this ID M1lugvAXKKtUxn_vdAG9JZleS6DrsjUUV
cannot be used.
So when you use new IDE, please use the following script ID.
1Mc8BthYthXx6CoIz90-JiSzSafVnT6U3t0z_W3hLTAX5ek4w0G_EIrNw
This script ID is the ID of Google Apps Script project of M1lugvAXKKtUxn_vdAG9JZleS6DrsjUUV
. By this, the library of Parser can be installed to the new IDE.
About the method for installing the library, you can see the official document.
Reference: