I'm building a script that uses a looping ImportHTML command to web scrape weather data based on zip code, and am currently running into an issue with the execution timing out every time the script is run.
The current way I have the script set up produces a correct result when run, but given that the script is pulling data from several hundred sources, it is taking a while and will not complete within the current time limit of Google scripts.
The sheet running the script utilizes 3 tabs:
- ZIPS, which contains a list of zip code link values pulled from the site that weather data is to be pulled from
- Blank, which is simply an intermediary sheet used in the execution of the script
- Result, where the final output is to be placed
In order to try and reduce the amount of read/write as much as possible, I changed the code from writing each result of the ImportHTML command as it was executed to appending to an array and writing this array at the end of the script. The code in it's current form is as follows:
function getTemps() {
var googleSheet = SpreadsheetApp.getActive();
// Read in Zip code link values
var sheet = googleSheet.getSheetByName('ZIPS');
var zipArray = sheet.getDataRange().getValues();
var arrayLength = zipArray.length;
//Set up sheet values
var blankSyntaxA = 'ImportHtml("https://www.wunderground.com/cgi-bin/findweather/getForecast?query=pz:';
var blankSyntaxB = '&zip=1", "table", 1)';
var tempResult = [];
// Writing Section
var sheet = googleSheet.getSheetByName('Blank');
for (var i = 0; i < arrayLength; i++)
{
var liveSyntax = blankSyntaxA+zipArray[i][0]+blankSyntaxB;
sheet.getRange('A1').setFormula(liveSyntax);
var importedData = sheet.getDataRange().getValues();
tempResult = tempResult.concat(importedData);
}
var sheet = googleSheet.getSheetByName('Result');
sheet.getRange(1,1,tempResult.length,8).setValues(tempResult);
}
I know the run time of the script could be reduced by eliminating the read/write contained within the For loop, but I'm not sure how to obtain the necessary HTML table without running the ImportHTML command within the 'Blank' sheet. Is there a way to run that command to fill the 'importedData' array without writing to a sheet?
Alternatively, I had considered utilizing a check on the runtime of the function and implementing a break as it neared the ~5 minute runtime limit, followed by a recursive call back to the original function, but I wasn't sure if this would actually mitigate the runtime issue, or even be possible given the nature of the recursive call.
Any advice on how this script could be modified to run within the script timeout parameter or modified to produce the complete desired outcome with all the necessary imported data would be appreciated. Thanks!