1

I have been working on importing a large amount of API data into a Google Sheet to build up a database. I have a list of over 25,000 links in cells inside of a sheet that I am looking to import into the data. I have a working set of code in Apps Script that has been able to quickly and efficiently import ten links, but when I have tried the full set, I have gotten errors like:

Exception: Service invoked too many times in a short time: urlfetch. Try Utilities.sleep(1000) between calls."

This will show up eleven seconds after running the code. I have also tried adding Utilities.sleep(1000) to the code in many places to no avail. Here is the code that I am working with:

function dataImport() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName("Import1");
  var countRow = 1;
  var exportSheet = ss.getSheetByName("Data1");
  var reqs = sheet.getRange("A2:A" + sheet.getLastRow()).getDisplayValues().reduce((ar, [url]) => {
    if (url) {
      ar.push({ url, muteHttpExceptions: true });
    }
    return ar;
  }, []);

  var responses = UrlFetchApp.fetchAll(reqs);
  Utilities.sleep(1000)
  var temp = responses.reduce((ar, r) => {
    if (r.getResponseCode() == 200) {
      var { id, firstName, lastName, fullName, displayName, shortName, weight, height, position: { abbreviation }, team: { $ref }, birthPlace: { city }, birthPlace: { state, country }, experience: { years, displayValue }, jersey, active } = JSON.parse(r.getContentText());
      ar.push([id, firstName, lastName, fullName, displayName, shortName, weight, height, abbreviation, $ref, city, state, country, years, displayValue, jersey, active]);
    }
    return ar;
  }, []);
  var res = [['IDs', 'First Name', 'Last Name', 'Full Name', 'Display Name', 'Short Name', 'Weight', 'Height', 'Position', 'Team URL', 'City', 'State', 'Country', 'Years', 'Class', 'jersey', 'Active'], ...temp];
  exportSheet.getRange(countRow, 1, res.length, res[0].length).setValues(res);
}

I had previously posted a question about this database as I was running into a different issue (Apps Script API Import via Links in Google Sheet Cells but Map Properties Undefined), then when the quantity became an issue, it was suggested that I post a new question to help others who may be running into this issue. Once again, this code has been able to work with a limited dataset; it just seems like the very large quantity is causing many issues.

Like I mentioned in the comments of the previous question, I'll mention it here. When I tried adding the Utilities.sleep(1000) command, most of the time I ran into the same error never the less, except in one location, to which I have not been able to replicate, so my apologies, I got this error:

Exception: Service Spreadsheets failed while accessing document with id 18tRBBcVwChuSJ0G5_A09I7EoQnji6nmRvOWcX5ndALE

Upon researching the original error, I found a few questions relating to it, but none of them worked for what I was looking to do:

How do I use Google app script UrlFetchApp.fetchAll on >1000 requests without getting "Service invoked too many times in a short time" error?: This one featured a looping set of code, but it was being run as a Sheets Extension, and doesn't match how my code is importing the data. This question, while it got comments, didn't receive an answer that worked for the person asking. They also didn't try the command the error code suggested.

How do I fetch a column of URLs in Google Apps Script/Javascript?: This deals with a far smaller base dataset, more similar to the test run in my previous question, where it was suggested to use a FetchAll command, which is currently how my code is setup and isn't working with the larger set of data.

At the end of this, I'll have the link to a copy sheet with all of the link data in case that is helpful for anyone. I also have put the code I have been using in the AppScript of that sheet. If I could get help figuring out the best way of doing this, it would be greatly appreciated!

https://docs.google.com/spreadsheets/d/1Nmjw7wCv5FzBFIE8QfqHOYGPioIkv79InBsJzpmEnyw/edit#gid=299593293

DJ Luke
  • 31
  • 6
  • The question doesn't include enough details. How are you running the script? Were you able to run the script successfully before the error occurred? Please tell us the most relevant questions that you found on this site and why they don't meet your needs. – Rubén Aug 14 '23 at 20:44
  • Hi @Rubén , thanks for the reply; I went in and added more details on the fact that this is AppScript code, and explained the fact that my reasoning for posting the question has to do with this issue popping up in another question, and it being suggested to me that I make a new post with it to better help others. As I had said in the question too, the code does work with limited datasets, but is running into issues when trying to import the full dataset of links. – DJ Luke Aug 14 '23 at 20:54
  • The spelling is Apps Script, not AppScript. – Rubén Aug 14 '23 at 20:55
  • What is the question that you previously posted? P.S. It's OK that you include links to a previous question if this is a follow-up question, but keep meta-commentary to what is strictly necessary to allow readers to understand what is being asked. – Rubén Aug 14 '23 at 20:56
  • @Rubén "Apps Script API Import via Links in Google Sheet Cells but Map Properties Undefined" was the question I previously posted, also spelling issue is noted – DJ Luke Aug 14 '23 at 20:59
  • An hyperlink is much better than the question title – Rubén Aug 14 '23 at 21:00
  • There are a lot of questions about the same error. Add a brief description of the most relevant and why they don't meet your needs. – Rubén Aug 14 '23 at 21:01
  • @Rubén okay, I have now updated the question both to feature a hyperlink to my previous question, and links to other relevant questions mentioning this error and why they do not meet the needs of this question. – DJ Luke Aug 14 '23 at 21:47
  • I saw that [Tanaike](https://stackoverflow.com/users/7108653/tanaike) [answered](https://stackoverflow.com/a/76896063/1595451) your previous question. First, please don't take them as the reference of how most people participate here. They are unique, so if they tell you that they are willing to help you with the follow-up question, you should have the courtesy to reply to them, including a link to this question. – Rubén Aug 14 '23 at 22:17
  • @Rubén okay, that is done, thank you for pointing that out! Do you by any chance have any ideas towards answering this question? – DJ Luke Aug 14 '23 at 23:02
  • I have to apologize for my poor skill. Now, I am testing a sample script for achieving your expected result. But, unfortunately, all values cannot still be retrieved from 27866 URLs because of the continuous requests. But, I would like to support you. When I could create a script for retrieving all values, I would like to propose it. I deeply apologize that I cannot resolve your question soon. I would like to study more. – Tanaike Aug 15 '23 at 01:17
  • @Tanaike I appreciate you taking a look! All good too, I am hopeful to learn more about this as well! – DJ Luke Aug 15 '23 at 13:59
  • For clarification purposes, are you using a personal Google account or a Google Workspace domain account in your Spreadsheet? – SputnikDrunk2 Aug 15 '23 at 20:53
  • @SputnikDrunk2 at this moment, it's being run through a personal Google account – DJ Luke Aug 16 '23 at 05:42
  • Two choices: 1) fetch urls **one at a time** with `UrlFetchApp.fetch` [doc ref](https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app#fetchurl). In this context, `Utilities.sleep(1000)` introduces a time delay between each fetch. **OR** 2) fetch **all the urls in a single pass** with `UrlFetchApp.fetchAll` [Doc ref](https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app#fetchallrequests). In this context, `Utilities.sleep(1000)` is meaningless since the request is made once only. You chose option#2: `var responses = UrlFetchApp.fetchAll(reqs);`. – Tedinoz Aug 16 '23 at 11:31
  • @Tedinoz okay, that makes sense; however, why would I be getting that error if the command itself is obsolete? And is there something I can do to actually get through all of the data I need to, as I think `UrlFetchApp.fetch` would likely take too long to get through all of the rows – DJ Luke Aug 16 '23 at 18:10
  • `if the command itself is obsolete` The command is not obsolete, it's just that you are fetching too many urls in one go. @SputnikDrunk2 has the right idea, batch your work; find the sweet spot. – Tedinoz Aug 17 '23 at 03:31

1 Answers1

1

SUGGESTION

While this approach might not be the most efficient, you can attempt to handle the exception using the batching technique in the URLFetch call.

From my testing, it seems that the UrlFetchApp.fetchAll() method can only handle a maximum of 200 URLs in a single operation. If the number exceeds 200, an exception is triggered based on my experience

Note: This behaviour is tested on a consumer Google account. Paid Google Workspace domain account does not produce the urlfetch exception.

The sample tweaked script below will batch total URLs requests by 200 (E.g if you have 27,000 requests, there will be 135 batches, each batch contains 200 requests.). Basically, this involves handling potential exceptions during URL fetching and keeping track of the progress of data processing using script properties. This information can serve as your starting point to formulate the batching strategy.

Tweaked Script

var scriptProperties = PropertiesService.getScriptProperties();

function dataImport() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName("Import1");
  var exportSheet = ss.getSheetByName("Data1");
  var reqs = sheet.getRange("A2:A" + sheet.getLastRow()).getDisplayValues().reduce((ar, [url]) => {
    if (url) {
      ar.push({ url, muteHttpExceptions: true });
    }
    return ar;
  }, []);

  //Storage of current data
  var bucket = [];
  var batchSize = 200;
  var batches = batchArray(reqs, batchSize);
  var startingBatch = scriptProperties.getProperty("batchNumber") == null ? 0 : parseInt(scriptProperties.getProperty("batchNumber"));
  var processedBatches = scriptProperties.getProperty("processedBatches") == null ? 0 : parseInt(scriptProperties.getProperty("processedBatches"));

  console.log(`Total: ${reqs.length}.\n${batches.length} batches.`)

  if (processedBatches >= (batches.length - 1)) {
    console.log('All data has been processed already.');
  } else {
    //Start from the very last batch that stopped that needs to be processed.
    for (let i = startingBatch; i < batches.length; i++) {
      console.log(`Processing batch index #${parseInt(i)}`);
      try {
        var responses = UrlFetchApp.fetchAll(batches[i]);
        bucket.push(responses);
        //Remove previous batch index number
        scriptProperties.deleteProperty("processedBatches");
        //Store latest sucessful batch index number
        scriptProperties.setProperty("processedBatches", parseInt(i));
      }
      //Catch the last batch index number where it stopped due to URL fetch exception
      catch (e) {
        //Remove the old batch number to be replaced with new batch number.
        scriptProperties.deleteProperty("batchNumber");
        //Remember the last batch that encountered and error to be processed again in the next call.
        scriptProperties.setProperty("batchNumber", parseInt(i));
        console.log(`Batch index #${parseInt(i)} stopped`);
        break;
      }
    }

    const initialRes = [].concat.apply([], bucket);

    var temp = initialRes.reduce((ar, r) => {
      if (r.getResponseCode() == 200) {
        var { id, firstName, lastName, fullName, displayName, shortName, weight, height, position: { abbreviation }, team: { $ref }, birthPlace: { city }, birthPlace: { state, country }, experience: { years, displayValue }, jersey, active } = JSON.parse(r.getContentText());
        ar.push([id, firstName, lastName, fullName, displayName, shortName, weight, height, abbreviation, $ref, city, state, country, years, displayValue, jersey, active]);
      }
      return ar;
    }, []);
    var res = [...temp];

    //Add table headers
    exportSheet.getLastRow() == 0 && exportSheet.appendRow(['IDs', 'First Name', 'Last Name', 'Full Name', 'Display Name', 'Short Name', 'Weight', 'Height', 'Position', 'Team URL', 'City', 'State', 'Country', 'Years', 'Class', 'jersey', 'Active']);

    //Add table data
    var result = () => {
      return temp.length != 0 && exportSheet.getRange(exportSheet.getLastRow() + 1, 1, res.length, res[0].length).setValues(res);
    }
    result() && console.log(`Processed: ${res.length}.`);
  }
}

//Function to chunk the request data based on batch sizes
function batchArray(arr, batchSize) {
  var batches = [];

  for (var i = 0; i < arr.length; i += batchSize) {
    batches.push(arr.slice(i, i + batchSize));
  }

  return batches;
}

//Optional function to check the latest processed batch
function checkBatch() {
 console.log(`${parseInt(scriptProperties.getProperty("processedBatches"))+1} batches have been processed.`);
}

Here's a sample step flow on how the script works:

  1. Run the dataImport function.

  2. Process each batch (either starts from the batchNumber index number 0 Or the last tracked index number) in the URLFetch.fetachAll() method.

  3. If a batch does not invoke an exception, it will be appended to the Data1 sheet.

  4. If a batch invokes an exception, the batchNumber property will be updated with that batch's index number & stops the script.

  5. Repeat Steps 1 and 2.

  6. Repeat Step 3 or Step 4.

You can either place the dataImport function a time-based trigger (to continue running in the background until it finishes all remaining batches) Or manually run it from the Apps Script Editor.

Quick Demo

  • E.g. you have 400 rows of data to be batched by 200 (2 batches). The first function run finishes batch index 0 but encountered an exception on batch index 1. The next function run finally finishes the batch index 1.

image

References

SputnikDrunk2
  • 3,398
  • 1
  • 5
  • 17
  • Dude, thank you! This is working like a charm, and the time-based trigger was a good call too! Good call on using batches to get this done! – DJ Luke Aug 17 '23 at 04:41