0

I created a script in Google Sheets, which is working well but after a while I'm getting the following error: Exception: Service invoked too many times for one day: urlfetch

I think I called the function like 200-300 times in the day, for what I checked it should be below the limit.

I read we can use cache to avoid this issue but not sure how to use it in my code.

function scrapercache(url) {
    var result = [];
    var description;
    var options = {
        'muteHttpExceptions': true,
        'followRedirects': false,
    };
  
var cache = CacheService.getScriptCache();
var properties = PropertiesService.getScriptProperties();

try {  
  let res = cache.get(url);

  if (!res) {
    // trim url to prevent (rare) errors
    url.toString().trim();
    var r = UrlFetchApp.fetch(url, options);
    var c = r.getResponseCode();

    // check for meta refresh if 200 ok
    if (c == 200) {
      var html = r.getContentText();
      cache.put(url, "cached", 21600);
      properties.setProperty(url, html);

      var $ = Cheerio.load(html); // make sure this lib is added to your project!

      // meta description
      if ($('meta[name=description]').attr("content")) {
        description = $('meta[name=description]').attr("content").trim();
      }
    }
  
    result.push([description]);    
  }
} 
catch (error) {
  result.push(error.toString());
} 
finally {
  return result;
}
 
}

how can I use cache like this to enhance my script please?

var cache = CacheService.getScriptCache();
  var result = cache.get(url);
  if(!result) {
    var response = UrlFetchApp.fetch(url);
    result = response.getContentText();
    cache.put(url, result, 21600);
Nimantha
  • 6,405
  • 6
  • 28
  • 69
Gabriel
  • 37
  • 3
  • 10
  • `I read we can use cache to avoid this issue` Where did you read this? Caching information isn't related to number of fetch requests you make so this in itself can't mitigate hitting limits. It may be true however, that you do not need to make the fetch each time if you have the information stored in the script cache. How are you calling `scrapercache(url)` and what is `url`? – Rafa Guillermo Jul 09 '20 at 08:53
  • Hi Rafa, thanks for your message. I read about the cache in this post: https://stackoverflow.com/questions/46426792/service-invoked-too-many-times-for-one-day-urlfetch I'm calling the scraper function like that: ``` =value(left(REGEXEXTRACT(scraper("https://www.gurufocus.com/term/fscore/"&C6&":"&B6&"/Piotroski-F-Score"),". as."),1)) ``` I have probably 200 cells like that so maybe they are calling the URLfetch too many times so it will be good to store the data in cache when it has been fetched. Would you know how to include the cache element in my function plz? Thanks – Gabriel Jul 09 '20 at 10:21

1 Answers1

0

Answer:

You can implement CacheService and PropertiesService together and only retrieve the URL again after a specified amount of time.

Code Change:

Be aware that additional calls to retrieving the cache and properties will slow your function down, especially if you are doing this a few hundred times.

As the values of the cache can be a maximum of 100 KB, we will use CacheService to keep track of which URLs are to be retrieved, but PropertiesService to store the data.

You can edit your try block as so:

var cache = CacheService.getScriptCache();
var properties = PropertiesService.getScriptProperties();

try {  
  let res = cache.get(url);

  if (!res) {
    // trim url to prevent (rare) errors
    url.toString().trim();
    var r = UrlFetchApp.fetch(url, options);
    var c = r.getResponseCode();

    // check for meta refresh if 200 ok
    if (c == 200) {
      var html = r.getContentText();
      cache.put(url, "cached", 21600);
      properties.setProperty(url, html);

      var $ = Cheerio.load(html); // make sure this lib is added to your project!

      // meta description
      if ($('meta[name=description]').attr("content")) {
        description = $('meta[name=description]').attr("content").trim();
      }
    }
  
    result.push([description]);    
  }
} 
catch (error) {
  result.push(error.toString());
} 
finally {
  return result;
}

References:

Related Questions:

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Rafa Guillermo
  • 14,474
  • 3
  • 18
  • 54
  • Hi Rafa, I modified the ```try``` code of the formula like you said. When I tried to call the function like this: ```=scrapercache("https://www.gurufocus.com/term/fscore/"&B41&":"&A41&"/Piotroski-F-Score")``` with B41 = nyse & A41 = ABBV I get the following error: ```Exception: Argument too large: value``` Not sure what I have done wrong :/ Thank you – Gabriel Jul 09 '20 at 11:42
  • The cache has a limit of 100 KB size for data. It might be better to use [`PropertiesService`](https://developers.google.com/apps-script/reference/properties/properties-service) instead – Rafa Guillermo Jul 09 '20 at 11:46
  • @Gabriel I have updated my code wit the use of PropertiesService as well. The idea is to use `CacheService` to mark which urls have been recently retrieved, but to overcome the value limits, store the data using `PropertiesService`. This will then get updated when the timeout on the cache occurs. – Rafa Guillermo Jul 09 '20 at 13:00
  • Thanks, I updated my script with your recommendations, I saved & ran it without error. But when I call it in my Google Sheets like this =scrapercache("https://www.gurufocus.com/term/fscore/nyse:ABBV/Piotroski-F-Score") I can an error message #REF! Reference does not exit. Did I modify the script correctly based on your information? Thank you – Gabriel Jul 10 '20 at 10:27