2

I am programming a Google Apps script within a spreadsheet. My use case includes iterating over a large set of folders that are children of a given one. The problem is that the processing takes longer than the maximum that Google allows (6 minutes), so I had to program my script to be able to resume later. I am creating a trigger to resume the task, but that is not part of my problem (at least, not the more important one at this moment).

My code looks like this (reduced to the minimum to illustrate my problem):

function launchProcess() {
    var scriptProperties = PropertiesService.getScriptProperties();
    scriptProperties.setProperty(SOURCE_PARENT_FOLDER_KEY, SOURCE_PARENT_FOLDER_ID);
    scriptProperties.deleteProperty(CONTINUATION_TOKEN_KEY);
    continueProcess();
}

function continueProcess() {
    try {
        var startTime = (new Date()).getTime();
        var scriptProperties = PropertiesService.getScriptProperties();
        var srcParentFolderId = scriptProperties.getProperty(SOURCE_PARENT_FOLDER_KEY);
        var continuationToken = scriptProperties.getProperty(CONTINUATION_TOKEN_KEY);
        var iterator = continuationToken == null ? DriveApp.getFolderById(srcParentFolderId).getFolders() : DriveApp.continueFolderIterator(continuationToken);

        var timeLimitIsNear = false;
        var currTime;
        while (iterator.hasNext() && !timeLimitIsNear) {
            var folder = iterator.next();

            processFolder_(folder);

            currTime = (new Date()).getTime();
            timeLimitIsNear = (currTime - startTime >= MAX_RUNNING_TIME);
        }

        if (!iterator.hasNext()) {
            scriptProperties.deleteProperty(CONTINUATION_TOKEN_KEY);
        } else {
            var contToken = iterator.getContinuationToken();
            scriptProperties.setProperty(CONTINUATION_TOKEN_KEY, contToken);
        } 

    } catch (e) {
        //sends a mail with the error
    }    
}

When launchProcess is invoked, it only prepares the program for the other method, continueProcess, that iterates over the set of folders. The iterator is obtained by using the continuation token, when it is present (it will not be there in the first invocation). When the time limit is near, continueProcess obtains the continuation token, saves it in a property and waits for the next invocation.

The problem I have is that the iterator is always returning the same set of folders although it has been built from different tokens (I have printed them, so I know they are different).

Any idea about what am I doing wrong?

Thank you in advance.

fdediego
  • 115
  • 1
  • 6
  • It looks as a bug for me. It's reported here: https://code.google.com/p/google-apps-script-issues/issues/detail?id=4116 – fdediego Jun 29 '14 at 18:54

4 Answers4

2

It appears that your loop was not built correctly. (edit : actually, probably also another issue about how we break the while loop, see my thoughts about that in comments)

Note also that there is no special reason to use a try/catch in this context since I see no reason that the hasNext() method would return an error (but if you think so you can always add it)

here is an example that works, I added the trigger creation / delete lines to implement my test.

EDIT : code updated with logs and counter

var SOURCE_PARENT_FOLDER_ID = '0B3qSFd3iikE3MS0yMzU4YjQ4NC04NjQxLTQyYmEtYTExNC1lMWVhNTZiMjlhMmI'
var MAX_RUNNING_TIME = 5*35*6;

function launchProcessFolder() {
  var scriptProperties = PropertiesService.getScriptProperties();
  scriptProperties.setProperty('SOURCE_PARENT_FOLDER_KEY', SOURCE_PARENT_FOLDER_ID);
  scriptProperties.setProperty('counter', 0);
  scriptProperties.deleteProperty('CONTINUATION_TOKEN_KEY');
  ScriptApp.newTrigger('continueProcess').timeBased().everyMinutes(10).create();
  continueProcessFolder();
}

function continueProcessFolder() {
  var startTime = (new Date()).getTime();
  var scriptProperties = PropertiesService.getScriptProperties();
  var srcParentFolderId = scriptProperties.getProperty('SOURCE_PARENT_FOLDER_KEY');
  var continuationToken = scriptProperties.getProperty('CONTINUATION_TOKEN_KEY');
  var iterator = continuationToken == null ? DriveApp.getFolderById(srcParentFolderId).getFolders() : DriveApp.continueFolderIterator(continuationToken);
  var timeLimitIsNear = false;
  var currTime;
  var counter = Number(scriptProperties.getProperty('counter'));
  while (iterator.hasNext() && !timeLimitIsNear) {
    var folder = iterator.next();
    counter++;
    Logger.log(counter+'  -  '+folder.getName());

    currTime = (new Date()).getTime();
    timeLimitIsNear = (currTime - startTime >= MAX_RUNNING_TIME);

    if (!iterator.hasNext()) {
      scriptProperties.deleteProperty('CONTINUATION_TOKEN_KEY');
      ScriptApp.deleteTrigger(ScriptApp.getProjectTriggers()[0]);
      Logger.log('******************no more folders**************');
      break;
    }
  }
  if(timeLimitIsNear){
    var contToken = iterator.getContinuationToken();
    scriptProperties.setProperty('CONTINUATION_TOKEN_KEY', contToken);
    scriptProperties.setProperty('counter', counter);
    Logger.log('write to scriptProperties');
  }
}

EDIT 2 :

(see also last comment)

Here is a test with the script modified to get files in a folder. From my different tests it appears that the operation is very fast and that I needed to set a quite short timeout limit to make it happen before reaching the end of the list.

I added a couple of Logger.log() and a counter to see exactly what was happening and to know for sure what was interrupting the while loop.

With the current values I can see that it works as expected, the first (and second) break happens with time limitation and the logger confirms that the token is written. On a third run I can see that all files have been dumped.

var SOURCE_PARENT_FOLDER_ID = '0B3qSFd3iikE3MS0yMzU4YjQ4NC04NjQxLTQyYmEtYTExNC1lMWVhNTZiMjlhMmI'
var MAX_RUNNING_TIME = 5*35*6;

function launchProcess() {
  var scriptProperties = PropertiesService.getScriptProperties();
  scriptProperties.setProperty('SOURCE_PARENT_FOLDER_KEY', SOURCE_PARENT_FOLDER_ID);
  scriptProperties.setProperty('counter', 0);
  scriptProperties.deleteProperty('CONTINUATION_TOKEN_KEY');
  ScriptApp.newTrigger('continueProcess').timeBased().everyMinutes(10).create();
  continueProcess();
}

function continueProcess() {
  var startTime = (new Date()).getTime();
  var scriptProperties = PropertiesService.getScriptProperties();
  var srcParentFolderId = scriptProperties.getProperty('SOURCE_PARENT_FOLDER_KEY');
  var continuationToken = scriptProperties.getProperty('CONTINUATION_TOKEN_KEY');
  var iterator = continuationToken == null ? DriveApp.getFolderById(srcParentFolderId).getFiles() : DriveApp.continueFileIterator(continuationToken);
  var timeLimitIsNear = false;
  var currTime;
  var counter = Number(scriptProperties.getProperty('counter'));
  while (iterator.hasNext() && !timeLimitIsNear) {
    var file = iterator.next();
    counter++;
    Logger.log(counter+'  -  '+file.getName());

    currTime = (new Date()).getTime();
    timeLimitIsNear = (currTime - startTime >= MAX_RUNNING_TIME);

    if (!iterator.hasNext()) {
      scriptProperties.deleteProperty('CONTINUATION_TOKEN_KEY');
      ScriptApp.deleteTrigger(ScriptApp.getProjectTriggers()[0]);
      Logger.log('******************no more files**************');
      break;
    }
  }
  if(timeLimitIsNear){
    var contToken = iterator.getContinuationToken();
    scriptProperties.setProperty('CONTINUATION_TOKEN_KEY', contToken);
    scriptProperties.setProperty('counter', counter);
    Logger.log('write to scriptProperties');
  }
}
Serge insas
  • 45,904
  • 7
  • 105
  • 131
  • Thank you for your response, but I see that the difference is that you put the if within the loop and I had it out, as I only wanted it to be executed once, when the limit was near or there wasn't more elements. Do you really think I need to store the token in the property in every iteration? I don't see a reason for that, but I will try doing it, just in case. – fdediego Jun 24 '14 at 20:01
  • you're right... I made a mistake ! code updated ;-) It would indeed be a potential error source to write repeatedly to the scripProperties. sorry about that (was at work and probably too busy on something else...) – Serge insas Jun 24 '14 at 20:09
  • I added a second 'if' to know what made it interrupt (last item or time limit) because the while loop has 2 conditions. – Serge insas Jun 24 '14 at 21:34
  • I don't understand you. Your code IS writing repeatedly to the scriptProperties. In fact I have tried it and it started raising this exception "The script has found some problem :(, Exception: Service invoked too many times in a short time: properties rateMax. Try Utilities.sleep(1000) between calls.]" – fdediego Jun 24 '14 at 22:07
  • it does not, I added a log in the code so you can check it. Are you sure you copied my last version ?(the one that is here ) – Serge insas Jun 24 '14 at 22:18
  • Oh, no, for some reason I was seeing your comment but not your last version of the code. I will try now. – fdediego Jun 24 '14 at 22:28
  • I fear the problem is still there, even when copying and pasting your code. It's very strange. There are about 350 folders in the parent folder. In the first iteration it returns a number of them before the time limit but, in the second invocation it repeats the data from the 100th folder, in the third invocation it repeats from the same exact record, and so on. – fdediego Jun 24 '14 at 22:45
  • I don't have enough folders to test exactly like you...I'll change the script to look for files.... and come back later . – Serge insas Jun 25 '14 at 06:06
  • I didn't have the time to modify my code yet but I wanted to ask if you actually get exactly 100 folders or was it a rough evaluation? If it's exactly 100 then I guess you are not exiting the loop because of a timeout but rather because of some limitation in the results that I ignored... could you check the execution transcript to confirm the time it takes to complete one run? Thx.- edit: you can also test it with a shorter timeout, a minute or so, to see how it behaves... – Serge insas Jun 26 '14 at 15:24
  • It is always iterating exactly from the 101th folder. – fdediego Jun 26 '14 at 20:18
  • Great news...then try the new version with logs and please tell me what happens :-) – Serge insas Jun 26 '14 at 20:21
  • 1
    I have tested with your last code and 30 seconds limit. I put delay to slow down the process: - First execution: it processed 50 folder - Second execution: it processed 51 more (no repetitions this time) - Third execution: it processed 43 (the first seems to be the last of previous execution (folder number 101)) - Fourth execution: it processed 50 folder, repeating from the same that the previous execution (folder number 101) - Fifth execution: it processed 50 folder, repeating from the same point. - ... – fdediego Jun 26 '14 at 20:45
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/56380/discussion-between-fdediego-and-serge-insas). – fdediego Jun 26 '14 at 20:48
1

As of January 1, 2016 this is still a problem. The bug report lists a solution using the Advanced Drive API, which is documented here, under "Listing folders".

If you don't want to use Advanced services, an alternative solution would be to use the Folder Iterator to make an array of File Ids.

It appears to me that the Folder Iterator misbehaves only when created using DriveApp.continueFolderIterator(). When using this method, only 100 Folders are included in the returned Folder Iterator.

Using DriveApp.getFolders() and only getting Folder Ids, I am able to iterate through 694 folders in 2.734 seconds, according the Execution transcript.

function allFolderIds() {
  var folders = DriveApp.getFolders(),
      ids = [];
  while (folders.hasNext()) {
    var id = folders.next().getId();
    ids.push(id);
  }
  Logger.log('Total folders: %s', ids.length);
  return ids;
}

I used the returned array to work my way through all the folders, using a trigger. The Id array is too big to save in the cache, so I created a temp file and used the cache to save the temp file Id.

Jack Steam
  • 4,500
  • 1
  • 24
  • 39
0

This is caused by a bug in GAS: https://code.google.com/p/google-apps-script-issues/issues/detail?id=4116

fdediego
  • 115
  • 1
  • 6
0

It appears you're only storing a single continuation token. If you want to recursively iterate over a set of folders and allow the script to pause at any point (e.g. to avoid the timeout) and resume later, you'll need to store a bunch more continuation tokens (e.g. in an array of objects).

I've outlined a template that you can use here to get it working properly. This worked with thousands of nested files over the course of 30+ runs perfectly.

Senseful
  • 86,719
  • 67
  • 308
  • 465