1

What I am trying to do is to create a local file tree from Google Drive files (works like a cache), so I don't have to make an HTTP request every time I need a file info (quota is limited on Google API).

My best approach was to request all the files at once, so I have one giant File list without too many HTTP requests, and I can just link them to create the tree. It kind of works by doing some tricks with the files, but the problem is that I'm getting much more files than I need (9729 received and just 1764 useful), because I'm not being able to think of an efficient filter to the list service. I'm downloading all the files (including sharedWithMe, except for trash because there is an easy filter for it trashed = false, also downloading only necessary fields) and it takes much longer than it would be necessary because of the unnecessary data.

I just wanted to get files that are under MyDrive (root) folder (there are some files that are sharedWithMe and are also under MyDrive, I can't simply ignore them by using q = 'me' in owner). In other words, I just want to get the files that are in the root level, files that are children of files that are on the root level and so on. Any query that makes it? Or at least makes it more efficient?

Note: this is not a shared drive application.

Lucasjrt
  • 48
  • 5
  • Although I'm not sure whether I could correctly understand about your goal, for example, when you want to retrieve the file list just under the root folder, how about using the search query of `'root' in parents` for the method of files.list in Drive API? And also, when you want to retrieve the file list just under the specific folder, how about using the search query of `'### folder Id ###' in parents`? [Ref](https://developers.google.com/drive/api/v3/reference/query-ref) If I misunderstood your situation and goal, I apologize. – Tanaike Dec 09 '19 at 23:10
  • This would work as well, the only problem is that for each folder I would have to make an HTTP request, the goal is to make the least amount of requests possible, both to make it run faster and don't use too much quota from the Google API (HTTP requests are limited). For example, I have 651 folders in my Google Drive, it would be 651 HTTP requests to get all the files inside these folders. – Lucasjrt Dec 09 '19 at 23:17
  • Thank you for replying. I apologize for my incomplete comment. In your case, how about using the query parameter like `'folderId1' in parents or 'folderId2' in parents or,,,`? Or how about using the batch request? [Ref](https://developers.google.com/drive/api/v3/batch) For both methods, you can retrieve the file list from the specific folders. If this was not the direction you want, I apologize. – Tanaike Dec 09 '19 at 23:24
  • Thank you for your reply as well, your suggestion is interesting, this would take "the height of the tree" requests, because on the first request the only id I have is from root, then on the second the children of the root and so on. In some cases this might be faster than downloading all of the files, but because of the latency of the requests it might be slower. Batch also has this problem because I need the requests in sequence to get the next IDs, first I have the root, second I have the root children, and so on. – Lucasjrt Dec 09 '19 at 23:36
  • At the former method, the file list in each folder is retrieved as one value. In this case, after the file list was retrieved, it is required to separate each folder using the parent ID. At the latter method, 100 API calls can be run by one API call. And this works with the asynchronous process. The returned values can be retrieved for each call. But I'm not sure whether this is the result you want. I apologize for this. – Tanaike Dec 09 '19 at 23:42
  • I just thought of a solution, I could first make a request with `'me' in owners` then I will have a bunch of files that are on MyDrive folder or under it, and with it I batch a file.list that will cover all of the parents (now I have their IDs) looking for children with `not 'me' in owners`. I can't see any solution more efficient than this one. – Lucasjrt Dec 09 '19 at 23:42
  • Thank you for replying. I'm glad your issue was resolved. – Tanaike Dec 09 '19 at 23:42
  • Well, thank you so much for your suggestions, it really helped me. – Lucasjrt Dec 09 '19 at 23:45
  • have a look at Alternative 3 of https://stackoverflow.com/questions/41741520/how-do-i-search-sub-folders-and-sub-sub-folders-in-google-drive – pinoyyid Dec 10 '19 at 11:56
  • 1
    Hi @Lucasjrt, please if you can post the answer on how you managed to solve your issue. This is important for [documentation purposes](https://stackoverflow.com/help/self-answer). – Andres Duarte Dec 11 '19 at 14:58

1 Answers1

1

The best way I found to create the tree is:

  1. request a files.list for all the folders (q: mimeType = 'application/vnd.google-apps.folder') as usually there are less folders than files. trashed = false can also be useful.
  2. Create the tree structure discarding all of the useless folders.
  3. For each folder in the tree, get their id and then file.list with q = 'mimeType != 'application/vnd.google-apps.folder' and (folderId1' in parents or 'folderId2 in parents or ... or 'folderIdN' in parents) (watch out for too complex query error with too big body size [maybe over 25,000 characters, not sure]).
  4. Complete the linking of the tree.

By doing so, there will be some useless data and a few requests, but the amount of downloaded data is much less than listing files without any q restriction.

The idea came from Alternative 3 on the answer for this question.

Lucasjrt
  • 48
  • 5