1

I'm new to using the Google Drive API for Python (v3) and I've been trying to access and update the sub-folders in a particular parent folder for which I have the fileId. Here is my build for the API driver:

store = file.Storage('token.json')
creds = store.get()
if not creds or creds.invalid:
    flow = client.flow_from_clientsecrets('credentials.json',
           scope='https://www.googleapis.com/auth/drive')
    creds = tools.run_flow(flow, store)
service = build('drive', 'v3', http=creds.authorize(Http()))

I am able to successfully access most of the sub-folders by using files().list() but at least one was missing from the list of results returned:

results = service.files().list(
    q="parents in '1QXPl6z04GsYAO0GKHBk2oBjEweaAbczw'", 
    fields="files(id, name), incompleteSearch, nextPageToken").execute()
items = results['files']

I double checked and there was no nextPageToken key in the results and the value of incompleteSearch was False, which I assume means the full list of results were returned. In addition when I accessed the list of parents for the missing file by using the file().get() method, the only parent listed is the one in the query above:

service.files().get(
    fileId='1WHP02DtXfJHfkdr47xSeeRIj0sCrihPA',
    fields='parents, name').execute()

and returns this:

{'name': 'Sara Gaul -Baltimore Corps docs and schedules',
 'parents': ['1QXPl6z04GsYAO0GKHBk2oBjEweaAbczw']}

Other details that may be relevant:

  • This particular folder that is not appearing in the list was renamed by a collaborator.
  • I'm running this code on a jupyter notebook instead from a python file.
  • I'm a named collaborator with write access on all of the sub-folders, including the one that's not showing up.

UPDATES

  • The files().list() query used to return 40 records of the 41 in the folder. Now it is only returning 39.
  • Both of the folders that are no longer being returned were renamed by someone who accessed the folder using the link that extends write level permissions.
  • When their folder details are queried directly using files().get() both of the non-returned folders still have the parent folder as their only parent, and their permissions have not changed.

Main questions:

  1. Why isn't this file which clearly has the parent id listed in my file().list() query showing up in the results of that query? And is there any way to adjust the query or the file to ensure that it does?
  2. Is there an easier way to list all of the files contained within a folder in the Google Drive API v3? I know that v2 had a children() method for folders, but it's been deprecated in v3 to my knowledge
William Daly
  • 172
  • 1
  • 10
  • How did you double check that there was no nextpagetoken when you are excluding it from your fields? – Linda Lawton - DaImTo Sep 13 '18 at 16:23
  • I ran `results.keys()` and the only key returned was `'files` and according to this documentation: https://developers.google.com/drive/api/v3/reference/files/list that key is absent from the object returned if there aren't multiple pages – William Daly Sep 13 '18 at 16:27
  • @DaImTo I just updated the original post to include some further info around `nextPageToken` and `incompleteSearch` – William Daly Sep 13 '18 at 16:38
  • fields='parents, name' is a requesting a partial response you have not included nextpagetoken there for you will never see it. try fields='*' – Linda Lawton - DaImTo Sep 13 '18 at 19:27
  • @DaImTo The issue is not with the `files().get()` method it's with `files().list()` and I included nextPageToken in the fields list and it's still not available in the results of that query – William Daly Sep 13 '18 at 19:55
  • I think you need to show me a example. I have x files in this folder and only y is returning. is there a permission issue with the missing files – Linda Lawton - DaImTo Sep 14 '18 at 06:53
  • Who owns the missing folder? Try setting "pageSize=1" on your query to force multiple pages with nextPageTokens to test your program logic. – pinoyyid Sep 14 '18 at 11:17
  • @DaImTo The example is listed in the original post. There are 41 files in the parent folder with id `1QXPl6z04GsYAO0GKHBk2oBjEweaAbczw` and now only 39 are not returning in the `files().list()` query when I filter for files that have parents with that id. – William Daly Sep 14 '18 at 16:17
  • @pinoyyid the issue isn't the multiple pages, because yesterday the `files().list()` returned 40 records, today it's only returning 39, and the record that is no longer showing up in the list today is one that was also renamed by a collaborator. I'll add this new pattern to the original post – William Daly Sep 14 '18 at 16:21

1 Answers1

4

I figured out the error with my code:

My previous query parameter in the files().list() method was:

results = service.files().list(
    q="parents in '1QXPl6z04GsYAO0GKHBk2oBjEweaAbczw'", 
    fields="files(id, name), incompleteSearch, nextPageToken").execute()
items = results['files']

After looking at another bug someone had posted in Google's issue tracker for the API, I saw the preferred syntax for that query was:

results = service.files().list(
    q="'1QXPl6z04GsYAO0GKHBk2oBjEweaAbczw' in parents", 
    fields="files(id, name), incompleteSearch, nextPageToken").execute()
items = results['files']

In other words switching the order of parents in fileId to fileId in parents. With the resulting change in syntax all 41 files were returned.

I have two follow-up questions that hopefully someone can clarify:

  1. Why would the first syntax return any records at all if it is incorrect? And why would changing the name of a file prevent it from being returned using the first syntax?
  2. If you wanted to return a list of files that were stored in one of a few folders, is there any way to pass multiple parent ids to the query as the parents in ... syntax would suggest? Or do they have to be evaluated as separate conditions i.e. fileId1 in parents or fileId2 in parents?

If someone could comment on this answer with those explanations or post a more complete answer, I would gladly select it as the best response.

William Daly
  • 172
  • 1
  • 10
  • to answer your first question it might be worth GETting the file and looking at its parents collection. A guess is that your initial syntax was somehow matching on only the first element of the parents. On 2, as you now know, it's "id in parents", so you'll just need to string the clauses together with `or` – pinoyyid Sep 14 '18 at 22:19
  • @pinoyyid Thanks! I've checked a folder's parents before and after a name change it doesn't appear to have changed how the list of parents is stored, but at least I know to avoid this issue before. And gotcha, that makes sense, but doesn't seem to allow for parameterized queries very easily. – William Daly Sep 18 '18 at 17:18