I'm building a python application that uses the Google drive APIs, so fare the development is good but I have a problem to retrieve the entire Google drive file tree, I need that for two purposes:
- Check if a path exist, so if i want upload test.txt under root/folder1/folder2 I want to check if the file already exist and in the case update it
- Build a visual file explorer, now I know that google provides his own (I can't remember the name now, but I know that exist) but I want to restrict the file explorer to specific folders.
For now I have a function that fetch the root of Gdrive and I can build the three by recursive calling a function that list me the content of a single folder, but it is extremely slow and can potentially make thousand of request to google and this is unacceptable.
Here the function to get the root:
def drive_get_root():
"""Retrieve a root list of File resources.
Returns:
List of dictionaries.
"""
#build the service, the driveHelper module will take care of authentication and credential storage
drive_service = build('drive', 'v2', driveHelper.buildHttp())
# the result will be a list
result = []
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
files = drive_service.files().list(**param).execute()
#add the files in the list
result.extend(files['items'])
page_token = files.get('nextPageToken')
if not page_token:
break
except errors.HttpError, _error:
print 'An error occurred: %s' % _error
break
return result
and here the one to get the file from a folder
def drive_files_in_folder(folder_id):
"""Print files belonging to a folder.
Args:
folder_id: ID of the folder to get files from.
"""
#build the service, the driveHelper module will take care of authentication and credential storage
drive_service = build('drive', 'v2', driveHelper.buildHttp())
# the result will be a list
result = []
#code from google, is working so I didn't touch it
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
children = drive_service.children().list(folderId=folder_id, **param).execute()
for child in children.get('items', []):
result.append(drive_get_file(child['id']))
page_token = children.get('nextPageToken')
if not page_token:
break
except errors.HttpError, _error:
print 'An error occurred: %s' % _error
break
return result
and for example now to check if a file exist I'm using this:
def drive_path_exist(file_path, list = False):
"""
This is a recursive function to che check if the given path exist
"""
#if the list param is empty set the list as the root of Gdrive
if list == False:
list = drive_get_root()
#split the string to get the first item and check if is in the root
file_path = string.split(file_path, "/")
#if there is only one element in the filepath we are at the actual filename
#so if is in this folder we can return it
if len(file_path) == 1:
exist = False
for elem in list:
if elem["title"] == file_path[0]:
#set exist = to the elem because the elem is a dictionary with all the file info
exist = elem
return exist
#if we are not at the last element we have to keep searching
else:
exist = False
for elem in list:
#check if the current item is in the folder
if elem["title"] == file_path[0]:
exist = True
folder_id = elem["id"]
#delete the first element and keep searching
file_path.pop(0)
if exist:
#recursive call, we have to rejoin the filpath as string an passing as list the list
#from the drive_file_exist function
return drive_path_exist("/".join(file_path), drive_files_in_folder(folder_id))
any idea how to solve my problem? I saw a few discussion here on overflow and in some answers people wrote that this is possible but of course the didn't said how!
Thanks