3

When I use Google Drive API to download my Excel files, it will send and outdated version. The way to reproduce this is to open the Excel file in Google Sheets Editor, than add new tabs, then download the File through the API. Most of the time it will not send the file with the latest added Tab.

I'm having this issue using the Google build('drive', 'v3', credentials=creds) in my project, but could also reproduce this with a simpler script that I got here. It uses requests and it's way simpler:

import os
import requests

def download_file_from_google_drive(id, destination):
    URL = "https://docs.google.com/uc?export=download&confirm=1"

    session = requests.Session()

    response = session.get(URL, params = { 'id' : id }, stream = True)
    token = get_confirm_token(response)

    if token:
        params = { 'id' : id, 'confirm' : token }
        response = session.get(URL, params = params, stream = True)

    save_response_content(response, destination)    

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value

    return None

def save_response_content(response, destination):
    CHUNK_SIZE = 32768

    with open(destination, "wb") as f:
        for chunk in response.iter_content(CHUNK_SIZE):
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)

if __name__ == "__main__":
    file_id = '1JI17N-NFAIOxX_2Y88gmuKMlsuGhBPcB'
    home = os.getenv('HOME')
    destination = f'{home}/Downloads/test-drive-download/output.xlsx'
    download_file_from_google_drive(file_id, destination)

The file id shown in the script is public: https://docs.google.com/spreadsheets/d/1JI17N-NFAIOxX_2Y88gmuKMlsuGhBPcB/edit#gid=1677858863

I've also recorded a video for better clarification: https://drive.google.com/file/d/1-L_SnWp1zQNWJ34Z2JtVbgnPVIONuBgr/view?usp=sharing

Mauricio
  • 2,552
  • 2
  • 29
  • 43
  • I have experienced the same situation with you. In that case, I used a workaround. 1. Manually edit the XLSX file on the browser and confirm the save. 2. Using a script, it copies the XLSX file. And, download the copied XLSX file. 3. After the download, delete the copied XLSX file. By this workaround, the latest XLSX file can be downloaded. In this case, it is required to use the access token for copying the XLSX file. – Tanaike May 24 '23 at 00:31
  • When I saw your script, it seems that the XLSX file is downloaded without using the access token. So, in this case, how about copying the XLSX file using Web Apps created by Google Apps Script? By this, you can achieve your goal by adding a script requesting Web Apps without the access token. I'm not sure whether this is your expected direction. So, I posted it as a comment. If this was not your expected direction, I apologize. – Tanaike May 24 '23 at 00:31
  • I intentionally used a script that doesn't require Access Token so it's easier to reproduce. But it happens regardless of the API client used. Regarding your suggestion, I can't use this workaround becaue I provide a system where its users authorize their Google Drive to have their files managed by the service. So I can't ask my users to do this workaround. – Mauricio May 24 '23 at 00:44
  • 1
    Thank you for replying. I apologize that my proposal was not useful for your situation. I would like to study more. – Tanaike May 24 '23 at 00:44
  • 1
    I suggest you post your issue/concern here: https://issuetracker.google.com/issues – Twilight May 25 '23 at 03:52

1 Answers1

0

I can confirm the issue. It seems that it is related to how google internally works with non-native file formats.

The workaround I've found is to upload a file with conversion to Google Sheets format (specify mimetype "application/vnd.google-apps.spreadsheet" in your Create api call).

    f, err = c.service.Files.Create(&drive.File{
        Name:     name,
        Parents:  []string{parentID},
        MimeType: "application/vnd.google-apps.spreadsheet",
    }).Media(file).Do()

Then, instead of using Download API, use Export API with the mimetype "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet".

resp, err := c.service.Files.Export(fileID, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet").Download()

This will download file in the xslx format and, in my experiments, I always got the latest version of the file.

PS: That's not ideal solution, since converting some sophisticated Excels to GoogleSheets might cause corruption/feature loss, but better than nothing.

iVariable
  • 130
  • 2
  • 5