60

I'm sorry if this is an obvious question, I'm still pretty new to the API. I'm using the python drive api library, and trying to download a google spreadsheet as a csv.
When I used files.get, it spat out a file with no downloadUrl, and with no 'text/csv' key in the export links field.
If it's not possible, I can find a workaround, but I'm hoping it is, since it is possible to do manually (file->download_as->csv)

Do I need to use the google document list api?

thanks, Matt

user1546842
  • 601
  • 1
  • 6
  • 3

7 Answers7

50

Update: I have posted another answer that works with the Spreadsheets v4 API.

Old Answer:

The answer from Alain is correct, but you also need to set the gid=parameter to specify which worksheet to export.

For example, if your 'application/pdf' export link is like this:

docs.google.com/feeds/download/spreadsheets/Export?key=<FILE_ID>&exportFormat=pdf

You can just change it to this to download the first worksheet:

docs.google.com/feeds/download/spreadsheets/Export?key<FILE_ID>&exportFormat=csv&gid=0

There is a bit of a problem, though as there is no reliable way to get the gid for a given worksheet through the API and they are not zero based indexes. If you delete a worksheet, that gid does not get reused. You can see the gid in the URL in your browser though, so if your worksheet information is constant you can just get that from there. See http://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=1813 and http://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=3240 for more info on that problem.

Peter Haight
  • 1,906
  • 14
  • 19
  • I'm still getting an HTML export of the spreadsheet. The gid is correct, there's only ever been one sheet; I am bewildered. – Andrew Wyld Feb 26 '13 at 11:16
  • 1
    This is awesome! thnx guys! used this in PHP. – Paschalis Jun 13 '13 at 23:01
  • 3
    Any news? Plus, is there a typo? There should be an '=' after key, surely. – fatuhoku May 16 '14 at 13:59
  • 7
    This works for me with the current google docs: https://docs.google.com/spreadsheets/d/SECRET/export?format=csv&gid=0 – Dr. Jan-Philip Gehrcke Jun 03 '14 at 13:08
  • @Jan-PhilipGehrcke is right with the new URL. Also make sure you make the document shared. – ElDog Jun 24 '14 at 10:37
  • Any chance anyone can tell me where does the access_token parameter come in all of this? also is "key" the spreadsheet's ID? Because that's the API KEY according to the documentation... – JustAGuy Jul 08 '18 at 00:14
  • This post is from 2012 with an earlier version of the API. Back then the key field used to refer to the document ID, but now things are pretty different. – Peter Haight Jul 08 '18 at 18:16
  • 1
    @JustAGuy I made a new answer that works with the latest API. Also, the access_token needs to go in the Authorization header for the HTTP request. If you are using a library that's usually taken care of for you, but there's an example in the python script I just posted that sets it directly. – Peter Haight Jul 08 '18 at 21:13
18

As a lot of other people have pointed out, my original answer is somewhat outdated. So here is my answer updated for v4 of the Google Spreadsheets API. Now there's a way to get the gids, but we can't use the the drive files.export API because it only exports first worksheet in the spreadsheet (even if you specify the gid).

To export all of the worksheets as CSV files, you need to get the gids for the worksheets you want to export using the spreadsheets.get API. That API call returns a bunch of information about the spreadsheet including each of the worksheets. You can get the gid from the properties.sheetId property for each worksheet.

Once you have that, you can just build the same URL that the Sheets uses when you select File->Download As->CSV. You can take the data.spreadsheetUrl value from spreadsheets.get and replace /edit with /export and then add the gid as the parameter. You will also need to include Authorization Bearer <auth token> in the HTTP header in the request.

Here's a python script based on their quickstart example that downloads all of the sheets for the spreadsheet with a specified ID. You need to replace <spreadsheet id> with the ID for a spreadsheet you have access to:

import apiclient.discovery
import httplib2
import oauth2client.file
import oauth2client.tools
import re
import requests
import shutil
import urllib.parse

SCOPES = 'https://www.googleapis.com/auth/drive.readonly'
SPREADSHEET_ID = '<spreadsheet id>'

store = oauth2client.file.Storage('credentials.json')
creds = store.get()
if not creds or creds.invalid:
  flow = oauth2client.client.flow_from_clientsecrets('client_secret.json', SCOPES)
  creds = oauth2client.tools.run_flow(flow, store)

service = apiclient.discovery.build('sheets', 'v4', http=creds.authorize(httplib2.Http()))

result = service.spreadsheets().get(spreadsheetId = SPREADSHEET_ID).execute()
urlParts = urllib.parse.urlparse(result['spreadsheetUrl'])
path = re.sub("\/edit$", '/export', urlParts.path)
urlParts = urlParts._replace(path=path)
headers = {
  'Authorization': 'Bearer ' + creds.access_token,
}
for sheet in result['sheets']:
  params = {
    'id': SPREADSHEET_ID,
    'format': 'csv',
    'gid': sheet['properties']['sheetId'],
  }
  queryParams = urllib.parse.urlencode(params)
  urlParts = urlParts._replace(query=queryParams)
  url = urllib.parse.urlunparse(urlParts)
  response = requests.get(url, headers = headers)
  filePath = '/tmp/foo-%s.csv' % (+ params['gid'])
  with open(filePath, 'wb') as csvFile:
    csvFile.write(response.content)

Peter Haight
  • 1,906
  • 14
  • 19
  • Could you please add sample data for credentials.json and client_secret.json files. That will be helpful for someone like me. – Jagath Aug 29 '18 at 12:28
  • 1
    @Jagath I know some time has passed but maybe this will help people in the long run: you can find some guidance [in here](https://developers.google.com/identity/protocols/OAuth2ServiceAccount) – Natan Streppel Jun 19 '19 at 16:32
  • "gid" query parameter doesn't work - it redirects to 404 page – Ivan Gusev Jun 04 '21 at 09:58
  • It looks like Google changed the export URL a little bit. I've updated the example above to work with the change. The above example now works for me with Python 3.9.5. – Peter Haight Jun 05 '21 at 19:27
10

The exportLinks collection doesn't expose the CSV format as this will only export the first worksheet of a spreadsheet. If retrieving the first worksheet as a CSV is the behavior you are looking for, you can build the link manually and set the ?exportFormat= query parameter to ?exportFormat=csv.

Alain
  • 6,044
  • 21
  • 27
7

Here's an implementation of Alain's suggestion that works for me:

downloadUrl = entry.get('exportLinks')['application/pdf']
# Strip "=pdf" and replace with "=csv"
downloadUrl = downloadUrl[:-4] + "=csv"

resp, content = drive_service._http.request(downloadUrl)
ryanarn
  • 79
  • 1
5

Not sure if it's what the OP needed, but in the new Google Sheets version it seems that it became a little hard to hot link a csv version of your spreadsheet.

In case you are interested in a Google apps script that will export all sheets in a spreadsheet to individual csv files (instead of downloading each one individually), Here you go:

function onOpen() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var csvMenuEntries = [{name: "export as csv files", functionName: "saveAsCSV"}];
  ss.addMenu("csv", csvMenuEntries);
};

function saveAsCSV() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheets = ss.getSheets();
  // create a folder from the name of the spreadsheet
  var folder = DocsList.createFolder(ss.getName().toLowerCase().replace(/ /g,'_') + '_csv_' + new Date().getTime());
  for (var i = 0 ; i < sheets.length ; i++) {
    var sheet = sheets[i];
    // append ".csv" extension to the sheet name
    fileName = sheet.getName() + ".csv";
    // convert all available sheet data to csv format
    var csvFile = convertRangeToCsvFile_(fileName, sheet);
    // create a file in the Docs List with the given name and the csv data
    folder.createFile(fileName, csvFile);
  }
  Browser.msgBox('Files are waiting in a folder named ' + folder.getName());
}

function convertRangeToCsvFile_(csvFileName, sheet) {
  // get available data range in the spreadsheet
  var activeRange = sheet.getDataRange();
  try {
    var data = activeRange.getValues();
    var csvFile = undefined;

    // loop through the data in the range and build a string with the csv data
    if (data.length > 1) {
      var csv = "";
      for (var row = 0; row < data.length; row++) {
        for (var col = 0; col < data[row].length; col++) {
          if (data[row][col].toString().indexOf(",") != -1) {
            data[row][col] = "\"" + data[row][col] + "\"";
          }
        }

        // join each row's columns
        // add a carriage return to end of each row, except for the last one
        if (row < data.length-1) {
          csv += data[row].join(",") + "\r\n";
        }
        else {
          csv += data[row];
        }
      }
      csvFile = csv;
    }
    return csvFile;
  }
  catch(err) {
    Logger.log(err);
    Browser.msgBox(err);
  }
}

Note: This script uses the DocsList.createFile() method, which is only available for Google Apps accounts.

If you need further explanation, go here: http://drzon.net/export-all-google-sheets-to-csv/

m0meni
  • 16,006
  • 16
  • 82
  • 141
Michael
  • 22,196
  • 33
  • 132
  • 187
4

(Jul 2016) This question is phrased correctly, but in essence is a duplicate of another thread (Download a spreadsheet from Google Docs using Python). While some of the previous answers to this question below may still work (although answers are in JS/Apps Script not Python), a new Drive API version (v3) and new Sheets API version (v4) make them slightly outdated although the previous versions of both have not been deprecated (yet). Modern Google API access occurs using API keys or OAuth2 authorization, primarily with the Google APIs Client Libraries, including the one for Python.

To perform the task requested in/by the OP, you would perhaps query for specific Sheets to download, then perform the actual export(s) with the Drive API. Since this is likely a common operation, I wrote a blogpost sharing a code snippet that does this for you. If you wish to pursue exporting further, I've got another pair of posts along with a video that outlines how to upload files to and download files from Google Drive.

Note that there is also a Google Sheets API, but it's primarily for spreadsheet-oriented operations, i.e., inserting data, reading spreadsheet rows, cell formatting, creating charts, adding pivot tables, etc., not file-based requests like exporting where the Drive API is the correct one to use.

Community
  • 1
  • 1
wescpy
  • 10,689
  • 3
  • 54
  • 53
3

Note, as of April 2015 DocsList was depreciated, and has been replaced by DriveApp. Many of the DriveApp methods are identical to DocsList. So, in many cases, you can simply replace DocsList with DriveApp. So replace DocsList.createFile() with DriveApp.createFile()

How to update DocsList to DriveApp in my code

Community
  • 1
  • 1