Download/Export Public Google Spreadsheet as TSV from Command Line?

Question

I have a public (published) Google spreadsheet that I’m trying to download programmatically in TSV form.

In my browser, with a Google login active, for some actual key $key, https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=$key&exportFormat=tsv works and produces a TSV file.

In my shell, however:

curl -L "https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=$key&exportFormat=tsv" produces a bunch of javascript.
curl -L "https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=$key&exportFormat=csv" also produces a bunch of javascript.
curl -L "https://docs.google.com/spreadsheet/pub?key=$key&single=true&gid=0&output=csv" works and produces a CSV file.
curl -L "https://docs.google.com/spreadsheet/pub?key=$key&single=true&gid=0&output=tsv" produces an error message.

(Attempts to use wget produced similar results.)

How do I make this work? All the Google documentation I’ve been able to find so far is geared towards much more complicated problems than a simple download and format change, and if the solution to my problem is in there somewhere, I haven’t been able to find it yet.

Looks like an Auth issue. Try logging out of your google account? (Strange that the CSV works and TSV fails.) — eddyparkinson, Jun 18 '14 at 08:14

mdahlman · Answer 1 · 2022-01-21T19:09:11.950

35

I found this to be frustratingly undocumented. I'm sure it's documented somewhere... but I never found it.

The premise is that your Google Sheet is published publicly. This is not intuitive for many folks. (Choose File -> Publish to Web...)

When you publish a sheet, you are given a url like this to copy: https://docs.google.com/spreadsheets/d/1XsfK2TN418FuEstNGG2eI9FmEV-4eY-FnndigHWIhk4/pubhtml

That url is nicely browsable... but it's not the downloadable CSV I wanted. Through a lengthy combination of search and trial-and-error I came up with this:

curl -L "https://docs.google.com/spreadsheets/d/1XsfK2TN418FuEstNGG2eI9FmEV-4eY-FnndigHWIhk4/export?gid=0&format=csv" > ./my_local.csv

Note that the gid=0 is often correct, since it seems to be the default ID for the initial tab. But it's not necessarily correct. See the answer below for more details about the gid.

I find this example to be tremendously helpful. I hope somebody comments with a link to the official docs explaining this in more detail.

edited Jan 21 '22 at 19:09

answered Feb 13 '15 at 07:23

mdahlman

9,204
4
44
72

2

Somehow, if I include `gid=0` then I will get a page not found error, but if I do not include it then I get the result as expected – Sorawee Porncharoenwase Sep 04 '17 at 05:12
By the way, the `gid=0` still works well for me. But the upvotes on that comment imply that it's important to remove it sometimes. I would update the answer if I knew when it was optional/mandatory/forbidden. – mdahlman Sep 25 '19 at 15:01
For GIDs, please see [my answer](https://stackoverflow.com/a/59785528/1667018) below. – Vedran Šego Jan 17 '20 at 10:35
2

If you get HTML back, you need to add the `-L` flag to curl (`curl -L "https://......"`) – Michael Oct 25 '20 at 21:35
Thanks a lot @Michael that was the missing bit :) – Karussell Jan 20 '22 at 09:20

score 8 · Answer 2 · answered Jan 25 '17 at 09:43

I can download through the shell in this way:

File => Publish to Web
Choose a Sheet and the format do you want to download.
Click on Publish
Copy the link

and then use it:

wget -O ./filename.csv "LINK"

or

curl -L "LINK" > ./filename.csv

in my case it worked as expected.

Plus I think that it publish all the formats so you can choose what to download changing the last part of the URL without un-publish and re-publish it:

output=tsv
output=csv

this is by far the easiest! – Maarten-vd-Sande Oct 27 '20 at 12:28 — Maarten-vd-Sande, Oct 27 '20 at 12:28

Vedran Šego · Answer 3 · 2020-01-17T10:34:56.087

To add to the answer written by @mdahlman: there is a gid=<value> argument that lets you chose the sheet to view (as CSV and TSV support the viewing of just one sheet). This is a sheet ID and you can pick it up from the URL of each sheet.

So, to get a CSV/TSV publish link, do this:

Publish the document to get a URL like https://docs.google.com/spreadsheets/d/e/{key}/pub?output=tsv.
Then for each spreadsheet:
1. Click on it.
2. View its URL in your browser's address bar. It'll end with edit#gid={gid}. That's what you want.
3. Make your URL from the one in step 1. and gid in 2.2.: https://docs.google.com/spreadsheets/d/e/{key}/pub?output=tsv&gid={gid}.

GIDs don't go in sequence (0, 1, 2,...). They are long numbers (9 digits for me), seemingly in no straight order or anything, so they're really more like sheet keys than what one would expect as an "id".

In my document, one of the GIDs was zero. I am assuming it's some sort of a default or a first created sheet. That explains why gid=0 worked for some people above, yet produced an error for others (those who don't have a sheet with such GID... they have possibly deleted it or something).

I get a "Temporary Redirect" when doing "curl {publushedURL+correctGID} Also, I am able to download the file using the link provided after publishing the file to the web as a CSV — Manthan_Admane, Jul 09 '20 at 21:19
@Manthan_Admane, I just tried it with `&gid=...` and it with works as expected for me. As for downloading "using the link provided after publishing the file to the web as a CSV", that works if you have a single sheet. If you have multiple, you need `gid` to pick the right sheet, as CSV supports only one. — Vedran Šego, Jul 10 '20 at 22:27
@Manthan_Admane, make sure you allow redirects. In case of curl it's: `curl -L` — bas080, Feb 22 '22 at 22:10

score 4 · Answer 4 · answered Sep 11 '22 at 12:24

This is the parameters list I have founded along the years:

&format=pdf                   //export format
&size=a4                      //A3/A4/A5/B4/B5/letter/tabloid/legal/statement/executive/folio
&portrait=false               //true= Potrait / false= Landscape
&scale=1                      //1= Normal 100% / 2= Fit to width / 3= Fit to height / 4= Fit to Page
&top_margin=0.00              //All four margins must be set!
&bottom_margin=0.00           //All four margins must be set!
&left_margin=0.00             //All four margins must be set!
&right_margin=0.00            //All four margins must be set!
&gridlines=false              //true/false
&printnotes=false             //true/false
&pageorder=2                  //1= Down, then over / 2= Over, then down
&horizontal_alignment=CENTER  //LEFT/CENTER/RIGHT
&vertical_alignment=TOP       //TOP/MIDDLE/BOTTOM
&printtitle=false             //true/false
&sheetnames=false             //true/false
&fzr=false                    //true/false
&fzc=false                    //true/false
&attachment=false             //true/false

//FORMATS WITH NO ADDITIONAL OPTIONS
  //format=xlsx       //excel
  //format=ods        //Open Document Spreadsheet
  //format=zip        //html zipped          

  //CSV,TSV OPTIONS***********
  //format=csv        // comma seperated values
  //             tsv        // tab seperated values
  //gid=sheetId             // the sheetID you want to export, The first sheet will be 0. others will have a uniqe ID

  // PDF OPTIONS****************
  //format=pdf     
  //size=0,1,2..10             paper size. 0=letter, 1=tabloid, 2=Legal, 3=statement, 4=executive, 5=folio, 6=A3, 7=A4, 8=A5, 9=B4, 10=B5  
  //fzr=true/false             repeat row headers
  //portrait=true/false        false =  landscape
  //fitw=true/false            fit window or actual size
  //gridlines=true/false
  //printtitle=true/false
  //pagenum=CENTER/UNDEFINED      CENTER = show page numbers / UNDEFINED = do not show
  //attachment = true/false      dunno? Leave this as true
  //gid=sheetId                 Sheet Id if you want a specific sheet. The first sheet will be 0. others will have a uniqe ID. 
                               // Leave this off for all sheets. 
  // EXPORT RANGE OPTIONS FOR PDF
  //need all the below to export a range
  //gid=sheetId                must be included. The first sheet will be 0. others will have a uniqe ID
  //ir=false                   seems to be always false
  //ic=false                   same as ir
  //r1=Start Row number - 1        row 1 would be 0 , row 15 wold be 14
  //c1=Start Column number - 1     column 1 would be 0, column 8 would be 7   
  //r2=End Row number
  //c2=End Column number

score 1 · Answer 5 · edited Aug 12 '21 at 19:29

1

My answer is about how to find the answer.

In Chrome browser, navigate to your google document.

In the upper right corner of the browser, go to the three dots->more tools-> developer tools

This will bring up the html... debugger.

At the top of the debugger window, select network.

Now in your document, initiate the download as that you're trying to automate.

In the debugger, it'll show you any web requests that are made. The first new one is probably what you want.

You should be able to right click->copy-> copy link address

The url includes an ID. I don't know what it's for, but curl was able to download the doc without.

Hope it's helpful.

edited Aug 12 '21 at 19:29

shim

9,289
12
69
108

answered Jan 20 '17 at 09:12

mmccoo

8,386
5
41
60

Nice approach. There are too many links in the network tab. I seem to have found the one I want- because I tried that on the browser and initiated the download. but I can't make a "curl {LINK}" to download. It says that the file has been moved. – Manthan_Admane Jul 09 '20 at 21:06

score 0 · Answer 6 · answered Jan 29 '23 at 18:18

0

here's a tiny command line tool that can do this (CSV, not TSV; maybe close enough): https://www.npmjs.com/package/gsheet-cmd

answered Jan 29 '23 at 18:18

Sebastian Rothbucher

1,423
10
8

Download/Export Public Google Spreadsheet as TSV from Command Line?

6 Answers6

Linked