1

I'm parsing a publicly published Google Spreadsheet CSV in Python. The main spreadsheet URL is like https://docs.google.com/spreadsheets/d/MyL0NgSpr4dsheetK3y3x4mpl3 , with MyL0NgSpr4dsheetK3y3x4mpl3 being the public spreadsheet key.

The main spreadsheet has several sub sheets, or tabs, if you prefer.

Before, to get all the sheets URLs (along with their titles), without using any Google API I used to extract all the info from the XML export format URL, so I was doing that in Python:

xml_url: str = f"https://spreadsheets.google.com/feeds/worksheets/{spreadsheet_key}/public/full"
file = urllib.request.urlopen(xml_url)
raw_data: str = file.read()
file.close()
data: dict = xmltodict.parse(raw_data)
sheets: dict = dict((e["title"]["#text"], e["id"].partition("/full/")[2]) for e in list(data.items())[0][1]["entry"])

It seems that since recently, Google doesn't provide any XML export URL anymore, so this method now returns a 404 when hitting on the XML URL.

Any idea about how to get the sheets URLs (and optionally their titles)? If it's possible without going through the pain of parsing the XML it's even better.

bolino
  • 867
  • 1
  • 10
  • 27
  • 1
    Although I cannot test the script because I cannot see the file ID of the Spreadsheet and the output values you expect, is this thread useful for your situation? https://stackoverflow.com/q/69454931 – Tanaike Oct 06 '21 at 04:20
  • Thanks. It seems that this implies the Google API. My question is about a publicly published spreadsheet, I want to avoid the pain of authenticating to the Google API. Will update my question – bolino Oct 06 '21 at 04:25
  • This response is useful: https://stackoverflow.com/questions/24255472/download-export-public-google-spreadsheet-as-tsv-from-command-line/59785528#59785528 , although it doesn't say how to get all the gid (sheets ids) from the spreadsheet if you don't know them in advance – bolino Oct 06 '21 at 04:34
  • 1
    Thank you for replying. I apologize for my poor English skill. [My proposed answer](https://stackoverflow.com/q/69454931) doesn't use Sheets API v4. Because when Sheets API v4 is used, an API key is required to be used even when the Spreadsheet is publicly shared. So I proposed to use the endpoint of `https://docs.google.com/spreadsheets/d/{spreadsheetId}/pub?output=csv` or `https://docs.google.com/spreadsheets/d/{spreadsheetId}/gviz/tq?tqx=out:csv`. – Tanaike Oct 06 '21 at 04:44
  • 1
    When you want to retrieve the sheet IDs of each sheet, it is required to use Sheets API v4. If you don't want to use Sheets API, how about preparing the wrapped API created by Web Apps of Google Apps Script? If this was not the direction you expect, I apologize. – Tanaike Oct 06 '21 at 04:44
  • About the method without using Sheets API, is this thread useful? https://stackoverflow.com/q/62732791/7108653 – Tanaike Oct 06 '21 at 04:50

0 Answers0