1

I want to read google sheet with multiple sheets into a (or several) pandas dataframe.

I don't know the sheet names, or the number of sheets in advance.


The trivial attempt fails:


def main():
    path = r"https://docs.google.com/spreadsheets/d/1-MlSisrAxhOyKhrz6S08PG68j667Ym7jGExOyytpCSM/edit?usp=sharing"

    pd.read_excel(path)

fails with

ValueError: Excel file format cannot be determined, you must specify an engine manually.

Trying any format doesn't work.


All answers to this question refer to .csv, meaning a single sheet, or knowing the sheet name in advance.

Same goes for the 1st Google hit for "read google sheet python pandas".

Is there a standard way of doing this?

Gulzar
  • 23,452
  • 27
  • 113
  • 201
  • Your error says `Excel file format cannot be determined` - are you sure you are reading speradsheets and not excel files? – ziganotschka Feb 28 '22 at 08:47

1 Answers1

0

When your Spreadsheet is publicly shared, in your situation, how about the following sample script?

Sample script:

import openpyxl
import pandas as pd
import requests
from io import BytesIO

spreadsheetId = "###" # Please set your Spreadsheet ID.
url = "https://docs.google.com/spreadsheets/export?exportFormat=xlsx&id=" + spreadsheetId
res = requests.get(url)
data = BytesIO(res.content)
xlsx = openpyxl.load_workbook(filename=data)
for name in xlsx.sheetnames:
    values = pd.read_excel(data, sheet_name=name)

    # do something

  • In this sample script, the publicly shared Spreadsheet is exported as a XLSX data. And, the exported XLSX data is opened, the sheet names are retrieved. And then, each sheet is put into the dataframe.

  • If you want to retrieve the specific sheets, please filter the sheet names from xlsx.sheetnames.

Note:

  • If your Spreadsheet is not publicly shared, this thread might be useful. Ref
Tanaike
  • 181,128
  • 11
  • 97
  • 165