25

I searched similar questions about reading csv from URL but I could not find a way to read csv file from google drive csv file.

My attempt:

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
dfs = pd.read_html(url)

How can we read this file in pandas?

Related links:

Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
  • See how to read a public shared file and save it to local disk. https://stackoverflow.com/questions/38511444/python-download-files-from-google-drive-using-url. Once the file is saved you can use `read_csv` – balderman Jun 15 '19 at 15:31

10 Answers10

45

This worked for me

import pandas as pd
url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(url)
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
  • 2
    The example of how to pull out the download URL was very useful. Brilliant answer. – David Parks Jan 27 '20 at 17:54
  • 4
    I think this probably a great example, but when the file is large (like the 565MB file I'm trying to download) it downloads the HTML of the page telling you that the file is too large to virus scan and asking you if you want to download it (rather than just downloading the file) – Richard D May 07 '20 at 22:37
20

To read CSV file from google drive you can do that.

import pandas as pd

url = 'https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)

I think this is the easiest way to read CSV files from google drive. hope your "Anyone with the link" option enables in google drive.

Samir Mughal
  • 255
  • 2
  • 7
9

Using pandas

import pandas as pd

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=url.split('/')[-2]
dwn_url='https://drive.google.com/uc?id=' + file_id
df = pd.read_csv(dwn_url)
print(df.head())

Using pandas and requests

import pandas as pd
import requests
from io import StringIO

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'

file_id = url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url2 = requests.get(dwn_url).text
csv_raw = StringIO(url2)
df = pd.read_csv(csv_raw)
print(df.head())

output

      sex   age state  cheq_balance  savings_balance  credit_score  special_offer
0  Female  10.0    FL       7342.26          5482.87           774           True
1  Female  14.0    CA        870.39         11823.74           770           True
2    Male   0.0    TX       3282.34          8564.79           605           True
3  Female  37.0    TX       4645.99         12826.76           608           True
4    Male   NaN    FL           NaN          3493.08           551          False
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
5

I would recommend you using the following code:

import pandas as pd
import requests
from io import StringIO

url = requests.get('https://doc-0g-78-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/5otus4mg51j69f99n47jgs0t374r46u3/1560607200000/09837260612050622056/*/0B6GhBwm5vaB2ekdlZW5WZnppb28?e=download')
csv_raw = StringIO(url.text)
dfs = pd.read_csv(csv_raw)

hope this helps

Nazim Kerimbekov
  • 4,712
  • 8
  • 34
  • 58
  • @astro123 Hey, I got the correct URL using a package intercept tool (in my case [Burp Suite](https://portswigger.net/burp)). I've just clicked on the download button (on the google drive website) and found the request which directly sent the csv file. – Nazim Kerimbekov Jun 15 '19 at 15:42
4

Simply change de URL from Google Drive using uc?id=, and then pass it to the read_csv function. In this example:

url = 'https://drive.google.com/uc?id=0B6GhBwm5vaB2ekdlZW5WZnppb28'
dfs = pd.read_csv(url)
rusiano
  • 364
  • 2
  • 11
2

The other answers are great for reading a publicly accessible file but, if trying to read a private file that has been shared with an email account, you may want to consider using PyDrive.

There are many ways to authenticate (OAuth, using a GCP service account, etc). Once authenticated, reading a CSV can be as simple as getting the file ID and fetching its contents:

from io import StringIO

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

# Assuming authentication has been performed and stored in a variable called gauth
drive = GoogleDrive(gauth)
params = {
    'q': f"id='{file_id}' = id and mimeType='text/csv'"
}
# List all files that satisfy the query
file_list = drive.ListFile(params).GetList()

gdrive_csv_file = file_list[0]
input_csv = StringIO(gdrive_csv_file.GetContentString())
    
df = pd.read_csv(input_csv)
arredond
  • 569
  • 6
  • 8
1

Google has updated the part of the query string URL now (usp=share_link). The following code works now:

import pandas as pd
url="https://drive.google.com/file/d/1a7qwzU2mbaJPkFQZMJCkdE37Ne2DbgHA/view?usp=share_link"
reconstructed_url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(reconstructed_url)
df
BhaskarT
  • 27
  • 2
0

Here is similar implementation using R

library(tidyverse)

url='https://drive.google.com/file/d/0B6GhBwm5vaB2ekdlZW5WZnppb28/view?usp=sharing'
file_id=nth(strsplit(url, split = "/")[[1]], -2)
dwn_url=paste('https://drive.google.com/uc?id=',file_id,sep = "")
df = read_csv(dwn_url)

head(df)
Ben Allen
  • 77
  • 6
0

In case you're using Google Colab you can add file to your Drive and type (default folder names):

df = pd.read_csv('/content/drive/MyDrive/.../your_file.csv')
toribicks
  • 31
  • 4
0

If you are using google colab as notebook you can directly mount the drive and then copy the path of file:

    df = pd.read_csv('/content/drive/MyDrive/Dataset/dataset.csv')
    df.head()
ouflak
  • 2,458
  • 10
  • 44
  • 49