I have a CSV file (110 MB) in Google Drive that I want to read in Python
I got its direct link with the Share - Get Link command
And I tried like this, in Python3:
import pandas as pd
import requests
from io import StringIO
orig_url='https://drive.google.com/file/d/1jfLj0k_0BaNuYYZk0Q1IlcgWVvCxb7Yf/view?usp=sharing'
file_id = orig_url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url = requests.get(dwn_url).text
csv_raw = StringIO(url)
df = pd.read_csv(csv_raw)
But the content came like this:
df.head()
<!DOCTYPE html><html><head><title>Google Drive - Virus scan warning</title><meta http-equiv="content-type" content="text/html; charset=utf-8"/><link href=/static/doclist/client/css/148676949-untrustedcontent.css rel="stylesheet"><link rel="icon" href="https://ssl.gstatic.com/docs/doclist/images/infinite_arrow_favicon_4.ico"/><style nonce="OQ9CeC6eq/6HZpt+pcTCOw">#gbar #guser{font-size:13px;padding-top:0px !important;}#gbar{height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh .gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}@media all{.gb1{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1 a.gb4{text-decoration:underline !important}a.gb1 a.gb4{color:#00c !important}.gbi .gb4{color:#dd8e27 !important}.gbf .gb4{color:#900 !important}
0 </style><script nonce="OQ9CeC6eq/6HZpt+pcTCOw"... NaN NaN NaN NaN
Please, does anyone know a better strategy for reading the CSV?