0

I have a CSV file (110 MB) in Google Drive that I want to read in Python

I got its direct link with the Share - Get Link command

And I tried like this, in Python3:

import pandas as pd
import requests
from io import StringIO

orig_url='https://drive.google.com/file/d/1jfLj0k_0BaNuYYZk0Q1IlcgWVvCxb7Yf/view?usp=sharing'

file_id = orig_url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url = requests.get(dwn_url).text
csv_raw = StringIO(url)
df = pd.read_csv(csv_raw)

But the content came like this:

df.head()
    <!DOCTYPE html><html><head><title>Google Drive - Virus scan warning</title><meta http-equiv="content-type" content="text/html; charset=utf-8"/><link href=&#47;static&#47;doclist&#47;client&#47;css&#47;148676949&#45;untrustedcontent.css rel="stylesheet"><link rel="icon" href="https://ssl.gstatic.com/docs/doclist/images/infinite_arrow_favicon_4.ico"/><style nonce="OQ9CeC6eq/6HZpt+pcTCOw">#gbar  #guser{font-size:13px;padding-top:0px !important;}#gbar{height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh  .gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}@media all{.gb1{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1    a.gb4{text-decoration:underline !important}a.gb1    a.gb4{color:#00c !important}.gbi .gb4{color:#dd8e27 !important}.gbf .gb4{color:#900 !important}
0   </style><script nonce="OQ9CeC6eq/6HZpt+pcTCOw"...   NaN     NaN     NaN     NaN

Please, does anyone know a better strategy for reading the CSV?

Reinaldo Chaves
  • 965
  • 4
  • 16
  • 43
  • does this answer your question https://stackoverflow.com/questions/48596521/how-to-read-data-from-google-drive-using-colaboratory-google – badger Nov 07 '20 at 20:23
  • Thanks @hatefAlipoor . But I don't want to use Colab. Is there an alternative way? – Reinaldo Chaves Nov 07 '20 at 20:27
  • maybe this https://stackoverflow.com/questions/25010369/wget-curl-large-file-from-google-drive/39225039#39225039 – badger Nov 07 '20 at 20:28
  • For example, if the file is publicly shared and the API key can be used, the script will become a simple. How about this? But if this was not the direction you expect, I apologize. – Tanaike Nov 08 '20 at 00:04
  • Is it possible to change the location from which you want to load the data? – squeezer44 Nov 08 '20 at 08:37
  • Thanks @Tanaike But I can't even open the file in Sheets to publish it. It's a big file, I can only upload it – Reinaldo Chaves Nov 08 '20 at 12:41
  • Yes @squeezer44 , I can change the location. But I know the commands for doing this in Google Drive. I haven't done yet on other platforms – Reinaldo Chaves Nov 08 '20 at 12:42
  • Thank you for replying. I have to apologize for my poor English skill. I wanted to ask whether the file of `1jfLj0k_0BaNuYYZk0Q1IlcgWVvCxb7Yf` has already been publicly shared. If it's so, you can download it using the API. In that case, the script will be a simple. – Tanaike Nov 08 '20 at 22:19

1 Answers1

0

I found two options to solve that - stipulation for that was they must be feasible without changing the filestore Google Drive.

  1. Python Module gdown
  2. How to Use Google Drive API in Python: Download files
squeezer44
  • 560
  • 2
  • 17