As an absolute novice to Python I relied on this thread in order to build script to read CSV data. The file content looked like this:
1,2,3
File was created in MS Excel and edited in Notepad++
The code used to read it came in two variants:
Variant 1:
import pandas as pd
url='https://drive.google.com/file/d/1WgaA_dIHYm3ogCUE4WDu1ocwgQSZ1Wc7/view?usp=sharing'
file_id = url.split('/')[-2]
dwn_url='https://drive.google.com/uc?id=' + file_id
df = pd.read_csv(dwn_url)
print(df.head()))
Variant 2:
import pandas as pd
import requests
from io import StringIO
url='https://drive.google.com/file/d/1WgaA_dIHYm3ogCUE4WDu1ocwgQSZ1Wc7/view?usp=sharing'
file_id = url.split('/')[-2]
dwn_url='https://drive.google.com/uc?export=download&id=' + file_id
url2 = requests.get(dwn_url).text
csv_raw = StringIO(url2)
df = pd.read_csv(csv_raw)
print(df.head())
Both returning well-known Error:
pandas.errors.ParserError: Error tokenizing data. C error: Expected 90 fields in line 3, saw 217
Using the read_csv(dwn_url, on_bad_lines='skip')
I got following output:
<!doctype html><html lang="en-US" dir="ltr"><head><base href="https://accounts.google.com/v3/signin/"><meta name="referrer" content="origin"><link rel="canonical" href="https://accounts.google.com/v3/signin/identifier"><meta name="viewport" content="width=device-width ... 0.149);border-radius:2px;bottom:0;content:"";left:0;position:absolute;right:0;top:0;z-index:-1}.JVMrYb{display:block}.hJIRO{display:none}.sQecwc{display:hidden}sentinel{}
0 /*# sourceURL=/_/mss/boq-identity/_/ss/k=boq-i... ... NaN
1 Copyright The Closure Library Authors. ... NaN
2 SPDX-License-Identifier: Apache-2.0 ... NaN
3 */ ... NaN
4 'use strict';var d=function(a){var b=0;return ... ... NaN
[5 rows x 90 columns]
Most importantly - this same output is provided when a completely empty CSV is imported, which makes me believe there is some technical data in CSV which Pandas reads (thus getting wrong count of columns from the very beginning).
Does anyone of community members have idea about the possible ways to resolve the issue?