I have been trying to download a zipped csv using the requests library from a server host URL.
When I download a smaller file not requiring compression from the same server it has no problem reading in the CSV, but with this one I return encoding errors.
I have tried multiple types of encoding, reading in as pandas csv, reading in as zip file and opening (at which point I get the error that file is not a zip file).
I have additionally tried using the zipfile library as sugggested here: Reading csv zipped files in python
and have also tried setting both encoding and compression in read_csv
.
The code which works for the non-zipped server file is below:
response = requests.get(url, auth=HTTPBasicAuth(un, pw), stream=True, verify = False)
dfs = pd.read_csv(response.raw)
but returns 'utf-8' codec can't decode byte 0xfd in position 0: invalid start byte
when used for this file.
I have also tried:
request = get(url, auth=HTTPBasicAuth(un, pw), stream=True, verify=False)
zip_file = ZipFile(BytesIO(request.content))
files = zip_file.namelist()
with gzip.open(files[0], 'rb') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print(row)
which returns a seek attribute error.