I am trying to write a dataframe to a gzipped csv in python pandas, using the following:
import pandas as pd
import datetime
import csv
import gzip
# Get data (with previous connection and script variables)
df = pd.read_sql_query(script, conn)
# Create today's date, to append to file
todaysdatestring = str(datetime.datetime.today().strftime('%Y%m%d'))
print todaysdatestring
# Create csv with gzip compression
df.to_csv('foo-%s.csv.gz' % todaysdatestring,
sep='|',
header=True,
index=False,
quoting=csv.QUOTE_ALL,
compression='gzip',
quotechar='"',
doublequote=True,
line_terminator='\n')
This just creates a csv called 'foo-YYYYMMDD.csv.gz', not an actual gzip archive.
I've also tried adding this:
#Turn to_csv statement into a variable
d = df.to_csv('foo-%s.csv.gz' % todaysdatestring,
sep='|',
header=True,
index=False,
quoting=csv.QUOTE_ALL,
compression='gzip',
quotechar='"',
doublequote=True,
line_terminator='\n')
# Write above variable to gzip
with gzip.open('foo-%s.csv.gz' % todaysdatestring, 'wb') as output:
output.write(d)
Which fails as well. Any ideas?