4

I have a 27GB CSV file and I want to simply rename the header rows. Can I do this without reading the entire file into a dataframe and then writing the entire file again?

This is essentially what I want to do, but without re-writing the whole 27GB file.

data = pd.read_csv(filename,sep="|",nrows=2)
data.head()

LOC_ID  UPC FW  BOP_U   BOP_$
0   17  438531560821    201712  1   40.0
1   239 438550152328    201719  2   28.8


data.columns = ['WHSE','SKU','PERIOD','QUANTITYONHAND','DOLLARSONHAND']
data.head()


   WHSE           SKU  PERIOD  QUANTITYONHAND  DOLLARSONHAND
0    17  438531560821  201712               1           40.0
1   239  438550152328  201719               2           28.8
John Rider
  • 51
  • 4

1 Answers1

1

Just specify there being only a single row with nrows.

header_df = pd.read_csv('my_file.csv', index_col=0, nrows=1)

As for re-writing the file, I don't think you'll get around having to process the entire file to re-write.

miradulo
  • 28,857
  • 6
  • 80
  • 93