additional column when saving pandas data frame to csv file

Question

Here the the code to process and save csv file, and raw input csv file and output csv file, using pandas on Python 2.7 and wondering why there is an additional column at the beginning when saving the file? Thanks.

c_a,c_b,c_c,c_d
hello,python,pandas,0.0
hi,java,pandas,1.0
ho,c++,numpy,0.0

sample = pd.read_csv('123.csv', header=None, skiprows=1,
    dtype={0:str, 1:str, 2:str, 3:float})
sample.columns = pd.Index(data=['c_a', 'c_b', 'c_c', 'c_d'])
sample['c_d'] = sample['c_d'].astype('int64')
sample.to_csv('saved.csv')

Here is the saved file, there is an additional column at the beginning, whose values are 0, 1, 2.

cat saved.csv
,c_a,c_b,c_c,c_d
0,hello,python,pandas,0
1,hi,java,pandas,1
2,ho,c++,numpy,0

Set `index=False` as parameter in `to_csv` if you don't want the column. http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html — khammel, Aug 27 '16 at 19:42
It is the index of the DataFrame. See [here](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html) how to disable it. — Ami Tavory, Aug 27 '16 at 19:43
Possible duplicate of [How to avoid Python/Pandas creating an index in a saved csv?](http://stackoverflow.com/questions/20845213/how-to-avoid-python-pandas-creating-an-index-in-a-saved-csv) — Merlin, Aug 27 '16 at 19:51

Juan David · Accepted Answer · 2016-08-27T19:58:32.727

The additional column corresponds to the index of the dataframe and is aggregated once you read the CSV file. You can use this index to slice, select or sort your DF in an effective manner.

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.html

http://pandas.pydata.org/pandas-docs/stable/indexing.html

If you want to avoid this index, you can set the index flag to False when you save your dataframe with the function pd.to_csv. Also, you are removing the header and aggregating it later, but you can use the header of the CSV to avoid this step.

sample = pd.read_csv('123.csv', dtype={0:str, 1:str, 2:str, 3:float})
sample.to_csv('output.csv', index= False)

Hope it helps :)

It was a pleasure. Have a nice day @LinMa – Juan David Aug 27 '16 at 19:59 — Juan David, Aug 27 '16 at 19:59

additional column when saving pandas data frame to csv file

1 Answers1