I am getting extra column while converting .csv file to .xlsx file in Python

Question

The code below converts .csv file in C:/Path/ into .xlsx file. However, it creates an extra column when converted to .xlsx file. How can I delete that added extra column? Thank you very much.

import os

for root, dirs, files in os.walk("C:/Processed_Report/", topdown=False)
for name in files:
    base_name, ext = os.path.splitext(name)  #Split name, extension
    if ext in ".csv":
        df = pd.read_csv(os.path.join(root, name))
        df.to_excel(os.path.join(root, 'Test.xlsx'))

Input:

Output:

Try with `index=False` – Mitchell Olislagers Dec 09 '22 at 19:12 — Mitchell Olislagers, Dec 09 '22 at 19:12

score 3 · Accepted Answer · answered Dec 09 '22 at 19:13

3

You need to pass index=False as a keyword of pandas.DataFrame.to_excel.

Replace this :

df.to_excel(os.path.join(root, 'Test.xlsx'))

By this :

df.to_excel(os.path.join(root, 'Test.xlsx'), index=False)

answered Dec 09 '22 at 19:13

Timeless

22,580
4
12
30

I wonder why it defaults to `index=True`? – Mark Ransom Dec 09 '22 at 19:18
No need to answer dupes – BigBen Dec 09 '22 at 19:21
@MarkRansom, there has been many discussions about that, https://github.com/pandas-dev/pandas/issues/46583#issuecomment-1176850085 – Timeless Dec 09 '22 at 19:22

Omid Afzali · Answer 2 · 2022-12-09T19:17:39.387

0

that is just how it is with dataframes in pandas, when you create a dataframe by any mean (like csv file) it ads an extra column which contains the indexes. how ever df to excel function has an argument name index which you can set on False to prevent creating that extra column:

df = pd.read_csv(os.path.join(root, name))
df.to_excel(os.path.join(root, 'Test.xlsx'), index = False)

edited Dec 09 '22 at 19:17

answered Dec 09 '22 at 19:16

Omid Afzali

1
2

No need to answer dupes. – BigBen Dec 09 '22 at 19:21

I am getting extra column while converting .csv file to .xlsx file in Python

2 Answers2