python hdfs writer adds an extra index column to output csv

Question

I need to write a csv to hdfs. Currently i am using hdfs module for doing that.

df = pd.DataFrame(data, columns = ['FirstName','LastName','City'])
d = InsecureClient('http://localhost:50070')
with d.write(path, encoding = 'utf-8', overwrite=True) as writer:
    df.to_csv(writer)

file is succesfully generated, but it adds and extra index column as starting to the csv. I need to have the file with the columns i have specified. how i can remove this index? i could not find any parameter for that.

#current output:
,"FirstName","LastName","City"
0,"John","Doo","New York" 
1,"Jane","Doo","San Francisco"

#expected:
"FirstName","LastName","City"
"John","Doo","New York"
"Jane","Doo","San Francisco"

thanks in advance, clairvoyant

update: using pandas package to create my csv

Change `df.to_csv(writer)` to `df.to_csv(writer, index=False)`. — Abdou, Nov 26 '17 at 20:42
Possible duplicate of [How to avoid Python/Pandas creating an index in a saved csv?](https://stackoverflow.com/questions/20845213/how-to-avoid-python-pandas-creating-an-index-in-a-saved-csv) — Abdou, Nov 26 '17 at 20:43

score 0 · Accepted Answer · answered Nov 26 '17 at 20:43

0

Set index=False when calling to_csv (it defaults to True).

answered Nov 26 '17 at 20:43

Dima Spivak

747
7
9

python hdfs writer adds an extra index column to output csv

1 Answers1