How to stop pandas from creating new column?

Question

I have a csv file of fish occurrences and need to trim out any fish that show up only once, and then output this as a 'trimmed' csv. However, the function I am using adds a headerless column to the trimmed csv, which messes up further calculations I need to do with the trimmed file.

The column includes row numbers from to_keep and I believe is created as a result of this line: return df[df[colname].isin(to_keep)]. I would like to have this script simply not create this column; otherwise I have no manually delete it from every single csv file I trim!

import pandas as pd

def trim_single_entries(fn, colname):
# remove all entries where colname's entry is unique to one row across the whole file
df = pd.read_csv(fn)
if colname in df.columns:
    counts = df[colname].value_counts()
    to_keep = [counts.index[i] for i in range(0,len(counts)) if counts.values[i] > 1]  
    return df[df[colname].isin(to_keep)]
else:
    return False

x = trim_single_entries('fish_data.csv', 'catalognumber')

x.to_csv('trimmed_fish_data.csv')

Question already answered? http://stackoverflow.com/questions/20845213/how-to-avoid-python-pandas-creating-an-index-in-a-saved-csv — tmthyjames, Sep 09 '15 at 15:35
Unfortunately I would not have known to search that, I'm very new to python/pandas and did not even know what an index was — spops, Sep 10 '15 at 17:31

score 3 · Accepted Answer · answered Sep 09 '15 at 14:24

3

Add index=False to the to_csv method

answered Sep 09 '15 at 14:24

Brian Pendleton

839
4
13

How to stop pandas from creating new column?

1 Answers1