1

I have a csv file which has 1000 entries (it is delimitered by a tab). I've only listed the first few.

    Unique ID   Name
 0  60ff3ads    Keith
 1  C6LSI545    Shawn
 2  O87SI523    Baoru
 3  OM022SSI    Naomi
 4  3LLS34SI    Alex
 5  Z7423dSI    blahblah

I want to remove the some of these entries by their index number from this csv file and save it into another csv file.

I've not started writing any codes for this yet because i'm not sure how i should go about doing it.. Please kindly advise.

jake wong
  • 4,909
  • 12
  • 42
  • 85
  • Have you had a look at module [Python CSV](https://docs.python.org/2/library/csv.html) ? – Vikas Ojha Jun 14 '15 at 08:43
  • You have to start use `csv reader` module, read line by line check your condition, and use `csv writer` to output https://docs.python.org/2/library/csv.html – itzMEonTV Jun 14 '15 at 08:44

3 Answers3

1

A one-liner to solve your problem:

import pandas as pd

indexes_to_drop = [1, 7, ...]
pd.read_csv('original_file.csv', sep='\t').drop(indexes_to_drop, axis=0).to_csv('new_file.csv')

check the read_csv doc to accommodate for your particular CSV flavor if needed

oDDsKooL
  • 1,767
  • 20
  • 23
0

The sample data suggests a tab delimitered file. You could open the input file with a csv.reader, and open an output file with csv.writer. It will be slightly simpler, however, if you simply use split() to grab the first field (index) and compare it with those indices that you want to filter out.

indices_to_delete = ['0', '3', '5']

with open('input.csv') as infile, open('output.csv', 'w') as outfile:
    for line in infile:
        if line.split()[0] not in indices_to_delete:
            outfile.write(line)

This could be reduced to this:

with open('c.csv') as infile, open('output.csv', 'w') as outfile:
    outfile.writelines(line for line in infile
                           if line.split()[0] not in indices_to_delete)

And that should do the trick in this case for the sort of data that you posted. If you find that you need to compare values in other fields containing whitespace, you should consider the csv module.

mhawke
  • 84,695
  • 9
  • 117
  • 138
-1

I don't think it is possible to remove lines. However, you could write two new files. So go over each row of the original csv. Next, for each row save it to csv-A or to csv-B. That way you end up with two seperated csvfiles.

More info here: How to Delete Rows CSV in python

Community
  • 1
  • 1
  • Hmm.. I was searching through stackoverflow as well for ideas on how to remove rows from a csv file.. But with no luck.. This might be the only way huh? – jake wong Jun 14 '15 at 09:04