31

All I would like to do is delete a row if it has a value of '0' in the third column. An example of the data would be something like:

6.5, 5.4, 0, 320
6.5, 5.4, 1, 320

So the first row would need to be deleted whereas the second would stay.

What I have so far is as follows:

import csv
input = open('first.csv', 'rb')
output = open('first_edit.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    if row[2]!=0:
        writer.writerow(row)
input.close()
output.close()

Any help would be great

Anshul Goyal
  • 73,278
  • 37
  • 149
  • 186
Will B
  • 387
  • 3
  • 6
  • 14

3 Answers3

41

You are very close; currently you compare the row[2] with integer 0, make the comparison with the string "0". When you read the data from a file, it is a string and not an integer, so that is why your integer check fails currently:

row[2]!="0":

Also, you can use the with keyword to make the current code slightly more pythonic so that the lines in your code are reduced and you can omit the .close statements:

import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[2] != "0":
            writer.writerow(row)

Note that input is a Python builtin, so I've used another variable name instead.


Edit: The values in your csv file's rows are comma and space separated; In a normal csv, they would be simply comma separated and a check against "0" would work, so you can either use strip(row[2]) != 0, or check against " 0".

The better solution would be to correct the csv format, but in case you want to persist with the current one, the following will work with your given csv file format:

$ cat test.py 
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[2] != " 0":
            writer.writerow(row)
$ cat first.csv 
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
$ python test.py 
$ cat first_edit.csv 
6.5, 5.4, 1, 320
Anshul Goyal
  • 73,278
  • 37
  • 149
  • 186
  • 1
    I have tried that as well, but it does not seem to be working regardless if it is set as a string or integer – Will B Apr 19 '15 at 04:58
  • I tried the way you edited it, and I have also tried to do a strip(), but the output file still has the rows with '0' values! – Will B Apr 19 '15 at 05:09
  • I have ran it a few times, and it comes back the same. – Will B Apr 19 '15 at 05:17
  • @WillB I'm not sure what you are running then, I've already posted the inputs and code I'm using. Maybe, you should use a `pdb.set_trace()` statement within your for loop to identify why it is not working. – Anshul Goyal Apr 19 '15 at 05:20
  • i converted my csv into a .txt to see that it was reading it as " 00.0000", which was the only way to make it work. Thanks for helping me solve the problem! – Will B Apr 19 '15 at 05:22
  • 1
    is there any way to do this without having to create and write to an additional file?? – oldboy Aug 19 '19 at 00:44
  • @BugWhisperer yes, there should be. But OP wants to do it with an additional file. – Anshul Goyal Aug 20 '19 at 00:21
  • 1
    ok, ive had trouble locating anything on doing this within a single file. would u be able to possibly provide me with a link to any post/info regarding this? – oldboy Aug 21 '19 at 20:13
3

Use pandas amazing library:

The solution for the question:

import pandas as pd


df = pd.read_csv(file)
df =  df[df.name != "dog"] 

# df.column_name != whole string from the cell
# now, all the rows with the column: Name and Value: "dog" will be deleted

df.to_csv(file, index=False)

General generic solution:

Use this function:

def remove_specific_row_from_csv(file, column_name, *args):
    '''
    :param file: file to remove the rows from
    :param column_name: The column that determines which row will be 
           deleted (e.g. if Column == Name and row-*args
           contains "Gavri", All rows that contain this word will be deleted)
    :param args: Strings from the rows according to the conditions with 
                 the column
    '''
    row_to_remove = []
    for row_name in args:
        row_to_remove.append(row_name)
    try:
        df = pd.read_csv(file)
        for row in row_to_remove:
            df = df[eval("df.{}".format(column_name)) != row]
        df.to_csv(file, index=False)
    except Exception  as e:
        raise Exception("Error message....")

Function implementation:

remove_specific_row_from_csv(file_name, "column_name", "dog_for_example", "cat_for_example")

Note: In this function, you can send unlimited cells of strings and all these rows will be deleted (assuming they exist in the single-column sent).

Gavriel Cohen
  • 4,355
  • 34
  • 39
  • 1
    One-line summary: use a Numpy-style filtering: `df = df[df.my_column != value]` – Basj Jul 25 '21 at 17:52
  • 1
    @Basj, you're right but pls do not forget the added value when things are arranged as a clear function and the possibility of sending unlimited cells of strings – Gavriel Cohen Jul 26 '21 at 10:09
1

You should have if row[2] != "0". Otherwise it's not checking to see if the string value is equal to 0.

The Obscure Question
  • 1,134
  • 11
  • 26