I am a beginning python user.
How do you write a row to a csv file by comparing a separate list of text patterns, and excluding rows with the matching pattern?
Here is a specific example:
listfile: spam, eggs, bacon,
csvfile:
col 1 col 2 col 3
row 1 zzz not eggs zzz
2 xxx bacon qqq
3 eee not bacon ttt
4 ttt eggs hhh
5 ggg not spam ppp
6 yyy eggs www
The csv file I need to write is supposed to have only rows 1, 3 and 5, because col 2 value did not match any of the values of the list for those rows.
Assuming the below filedata, how would I write this?
mycsv = csv.reader(open('spameggsbacon.csv'))
listfile = listfile.txt
for row in mycsv:
text = row[1]
writecsvfile = open('write.csv', 'a')
EDIT: based on Md Johirul Islam's answer, I tried:
import csv
import pandas as pd
data = pd.read_csv('spameggsbacon.csv')
listfiledata = 'listfile.txt'
with open(listfiledata) as f:
listfiledata = f.readlines()
listfiledata = [x.strip() for x in listfiledata]
data = data[~data['col2'].isin(listfiledata)]
data.to_csv('spameggsbacon.csv', sep=',')
print(listfiledata)
print(data.head)
The code runs, but does not remove the rows that have matching values. It appears the reason has to do with how this line is written:
data = data[~data['col2'].isin(listfiledata)]
Edit 2: Not sure if it matters, but I revised the original example to clarify that the values in col2 may repeat, for example, 'eggs' appears in both row 4 and row 6
Edit 3:
Here is what you see if you run
print(listfiledata)
print(data.head)
Output is:
['spam,eggs,bacon']
<bound method NDFrame.head of col1 col2 col3
0 zzz not eggs zzz
1 zzz bacon zzz
2 zzz not bacon zzz
3 zzz eggs zzz
4 zzz not spam zzz
5 zzz eggs zzz>