Trying to write a script that exclude only rows from csv files under a specific directory, that is present in another csv file, and redirect the output to another csv. This something like an exception rule to apply.
Like from below input with considering the exception as below:
inDirectory/input.csv:
Id Name Location Data Services Action
10 John IN 1234 mail active
12 Samy GR 5678 phone disable
28 Doug UK 9123 phone active
excDirectory/exception.csv:
12 Samy GR 5678 phone disable
Wanted to redirect output as below:
outDirectory/output.csv:
Id Name Location Data Services Action
10 John IN 1234 mail active
28 Doug UK 9123 phone active
All i am able to write as below, which is incomplete and i am looking for a solution that perform the same. Any idea? i am very much new to Python scripting.
import pandas as pd
inDir = os.listdir('csv_out_tmp')
excFile = pd.read_csv('exclude/exception.csv', sep=',', index_col=0)
for csv in inDir:
inFile = pd.read_csv('csv_out_tmp/' + csv)
diff = set(inFile)^set(excFile)
df[diff].to_csv('csv_out/' + csv, index=False)
Another way code i am writing as per @neotrinity
inDir = os.listdir('csv_out_tmp')
excFile = 'exclude/exception.csv'
for csv in inDir:
inFile = open('csv_out_tmp/' + csv)
excRow = set(open(excFile))
with open('csv_out/' + csv, 'w') as f:
for row in open(inFile):
if row not in excRow:
f.write(row)
With the above code the error i am getting as below
for row in open(inFile):
TypeError: coercing to Unicode: need string or buffer, file found