I have 2 csv files of different length. I need to find and remove the rows in one file that do not exist in the other file. Is there an easy way to do this, other than looping through the 2nd file n times?
Asked
Active
Viewed 40 times
0
-
Possible duplicate of [set difference for pandas](https://stackoverflow.com/questions/18180763/set-difference-for-pandas) – Andrey Portnoy Sep 18 '18 at 19:03
-
Thank you, Andrey, that looks like a good resource. I'm not looking for duplicate rows though. So I wouldn't call this post a duplicate of that one. – Sep 18 '18 at 19:15
1 Answers
1
Assuming you load your csv file into df1, and df2
df1[df1.apply(tuple,1).isin(df2.apply(tuple,1))]

BENY
- 317,841
- 20
- 164
- 234