0

I am wanting to use one CSV as a reference and search for those values in another. If the value is not found I need to remove the row. For example

import pandas as pd

df1

Column  A     B    C    
        1     5    10
        10    5     5

df2

Column  A     B    C    
        3     5    10
        10    5     5

Given these two df I want to use df1 as the reference using column A, search column A from df2 and remove the first row since it is not within df1. There should be a new df created with only the interested values.

df1 = pd.read_csv('DataIWantToReference.csv') 
df2 = pd.read_csv('DataToRemove.csv')

df3 = append only values from df2 that match df1

Not sure if I should create a list of values from df1 to iterate over searching df2 or how I should go about doing this?

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58

1 Answers1

0

This return a DataFrame with all the df2['A'] values which not in the df1['A'] column.

df2[~df2['A'].isin(df1['A'])]

It's what you want ?

LeMorse
  • 132
  • 6