0

I have a dataframe (G) whose columns are “Client” and “TIV”.

I have another dataframe whose (B) columns are “Client”, “TIV”, “A”, “B”, “C”.

I want to select all rows from B whose clients are not in G. In other words, if there is a row in B whose Client also extsist in G then I want to delete it.

I did this:

x= B[B[‘Client’]!= G[‘Client’]

But it returned saying that “can only compare identically labeled Series Object”

I appriciate your help.

navid
  • 25
  • 2
  • 9
  • Does this answer your question? [How to filter Pandas dataframe using 'in' and 'not in' like in SQL](https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql) – Chris Sep 06 '22 at 13:33
  • please share samples of dataframe so people can try it on their own machines – grymlin Sep 06 '22 at 13:33
  • I think what you are looking for is an anti join. Check this post: https://stackoverflow.com/questions/38516664/anti-join-pandas – Jannik Sep 06 '22 at 13:35
  • @grymlin Thank you for your feedback. Anything would work so that's why I didn't put anything there. As long as Column of G is not selected in B, I am happy :) – navid Sep 06 '22 at 13:39
  • @Chris I am trying my best to understand it actually. I am not sure – navid Sep 06 '22 at 13:40

2 Answers2

1

You can use df.isin combined with ~ operator:

B[~B.Client.isin(G.Client)]
Nuri Taş
  • 3,828
  • 2
  • 4
  • 22
0

Maybe the following code snippet helps:

df1 = pd.DataFrame(data={'Client': [1,2,3,4,5]})
df2 = pd.DataFrame(data={'Client': [1,2,3,6,7]})
# Identify what Clients are in df1 and not in df2
clients_diff = set(df1.Client).difference(df2.Client)
df1.loc[df1.Client.isin(clients_diff)]

The idea is to filter df1 on all clients which are not in df2

Jannik
  • 965
  • 2
  • 12
  • 21