0

I need to merge 2 data frames with Pandas. I'm using Jupyter Notebook. I can merge but I cannot filter the data (like a WHERE statement on SQL). The data frames have equal messages that were labeled with 0 or 1 by the labelers. The data frames have 3 columns with equal values - Id, timestamp, message and 2 columns high, low with different values of labeling (1 or 0) I did the merge on column 'Id'

I am trying to see for example:

Is there any time that Df1 has the column 'high' with value == 1 and the Df2 has the column 'low' with value == 1 for the same statement? or vice-versa?

df1.merge(df2, on = ['Id'], how = 'inner')

now I need to filter, but I can't find the right syntax...

(df1[(df1["high_df1"] == 1)]) & (df2[(df2["low_df2"] == 1)])

I could visualize the answer with separate statements:

df1.loc[(df1['high_df1'] == 1) & (df2['low_df2'] == 1)]
df2.loc[(df2['low_df2'] == 1) & (df1['high_df1'] == 1)]

But I would like to see the results in one table.

Can anyone help me?

Artem Vovsia
  • 1,520
  • 9
  • 15
pluccap
  • 1
  • 1
  • 1
    Please read about [mcve](/help/mcve) and [how-to-ask](/help/how-to-ask) – rpanai Oct 11 '19 at 11:18
  • 2
    hey Puccap, 1. please format your code, 2. please post your expected output and input data (5 rows of each) and please read [How to ask a good pandas question](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Umar.H Oct 11 '19 at 11:21

2 Answers2

0

As I understood:

df1 = df1.merge(df2, on = ['Id'], how='left')

df1.where((df1['high_df1'] == 1) & (df1['low_df2'] == 1))
Cambala
  • 26
  • 3
0

Is there any time that Df1 has the column 'high' with value == 1 and the Df2 has the column 'low' with value == 1 for the same statement? or vice-versa?

An alternative solution is to apply an appropriate query():

df1.merge(df2, on=['Id'], how='inner').query('high_df1==1&low_df2==1|low_df1==1&high_df2==1')
Armali
  • 18,255
  • 14
  • 57
  • 171