-1

I have a dataframe that looks like this (10k~ rows). I'll call it Maindf

+---+---------+----------+-------+--------------+
|   | Product | Discount | Store | OtherColumns |
+---+---------+----------+-------+--------------+
| 0 | A       | 0.5      | Red   |              |
| 1 | A       | 1        | Red   |              |
| 2 | C       | 3        | Green |              |
| 3 | Z       | 1.5      | Blue  |              |
| 4 | I       | 0        | Red   |              |
| 5 | D       | 0        | Green |              |
+---+---------+----------+-------+--------------+

Through code I generate this other dataframe (changes depending on the input data). I'll call it Filterdf

+---+---------+----------+---------+
|   | Product | Discount | Counter |
+---+---------+----------+---------+
| 0 | A       | 0.5      |       1 |
| 1 | B       | 2.0      |       2 |
| 2 | C       | 1        |       9 |
| 3 | D       | 0        |       7 |
+---+---------+----------+---------+

I am trying to return all values from Maindf that match on columns Product and Discount with Filterdf. So the expected output would be this

+---+---------+----------+-------+--------------+
|   | Product | Discount | Store | OtherColumns |
+---+---------+----------+-------+--------------+
| 0 | A       | 0.5      | Red   |              |
| 1 | D       | 0        | Green |              |
+---+---------+----------+-------+--------------+

And here is my code line to do it, which is not working out properly.

NewMaindf = Maindf[(Maindf['Product'].isin(Filterdf['Product']) & Maindf['Discount'].isin(Filterdf['Discount']))]

print(NewMaindf)

The output is this. I am interested only in the data from Maindf that matches both columns of Filterdf, in this case A with discount 1 is coming through because A isin Filterdf['Product'] and also 1 isin Filterdf['Discount'] but with Product C

+---+---------+----------+-------+--------------+
|   | Product | Discount | Store | OtherColumns |
+---+---------+----------+-------+--------------+
| 0 | A       | 0.5      | Red   |              |
| 1 | A       | 1        | Red   |              |
| 2 | D       | 0        | Green |              |
+---+---------+----------+-------+--------------+

How could this be achieved? Thank you and sorry for the poor formatting, first time posting here

Pamdca
  • 1
  • 1
    Please refer to [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – Ben Pap Feb 11 '20 at 23:24
  • Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – AMC Feb 12 '20 at 02:32

1 Answers1

0
import pandas as pd
maindf = {'Product': ['A', 'A','C','Z','I','D'], 'Discount': [0.5,1,3,1.5,0,0],'Store' :['Red','Red','Red','Red','Red','Red']}
Maindf = pd.DataFrame(data=maindf)
print(Maindf)
filterdf = {'Product': ['A', 'B','C','D' ], 'Discount': [0.5, 2.0,1,0]}
Filterdf = pd.DataFrame(data=filterdf)
print(Filterdf)
NewMaindf= Maindf[Maindf[['Product','Discount']].astype(str).sum(axis = 1).isin(
                Filterdf[['Product','Discount']].astype(str).sum(axis = 1))]
print(NewMaindf)

Output:

 Product  Discount Store
0       A       0.5   Red
1       A       1.0   Red
2       C       3.0   Red
3       Z       1.5   Red
4       I       0.0   Red
5       D       0.0   Red
  Product  Discount
0       A       0.5
1       B       2.0
2       C       1.0
3       D       0.0
  Product  Discount Store
0       A       0.5   Red
5       D       0.0   Red
Hassan
  • 118
  • 6