Repeating rows in dataframe need to choose minimum value

Question

Having a data frame as below. if product 1 and product 2 are repeating simultaneously then need to choose the smallest price.

expected output:

score 1 · Accepted Answer · answered Sep 06 '22 at 14:04

1

Does this work for you :

data = {"Product1": ["A", "A", "B", "B", "D","D"], 
        "Product2": ["B", "B", "C", "C", "E","F"],
        "location": ["GOA","Banlore","GOA","Banlore","Delhi","Delhi"],
        "Price":[30,40,20,30,10,15]}

df = pd.DataFrame(data)
df = df.loc[df.groupby(['Product1',"Product2"]).Price.idxmin()]

answered Sep 06 '22 at 14:04

grymlin

492
1
9

if the data frame contain some missing value, how will we handle it? Like data = {"Product1": ["A", "A", "B", "B", "D","D","C"], "Product2": ["B", "B", "C", "C", "E","F",'H'], "location": ["GOA","Banlore","GOA","Banlore","Delhi","Delhi",Nan], "Price":[30,40,20,30,10,Nan]} – Mathew John Sep 06 '22 at 14:36
I need the output as same, like if product 1 and 2 are repeating then need to choose the minimum.else need to print the same as dataframe. can you help me? – Mathew John Sep 06 '22 at 14:37
KeyError: '[nan] not in index' for me error is showing – Mathew John Sep 06 '22 at 14:39
data = {"Product1": ["A", "A", "B", "B", "D","D","E"], "Product2": ["B", "B", "C", "C", "E","F","D"], "location": ["GOA","Banlore","GOA","Banlore","Delhi","Delhi",None], "Price":[30,40,20,30,10,15,None]} df = pd.DataFrame(data) df – Mathew John Sep 06 '22 at 14:43
in this case, it seems like the price is the intresting column here so you can either set it to 0 if its nan, example : df['Price'] = df['Price'].fillna(0) or keep df without rows where Price is nan df = df[df['Price'].notna()], it depends on what you want to do :) – grymlin Sep 06 '22 at 14:44

Repeating rows in dataframe need to choose minimum value

1 Answers1