0

I have to compare each row with a different threshold basis the data of other columns in the data. How can I do so.

For example, I have 5 columns as below.

I have to compare the price column with a threshold derived basis the data distribution of market, product group and price type.

Say I will calculate median for each of these groups such as GBR, Toys, Low ASP. I will compare the price for row 1 i.e. $10 with the median derived for this group i.e. GBR, Toys and Low ASP.

Accordingly my threshold will differ for each row basis the value of attributes of Market, Product Group and Price type). How can I do so? I am stuck with identifying a logic for this in Python.

The data snippet:

enter image description here

Red
  • 26,798
  • 7
  • 36
  • 58
vicky
  • 1

1 Answers1

0

I don't know what your DF looks like. I'm going to assume you're using pandas. In the future provide some code so we can provide a better answer for you. What you would want to do is likely something like this:

filtered_df = df[(df['Price'] > 100) & (df['Product Group']=='Toys')]

After this, you can compare two different dataframes. To add a column with the median, you could do something like this:

filtered_df[median] = filtered_df['Price'].median()

For more on the pandas median function look here: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.median.html Using Groupby may make it easier for you to do all of this; however, as I said earlier, I need more code to go off.
Sorry if this Answer isn't exactly what you need, I'm not sure what context you're using the comparison for. It looks like an excel replacement. Look at this post for more context. how do you filter pandas dataframes by multiple columns

Joe Wolf
  • 61
  • 12