1

I have a following dataframe..

     coupon_type     dish_id  dish_name   dish_price  dish_quantity
0     Rs 20 off       012      Sandwich     65            2
1     Rs 20 off       013       Chicken     125           3
2     Rs 20 off       013       Chicken     125           3
3     Rs 20 off       013       Chicken     125           3

    ratings    reviews      coupon_type  user_id order_id  meals order_area
    4     blah blah blah   Rs 20 off      9       9         5     London
    4     blah blah blah   Rs 20 off      9       9         5     London
    3     blah blah blah   Rs 20 off      9       9         5     London
    4     blah blah blah   Rs 20 off      9       9         5     London  

I am doing groupby on dish_name column.

df_dish_name = df_final.groupby('dish_name')

Then I am performing some ratio operations on groupby.

Which gives me following pandas series..which I am storing in dish_specific_perf

dish_name
Chicken       45.000000
Sandwich      61.111111

Then I am checking one condition in if loop..

if((dish_specific_perf < 50).any() == True):

If the condition is true then, I want to add ("NP") string to corresponding dish name in dataframe.. So, In dataframe it should look like this.

 coupon_type     dish_id  dish_name   dish_price  dish_quantity
0     Rs 20 off       012      Sandwich     65            2
1     Rs 20 off       013       Chicken     125           3
2     Rs 20 off       013       Chicken     125           3
3     Rs 20 off       013       Chicken     125           3

    ratings    reviews      coupon_type  user_id order_id  meals order_area
    4     blah blah blah   Rs 20 off      9       9         5     London
    4     blah blah blah   Rs 20 off      9       9         5     London
    3     blah blah blah   Rs 20 off      9       9         5     London
    4     blah blah blah   Rs 20 off      9       9         5     London  

  Flag
  Null
  NP
  NP
  NP

The problem with this is how do I compare series elements with dataframe dish_name column to check whether chicken exist or not?

when I do

dish_specific_perf[0]  

It just gives me a number as 45.

Please help..

Neil
  • 7,937
  • 22
  • 87
  • 145
  • 1
    IIUC then you can do `df_final['Flag'] = df_final['dish_name'].map(dish_specific_perf < 50)` this will set the flag to `True` or `False` you can then set these to NP/Null as desired `df_final['Flag'] = np.where(df_final['Flag'], 'NP', 'Null')` – EdChum Dec 11 '15 at 10:05
  • @EdChum Exactly what I am looking for.. Thanks alot..;) – Neil Dec 11 '15 at 10:05

2 Answers2

2

Essentially you are looking to do a lookup for that we can use map on the boolean series so the following will add a boolean flag:

df_final['Flag'] = df_final['dish_name'].map(dish_specific_perf < 50)

This works by looking up the df value against the series index and returning the value.

You can then convert the boolean values to your desired flag:

df_final['Flag'] = np.where(df_final['Flag'], 'NP', 'Null')
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • can we have two conditions in map? ` df_final['dish_name'].map(dish_specific_perf > 50 & dish_specific < 70)` – Neil Dec 11 '15 at 10:14
  • yes that'll work but you need to wrap parentheses around the conditions due to operator precedence: `df_final['dish_name'].map((dish_specific_perf > 50) & (dish_specific < 70))` – EdChum Dec 11 '15 at 10:15
  • I am trying with 2 different series conditions but its not giving me results.. `df_final['dish_name'].map((dish_relative_perf < 10) & (dish_specific_perf < 50))` `dish_relative_perf is one series and dish_specific_perf is another series.` It just executes first condition not both. – Neil Dec 11 '15 at 15:37
  • Unless the 2 series have matching indices then that won't work – EdChum Dec 11 '15 at 15:56
  • You'll need to post a new question with your input data, code to reproduce your dfs, and your desired output – EdChum Dec 11 '15 at 16:09
0

Your if statement is wrong for your needs, to begin with. I would do the whole thing in the loop over groups like so:

for name, group in df_dish_name:
    # Whatever awesome thing you are doing, which give you the ratio...

    if ratio < 50:
        df_final.loc[df_final['dish_name']==name, 'Flag'] = 'NP'

This will avoid indexing and selecting multiple times, and is easier to maintain.

Kartik
  • 8,347
  • 39
  • 73
  • Its setting all the corresponding rows to be NP.. I want only those rows to be set NP which are true for if condition.. – Neil Dec 11 '15 at 15:52
  • From your question I guessed you do want to set all rows where the group aggregate ratio is less than 50. – Kartik Dec 11 '15 at 18:43