0

I'm new to Python. I have a dataframe and a datatable to show it in a GUI. enter image description here

Step 1: For every couple of unique values in column A and B, I need to search for the maximum value in column 'C' I managed to do it with groupby (see code) In this picture, you can see the result of Groupby, with the max values in column C.

enter image description here

Step2 For every couple of unique values in column A and B, I need to highlight the corresponding max value in column C. The result should be like this: enter image description here

Can somebody help me, please?

import tkinter as tk
import pandas as pd
from pandastable import Table

df = pd.DataFrame({
    'A': ['alfa','beta','alfa','alfa','beta','beta'],
    'B': ['a','a','a','b','a','a'],
    'C': [1,2,3,4,7,6],
})

root = tk.Tk()

frame1 = tk.Frame(root)
frame1.pack()

pt = Table(frame1, dataframe=df) 
pt.show()

print(df.loc[df.groupby(["A", "B"])["C"].idxmax()] )

pt.setColorByMask( 'C', pt.model.df.iloc[:, 2] == pt.model.df.iloc[:, 2].max(),  'lightgreen'  )

root.mainloop()
Chakot
  • 39
  • 4
  • for step 2: do you mean you want to highlight the entire row that max value appears? if so, you might want to look at https://stackoverflow.com/questions/43596579/how-to-use-pandas-stylers-for-coloring-an-entire-row-based-on-a-given-column – Paul Wang Sep 20 '22 at 15:14
  • For step 2, I want to highlight, for every couple of unique values in column A and B, the corresponding max value in column C. I'll edit my question because maybe is not clear. Thank you Paul – Chakot Sep 20 '22 at 15:17

1 Answers1

1

You can create a new column in df, indicating if it contains the max value. This column can then be used as a color mask in setColorByMask For example like this:

import tkinter as tk
import pandas as pd
from pandastable import Table

df = pd.DataFrame({
    'A': ['alfa','beta','alfa','gamma','beta','delta'],
    'B': ['a','b','a','b','a','b'],
    'C': [1,2,3,4,7,6],
})

root = tk.Tk()

frame1 = tk.Frame(root)
frame1.pack()

pt = Table(frame1, dataframe=df)
pt.show()

max_value_rows = df.loc[df.groupby(["A", "B"])["C"].idxmax()]

print(max_value_rows)

df['color_mask'] = False
df['color_mask'][max_value_rows.index] = True
color_mask = df['color_mask']

pt.setColorByMask( 'C', color_mask,  'lightgreen'  )

root.mainloop()

Edit: You can also create the color mask like this:

max_value_rows = df.loc[df.groupby(["A", "B"])["C"].idxmax()]

color_mask = pd.Series(len(df) * [False])
color_mask[max_value_rows.index] = True

pt.setColorByMask( 'C', color_mask,  'lightgreen'  )
AndrzejO
  • 1,502
  • 1
  • 9
  • 12
  • Andrzej, thank you for your answer: this is a first working solution. In the reality my dataframe will contain 18 columns to check and more than 20000 rows. So I would prefer to avoid to add 18 new columns. – Chakot Sep 20 '22 at 15:50
  • 1
    @Chakot I have updated my answer, you can create the color mask without changing the dataframe – AndrzejO Sep 20 '22 at 16:25