0

I have a problem where I need to find the maximum value related to a string. The data is stored in excel. I understand I need to use openpyxl and possibly pandas. Example below:

Name Value
x 10
x 5
x 1
y 40
y 11
y 101
z 30
z 31

I am looking for the following output:

Name Value
x 10
y 101
z 31

Here is the beginning of my code:

import openpyxl
wb = openpyxl.load_workbook("Book1.xlsx")
sh1=wb['Sheet1']

import pandas as pd
excel_file= pd.ExcelFile(r"C:\Users\PycharmProjects\pythonProject\Book1.xlsx")
def = excel_file.parse(excel_file.sheet_nams[0])

Any help would be much appreciated! :)

user11427018
  • 129
  • 1
  • 6

2 Answers2

0

Try with

out = df.groupby('Name').max().reset_index()

Or

out = df.sort_values('Value').drop_duplicates('Name',keep='last').sort_index()
  Name  Value
0    x     10
5    y    101
7    z     31
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Pandas can do all of this natively. By adding in @BENY's answer, we can do it easily.

import pandas as pd
grp = pd.read_excel("C:\Users\PycharmProjects\pythonProject\Book1.xlsx").groupby('Name').max().reset_index()

for i in grp:
    print(i)
jameshgrn
  • 26
  • 3