0

I need to find all duplicate rows (string values) in "Name" column and then find out if two numerical values in "Amount" column sum up to a third value also in the "Amount" column in an Excel tab in Pandas (Python)? There are two tabs in this worksheet. I'm referring to the second tab called "Table2".

For example, in the table below, I have several duplicates in the "Name" column. But for "Richard Madden" duplicates, corresponding values in "Amount" table (-4000) + (-6000) equals (-10000). I need to delete the entire rows for -4000 and -6000 and leave the row for -10000.

enter image description here

Here's my code so far:

import pandas as pd

excel = pd.ExcelFile('/Users/user/Downloads/DSR-Table.xlsx')
df1 = pd.read_excel(excel, 'Table2')

dfObj = pd.DataFrame(df1, columns=['Name'])
duplicateRowsDF = dfObj[dfObj.duplicated()]
  • Does this answer your question? [Pandas group-by and sum](https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum) – Nick ODell Jan 13 '22 at 23:30
  • No. It showed how to apply the filter then sum up all the numbers. I’m trying to find out if two numbers sum up to a third number within the same column after sorting by duplicate “Names.” If yes, then I need to delete those two numbers and keep the third number that happens to be their sum. Thanks. – itsmenick212 Jan 15 '22 at 01:36
  • Got it. My bad. I'll retract the close vote. – Nick ODell Jan 15 '22 at 01:37
  • No problem, Nick! Thank you! – itsmenick212 Jan 15 '22 at 01:40

0 Answers0