I want to find duplicate items within 2 rows in Excel. So for example my Excel consists of:
list_A list_B
0 ideal ideal
1 brown colour
2 blue blew
3 red red
I checked the pandas documentation and tried duplicate method but I simply don't know why it keeps saying "DataFrame is empty". It finds both columns and I guess it's iterated over it but why doesn't it find the values and compare them?
I also tried using iterrows but honestly don't know how to implement it.
When running the code I get this output:
Empty DataFrame
Columns: [list A, list B]
Index: []
import pandas as pd
pt = pd.read_excel(r"C:\Users\S531\Desktop\pt.xlsx")
dfObj = pd.DataFrame(pt)
doubles = dfObj[dfObj.duplicated()]
print(doubles)
The output I'm looking for is:
list_A list_B
0 ideal ideal
3 red red
Final solved code looks like this:
import pandas as pd
pt = pd.read_excel(r"C:\Users\S531\Desktop\pt.xlsx")
doubles = pt[pt['list_A'] == pt['list_B']]
print(doubles)