1

I'm working on an app that finds and highlight duplicates in excel files. I have the following code where I am just adding --> where there are duplicates.

import pandas as pd
import numpy as np
from openpyxl import Workbook

dftest1=pd.read_excel('files/test1.xlsx')
dftest2=pd.read_excel('files/test2.xlsx')

comparevalues = dftest1.values == dftest2.values
rows,cols=np.where(comparevalues==True)
for item in zip(rows,cols):
     dftest1.iloc[item[0], item[1]] = '{} --> {}'.format(dftest1.iloc[item[0], item[1]],dftest2.iloc[item[0], item[1]])
     dftest1.to_excel('./files/output.xlsx',index=True,header=False)

I tried using fill but got the error that fill is read-only. How can I use styles or fill to highlight duplicates?

Eperanza
  • 11
  • 2
  • You could apply a color for a selection of cells based on conditions: See https://stackoverflow.com/a/73209396/13604396 or [Styling](https://pandas.pydata.org/pandas-docs/version/1.1.5/user_guide/style.html) from the `pandas` docs. – Confused Learner Aug 03 '22 at 12:06
  • Question excel is know to have the capabilities of highliting color by itself, why are you using pandas? For such a task. Now if you want to make a visualization in pandas while displaying in jupyter notebook and so on that is another question – INGl0R1AM0R1 Aug 03 '22 at 14:33

0 Answers0