Simple Example
I have created a sample.xlsx file to try mimic your case below. "Sheet1" has a column product and so does "Sheet2".
The only difference between the two columns is record 2 and 5 (starting on a zero based index system) which you want to identify (in terms of there index).
Sheet1
Product|
WM001 |
WM00X |
WM001 |
WM001 |
WM00X |
WM001 |
Sheet2
Product|
WM001 |
WM00X |
WM00X |
WM001 |
WM00X |
WM00X |
Pandas allows you to compare columns directly and create a mask (of truth values) which we can use to filter the rows you want.
>>> import pandas as pd
>>> sht1 = pd.read_excel('sample.xlsx', sheet_name='Sheet1')
>>> sht2 = pd.read_excel('sample.xlsx', sheet_name='Sheet2')
# Get the index's of those records that don't match in each sheet
>>> mask = sht1[sht1 != sht2].dropna().index
# Identify those discrepencies in both sheets.
>>> sht1.loc[mask,:]
>>> sht2.loc[mask,:]
Note
The last lines will print the index of those records and the values in the Product column. From there you should be able to identify the records.
If you want to do any inplace changes to the sheet with the index value I recommend using .iat[]
in pandas https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iat.html.
EDIT:
To deal with records with different lengths, assuming the indexes match up to the point of the sheet with the least number of rows. Obviously if sheet1 has 50 rows and sheet2 has 70 rows, you can only compare the first 50 rows between the both.
>>> import pandas as pd
>>> sht1 = pd.read_excel('sample.xlsx', sheet_name='Sheet1')
>>> sht2 = pd.read_excel('sample.xlsx', sheet_name='Sheet2')
>>> min_length = min(len(sht1), len(sht2))
>>> sht1_subset = sht1.loc[:min_length,'Product']
>>> sht2_subset = sht2.loc[:min_length,'Product']
# Get the Truth values of those records that don't match on product column in each sheet
>>> mask = (sht1_subset != sht2_subset)
# Identify those discrepencies in both sheets using `mask`
>>> sht1.where(mask).dropna()
>>> sht2.where(mask).dropna()
Print out mask
to see what it returns, should be a series of True False values representing the expression of those records in Product
field that are not equal.