I have a pandas dataframe like as shown below
Company,year
T123 Inc Ltd,1990
T124 PVT ltd,1991
ABC Limited,1992
ABCDE Ltd,1994
tf = pd.read_clipboard(sep=',')
tf['Company_copy'] = tf['Company']
I would like to compare each value from tf['company']
against each value of tf['company_copy
] but exclude same matching row number or index number, string
For ex: I want T123 Inc Ltd
to be compared with remaining 3 items. Similarly, I want ABCDE Ltd
to be compared only with remanining 3 items.
So, I tried the below with the help of this post here
compare = pd.MultiIndex.from_product([tf['Company'].astype(str),tf['Company_copy'].astype(str)]).to_series()
but it produces some incorrect comparison as shown below. I want to avoid duplicate comparison
I expect my output to be like as below. You can see it doesn't have duplicate/same row comparison
Company Company_copy
T123 Inc Ltd T124 PVT ltd ( T123 Inc Ltd, T124 PVT ltd)
ABC Limited ( T123 Inc Ltd, ABC Limited)
ABCDE Ltd ( T123 Inc Ltd, ABCDE Ltd)
T124 PVT ltd T123 Inc Ltd ( T124 PVT ltd, T123 Inc Ltd)
ABC Limited ( T124 PVT ltd, ABC Limited)
ABCDE Ltd ( T124 PVT ltd, ABCDE Ltd)
ABC Limited T123 Inc Ltd ( ABC Limited, T123 Inc Ltd)
T124 PVT ltd ( ABC Limited, T124 PVT ltd)
ABCDE Ltd ( ABC Limited, ABCDE Ltd)
ABCDE Ltd T123 Inc Ltd ( ABCDE Ltd, T123 Inc Ltd)
T124 PVT ltd ( ABCDE Ltd, T124 PVT ltd)
ABC Limited ( ABCDE Ltd, ABC Limited)