I have the following dataframe in pandas:
id name categoryids shops
5 239 Boulanger [5] 152
3 196 Bouygues Telecom [5] 500
4 122 Darty [5,3] 363
1 311 Electro Dépôt [5] 81
0 2336 Orange [15] 578
2 194 Orange [5] 577
I would like to drop the 5th row because it's duplicated in name but has a different value in the column categoryids, but as the values are arrays (as they can have more than one value), I have problem comparing them.
My idea was to check the mode of this column and discard all rows that don't have this value in its array (for example, in this case, the mode would be 5, so the 5th column should be discarded as this value is not present in its array), but I have problems calculating this value as the column is an array, not a single value.
Any ideas or suggestions on how to do this?
I'm using python 3.7 and last version of pandas.
Thank you.