I have a dataframe whose first 5 rows looks like this.
userID CategoryID sectorID
agunii2035 [16, 17, 3, 12, 1] [2, 33, 29, 18, 23]
agunii3007 [2, 4, 6, 3, 16] [4, 15, 29, 10, 18]
agunii2006 [8, 16, 2, 5, 12] [38, 18, 7, 36, 33]
agunii2003 [6, 4, 2, 5, 17] [37, 12, 3, 32, 34]
agunii3000 [12, 11, 7, 3, 1] [38, 1, 13, 25, 3]
Now for any userID
(let say "userID" = 'agunii2035') , I want to get the "userID"s whose "CategoryID
" or "SectorID
" have at least one common intersection value (For example, since agunni2035 and aguni3007 have at least one common "CategoryID
" i.e '16' or have one common "sectorID
" i.e. '29', we will consider the "userID
" 'agunii3007')
The output can be a dataframe that looks like this
userID user_with_common_cat/sectorID
agunii2035 {aguni3007, agunni2006, agunii2003, agunii300}
aguni3007 {agunni2035,agunni2006,agunii2003}
and so on
or this can also be
userID user_with_common_cat/sectorID
agunii2035 [aguni3007, agunni2006, agunii2003, agunii300]
aguni3007 [agunni2035,agunni2006,agunii2003}
and so on
Any help on this please?
Edit
What I have done so far:
userID= 'agunii2035'
common_users = []
for user in uniqueUsers:
common = list(set(df_interest.loc[df_interest['userID'] == 'agashi2035', 'categoryID'].iloc[0]).intersection(df_interest.loc[df_interest['userID'] == user, 'categoryID'].iloc[0]))
#intersect = len(common) > 0
if (len(common) > 0):
common_users.append(user)
I want to do this for sectors as well and make the intersection for either sector or category and append to the common_user
list if length of any intersection is 1.
Also, I want to do this for all the users.