Extract all rows of an specific columns with respect to another column in pandas DataFrame()

Question

Apologies if this questions seems very easy in advance!

Given the following small dataset pd.DataFrame():

    userId  movieId     rating
0   1       169         2.5
1   1       2471        3.0
2   1       48516       5.0
3   2       2571        3.5
4   2       109487      4.0
5   2       112552      5.0
6   2       112556      4.0
7   3       356         4.0
8   3       2394        4.0
9   3       2431        5.0

I would like to extract all the movieId that one user with userId has watched! The output for the above dataset I expect to get is something like this:

[[169, 2471, 48516], [2571, 109487, 112552, 112556], [356, 2394, 2431]]

I have written a for loop which results different than what I expected and seems extremely inefficient as the size of the dataset increases:

mv_lst = []
usrID = np.unique(test_df['userId'])
for i,v in enumerate( test_df['userId'] ):
    if v in usrID:
        mv_lst.append(test_df['movieId'][i])
print(mv_lst)
# result: [169, 2471, 48516, 2571, 109487, 112552, 112556, 356, 2394, 2431]

Is there smarter and cleaner alternative in pandas to do this? Cheers,

Use `L = test_df.groupby('userID')['movieId'].apply(list).tolist()` — jezrael, Jun 14 '22 at 13:02

Extract all rows of an specific columns with respect to another column in pandas DataFrame()

0 Answers0