0

I would like to count how many users have rated the specific movieId? I have tried using pandas.iloc, but the result is still not as expected. The expected output is following:

For example, I using MovieLens data set, and let say movieId 302 actually have total 10 userId rated this specific movie.

The data is in dataframe. In your opinion what method should I try to get the expected result? I truly appreciate if I can learn from you. Thank you.

!wget "http://files.grouplens.org/datasets/movielens/ml-100k.zip"
!unzip ml-100k.zip
!ls

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv("ml-100k/u.data", sep='\t',names="userId,movieId,rating,timestamp".split(",")) 
data
Yeo Keat
  • 143
  • 1
  • 9
  • 3
    [Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) instead post a sample 5 line datafreme which can reproduce your question and users can try and get to a solution , also post a n expected dataframe result – anky Mar 08 '20 at 17:31
  • 1
    Does this answer cover what you're looking for? https://stackoverflow.com/questions/41415017/count-unique-values-using-pandas-groupby/45091077 – LTheriault Mar 08 '20 at 17:40

1 Answers1

1

Assuming that a single user cant rate on the same movie twice, to start you could try:

df.groupby('movieId')['userId'].count().reset_index(name='userIdCount')

(the reset_index() is to have it back as a dataframe)

you would then have:

    movieId userIdCount
0   1       5
1   2       1
2   3       2

If you want to make sure that no userId voted more than once you can also use:

df.groupby('movieId')['userId'].nunique().reset_index(name='userIdCount')
Gorlomi
  • 515
  • 2
  • 11
  • Thank you Gorlomi, your suggestion is great! How you achieve that? I have spend few hours look into it, but still have no idea how to solve, but you can get it right within an hour, it is something I should learn. Thank you so much for your help. I will study more about the `groupby` and understand the solutions. – Yeo Keat Mar 09 '20 at 05:10
  • 1
    Hi, this is how you learn! From now on you'll be aware of this tool and as you face stiuations you'll find other tools and after a while it will come natural – Gorlomi Mar 09 '20 at 09:51