1

In Graphlab,

I am working with a small subset of movies from a larger list.

  movieIds_5K_np = LL_features_SCD_min.to_numpy()[:,0]
  ratings_33K_np = ratings_33K.to_numpy()

movieIds_5K_np is an array containing my movieIds. `ratings_33K_np' is an array with FOUR columns whose second columns contains movie Ids for ALL movies.

I need to select only the rows in ratings_33K_np whose id exist in `movieIds_5K_np'.

I tried this approach but it doesn't seems to be working:

 ratings_5K_np = ratings_33K_np[ratings_33K_np[:,2]==movieIds_5K_np] 

How can I do this in Graphlab or by using some Python libraries? I should say that originally ratings_33K and movieIds_5K were imported as SFrame.

Thanks

iulian
  • 5,494
  • 3
  • 29
  • 39
Yas
  • 811
  • 4
  • 11
  • 20

1 Answers1

1

Given that you have 2 sframes, you can do a join, like so:

ratings_5K = LL_features_SCD_min[['id_column_name']].join(ratings_33K, on='id_column_name', how='left')

As far as I understood from your code, the LL_features_SCD_min is the sframe corresponding to your miniset (5K data). So you just take the IDs that you want and left join them with the entire dataset, thus obtaining a new sframe with only the IDs that you wanted. Just substitute your id column name and there you go.

For more information regarding how join work within graphlab, consider checking the documentation on SFrame.

Good luck!

iulian
  • 5,494
  • 3
  • 29
  • 39
  • Thanks, it worked pretty well. Can I know the difference between 'left' and 'inner'. I would have used 'inner' join in this case shouldn't have you mentioned 'left' – Yas Mar 27 '16 at 18:02
  • 1
    `left` join assures you will have all rows from the `LL_features` sframe in your result (with null values for absent rows in `ratings_33` sframe), while `inner` join will return only the rows contained in both `sframe`s, dropping out values from `LL_features` sframe if they are not present in `ratings_33K` sframe. Depending on what you need, you might use one or another. For a visual explanation of types of `join`, see [this](http://stackoverflow.com/questions/5706437/whats-the-difference-between-inner-join-left-join-right-join-and-full-join) answer – iulian Mar 27 '16 at 18:09
  • Amazing explanations and links. Thanks alot – Yas Mar 27 '16 at 18:11