0

I have found a few questions on SO for comparing two arrays such as: In Python, how can I calculate correlation and statistical significance between two arrays of data? . There are also posts about scipy an k nearest neighbor, but they too are about arrays.

Say a user has some attributes, such as age, salary, and some yes/no questions. And we have 100 users. these array comparisons work. However, if the users have already like certain movies, we know which users like similar things.

if that data is like such, an array with dictionary

{"Movie": 'A', 'Users Like': [1,4,6,3]}
{'Movie': 'B', 'Users Like': [2,4,5]}
{'Movie': 'B', 'Users Dislike': [1,6]}

and each user has an array of answers to a variety of questions, where the array index is the user number and the items in each array are the answers to different questions

[ [0,1,0,1,183] 
  [1,1,0,1,178]... ]

Is there a function/algorithm in python for inputting the known likes/dislikes, and finding which questions (based on their answers) have the highest correlations to that user.

Community
  • 1
  • 1
user-2147482637
  • 2,115
  • 8
  • 35
  • 56
  • Do you want to find an already existed function or a self-defined function? – Stephen Lin Jul 24 '14 at 06:05
  • @m170897017 ideally if there is a function that will let me use the first set of data to go through the second it is best. One of the problems with what i found so far is that the first set of data is just like, dislike, nothing. the collaborative filters i see all use a ranking system, so the yes/no has given very wrong results when converted to integers like 0/1 – user-2147482637 Jul 24 '14 at 06:31
  • Wouldn't this be a case for a ML regression model to figure out the statistics for you? – PJ_ May 17 '22 at 18:26

0 Answers0