I have found a few questions on SO for comparing two arrays such as: In Python, how can I calculate correlation and statistical significance between two arrays of data? . There are also posts about scipy an k nearest neighbor, but they too are about arrays.
Say a user has some attributes, such as age, salary, and some yes/no questions. And we have 100 users. these array comparisons work. However, if the users have already like certain movies, we know which users like similar things.
if that data is like such, an array with dictionary
{"Movie": 'A', 'Users Like': [1,4,6,3]}
{'Movie': 'B', 'Users Like': [2,4,5]}
{'Movie': 'B', 'Users Dislike': [1,6]}
and each user has an array of answers to a variety of questions, where the array index is the user number and the items in each array are the answers to different questions
[ [0,1,0,1,183]
[1,1,0,1,178]... ]
Is there a function/algorithm in python for inputting the known likes/dislikes, and finding which questions (based on their answers) have the highest correlations to that user.