Alright guys. My professor says that there is a way to do this function without the help of any loops in Python3. I'm not seeing it atm. She recommends using zip, enumerate, readlines, and split(";") (Every review is followed by a ';', if there are two in a row it means that that reviewer did not review the movie). What I'm doing is reading in a movie, looking for a comparison movie in the movMat list of lists. I then compare them for common reviewers. After that I have to get the Pearson calculation, which involves getting the common reviewers of the current movie, the values of the target movie (the compare movie), getting the mean of said common reviewer values, the standard deviation and finally the Pearson R correlation.
def pCalc (movMat, movNumber, n)
indexes1 = [i for i,x in enumerate(movMat[movNumber][1].split(';')) if x == '1' or x == '2' or x == '3' or x == '4' or x == '5' ]
indexes2 = [i for i,x in enumerate(movMat[n][1].split(';')) if x == '1' or x == '2' or x == '3' or x == '4' or x == '5' ]
compare = list(set(indexes1).intersection(indexes2))
xi = []
for index, val in enumerate(movMat[movNumber][1].split(';')):
if index in compare:
xi.append(int(val))
average1 = sum(xi)/len(compare)
stdDev1 = statistics.stdev(xi)
yi = []
for index, val in enumerate(movMat[n][1].split(';')):
if index in compare:
yi.append(int(val))
average2 = sum(yi)/len(compare)
stdDev2 = statistics.stdev(yi)
r = 0
newSum = 0
for i in range(0, len(compare)):
newSum += ((xi[i]-average1)/stdDev1) * ((yi[i]-average2)/stdDev2)
r = (1/(len(compare)-1)) * newSum
An example input would be:
The main part of this program handles argument calls, lines in the file and whatnot but a sample output for an input of command line argument '1' would call up toy story and compare it to other movies within the database like this:
Movie number: Movie 1|Toy Story (1995)
*** No. of rows (movies) in matrix = 1682
*** No. of columns (reviewers) = 943
Output shows r-value, movie no.|name, no. of ratings
compare movie is 1|Toy Story (1995)
no. of common reviewers 452
target avg 3.8783185840707963
compare avg 3.8783185840707963
target std 0.9278967014291252
compare std 0.9278967014291252
r 0.999999999999991
compare movie is 2|GoldenEye (1995)
no. of common reviewers 104
target avg 3.8653846153846154
compare avg 3.201923076923077
target std 0.9456871165874381
compare std 0.9177833965361495
r 0.22178411018797187
compare movie is 3|Four Rooms (1995)
no. of common reviewers 78
target avg 3.717948717948718
compare avg 2.9358974358974357
target std 0.9520645495064435
compare std 1.2096982943568881
r 0.1757942980351483
compare movie is 4|Get Shorty (1995)
no. of common reviewers 149
target avg 3.87248322147651
compare avg 3.530201342281879
target std 0.9247979370536794
compare std 0.9970025819307402
r 0.10313529410109303