Hello all out there at stackoverflow! My issue is: I want to read in in csv-Files from imdb, merge and add results and give them out. I can add new rows with calculations, e.g. divide the averageRating through 10 or something like this - works fine.
But the goal is to classify the data according to the number of votes. Code is like this:
import numpy
import pandas as pd
import time
df1 = pd.read_csv('imdb_title.csv', sep='\t')
df2 = pd.read_csv('imdb_ratings.csv', sep='\t')
output_csv = 'imdb_result.csv'
df = df1.merge(df2, how='outer')
df = df[df.titleType == 'movie']
for i in df.numVotes:
if i <= 5000:
j = 5.9
elif i <= 25000:
j = 6.6
...
elif i <= 1000000:
j = 8.2
else:
j = 8.4
df['estRate'] = j
print(i, j)
df.to_csv(output_csv, sep=';')
"print(i, j)" will give the correct answer, but output file won't.
Example wanted vs. result
|numVotes| result|numVotes| result|
| 30670.0| 7.2 | 30670.0 |6.6 |
| 04774.0| 5.9 | 04774.0| 6.6|
| 20876.0| 6.6 |20876.0| 6.6|
After searching and reading numerous articles, i tried to change the italic-written row:
df['estRate'] = j.copy() but i received the errormessage "AttributeError: 'float' object has no attribute 'copy'"
Then i tried using copy method
"df['estRate'] = copy.copy(j)" --> this is running but takes no effect.
The last value of the result (6.6) is still the value written in any row in the csv-table.
I understand that the handling in dataframes is different and that's the reason i have to use the copy-method to ensure it is the at-time-value that is recognized.
Another try was to append date in an open file...
"df.to_csv(output_csv, sep=';', mode='a', header=False)"
but this will lead to n-times more rows (whilst part of the for... loop) or to just the last one, as earlier. What i need is to only write the 1st, second, third line of the df.
i tried for index, i enumerate... and than "index.to_csv" ... this leads to error "'int' object has no attribute 'to_csv'"
or df[index] or something like this, but this causes also hard errors.
May someone has a suggestion for my, i tried long and different suggestions, but nothing seems to work in my case.