I'm trying to make a product review analyzer with Python. I built a dataset with Excel with two columns containing positive and negative feedback adjectives. The program should then analyze the review and check the text's negative and positive feedback numbers with a for loop.
import numpy as np
import pandas as pd
data = pd.read_csv("data.csv")
str = "some string"
numbers = []
positives = []
negatives = []
def wordCount(word):
avoided = word.split()
print("There are", len(avoided), "words in this string")
for i in range(len(avoided)):
numbers.append(avoided.count(avoided[i]))
if avoided[i] in data["Positive"]:
positives.append(avoided[i])
elif avoided[i] in data["Negative"]:
negatives.append(avoided[i])
print(positives, negatives)
print(numbers)
print(avoided[numbers.index(np.max(numbers))], np.max(numbers))
wordCount(str)
But unfortunately, when I try to get each column of the dataset, an error occurs:
'utf-8' codec can't decode byte 0xfe in position 0: invalid start byte
I tried encoding and decoding the dataset and tried converting it into a list. None of them worked, and the program pursued on giving me the same error again.
Is it because I import the dataset the wrong way? Is something wrong with my code?
Can someone please help me how to solve it?