I want to sum my data to have one number (which is the sum of all the minutes) per day
my data look like that:
Date negative_sentiment positive_sentiment neutral_sentiment compount_sentiment
2015.03.22.13.00 1.407692 3.655128 54.937179 3.698333
2015.03.22.13.01 1.839572 3.457345 54.702827 2.742424
2015.03.22.13.02 1.852847 3.187877 54.959512 2.649846
2015.03.22.13.03 1.758206 3.444771 54.762926 3.495089
2015.03.22.13.04 1.611731 3.274262 55.114041 2.847284
2015.03.22.13.05 1.833436 3.241374 54.907794 2.881480
and the format is:
Date datetime64[ns]
negative_sentiment float64
positive_sentiment float64
neutral_sentiment float64
compount_sentiment float64
dtype: object
I tried many option but nothing is working:
import pandas as pd
pd.set_option('display.width', 1000)
path_name = "C:/Users/Alex/Desktop/03_2015.csv"
data_sentimental = pd.read_csv(path_name, sep=';', header=None, names = ['Date', 'negative_sentiment', 'positive_sentiment','neutral_sentiment','compount_sentiment'])
# converting column 1 to datetime and assigning it back to column 1
data_sentimental['Date'] = pd.to_datetime(data_sentimental['Date'], format='%Y.%m.%d.%H.%M')
print(data_sentimental.dtypes) #giving us the type of data so we can be sure that we have the good type
data_sentimental = pd.DatetimeIndex(data_sentimental['Date']).normalize()
data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())
but that give me this error:
Traceback (most recent call last):
File "C:/Users/Alex/PycharmProjects/master_thesis/result.py", line 19, in <module>
data_sentimental = data_sentimental.groupby(data_sentimental['Date'].dt.normalize())
File "C:\Users\Alex\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\datetimelike.py", line 267, in __getitem__
raise ValueError
thank you for your help