I have two Datasets.
The first one, in the market variable contains a generic market trend with the following structure:
Date High Close Volume Open Low
The second, in the moods variable contain for each days a few tweets with an assosiate sentiment in this structure:
body date datetime id sentiment time
So, I want to count for each days how many "Bearish" and "Bullish" sentiment there are. It works and this is my code with comments:
# Read the datasets
market = pd.read_csv("Datasets/SP500/aggregates.txt")
moods = pd.read_json("Datasets/DatasetStockTwits-Aggregato.json")
# Remove all null sentiments
moods = moods[moods.sentiment != "null"]
# Get a generic subsets of data for computational speed
market_tail = market.tail(100)
# For each day present in market_tail, get the same days twits
moods_tail = moods.loc[moods['date'].isin(market_tail.Date)]
# So now I count for each day how many "Bearish" and "Bullish" twits there are
sentiments_count = pd.crosstab(moods_tail['date'], moods_tail['sentiment'])
print(sentiments_count)
This is the results:
sentiment Bearish Bullish
date
2017-11-03 9 12
2017-11-05 3 6
2017-11-06 20 9
2017-11-07 16 35
So it work fine, but I don't understand why I cannot access to sentiments_count.date
or sentiments_count['date']
index.
In fact if I try somethings like this:
print(sentiments_count['date'])
I obtain: KeyError: 'date'
Am I missing somethings? Thanks