The data I'm using is a conversation message log. I have a Pandas Dataframe with datestamps as the index, and two columns; one for "sender" and one for "message."
I'm simply trying to plot a stackplot of messages over time. I don't actually need the contents of message, so I've cleaned the data as follows:
Dummydata:
df = pd.Dataframe({'date': [Timestamp('2019-07-29 19:58:00'), Timestamp('2019-07-29 20:03:00'), Timestamp('2019-08-01 19:22:00'), Timestamp('2019-08-01 19:23:00'), Timestamp('2019-08-01 19:25:00'), Timestamp('2019-08-04 11:28:00'), Timestamp('2019-08-04 11:29:00'), Timestamp('2019-08-04 11:29:00'), Timestamp('2019-08-04 12:43:00'), Timestamp('2019-08-04 12:49:00'), Timestamp('2019-08-04 12:51:00'), Timestamp('2019-08-04 12:51:00'), Timestamp('2019-08-25 22:33:00'), Timestamp('2019-08-27 11:55:00'), Timestamp('2019-08-27 18:35:00'), Timestamp('2019-11-06 18:53:00'), Timestamp('2019-11-06 18:54:00'), Timestamp('2019-11-06 20:42:00'), Timestamp('2019-11-07 00:16:00'), Timestamp('2019-11-07 15:24:00'), Timestamp('2019-11-07 16:06:00'), Timestamp('2019-11-08 11:48:00'), Timestamp('2019-11-08 11:53:00'), Timestamp('2019-11-08 11:55:00'), Timestamp('2019-11-08 11:55:00'), Timestamp('2019-11-08 11:59:00'), Timestamp('2019-11-08 12:03:00'), Timestamp('2019-12-24 13:40:00'), Timestamp('2019-12-24 13:42:00'), Timestamp('2019-12-24 13:43:00'), Timestamp('2019-12-24 13:44:00'), Timestamp('2019-12-24 13:44:00')], 'sender': ['Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 1', 'Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 2', 'Person 2', 'Person 2', 'Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 2', 'Person 2', 'Person 1', 'Person 2', 'Person 2', 'Person 1', 'Person 2', 'Person 2', 'Person 1', 'Person 2', 'Person 1', 'Person 2'], 'message': ['Hello', 'Hi there', "How's things", 'good', 'I am glad', 'Me too.', 'Then we are both glad', 'Indeed we are.', 'I sure hope this is enough fake conversation for stackoverflow.', 'Better write a few more messages just in case', "But the message content isn't relevant", 'Oh yeah.', "I'm going to stop now.", 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted', 'redacted']})
dfgrouped = df.groupby(["sender"])
dfgrouped[["sender"]].resample("D").count()
This gives a dataframe grouped by each sender in the conversation, with DateTime as index and number of messages sent for that given day.
dfgrouped[["sender"]].get_group("Joe Bloggs").resample("D").count()
... would give a dataframe with just one user and their message counts per day.
I'd like to know how to use matplotlib to plot a stackplot where each "sender" is a different line. I haven't been able to achieve this through either
ax.stackplot(dfgrouped[["sender"]].resample("D").count())
or through looping:
for sender in df["sender"].unique():
axs[i].stackplot(dfgrouped[["sender"]].get_group(sender).resample("D").count()