0

data is a pandas dataframe with a date-time-index on entries with multiple attributes. One of these attributes is called STATUS. I tried to create a plot of the number of entries per day, broken down by the STATUS attribute.

My first attempt using pandas.plot:

for status in data["STATUS"].unique():
    entries = data[data["STATUS"] == status]
    entries.groupby(pandas.TimeGrouper("D")).size().plot(figsize=(16,4), legend=True)

The result:

enter image description here

How should I modify the code above so that the legend shows which status the curve belongs to?

Also, feel free to suggest a different approach to realizing such a visualization (group time series by time interval, count entries, and break down by attributes of the entries).

clstaudt
  • 21,436
  • 45
  • 156
  • 239

1 Answers1

0

I believe that with below change to your code you will get what you want:

fig, ax = plt.subplots()    
for status in data["STATUS"].unique():
        entries = data[data["STATUS"] == status]
        dfPlot = pandas.DataFrame(entries.groupby(pandas.TimeGrouper("D")).size())
        dfPlot.columns=[status]
        dfPlot.plot(ax=ax, figsize=(16,4), legend=True)

What happened is that the output for size function gives you a Series type with no name in its column. So creating a Dataframe from the Series and changing the column name does the trick.

clstaudt
  • 21,436
  • 45
  • 156
  • 239
Cedric Zoppolo
  • 4,271
  • 6
  • 29
  • 59
  • Only that gives me one plot per iteration - need to figure out how to combine them into a single plot. – clstaudt Oct 27 '17 at 09:09