0

I am trying to do a scatter plot over the number of items in the DataFrame for each date/time combination. I've grouped data like this:

dff = pd.DataFrame(df.groupby(['date', 'time']).size().rename('count'))

and it looks like this:

                           count
date         time       
2017-05-19   15:00         1
             15:30         1
             16:00         1
             16:30         1
             17:00         1
2017-05-23   10:00         2
             10:30         2
             11:00         2
...

Now, how can I scatter plot the counts having dates on the X axis and times on the Y axis? plt.scatter(x, y, s=area, c=colors) is the signature, but however I try to select x and y from dff, it fails to find the keys. Also, scatter expects floats on the axes, while I have strings.

linkyndy
  • 17,038
  • 20
  • 114
  • 194

1 Answers1

3

This requires accessing the MultiIndex values like so:

# replicating sample data (foo is just dummy data for the count)
grouped = df.groupby(['date', 'time'])['foo'].count()
date        time    
2015-01-01  15:00:00    1
            15:30:00    1
2015-01-02  16:00:00    2
Name: foo, dtype: int64

plt.scatter(x=grouped.index.get_level_values(0), y=grouped.index.get_level_values(1), s=[20*4**n for n in grouped.values])
plt.show()

You'll need to play with the s argument in scatter(), here's the doc I was using for this- pyplot scatter plot marker size.

enter image description here

Andrew L
  • 6,618
  • 3
  • 26
  • 30