1

I have a data frame with perfectly organised timestamps, like below:

enter image description here

It's a web log, and the timestamps go though the whole year. I want to cut them into each day and show the visits within each hour and plot them into the same figure and stack them all together. Just like the pic shown below:

enter image description here

I am doing well on cutting them into days and plot the visits of a day individually, but I am having trouble plotting them and stacking them together. The primary tool I am using is Pandas and Matplotlib.

Any advices and suggestions? Much Appreciated!


Edited:

My Code is as below:

The timestamps are: https://gist.github.com/adamleo/04e4147cc6614820466f7bc05e088ac5

And the dataframe looks like this: enter image description here

I plotted the timestamp density through the whole period used the code below:

timestamps_series_all = pd.DatetimeIndex(pd.Series(unique_visitors_df.time_stamp))
timestamps_series_all_toBePlotted = pd.Series(1, index=timestamps_series_all)
timestamps_series_all_toBePlotted.resample('D').sum().plot()

and got the result:

enter image description here

I plotted timestamps within one day using the code:

timestamps_series_oneDay = pd.DatetimeIndex(pd.Series(unique_visitors_df.time_stamp.loc[unique_visitors_df["date"] == "2014-08-01"]))
timestamps_series_oneDay_toBePlotted = pd.Series(1, index=timestamps_series_oneDay)
timestamps_series_oneDay_toBePlotted.resample('H').sum().plot()

and the result: enter image description here

And now I am stuck.

I'd really appreciate all of your help!

Adam Liu
  • 1,288
  • 13
  • 17

1 Answers1

2

I think you need pivot:

#https://gist.github.com/adamleo/04e4147cc6614820466f7bc05e088ac5 to L
df = pd.DataFrame({'date':L})
print (df.head())
                  date
0  2014-08-01 00:05:46
1  2014-08-01 00:14:47
2  2014-08-01 00:16:05
3  2014-08-01 00:20:46
4  2014-08-01 00:23:22

#convert to datetime if necessary
df['date']  = pd.to_datetime(df['date'] )
#resample by Hours, get count and create df
df = df.resample('H', on='date').size().to_frame('count')
#extract date and hour
df['days'] = df.index.date
df['hours'] = df.index.hour
#pivot and plot
#maybe check parameter kind='density' from http://stackoverflow.com/a/33474410/2901002
#df.pivot(index='days', columns='hours', values='count').plot(rot='90')
#edit: last line change to below:
df.pivot(index='hours', columns='days', values='count').plot(rot='90')
Adam Liu
  • 1,288
  • 13
  • 17
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thank you so much buddy, this answer really helped. However the last line was a bit twisted, so I changed to `df.pivot(index='hours', columns='days', values='count').plot(rot='90')` and it worked perfectly. Appreciated! – Adam Liu Apr 21 '17 at 15:40