0

I have a data frame that looks like this:

                 counts month
login_time      
1970-03-14 17:45:52 3   Mar
1970-01-09 01:31:25 3   Jan
1970-04-12 04:03:15 3   Apr
1970-02-24 23:09:57 3   Feb
1970-04-04 01:17:40 3   Apr
1970-02-12 11:16:53 3   Feb
1970-03-17 01:01:39 3   Mar
1970-01-06 21:45:52 3   Jan
1970-03-29 03:24:57 3   Mar
1970-04-03 14:42:38 2   Apr

I would like to aggregate these login counts by 15 min intervals and then plot the results.

I tried the following:

df.groupby('login_time').resample('15min').count()

but the way it resamples doesn't seem correct

        counts  month
login_time  login_time      
1970-01-01 20:12:16 1970-01-01 20:00:00 1   1
1970-01-01 20:13:18 1970-01-01 20:00:00 1   1
1970-01-01 20:16:10 1970-01-01 20:15:00 1   1
1970-01-01 20:16:36 1970-01-01 20:15:00 1   1
1970-01-01 20:16:37 1970-01-01 20:15:00 1   1
1970-01-01 20:21:41 1970-01-01 20:15:00 1   1
1970-01-01 20:26:05 1970-01-01 20:15:00 1   1
1970-01-01 20:26:21 1970-01-01 20:15:00 1   1
1970-01-01 20:31:03 1970-01-01 20:30:00 1   1
1970-01-01 20:34:46 1970-01-01 20:30:00 1   1

Thank you!

user
  • 651
  • 10
  • 22

1 Answers1

1

Not sure if that's exactly what you meant, since you did not specify if you're interested in bins of 15 min from midnight or from the beginning of the dataset, but here's something that I think would work:

I generated random dates in some range (to have something to bin) using that answer.

import pandas as pd
import numpy as np

# Make some fake data
def random_date_generator(start_date, range_in_days):
    days_to_add = np.arange(0, range_in_days)
    random_date = np.datetime64(start_date) + np.random.choice(days_to_add)
    return random_date

data_length = 1000
date_col = [random_date_generator('1970-01-01 00:00:00', 100000) for dc in np.arange(data_length)]
count_col = np.random.randint(5, size = data_length)

# Sample:
df = pd.DataFrame({'login_time':date_col, 'counts': count_col})
df = df.set_index(['login_time'])

df.resample('15T').count()
liorr
  • 764
  • 5
  • 21