I have 31-day time series dataset which I extract using the below query.
import pymysql
import pymysql.cursors
import pandas as pd
import sys
import csv
conn = pymysql.connect(host='',user='',password='',db='',port='',charset='utf8mb4',cursorclass=pymysql.cursors.DictCursor)
dbquery = """SELECT * FROM 'table_name' where 'date' between ('2012-03-01 00:00:00') and ('2012-03-31 23:59:59') """
df = pd.read_sql_query(dbquery, conn)
df.to_csv('one_month_timeseries_data.csv', sep=',', encoding='utf-8')
The extracted dataframe is structured as below.
id date value
1 2012-01-01 00:00:00 33
5 2012-01-02 00:00:01 15
.
.
.
.
2 2012-01-30 23:59:58 10
5 2012-01-31 23:59:59 5
I want to further loop through each day (00:00:00 - 23:59:59) through the entire dataset and split the dataframe into 3 separate dataframes of 8 hour intervals each grouped into sleep (00:00:00 - 07:59:59), work (08:00:00 - 15:59:59), home (16:00:00 - 23:59:59).
Expected output:
e.g., for day 1:
day1_df_sleep:
id date value
1 2012-01-01 00:00:00 33
.
.
.
.
3 2012-01-01 07:59:59 10
day1_df_work:
id date value
1 2012-01-01 08:00:00 12
.
.
.
.
4 2012-01-01 15:59:59 50
day1_df_home:
id date value
1 2012-01-01 16:00:00 12
.
.
.
.
4 2012-01-01 23:59:59 50
etc..until day 31
I have read about pandas' date_range() function but I am not sure how best to implement it where I only have the date column.
How best can I implement this using pandas?