0

I have a dataframe with data every 15 minutes for 2 years, I need to group my data in different time intervals.

I have to apply a function every day, just that my day has more than 24 hours, that is, the range of data in which I must apply the function of the day, it has some hours before the day and some hours after the day, that is to say , for example in the day 08/15/2016, the lower limit of the data range is 08/14/2016 20:00:00, and the upper limit 08/16/2016 06:00:00.

I have a dataframe with the daily data every 15 minutes and another dataframe with the time intervals for each day.

I already tried several ways but I can not create a function to modify the Groupby command

My csv has the next format

Date_time,wt_f
2015-09-15 11:00:00,1.2982869908295105
2015-09-15 11:15:00,1.302517743219596
2015-09-15 11:30:00,1.3067403484132343
2015-09-15 11:45:00,1.310906931975276
2015-09-15 12:00:00,1.3149784663752402
2015-09-15 12:15:00,1.3191250773036653
2015-09-15 12:30:00,1.3226057283408936
2015-09-15 12:45:00,1.3238564048371992
2015-09-15 13:00:00,1.3247697516940984

And my code is

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn import linear_model

sy = 0.139
hp = 2.12
dados = pd.read_csv('wt_f.csv' , header = 0)
dados['Date_time'] = pd.to_datetime(dados['Date_time'],format='%Y-%m-%d %H:%M:%S')
dados = dados.set_index(['Date_time'])
dados['wt_l'] = dados['wt_f'] - hp
dados['dwt_l/dt'] = dados.wt_l.diff(+1)
dados['dt_j'] = dados.index.to_julian_date()
#dados['date'] = dados.index.date
reg = linear_model.LinearRegression()

def regr_m(dados, x, y):
    reg_m = reg.fit(dados[x], dados[y])
    return reg_m.coef_[0][0]

def regr_b(dados, x, y):
    reg_b = reg.fit(dados[x], dados[y])
    return reg_b.intercept_[0]

a = dados.at_time('0:00:00')
a = a.drop(['wt_f','wt_l','dwt_l/dt','dt_j'] , axis = 1)
a['lim_inf'] = a.index - pd.to_timedelta('4:00:00')
a['lim_sup'] = a.index + pd.to_timedelta('32:00:00')
a = a.reset_index()
jcdc1331
  • 1
  • 1
  • It sounds like the groups overlap. Do you mean that for 08/16/2016, it would go from 08/15/2016 20:00:00, and the upper limit 08/17/2016 06:00:00 ? If so, groupby won't work because the groupby subsets are not distinct. You could perhaps make a list or 2D array of indicies corresponding to "days" and apply your function to that. – Dave X Oct 31 '17 at 03:51
  • Dave, you have some idea of ​​how to do that, I'm really new to python and I do not know how to do it, thank you very much for your help – jcdc1331 Oct 31 '17 at 23:43
  • With just 2 years, I'd loop over days, and write your function to work across a range like https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates – Dave X Nov 01 '17 at 02:26

0 Answers0