2

I'm new on python, and I'm trying to convert this code from another language. And I don't know if there is a simple way to solve my problem and avoid the long processing time.

About the problem

I have a data frame with 2 columns (time, for every 30 minutes; and a value) trying to find a maximum aggregate value from a specific time step for each day.

About the time, they are already an accumulation. For example, '2019-03-28 04:00:00', represents an accumulation from 03:31:00 to 04:00:00.

So, for a time step equals to 2 hours, for example, I may find the maximum value ranging from 04:00:00 to 05:30:00 (=80.0) at 2019-03-28, but it could happen in a different set of data.

      Time                Value
2019-03-28 00:30:00        10.0 
2019-03-28 01:00:00         5.0
2019-03-28 01:30:00         0.0
2019-03-28 02:00:00        15.0
2019-03-28 02:30:00         2.0
2019-03-28 03:00:00         0.0
2019-03-28 03:30:00         0.0
2019-03-28 04:00:00        10.0  *
2019-03-28 04:30:00         0.0  *
2019-03-28 05:00:00        10.0  *
2019-03-28 05:30:00        60.0  *
2019-03-28 06:00:00         0.0
........
........
2019-03-28 23:30:00         0.0
........
........

EDIT Is there a simple way to automatically find the maximum value aggregating 2 hours for each day?

Geo0511
  • 23
  • 4

2 Answers2

2

Please try the following. If doesn't work let us know we will help further

df['Time']=pd.to_datetime(df['Time'])#Coerce Time to datetime
df.set_index(df['Time'], inplace=True)#Set time as the index
df.groupby(df.index.date)['Value'].max().to_frame()#groupby date. Can also substitute date for day
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • It's close to what I want. With this code, it gets the maximum value in 30 minutes, which returns to 60. What I'm looking for is the maximum value observed in 2 hours. So, it should be accumulated from 04:00:00 to 05:30:00, for this specific day, and returns the value equal to 80. – Geo0511 May 17 '20 at 12:53
  • This not what you wanted initially. So you have changed the question? Try df.rolling('2H')['Value'].sum().max(). Even so, I am not clear what you really need. Basically, just using rolling window. My advice, you get new terminology, read on it and take tutorials. You'll learn faster. – wwnde May 17 '20 at 12:56
  • Thank you for your help. I used the idea from another question to accumulate the values [https://stackoverflow.com/questions/42308197/python-pandas-find-consecutive-group-with-max-aggregate-values](https://stackoverflow.com/questions/42308197/python-pandas-find-consecutive-group-with-max-aggregate-values) . Then, applied the function using the groupby, as you suggest. – Geo0511 May 17 '20 at 19:32
  • The way to go. Keep coding – wwnde May 17 '20 at 19:42
1

Using .resample():

# Import and initialise pacakages in session: 
import pandas as pd

# Coerce Time to datetime: Time => Date Vector
df['Time'] = pd.to_datetime(df['Time'])

# Replace index with date vec: index => Time
df.set_index(df['Time'], inplace=True)

# Resample to get the daily max: stdout => aggregated Series
df.resample('D').max()
hello_friend
  • 5,682
  • 1
  • 11
  • 15