I have some data in chronological order. The index is a date time with minute-level resolution. I store the hour in a column called hour and the minute in a column called minute. I want to trim the start of the data so that I always begin with 00:00. The incoming dataset may begin with some random minute of the day. The data consists of minute-level rows for many days (1000s). So losing part of the first day is not an issue. I just need the data to start at midnight.
I am trying to use the following code to trim my data frame so that is always begins with 00:00.
def clean_start_data (df):
for index, row in df.iterrows():
if row['hour'] > 0 or row['minute'] > 0:
df.drop(index, inplace=True)
else:
break
return df
But I get stuck and my kernel becomes unresponsive
What am I doing wrong?
EDIT
My data looks like this
h = 9 m = 0 data = blah
h = 9 m = 1 data = blahhbadf
h = 9 m = 2 data = somethning_else
....
h = 0 m = 0 data = something. // new day...I want to start here and remove all rows above
The data covers around 400 days. At h=23 m=59, the h goes back to 0 and minute goes back to 0.
I want to remove from my data the time entries which occur before a new day starts. eg. I want my data to start at h = 0 m = 0.