1
import pandas as pd
d= {'start_time': ["06:30:00", "07:30:00", "24:30:00","17:30:00","22:30:00","16:30:00"], 'final_time': ["09:30:00", "23:30:00", "27:30:00","26:30:00","23:30:00","23:00:00"], 'time_squence': [22, 33, 223,333,3321,76]}
data=pd.DataFrame(d)

so in this pandas i want remove the rows that either in start_time or final_time that has a time more than 24hours and then update the pandas. as python cannot convert those strings(times) to datetime object.

d= {'start_time': ["06:30:00", "07:30:00","22:30:00","16:30:00"], 'final_time': ["09:30:00", "23:30:00","23:30:00","23:00:00"], 'time_squence': [22, 33,3321,76]}
data=pd.DataFrame(d)

thats the expected output and values should be converted to datetime object

Ali Mozard
  • 15
  • 3

2 Answers2

0

You can filter first 2 values if less like 24 hours first in boolean indexing with convert values to timedeltas by to_timedelta and compare seconds by Series.dt.total_seconds:

d= {'start_time': ["24:00:00", "07:30:00", "24:30:00","17:30:00","22:30:00","16:30:00"], 
    'final_time': ["09:30:00", "23:30:00", "27:30:00","26:30:00","23:30:00","23:00:00"], 
    'time_squence': [22, 33, 223,333,3321,76]}
data=pd.DataFrame(d)

day_sec = 86400
m1 = pd.to_timedelta(data['start_time']).dt.total_seconds() <= day_sec
m2 = pd.to_timedelta(data['final_time']).dt.total_seconds() <= day_sec

data = data[m1 & m2].copy()
print (data)
  start_time final_time  time_squence
0   24:00:00   09:30:00            22
1   07:30:00   23:30:00            33
4   22:30:00   23:30:00          3321
5   16:30:00   23:00:00            76
    
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

One way is to use .dropna(), first you can use .apply() so we get the hours. code below:

data["final"] = data["final_time"].apply(lambda x:x if int(x[0:2]) >= 24  and int(x[3:5]) > 0 else False )
data["start"] = data["start_time"].apply(lambda x:x if int(x[0:2]) >= 24 and int(x[3:5]) > 0  else False )
data["final"] = data[data["final"] == False]
data["start"] = data[data["start"] == False]
data.dropna(axis=0, inplace=True)
data = data.iloc[:,:-2]
data
data

output:

    start_time  final_time  time_squence
0   06:30:00    09:30:00    22
1   07:30:00    23:30:00    33
4   22:30:00    23:30:00    3321
5   16:30:00    23:00:00    76
The.B
  • 361
  • 2
  • 11