0

Sadly, the answer of this question for datetime.date do not work for datetime.time.

So I implemented an df.apply() function, which is doing what I expect:

def get_ts_timeonly_float(timeonly):
    if isinstance(timeonly, datetime.time):
        return timeonly.hour * 3600 + timeonly.minute * 60 + timeonly.second
    elif isinstance(timeonly, pd.Timedelta):
        return timeonly.seconds

fn_get_ts_timeonly_pd_timestamp = lambda row: get_ts_timeonly_float(row.ts_timeonly)
col = df.apply(fn_get_ts_timeonly_pd_timestamp, axis=1)
df = df.assign(ts_timeonly_as_ts=col.values)

Problem:

However, this is not yet “blazingly fast.” One reason is that .apply() will try internally to loop over Cython iterators. But in this case, the lambda that you passed isn’t something that can be handled in Cython, so it’s called in Python, which is consequently not all that fast.

This is a great blog post

So is there a faster method to convert datetime.time into some int representation (like total_seconds till start of day)? Thanks!

gies0r
  • 4,723
  • 4
  • 39
  • 50
  • 1
    The use of branching (`if`) pretty much exclude your code from "blazing fast" (i.e. vectorized) operations offered by pandas. How did you get the `time` values in the first place? Try keeping the columns as `pd.Timedelta` so you can use `col.total_seconds()`. Anyhow, some sample input will be helpful here – Code Different Apr 25 '20 at 13:49
  • You are right with the `if` statement @CodeDifferent. I converted the `datetime.time()` now to `pd.Timedelta` using `pd.to_timedelta(df.ts_timeonly.astype(str))` and removed the if statement. Thanks. – gies0r Apr 25 '20 at 14:37

0 Answers0