-1

I am trying to convert MM:SS.ss into a timedelta. So far, I have tried:

df=pd.to_timedelta(df) 

it gave me TypeError: arg must be a string, timedelta, list, tuple, 1-d array, or Series.

df = pd.to_datetime(df, format = "%MM:%SS:%ss")

Example of the data: 1:24.3154

CBegin
  • 29
  • 7
  • Information regarding the data-frame structure missing. PS: the `pd.to_datetime` is applied on a column. So, maybe you can try: `df['Date'] = pd.to_datetime(df['Date'], format = "%MM:%SS:%ss")` – Serial Lazer Nov 05 '20 at 02:44
  • Are you reading the datafile from a csv file? – Carlos Nov 05 '20 at 02:48
  • @Carlos I'm reading the data from a csv file I created – CBegin Nov 05 '20 at 02:49
  • @SerialLazer The whole data contains lap times, so I'm trying to convert all the values into timedelta – CBegin Nov 05 '20 at 02:50
  • Did you try with the `parse_dates` parameter? https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html Also, this related: https://stackoverflow.com/questions/21906715/how-to-get-pandas-read-csv-to-infer-datetime-and-timedelta-types-from-csv-file ;) – Carlos Nov 05 '20 at 02:56
  • This is not programming at all – NVRM Nov 05 '20 at 02:57

1 Answers1

2

Try it online!

import pandas as pd
df = pd.DataFrame({    # Read-in input DataFrame any way you need
    'TimeCol0': ['1:24.3154', '2:36.789'],
    'TimeCol1': ['3:57.98',   '10:15.32'],
    'OtherCol': ['a', 'b'],
})
for col in ['TimeCol0', 'TimeCol1']:  # Put here all time columns to be converted
    df[col] = pd.to_datetime(df[col], format = '%M:%S.%f') - pd.to_datetime('1900-01-01')
print(df)

Output:

                TimeCol0               TimeCol1 OtherCol
0 0 days 00:01:24.315400 0 days 00:03:57.980000        a
1 0 days 00:02:36.789000 0 days 00:10:15.320000        b
Arty
  • 14,883
  • 6
  • 36
  • 69
  • the datetime detour seems painful ^^ also, the explicit format makes it sort of inflexible; what if some entries also have an hour? Nice solution anyway! – FObersteiner Nov 05 '20 at 06:51
  • @MrFuppes If there are several different formats inter-mixed inside data then if they are quite different like having or not having hours then we can distinguish them by number of `:` occurances inside string. Then just apply same code as above but only to filtered out cells - in first run for `:` appearing once, second run for `:` appearing twice. Regarding speed of processing only matters that you spend not to much time inside Python-only code, i.e. do not to many runs, but doing fixed number of 5-10 runs is totally fast! If speed not matters you can apply any Python-only func for all cells! – Arty Nov 06 '20 at 08:48