0

I have a CSV file whose first column looks like this:

2018-12-10 20:00:25.855
2018-12-10 20:09:26
2018-12-10 20:13:27.31
2018-12-10 20:23:28

These are currently strings that I want to ultimately convert into just time objects (without the date). My first step was to remove the milliseconds but I can't figure out how to do that when some of the strings don't contain milliseconds.

I tried using this line to replace the milliseconds value but I end up with "data remains" error.

 strdate = datetime.strptime(column[0], '%Y-%m-%d %H:%M:%S').replace(microsecond=0)
 ValueError: unconverted data remains: .855

I have also tried stripping the string after the "." but nothing happens.

column[0].strip('.')
klex52s
  • 437
  • 1
  • 7
  • 19
  • "data remains" error? If you are getting an error, please add it to your question. – amanb Dec 14 '18 at 20:16
  • 1
    Possible duplicate of [How to format date string via multiple formats in python](https://stackoverflow.com/questions/23581128/how-to-format-date-string-via-multiple-formats-in-python) – pault Dec 14 '18 at 20:18
  • 1
    check the post suggested by @pault. Just for your information `column[0].strip('.')` does not work because it will remove `'.'` from the begining and end of the string. what you want to use is `column[0].split('.')[0]` – buran Dec 14 '18 at 20:25

4 Answers4

0
string = '20:00:25.855'
newstr = string[:string.find('.')]
print (newstr)
#20:00:25

Using the above logic outlined:

import pandas as pd
datadict = {
        'Time':['2018-12-10 20:00:25.855',
                '2018-12-10 20:09:26',
                '2018-12-10 20:13:27.31',
                '2018-12-10 20:23:28'],
        }
df = pd.DataFrame(datadict)

df['Time'] = [row[11:row.find('.')] if '.' in row else row[11:] for row in df['Time']]
print (df)
       Time
0  20:00:25
1  20:09:26
2  20:13:27
3  20:23:28
ycx
  • 3,155
  • 3
  • 14
  • 26
0

This returns the time portion of the datetime object, which you can then use for whatever calculations you need:

from datetime import datetime

def get_times():
    times = ['2018-12-10 20:00:25.855','2018-12-10 20:09:26']
    return [datetime.strptime(x[11:19],'%H:%M:%S').time() for x in times]

Output is: [datetime.time(20, 0, 25), datetime.time(20, 9, 26)]

To return a 'readable' form:

def get_times():
    times = ['2018-12-10 20:00:25.855','2018-12-10 20:09:26']
    dt_objects =  [datetime.strptime(x[11:19],'%H:%M:%S').time() for x in times]
    return [dt.strftime('%H:%M:%S') for dt in dt_objects]

Output is: ['20:00:25', '20:09:26']

Nick
  • 3,454
  • 6
  • 33
  • 56
0

Just in case you want to parse the time including micoseconds, you could conditionally expand the format string:

from datetime import datetime as DT

times =['2018-12-10 20:00:25.855',
'2018-12-10 20:09:26',
'2018-12-10 20:13:27.31',
'2018-12-10 20:23:28']

for t in times:
    hasdot = '.' in t
    print(DT.strptime(t[11:], '%H:%M:%S' + ('.%f' if hasdot else '' )).time())

#20:00:25.855000
#20:09:26                                                    
#20:13:27.310000                                           
#20:23:28             
SpghttCd
  • 10,510
  • 2
  • 20
  • 25
  • This solution didn't work for me, but I understand the logic behind it and can see how it is applicable. – klex52s Dec 18 '18 at 14:38
  • :) well, that sounds interesting, so that I'm curious: could you please explain, why it didn't work but still is worth being accepted or how you had to modify it because of what...? – SpghttCd Dec 18 '18 at 17:42
0

datetime.fromisoformat() handles both formats, with and without milliseconds.

zisha
  • 11,352
  • 1
  • 12
  • 3