-1

I am trying to replace a wrongly formatted datetime String object with a space in between its 'day' and 'hours' sections (as seen in data below).

The data below are the incorrectly formatted ones which I have used regex to sieve out. I can't crack my head over how to include a space in between. Wondering if I could do it through regex? New to regex, would appreciate if anyone could share their experiences!

df[df['time'].str.contains(r'\.[0-9][0-9][0-9][0-9]\:')]['time']
20192    2010.12.2512:01:45
20193    2010.12.2512:02:52
20196    2010.12.2512:07:43
20197    2010.12.2512:25:33
42811    2013.01.0103:04:10
43527    2013.05.1605:49:49
Laz
  • 113
  • 1
  • 8

1 Answers1

0

Your regex isn't that far off; you just need to add matching groups (see e.g. here) - between which you could place the missing space. E.g. like

import pandas as pd

df = pd.DataFrame({'time': ['2010.12.2512:01:45', '2010.12.2512:02:52', '2010.12.25 12:07:43']})

df['time'].replace('(\.[0-9]{2})([0-9]{2}\:)', r'\1 \2', regex=True)
# 0    2010.12.25 12:01:45
# 1    2010.12.25 12:02:52
# 2    2010.12.25 12:07:43
# Name: time, dtype: object

Note that I've added one "valid" string at the end of the list. It's not modified by the regex replace.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72