1

I'm trying to convert one of the Column in CSV which is for Date/timestamp in the format 12/30/2023 22:13 to a required format 2023-12-30T22:13:39Z.

So far got this:

import pandas as pd
df = pd.read_csv('input.csv', na_filter=False, skiprows=0)
print(df)
df['datetime'] = pd.to_datetime(df['datetime'])
print(df)

This gets me Date/time in format of 2023-12-30 22:13:00.

There is a 'format' in to_datetime() documentation (https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html), Tried this but this does not work.

df['datetime'] = pd.to_datetime(df['datetime'], format='%Y-%m-%dT%H:%M:%sZ', errors='coerce')

This Python - Pandas - Convert YYYYMM to datetime closet I found to what I needed but cannot make it work.

Please someone help.

input.csv enter image description here

Tried: enter image description here

AliasSyed
  • 67
  • 8
  • 3
    `pd.to_datetime(df['datetime'], dayfirst=False).dt.strftime('%Y-%m-%dT%H:%M:%sZ')` the format to pass to `to_datetime` is the input format, not the desired one. – mozway Dec 20 '22 at 22:37
  • Does this answer your question? [Convert Pandas Column to DateTime](https://stackoverflow.com/questions/26763344/convert-pandas-column-to-datetime) – pylos Dec 20 '22 at 22:51
  • @mozway This works great, on test csv file. But when I used on production input file which few thousand rows long, I found it failing with this error because it didn't like the format: ``` raise ParserError("Unknown string format: %s", timestr) dateutil.parser._parser.ParserError: Unknown string format: =abc xyz on mm/dd/yyyy, NUM31 days ``` Is there way to ignore if something doesn't follow format? Thank you. – AliasSyed Dec 21 '22 at 00:11
  • 1
    Add `errors='coerce'` – mozway Dec 21 '22 at 06:17

1 Answers1

1

Using Regular expression re with csv

Input.csv

datetime
12/30/2023 22:13
9/17/2023 6:55
9/1/2023 23:59
7/5/2023 23:59
3/5/2023 14:56
5/9/2023 8:32

Code

import csv
import re

with open('input.csv') as csv_file:
    csv_reader = csv.reader(csv_file)
    next(csv_reader)
    for row in csv_reader:
        print(row[0])
        date_pattern = "(\d{1,2})\/(\d{1,2})\/(\d{4})\s(\d{1,2})\:(\d{2})"
        r = re.match(date_pattern, row[0])
        new_date = '{0}-{1}-{2}T{3}:{4}:00Z'.format(r[3], r[1], r[2], r[4], r[5])
        print(new_date)
        print('--------------------')

Result

$ python convert-csv.py
12/30/2023 22:13
2023-12-30T22:13:00Z
--------------------
9/17/2023 6:55
2023-9-17T6:55:00Z
--------------------
9/1/2023 23:59
2023-9-1T23:59:00Z
--------------------
7/5/2023 23:59
2023-7-5T23:59:00Z
--------------------
3/5/2023 14:56
2023-3-5T14:56:00Z
--------------------
5/9/2023 8:32
2023-5-9T8:32:00Z
--------------------

Regexper Visualization

enter image description here

regex101 test matched strings

enter image description here

Bench Vue
  • 5,257
  • 2
  • 10
  • 14