1

The input data comes from a CSV file in which dates are formatted in AM/PM and then the Pandas library is used to export the csv to a sqlite database table.

Are there any elegant solutions or a builtin function for either python 2.7 or 3 to convert these 12 hour dates to 24 hour format, preferably before they are written to the database? I need it to compare dates which will be imported to the database.

The following code is used:

import pandas as pd
import datetime as dt

from sqlite3 import connect
from glob import iglob
from os.path import splitext, basename


dbName = 'logs'
logsDir = 'event_logs'

def csvToDb(db, filename):
    conn = connect(db)
    df = pd.read_csv(filename,encoding='utf-8')
    df.to_sql(splitext(basename(filename))[0], conn)
    conn.close()


def csvLogsToDb(directory):
    for filename in iglob(directory + '/*.csv'):
        print('WRITING TO DB | ' + filename)
        csvToDb(dbName, filename)

    print("DONE")


if __name__ == "__main__":
    csvLogsToDb(logsDir)

I removed most data and kept the most relevant, but in a nutshell, this is my current input and needed output

INPUT CSV DATA:

Level,DateTime    
Information,2017/01/06 9:06:02 AM
Information,2017/01/06 9:02:44 AM
Information,2017/01/06 8:56:48 PM

CURRENT OUTPUT

COL1    
    Information    
    Information
    Information

COL2
    2017/01/06 9:06:02 AM
    2017/01/06 9:02:44 AM
    2017/01/06 8:56:48 PM

EXPECTED OUTPUT

COL1
    Information
    Information
    Information

COL2
    2017-01-06 09:06:02
    2017-01-06 09:02:44
    2017-01-06 20:56:48
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Elmar
  • 55
  • 3
  • 15
  • 1
    Possible duplicate of [How can I account for period (AM/PM) with datetime.strptime?](http://stackoverflow.com/questions/1759455/how-can-i-account-for-period-am-pm-with-datetime-strptime) – Alex Hall Jan 06 '17 at 13:16
  • So how is AM or PM indicated in those dates? It's just a question of configuring the parser for the dates to read that information. – Martijn Pieters Jan 06 '17 at 13:16
  • @MartijnPieters : The CSV dates are in the following format : '2017/01/05 7:43:25 PM' . Sorry if this question is silly. I have used Python for some time but don't do so frequently as I mainly use C# . – Elmar Jan 06 '17 at 13:22
  • So how do you read that data currently? What happens, what should happen instead? – Martijn Pieters Jan 06 '17 at 13:24
  • I note that the Pandas CSV loader has [specific Datetime handling options](http://pandas.pydata.org/pandas-docs/stable/io.html#datetime-handling), and the default should be able to handle AM/PM times out of the box. – Martijn Pieters Jan 06 '17 at 13:25
  • I edited my answer to include the code. I changed the line with read_csv to: df = pd.read_csv(filename,encoding='utf-8',parse_dates=True) but still doesn't work. I am also reading up more on the date formating for pandas btw. – Elmar Jan 06 '17 at 13:34
  • @ElmarleGrange: "doesn't work" is not a helpful problem description. What input data do you have, what output do you see, what did you expect to happen instead? Without that information, all we can do is give generic advice. – Martijn Pieters Jan 06 '17 at 13:38
  • @MartijnPieters : I added more information to my question which includes input, current output and expected output. – Elmar Jan 06 '17 at 14:02

2 Answers2

2

Just tell pandas.read_csv() to parse the dates, see the datetime parsing options available.

The dateutil.parser default parser can handle times with AM/PM out of the box. Set the parse_dates option to a list of column indices you want parsed. For your sample input, that'd be column 1:

>>> import pandas
>>> from io import BytesIO
>>> data = BytesIO(b'''\
... Level,DateTime
... Information,2017/01/06 9:06:02 AM
... Information,2017/01/06 9:02:44 AM
... Information,2017/01/06 8:56:48 PM
... ''')
>>> pandas.read_csv(data, parse_dates=[1])
         Level        DateTime
0  Information 2017-01-06 09:06:02
1  Information 2017-01-06 09:02:44
2  Information 2017-01-06 20:56:48
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Works perfectly now! Changing parse_dates=true TO parse_dates=[1] did the trick. Thank you for your time and patience. I really appreciate it. – Elmar Jan 06 '17 at 14:19
1

You can use datetime package to get this done

from datetime import datetime
data = '2:35 PM'
standard_time_format = datetime.strptime(data, "%I:%M %p")
only_in_24_hour_format = datetime.strftime(standard_time_format, "%H:%M")
print only_in_24_hour_format
print standard_time_format

This would give the following as output

14:35

1900-01-01 14:35:00

Ben
  • 51,770
  • 36
  • 127
  • 149
Suraj
  • 38
  • 7