0

I am reading different log files with different date formats. I am using python to read in the files line by line and then parse the line. I want to parse the line for dates and then formulate them into a date object to do comparisons on.

For example, say that I have 2 log files both with different date formats. How do I read them into an object to compare them to a known date. Assume for example, I wanted to discard all dates before a certain time.

Let's assume the first log file just has one line:

invalid access 2015-01-04 14:23:15 on IP 5.5.5.5

How do I read in 2015-01-04 14:23:15 into a dateobject (so I can do comparisons)

What if the date format was different? How would I read in that?

Matthew
  • 3,886
  • 7
  • 47
  • 84
  • if the time string represents local time then you need to convert it to UTC or POSIX time to do comparisons because local time is non-monotonous. See [Find if 24 hrs have passed between datetimes - Python](http://stackoverflow.com/a/26313848/4279) – jfs Feb 10 '15 at 23:29
  • related: [How to parse ISO formatted date in python?](http://stackoverflow.com/q/127803/4279) – jfs Feb 10 '15 at 23:36

3 Answers3

1

You can use datetime.datetime.strptime:

In [1]: from datetime import datetime
In [2]: d = '2015-01-04 14:23:15'
In [3]: datetime.strptime(d, '%Y-%m-%d %H:%M:%S')
Out[3]: datetime.datetime(2015, 1, 4, 14, 23, 15)

For other formats, check out the documentation

xnx
  • 24,509
  • 11
  • 70
  • 109
0

dateutil can usually parse whatever

import dateutil.parser as p
print p.parse("2015-01-04 14:23:15")

this assumes you can isolate your datestring

$ easy_install python-dateutil 
$ pip install python-dateutil

or simply attainable at https://pypi.python.org/pypi/python-dateutil/2.4.0 if you need the source ...

mgilson
  • 300,191
  • 65
  • 633
  • 696
Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
  • A link to [`dateutil`](https://pypi.python.org/pypi/python-dateutil/2.4.0) might be helpful as it isn't in the stdlib. – mgilson Feb 09 '15 at 22:01
0

Using re module directly might be more efficient than using it implicitly via datetime.strptime() (measure it to find out whether it matters in your case):

>>> import datetime, re   
>>> s = '2015-01-04 14:23:15'
>>> datetime.datetime(*map(int, re.findall('\d+', s)))
datetime.datetime(2015, 1, 4, 14, 23, 15)
jfs
  • 399,953
  • 195
  • 994
  • 1,670