2

I have a list of date strings, potentially very large. The format is unknown, but will be the same in all elements. Assume dateutil.parser will succeed in parsing it.

The problem is that the list is very large and dateutil.parser is slow. I could parse one element and then use a faster method (strptime, perhaps) if I could infer the format string. However, as far as I can tell, the parser does not give me the format string it succeeded with.

Is there an elegant solution?

  • This is not currently possible with dateutil, but pandas does something very similar to this if you pass the `infer_datetime_format ` option to `to_datetime()` or `read_csv()` – Paul Mar 30 '17 at 22:19
  • See issue [#125](https://github.com/dateutil/dateutil/issues/125) on the dateutil issue tracker. – Paul Mar 30 '17 at 22:22

1 Answers1

0

If you have some heavy calculation, I think you can use multithreading with the threading module.

import threading

thread = threading.Thread(target=some_def, args=(), kwargs={})
thread.start() # will run some_def
thread.join() # will wait until some_def is done

Or in your's case:

import threading import dateutil.parser

threads = []
times = ["Sat Oct 11 17:13:46 UTC 2003", "Sat Oct 11 17:13:46 UTC 2004"]
for time in times:
    thread = threading.Thread(target=dateutil.parser.parser, args=(time))
    threads.append(thread)

for thread in threads:
    thread.start()

map(lambda thread: thread.join(), threads)

For more information: Asynchronous method call in Python?.

Community
  • 1
  • 1
Yuval Pruss
  • 8,716
  • 15
  • 42
  • 67