Get the format in dateutil.parse

Question

Is there a way to get the "format" after parsing a date in dateutil. For example something like:

>>> x = parse("2014-01-01 00:12:12")
datetime.datetime(2014, 1, 1, 0, 12, 12)

x.get_original_string_format()
YYYY-MM-DD HH:MM:SS # %Y-%m-%d %H:%M:%S

# Or, passing the date-string directly
get_original_string_format("2014-01-01 00:12:12")
YYYY-MM-DD HH:MM:SS # %Y-%m-%d %H:%M:%S

Update: I'd like to add a bounty to this question to see if someone could add an answer that would do the equivalent on getting the string-format of a common date-string passed. It can use dateutil if you want, but it doesn't have to. Hopefully we'll get some creative solutions here.

so you need to reverse to python date format like string `%Y-%m-%d %H:%M:%S` or ISO format like string `YYYY-MM-DD HH:MM:SS`? — Enix, Dec 29 '18 at 02:47
I would have been extremely surprised if the date object still contained anything like its string representation. That would be a massive waste of resources and not exactly a shining example of good programming either. Could you elaborate a bit about the use case you’re trying to solve? — Sebastiaan van den Broek, Dec 30 '18 at 05:19
@SebastiaanvandenBroek this is for a csv/excel parser where we have no control over the user's input and want to be able to detect somewhat common patterns that may be used. — David542, Dec 30 '18 at 19:11
I love the question. Have you seen the unitttest for the parser? If not, you can find it [here](https://github.com/dateutil/dateutil/blob/master/dateutil/test/test_parser.py#L31-L92), `dateutil` supports a whole bunch of formats, however when you see the parser code in this [link](https://github.com/dateutil/dateutil/blob/2.7.x/dateutil/parser/_parser.py#L668-L875) you'll find that there is not a way to do directly what you ask which that same package. You can extract that code and implement the output you need, this I think would be the optimal way. Else @benvc answer should be sufficient. — silgon, Dec 30 '18 at 21:45
perhaps [this](https://stackoverflow.com/questions/46842793/datetime-conversion-how-to-extract-the-inferred-format) is what you were looking for? — cs95, Jan 04 '19 at 04:07
Can you limit the input format types to some combinations ? In your example, how could we know the difference between `YYYY-MM-DD ...` and `YYYY-DD-MM ...`. In real, everything is possible but you have maybee few formats in reality. — doom, Jan 04 '19 at 15:38

alecxe · Answer 1 · 2018-12-22T02:09:21.840

13

Is there a way to get the "format" after parsing a date in dateutil?

Not possible with dateutil. The problem is that dateutil never has the format as an intermediate result any time during the parsing as it detects separate components of the datetime separately - take a look at this not quite easy to read source code.

edited Dec 22 '18 at 02:09

answered Dec 22 '18 at 02:07

alecxe

462,703
120
1,088
1,195

@alexce wow, ok. How would you suggest getting the string format then with the help of dateutil? – David542 Dec 22 '18 at 02:08
2

@David542 well, this needs some time to research and it depends on your actual problem you are trying to solve. Depending on the scope of the problem and the possible inputs, it could be that you could just iterate over the common datetime formats and just try them out one by one. You could, for example, use [`dateparser` specifying the `date_formats`](https://dateparser.readthedocs.io/en/latest/#popular-formats) and try popular formats until it works - the format for which it would work would be the format of the datetime string found in the string. – alecxe Dec 22 '18 at 02:17
I see, I like that approach and I'll try that. Is there a place that contains perhaps the ten or so "most popular" date formats? – David542 Dec 22 '18 at 02:25
@David542 that's a good question, `dateparser` has some samples of popular formats: https://dateparser.readthedocs.io/en/latest/index.html#popular-formats, though not sure the best place for that. Thanks, keep the good work and the good questions coming David. – alecxe Dec 22 '18 at 02:28
thanks, added a bounty for it to encourage some answers. – David542 Dec 29 '18 at 02:31
Can't we initially pass the format ? for eg in datetime object you have to pass datetime.datetime.strptime(datestring,"format-goes here")? and defining a class that could make use of that format is its property to parse that information. – Gaurav Dec 29 '18 at 02:48
@Gaurav yeah, it is possible to specify the format, but for that you would have to know possible expected formats..we talked about it above in comments.. – alecxe Dec 29 '18 at 03:30
@alecxe Yes i Read the comments , but you actually didn't get my point of view might be, If you go there we have common format that date time object allows https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior an defining Say DD for %d and MM for %M you could actually replace inplace within the format string. Say you pass %Y %m, %d as format string you could have replace in place characters as YYYY mm ,dd – Gaurav Dec 29 '18 at 06:10

benvc · Answer 2 · 2022-05-11T19:48:41.230

I don't know of a way that you can return the parsed format from dateutil (or any other python timestamp parser that I know of).

Implementing your own timestamp parsing function that returns a list of possible formats and related datetime objects is fairly trivial using datetime.strptime() but doing it efficiently against a broadly useful list of possible timestamp formats is not.

The following example utilizes a list of just over 100 formats. It does not even scratch the surface of the wide variety of formats parsed by dateutil. It tests each format in sequence until it exhausts all formats in the list (likely much less efficient than the dateutil approach of locating the various datetime parts independently as noted in the answer from @alecxe).

In addition, I have included some example timestamp formats that include time zone names (instead of offsets). If you run the example function below against those particular datetime strings, you may find that it does not return the expected matches even though I have included matching formats using the %Z directive. Some explanation for the challenges with using %Z to handle time zone names can be found in issue 22377 at bugs.python.org (just to highlight another non-trivial aspect of implementing your own datetime parsing function).

With all of those caveats, if you are dealing with a manageable set of potential formats, implementing something simple like the below may get you what you need.

Example function that attempts to match a datetime string against a list of formats and return a dict that includes the original datestring and a list of matches, each a dict that includes a datetime object along with the matched format:

from datetime import datetime

def parse_timestamp(datestring, formats):
    results = {'datestring': datestring, 'matches': []}
    for f in formats:
        try:
            d = datetime.strptime(datestring, f)
        except:
            continue
        results['matches'].append({'datetime': d, 'format': f})
    return results

Example formats and datetime strings:

formats = ['%A, %B %d, %Y', '%A, %B %d, %Y %I:%M:%S %p %Z', '%A, %d %B %Y', '%B %d %Y', '%B %d, %Y', '%H:%M:%S', '%H:%M:%S,%f', '%H:%M:%S.%f', '%Y %b %d %H:%M:%S.%f', '%Y %b %d %H:%M:%S.%f %Z', '%Y %b %d %H:%M:%S.%f*%Z', '%Y%m%d %H:%M:%S.%f', '%Y-%m-%d %H:%M:%S %z', '%Y-%m-%d %H:%M:%S%z', '%Y-%m-%d %H:%M:%S,%f', '%Y-%m-%d %H:%M:%S,%f%z', '%Y-%m-%d %H:%M:%S.%f', '%Y-%m-%d %H:%M:%S.%f%z', '%Y-%m-%d %I:%M %p', '%Y-%m-%d %I:%M:%S %p', '%Y-%m-%d*%H:%M:%S', '%Y-%m-%d*%H:%M:%S:%f', '%Y-%m-%dT%H:%M:%S', '%Y-%m-%dT%H:%M:%S%Z', '%Y-%m-%dT%H:%M:%S%z', '%Y-%m-%dT%H:%M:%S*%f%z', '%Y-%m-%dT%H:%M:%S.%f', '%Y-%m-%dT%H:%M:%S.%f%z', '%Y/%m/%d', '%Y/%m/%d*%H:%M:%S', '%a %b %d %H:%M:%S %Z %Y', '%a, %d %b %Y %H:%M:%S %z', '%b %d %H:%M:%S', '%b %d %H:%M:%S %Y', '%b %d %H:%M:%S %z', '%b %d %H:%M:%S %z %Y', '%b %d %Y', '%b %d %Y %H:%M:%S', '%b %d, %Y', '%b %d, %Y %I:%M:%S %p', '%b.%d.%Y', '%d %B %Y', '%d %B %Y %H:%M:%S %Z', '%d %b %Y %H:%M:%S', '%d %b %Y %H:%M:%S %z', '%d %b %Y %H:%M:%S*%f', '%d%m_%H:%M:%S', '%d%m_%H:%M:%S.%f', '%d-%b-%Y', '%d-%b-%Y %H:%M:%S', '%d-%b-%Y %H:%M:%S.%f', '%d-%b-%Y %I:%M:%S %p', '%d-%m-%Y', '%d-%m-%Y %I:%M %p', '%d-%m-%Y %I:%M:%S %p', '%d-%m-%y', '%d-%m-%y %I:%M %p', '%d-%m-%y %I:%M:%S %p', '%d/%b %H:%M:%S,%f', '%d/%b/%Y %H:%M:%S', '%d/%b/%Y %I:%M %p', '%d/%b/%Y:%H:%M:%S', '%d/%b/%Y:%H:%M:%S %z', '%d/%m/%Y', '%d/%m/%Y %H:%M:%S %z', '%d/%m/%Y %I:%M %p', '%d/%m/%Y %I:%M:%S %p', '%d/%m/%Y %I:%M:%S %p:%f', '%d/%m/%Y*%H:%M:%S', '%d/%m/%Y*%H:%M:%S*%f', '%d/%m/%y', '%d/%m/%y %H:%M:%S', '%d/%m/%y %H:%M:%S %z', '%d/%m/%y %I:%M %p', '%d/%m/%y %I:%M:%S %p', '%d/%m/%y*%H:%M:%S', '%m%d_%H:%M:%S', '%m%d_%H:%M:%S.%f', '%m-%d-%Y', '%m-%d-%Y %I:%M %p', '%m-%d-%Y %I:%M:%S %p', '%m-%d-%y', '%m-%d-%y %I:%M %p', '%m-%d-%y %I:%M:%S %p', '%m/%d/%Y', '%m/%d/%Y %H:%M:%S %z', '%m/%d/%Y %I:%M %p', '%m/%d/%Y %I:%M:%S %p', '%m/%d/%Y %I:%M:%S %p:%f', '%m/%d/%Y*%H:%M:%S', '%m/%d/%Y*%H:%M:%S*%f', '%m/%d/%y', '%m/%d/%y %H:%M:%S', '%m/%d/%y %H:%M:%S %z', '%m/%d/%y %I:%M %p', '%m/%d/%y %I:%M:%S %p', '%m/%d/%y*%H:%M:%S', '%y%m%d %H:%M:%S', '%y-%m-%d %H:%M:%S', '%y-%m-%d %H:%M:%S,%f', '%y-%m-%d %H:%M:%S,%f %z', '%y/%m/%d %H:%M:%S']

datestrings = ['03-11-1999', '03-12-1999 5:06 AM', '03-12-1999 5:06:07 AM', '03-12-99 5:06 AM', '03-12-99 5:06:07 AM', '03/12/1999', '03/12/1999 5:06 AM', '03/12/1999 5:06:07 AM', '03/12/99 5:06 AM', '03/12/99 5:06:07', '03/12/99 5:06:07 AM', '04/23/17 04:34:22 +0000', '0423_11:42:35', '0423_11:42:35.883', '05/09/2017*08:22:14*612', '06/01/22 04:11:05', '08/10/11*13:33:56', '10-04-19 12:00:17', '10-06-26 02:31:29,573', '10/03/2017 07:29:46 -0700', '11-02-11 16:47:35,985 +0000', '11/22/2017*05:13:11', '11:42:35', '11:42:35,173', '11:42:35.173', '12/03/1999', '12/03/1999 5:06 AM', '12/03/99 5:06 AM', '12/3/1999', '12/3/1999 5:06 AM', '12/3/1999 5:06:07 AM', '150423 11:42:35', '19/Apr/2017:06:36:15 -0700', '1999-03-12 05:06:07.0', '1999-03-12 5:06 AM', '1999-03-12 5:06:07 AM', '1999-03-12+01:00', '1999-3-12 5:06 AM', '1999-3-12 5:06:07 AM', '1999/3/12', '20150423 11:42:35.173', '2017 Mar 03 05:12:41.211 PDT', '2017 Mar 10 01:44:20.392', '2017-02-11T18:31:44', '2017-03-10 14:30:12,655+0000', '2017-03-12 13:11:34.222-0700', '2017-03-12T17:56:22-0700', '2017-06-26 02:31:29,573', '2017-07-01T14:59:55.711+0000', '2017-07-04*13:23:55', '2017-07-22T16:28:55.444', '2017-08-19 12:17:55 -0400', '2017-08-19 12:17:55-0400', '2017-09-08T03:13:10', '2017-10-14T22:11:20+0000', '2017-10-30*02:47:33:899', '2017-11-22T10:10:15.455', '2017/04/12*19:37:50', '2018 Apr 13 22:08:13.211*PDT', '2018-02-27 15:35:20.311', '2018-08-20T13:20:10*633+0000', '22 Mar 1999 05:06:07 +0100', '22 March 1999', '22 March 1999 05:06:07 CET', '22-Mar-1999', '22-Mar-1999 05:06:07', '22-Mar-1999 5:06:07 AM', '22/03/1999 5:06:07 AM', '22/Mar/1999 5:06:07 +0100', '22/Mar/99 5:06 AM', '23 Apr 2017 10:32:35*311', '23 Apr 2017 11:42:35', '23-Apr-2017 11:42:35', '23-Apr-2017 11:42:35.883', '23/Apr 11:42:35,173', '23/Apr/2017 11:42:35', '23/Apr/2017:11:42:35', '3-11-1999', '3-12-1999 5:06 AM', '3-12-99 5:06 AM', '3-12-99 5:06:07 AM', '3-22-1999 5:06:07 AM', '3/12/1999', '3/12/1999 5:06 AM', '3/12/1999 5:06:07 AM', '3/12/99 5:06 AM', '3/12/99 5:06:07', '8/5/2011 3:31:18 AM:234', '9/28/2011 2:23:15 PM', 'Apr 20 00:00:35 2010', 'Dec 2, 2017 2:39:58 AM', 'Jan 21 18:20:11 +0000 2017', 'Jun 09 2018 15:28:14', 'Mar 16 08:12:04', 'Mar 22 1999', 'Mar 22, 1999', 'Mar 22, 1999 5:06:07 AM', 'Mar.22.1999', 'March 22 1999', 'March 22, 1999', 'Mon Mar 22 05:06:07 CET 1999', 'Mon, 22 Mar 1999 05:06:07 +0100', 'Monday, 22 March 1999', 'Monday, March 22, 1999', 'Monday, March 22, 1999 5:06:07 AM CET', 'Sep 28 19:00:00 +0000']

Example usage:

print(parse_timestamp('2018-08-20T13:20:10*633+0000', formats))
# OUTPUT
# {'datestring': '2018-08-20T13:20:10*633+0000', 'matches': [{'datetime': datetime.datetime(2018, 8, 20, 13, 20, 10, 633000, tzinfo=datetime.timezone.utc), 'format': '%Y-%m-%dT%H:%M:%S*%f%z'}]}

Doddie · Accepted Answer · 2019-01-01T13:13:31.233

My idea was to:

Create an object that has a list of candidate specifiers you think might be in the date pattern (the more you add, the more possibilities you will get out the other end)
Parse the date string
Create a list of possible specifiers for each element in the string, based on the date and the list of candidates you supplied.
Recombine them to produce a list of 'possibles'.

If you get only a single candidate, you can be pretty sure is it the right format. But you will often get many possibilities (especially with dates, months, minutes and hours all in the 0-10 range).

Example class:

import re
from itertools import product
from dateutil.parser import parse
from collections import defaultdict, Counter

COMMON_SPECIFIERS = [
    '%a', '%A', '%d', '%b', '%B', '%m',
    '%Y', '%H', '%p', '%M', '%S', '%Z',
]


class FormatFinder:
    def __init__(self,
                 valid_specifiers=COMMON_SPECIFIERS,
                 date_element=r'([\w]+)',
                 delimiter_element=r'([\W]+)',
                 ignore_case=False):
        self.specifiers = valid_specifiers
        joined = (r'' + date_element + r"|" + delimiter_element)
        self.pattern = re.compile(joined)
        self.ignore_case = ignore_case

    def find_candidate_patterns(self, date_string):
        date = parse(date_string)
        tokens = self.pattern.findall(date_string)

        candidate_specifiers = defaultdict(list)

        for specifier in self.specifiers:
            token = date.strftime(specifier)
            candidate_specifiers[token].append(specifier)
            if self.ignore_case:
                candidate_specifiers[token.
                                     upper()] = candidate_specifiers[token]
                candidate_specifiers[token.
                                     lower()] = candidate_specifiers[token]

        options_for_each_element = []
        for (token, delimiter) in tokens:
            if token:
                if token not in candidate_specifiers:
                    options_for_each_element.append(
                        [token])  # just use this verbatim?
                else:
                    options_for_each_element.append(
                        candidate_specifiers[token])
            else:
                options_for_each_element.append([delimiter])

        for parts in product(*options_for_each_element):
            counts = Counter(parts)
            max_count = max(counts[specifier] for specifier in self.specifiers)
            if max_count > 1:
                # this is a candidate with the same item used more than once
                continue
            yield "".join(parts)

And some sample tests:

def test_it_returns_value_from_question_1():
    s = "2014-01-01 00:12:12"
    candidates = FormatFinder().find_candidate_patterns(s)
    sut = FormatFinder()
    candidates = sut.find_candidate_patterns(s)
    assert "%Y-%m-%d %H:%M:%S" in candidates


def test_it_returns_value_from_question_2():
    s = 'Jan. 04, 2017'
    sut = FormatFinder()
    candidates = sut.find_candidate_patterns(s)
    candidates = list(candidates)
    assert "%b. %d, %Y" in candidates
    assert len(candidates) == 1


def test_it_can_ignore_case():
    # NB: apparently the 'AM/PM' is meant to be capitalised in my locale! 
    # News to me!
    s = "JANUARY 12, 2018 02:12 am"
    sut = FormatFinder(ignore_case=True)
    candidates = sut.find_candidate_patterns(s)
    assert "%B %d, %Y %H:%M %p" in candidates


def test_it_returns_parts_that_have_no_date_component_verbatim():
    # In this string, the 'at' is considered as a 'date' element, 
    # but there is no specifier that produces a candidate for it
    s = "January 12, 2018 at 02:12 AM"
    sut = FormatFinder()
    candidates = sut.find_candidate_patterns(s)
    assert "%B %d, %Y at %H:%M %p" in candidates

To make it a bit clearer, here's some example of using this code in an iPython shell:

In [2]: ff = FormatFinder()

In [3]: list(ff.find_candidate_patterns("2014-01-01 00:12:12"))
Out[3]:
['%Y-%d-%m %H:%M:%S',
 '%Y-%d-%m %H:%S:%M',
 '%Y-%m-%d %H:%M:%S',
 '%Y-%m-%d %H:%S:%M']

In [4]: list(ff.find_candidate_patterns("Jan. 04, 2017"))
Out[4]: ['%b. %d, %Y']

In [5]: list(ff.find_candidate_patterns("January 12, 2018 at 02:12 AM"))
Out[5]: ['%B %d, %Y at %H:%M %p', '%B %M, %Y at %H:%d %p']

In [6]: ff_without_case = FormatFinder(ignore_case=True)

In [7]: list(ff_without_case.find_candidate_patterns("JANUARY 12, 2018 02:12 am"))
Out[7]: ['%B %d, %Y %H:%M %p', '%B %M, %Y %H:%d %p']

Thank you for sharing this well written code and solution! However there is a logical bug when generating all possibilities for the dateformat if both date & month <= 12 and date != month. Example list(ff.find_candidate_patterns("2014-12-10")) returns only ['%Y-%m-%d'] while I was expecting to see ['%Y-%d-%m', '%Y-%m-%d']. — Mohammed Alfaki, Jan 07 '23 at 19:19

score 3 · Answer 4 · answered Dec 31 '18 at 14:50

Idea:

Inspect the user input date string, and build possible date format set
Loop over the format set, use datetime.strptime parse the date string with individual possible date format.
Format the date from step 2 with datetime.strftime, if the result equal to the origin date string, then this format is a possible date format.

Algorithm implementation

from datetime import datetime
import itertools
import re

FORMAT_CODES = (
    r'%a', r'%A', r'%w', r'%d', r'%b', r'%B', r'%m', r'%y', r'%Y',
    r'%H', r'%I', r'%p', r'%M', r'%S', r'%f', r'%z', r'%Z', r'%j',
    r'%U', r'%W',
)

TWO_LETTERS_FORMATS = (
    r'%p',
)

THREE_LETTERS_FORMATS = (
    r'%a', r'%b'
)

LONG_LETTERS_FORMATS = (
    r'%A', r'%B', r'%z', r'%Z',
)

SINGLE_DIGITS_FORMATS = (
    r'w',
)

TWO_DIGITS_FORMATS = (
    r'%d', r'%m', r'%y', r'%H', r'%I', r'%M', r'%S', r'%U', r'%W',
)

THREE_DIGITS_FORMATS = (
    r'%j',
)

FOUR_DIGITS_FORMATS = (
    r'%Y',
)

LONG_DIGITS_FORMATS = (
    r'%f',
)

# Non format code symbols
SYMBOLS = (
    '-',
    ':',
    '+',
    'Z',
    ',',
    ' ',
)


if __name__ == '__main__':
    date_str = input('Please input a date: ')

    # Split with non format code symbols
    pattern = r'[^{}]+'.format(''.join(SYMBOLS))
    components = re.findall(pattern, date_str)

    # Create a format placeholder, eg. '{}-{}-{} {}:{}:{}+{}'
    placeholder = re.sub(pattern, '{}', date_str)

    formats = []
    for comp in components:
        if re.match(r'^\d{1}$', comp):
            formats.append(SINGLE_DIGITS_FORMATS)
        elif re.match(r'^\d{2}$', comp):
            formats.append(TWO_DIGITS_FORMATS)
        elif re.match(r'^\d{3}$', comp):
            formats.append(THREE_DIGITS_FORMATS)
        elif re.match(r'^\d{4}$', comp):
            formats.append(FOUR_DIGITS_FORMATS)
        elif re.match(r'^\d{5,}$', comp):
            formats.append(LONG_DIGITS_FORMATS)
        elif re.match(r'^[a-zA-Z]{2}$', comp):
            formats.append(TWO_LETTERS_FORMATS)
        elif re.match(r'^[a-zA-Z]{3}$', comp):
            formats.append(THREE_LETTERS_FORMATS)
        elif re.match(r'^[a-zA-Z]{4,}$', comp):
            formats.append(LONG_LETTERS_FORMATS)
        else:
            formats.append(FORMAT_CODES)

    # Create a possible format set
    possible_set = itertools.product(*formats)

    found = 0
    for possible_format in possible_set:
        # Create a format with possible format combination
        dt_format = placeholder.format(*possible_format)
        try:
            dt = datetime.strptime(date_str, dt_format)
            # Use the format to parse the date, and format the 
            # date back to string and compare with the origin one
            if dt.strftime(dt_format) == date_str:
                print('Possible result: {}'.format(dt_format))
                found += 1
        except Exception:
            continue

    if found == 0:
        print('No pattern found')

Usage:

$ python3 reverse.py
Please input a date: 2018-12-31 10:26 PM
Possible result: %Y-%d-%M %I:%S %p
Possible result: %Y-%d-%S %I:%M %p
Possible result: %Y-%m-%d %I:%M %p
Possible result: %Y-%m-%d %I:%S %p
Possible result: %Y-%m-%M %I:%d %p
Possible result: %Y-%m-%M %I:%S %p
Possible result: %Y-%m-%S %I:%d %p
Possible result: %Y-%m-%S %I:%M %p
Possible result: %Y-%H-%d %m:%M %p
Possible result: %Y-%H-%d %m:%S %p
Possible result: %Y-%H-%d %M:%S %p
Possible result: %Y-%H-%d %S:%M %p
Possible result: %Y-%H-%M %d:%S %p
Possible result: %Y-%H-%M %m:%d %p
Possible result: %Y-%H-%M %m:%S %p
Possible result: %Y-%H-%M %S:%d %p
Possible result: %Y-%H-%S %d:%M %p
Possible result: %Y-%H-%S %m:%d %p
Possible result: %Y-%H-%S %m:%M %p
Possible result: %Y-%H-%S %M:%d %p
Possible result: %Y-%I-%d %m:%M %p
Possible result: %Y-%I-%d %m:%S %p
Possible result: %Y-%I-%d %M:%S %p
Possible result: %Y-%I-%d %S:%M %p
Possible result: %Y-%I-%M %d:%S %p
Possible result: %Y-%I-%M %m:%d %p
Possible result: %Y-%I-%M %m:%S %p
Possible result: %Y-%I-%M %S:%d %p
Possible result: %Y-%I-%S %d:%M %p
Possible result: %Y-%I-%S %m:%d %p
Possible result: %Y-%I-%S %m:%M %p
Possible result: %Y-%I-%S %M:%d %p
Possible result: %Y-%M-%d %I:%S %p
Possible result: %Y-%M-%S %I:%d %p
Possible result: %Y-%S-%d %I:%M %p
Possible result: %Y-%S-%M %I:%d %p

very interesting approach, thanks. Could you try the above with perhaps some date strings from some of the other answers? And maybe use the "most common" result as the result -- for example, "Possible result (all)" | "Likely result (one)" — David542, Dec 31 '18 at 19:27

Gaurav · Answer 5 · 2018-12-30T05:11:16.903

2

My idea was to create a class something like this, might not be accurate

from datetime import datetime
import re
class DateTime(object):
    dateFormat = {"%d": "dd", "%Y": "YYYY", "%a": "Day", "%A": "DAY", "%w": "ww", "%b": "Mon", "%B": "MON", "%m": "mm",
                  "%H": "HH", "%I": "II", "%p": "pp", "%M": "MM", "%S": "SS"}  # wil contain all format equivalent

    def __init__(self, date_str, format):
        self.dateobj = datetime.strptime(date_str, format)
        self.format = format

    def parse_format(self):
        output=None
        reg = re.compile("%[A-Z a-z]")
        fmts = None
        if self.format is not None:
            fmts = re.findall(reg, self.format)
        if fmts is not None:
            output = self.format
            for f in fmts:
                output = output.replace(f, DateTime.dateFormat[f])
        return output


nDate = DateTime("12 January, 2018", "%d %B, %Y")
print(nDate.parse_format())

edited Dec 30 '18 at 05:11

answered Dec 29 '18 at 13:53

Gaurav

533
5
20

how would that work with something like "Jan. 4, 2017" , or really anything? (I tried it with the most common date format, YYYY-MM-DD, and it didn't work) – David542 Dec 29 '18 at 22:23
you need to pass %Y-%m-%d and it will return YYYY-mm-dd – Gaurav Dec 30 '18 at 01:45
If you pass there is lots of chances that people passes something like this , 01 of January, 2019 at 12:00 am and string format be DD of MMMM, YYYY at HH:MM AA which will involve lots of pattern defining because this contain word like "of at" but if you pass format the way %d of %m,%Y at %h:%M %a then it will be able to parse more formats i guess – Gaurav Dec 30 '18 at 02:06
@David542 even if you try non common format it will give you output like `otherDate=DateTime("January 12, 2018 at 02:12 am","%B %d, %Y at %H:%M %p") print(otherDate.parse_format())` – Gaurav Dec 30 '18 at 05:09
The issue is too complicated to just whip up a small class to RELIABLY do it. – Sebastiaan van den Broek Dec 30 '18 at 05:16
@SebastiaanvandenBroek Yes i agree With You I just show the idea actually it can be created as a whole new module – Gaurav Dec 30 '18 at 05:20
@SebastiaanvandenBroek I am not that experience to write Professional Module(that can be used by developer) Yet, Hopefully one day i will be – Gaurav Dec 30 '18 at 05:21
2

Yeah sure, always keep learning but for this case I’d have a look at the dateutil’s source. Perhaps it can be modified so that when it finds a piece of string that is successfully parsed with a format, instead of using that as input for the date object just returning the successful format string instead. – Sebastiaan van den Broek Dec 30 '18 at 05:27
@SebastiaanvandenBroek When I was a beginner with stack overflow i used to think the developer here are too rude, they eagerly dislike a beginner crazy thoughts, but as time passes and your message changed my mind. They do it to run this platform effortlessly for developer community. – Gaurav Dec 30 '18 at 16:32

score 0 · Answer 6 · answered Jan 04 '19 at 15:29

0

You can wrap the function to store the arguments along with the result any time you call the wrapped version:

from dateutil.parser import parse
from functools import wraps

def parse_wrapper(function):
    @wraps(function)
    def wrapper(*args):
        return {'datetime': function(*args), 'args': args}
    return wrapper

wrapped_parse = parse_wrapper(parse)
x = wrapped_parse("2014-01-01 00:12:12")
# {'datetime': datetime.datetime(2014, 1, 1, 0, 12, 12),
#  'args': ('2014-01-01 00:12:12',)}

answered Jan 04 '19 at 15:29

Engineero

12,340
5
53
75

that's an interesting approach, but how would you extract, for example "%Y-%m-%d %H:%M:%S" from the above? Of course we know what the input string is, but that's not much help for figuring out the date format pattern. – David542 Jan 04 '19 at 21:22

Get the format in dateutil.parse

6 Answers6

Idea:

Algorithm implementation

Usage:

Linked