0

I am using the datefinder module in python and I need to extract the DateTime from a string but I am getting multiple dates from the string that contains only one date and time.

Code:

import datefinder
def date_using_datefinder(date_string):
    matches = datefinder.find_dates(date_string)
    for match in matches:
        print(match)

Input:

test3='''
[26/08/2018 06:58:29.126900]
[26/03/2004 06:58:29.126985][SDAP_CODEC][JET_AT_JUMP][43][HTXXHTXX] DUST_RPOP:QFI:9
[02/06/2003 06:58:29.254621][SDAP_CODEC][JET_AT_JUMP][43][HTXXHTXX] DUST_RPOP:QFI:9
[20/05/2022 06:58:29.124898][SDAP_CODEC][JET_AT_JUMP][43][HTXX] DUST_RPOP:QFI:9
[26/08/2020 06:58:29.136579][ALST][stx][29][ggg] JET_AT_JUMP:TRUX_MSGD_HTXX:13265261686865256:QWERT_DUMPING_TDD:45:DUST_RPOP_CVX:32:AXTP_DI:65576
'''

Output:

2018-08-26 06:58:29.126900
2004-03-26 06:58:29.126985
2003-02-06 06:58:29.254621
2022-05-20 06:58:29.124898
2020-08-26 06:58:29.136579
2045-08-02 00:00:00
2032-08-02 00:00:00

why these last two dates are appearing which is I guess nowhere in the string.

PS: I tried DateUtil Module also but it's showing ParseError.

just for reference, the code is:

from datetime import datetime
from dateutil import tz
import dateutil.parser as dparser
import warnings
warnings.filterwarnings('ignore')

def date_Using_UtilModule(date_string):
    res = dparser.parse(date_string, fuzzy = True)
    return res

res = date_Using_UtilModule("[26/08/2020 06:58:29.136579][ALST][stx][29][ggg] JET_AT_JUMP:TRUX_MSGD_HTXX:13265261686865256:QWERT_DUMPING_TDD:45:DUST_RPOP_CVX:32:AXTP_DI:65576")
print(res)

output:
ParserError: Unknown string format: [26/08/2020 06:58:29.136579][ALST][stx][29][ggg] JET_AT_JUMP:TRUX_MSGD_HTXX:13265261686865256:QWERT_DUMPING_TDD:45:DUST_RPOP_CVX:32:AXTP_DI:65576

Note: using regex will not work in my case because my log lines can have random patterns and also any DateTime format, or I can say not want to use regex.

codex
  • 43
  • 6
  • Your input has `:45:` and `:32:` in its last line, seems like the `datefinder` library you're using is picking those out as year numbers (and giving them today's day of the year?). You might have an easier time of things if you do some initial pre-parsing of your input lines (e.g. picking out only the contents of the first pair of square brackets). – Blckknght Aug 01 '21 at 20:30
  • Thanks, @Blckknght here I gave a few inputs lines but what if the DateTime is not in square brackets and it may be at any place in the string then it may not help us. – codex Aug 01 '21 at 20:38

1 Answers1

0

Let me help myself

I have created a python library to do my task if anyone else is needed can also use this lib. tried to cover most of the date-time format and will update for more.

It's time to use our own library

Installation -> pip install MyDateTimeLib==0.1.2
Importing as -> from MyDateTimeLib import myfunction
How to use? --> myfunction.date_find("passing date string")
Returns     --> it return the dictionary containing all the dates from the string else null dict.
check on    --> https://pypi.org/project/MyDateTimeLib/0.1.2/

DEMO:

data_for_date = '''
[26/08/2018 06:58:29.126900]
[26/03/2004 06:58:29.126985][SDAP_CODEC][JET_AT_JUMP][43][HTXXHTXX] DUST_RPOP:QFI:9
[02/06/2003 06:58:29.254621][SDAP_CODEC][JET_AT_JUMP][43][HTXXHTXX] DUST_RPOP:QFI:9[26/03/2036 06:58:29.126985]
[20/05/2022 06:58:29.124898][SDAP_CODEC][JET_AT_JUMP][43][HTXX] DUST_RPOP:QFI:9
[26/08/2020 06:58:29.136579][ALST][stx][29][ggg] JET_AT_JUMP:TRUX_MSGD_HTXX:13265261686865256:QWERT_DUMPING_TDD:45:DUST_RPOP_CVX:32:AXTP_DI:65576
'''

CODE:

from MyDateTimeLib import myfunction
for x in data_for_date.splitlines():
    if len(x)>1:
        dic = myfunction.date_find(x)
        print()
        for k,v in dic.items():
            print(k,v)
            

OUTPUT:

Date:0 2018-08-26 06:58:29.126900

Date:0 2004-03-26 06:58:29.126985

Date:0 2003-02-06 06:58:29.254621
Date:1 2036-03-26 06:58:29.126985

Date:0 2022-05-20 06:58:29.124898

Date:0 2020-08-26 06:58:29.136579
codex
  • 43
  • 6