0

I have a log file like this

[01012017 052235 500][1][INFO]> ----Amount   : 200

[01012017 052235 515][1][INFO]> ----Mix      : 1

[01012017 052235 515][1][INFO]> ----Currency : LKR

[01012017 052243 156][1][INFO]> ----Denomination

[01012017 052243 171][1][INFO]> -----CU  TYP 

I want to extract the date after the 1st square bracket and I wrote a python code as follows.

transactionDate = re.findall('\[(.*?)\s\w+\s\w+\]\[\w\]\[INFO\]\>\s+\w+Amount',strtosearch2,re.DOTALL)

This gives a empty list. The expected output is:

01012017

Can you please help fix this error ?

Taku
  • 31,927
  • 11
  • 74
  • 85

2 Answers2

0

Your regex "\[(.*?)\s\w+\s\w+\]\[\w\]\[INFO\]\>\s+\w+Amount" has one mistake in it, the \w+ before Amount, since you were trying to match ---- with \w+, but a dash (-) is not in the \w character set.

You will need to change that part of the regex to include the dash, so making a set [\w-]+ should solve your problem.


The final regex will be "\[(.*?)\s\w+\s\w+\]\[\w\]\[INFO\]\>\s+[\w-]+Amount"

When you use this regex, you will get your desired output:

['01012017']
Taku
  • 31,927
  • 11
  • 74
  • 85
0

You could try using the following:

re.findall(r'\[(\d+)\s\d+\s\d+\]\[\d\]\[INFO\]\>', str2search)

Note that I'm using \d (which matches digits) instead of \w (which matches any "word" characters)

Also, this solution will work for every line in your log file, not just the first one.

GotoCode
  • 31
  • 2