-1

I started to play with python3.8 re lib and I find strange things. First I'm making my regex online to test how it works and when I found out that its already working I try in python.

test_string = [Tues Jan 20 11:35:13.405644 2020] [access_compat:error] [pid 1871:tid 140301098780416] [client 192.168.123.9:59662] AH01797: client denied by server configuration: /var/www/website/401.html

res = re.compile(r"(?P<Time>\w+\s\w+\s\d+\s\d+:\d+:\d{2})|(?P<Type>[a-zA-Z]+_\w+:\w+)", re.VERBOSE)

for line in logfile:
    line = res.search(line)
    print(line.groupdict())

I'm trying to parse log line like that like that. but I get the following result:

type : none

I dont know why, or how to fix it , any ideas?:

{'time': Mon Jan 20 11:34:13, 'type': access_compat:error}
0stone0
  • 34,288
  • 4
  • 39
  • 64
robotiaga
  • 315
  • 2
  • 11

2 Answers2

1

You used a pattern with 2 alternatives, whereas you should use a pattern matching both alternatives at once, e.g.:

(?P<Time>\w+\s\w+\s\d+\s\d+:\d+:\d{2}).+?(?P<Type>[a-zA-Z]+_\w+:\w+)

or

'time': (?P<Time>\w+\s\w+\s\d+\s\d+:\d+:\d{2}), 'type': (?P<Type>[a-zA-Z]+_\w+:\w+)

For a working example see https://regex101.com/r/v6g791/1

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
1

This regex seems to work :

r"\[(?P<Time>\w+\s\w+\s\d+\s\d+:\d+:\d{2}).*\].*\[(?P<Type>[a-zA-Z]+_\w+:\w+)\]"

The code :

import re

test_string = "[Tues Jan 20 11:35:13.405644 2020] [access_compat:error] [pid 1871:tid 140301098780416] [client 192.168.123.9:59662] AH01797: client denied by server configuration: /var/www/website/401.html"
res = re.compile(r"\[(?P<Time>\w+\s\w+\s\d+\s\d+:\d+:\d{2}).*\].*\[(?P<Type>[a-zA-Z]+_\w+:\w+)\]", re.VERBOSE)
print(res.search(test_string).groupdict())

Result :

{'Time': 'Tues Jan 20 11:35:13', 'Type': 'access_compat:error'}

Explanation :

  • You forgot to match characters [ and ]. You must use : \[<pattern>\]
  • You have used character | between your 2 pattern. And this character means : match either A or B (e.g.: r"A|B")

Source : https://docs.python.org/3/library/re.html

RomainD
  • 141
  • 1
  • 6