I have a question about Python regex. I don't have much information about Python regex. I am working with HTTP request messages and parsing them with regex. As you know, the HTTP GET messages are in this format.
GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: 10.2.0.12
Connection: Keep-Alive
I want to parse the URI, method, user-agent, and the host areas of the message. My regex for this job is:
r'^({0})\s+(\S+)\s+[^\n]*$\n.*^User-Agent:\s*(\S+)[^\n]*$\n.*^Host:\s*(\S+)[^\n]*$\n'.format('|'.join(methods)), re.MULTILINE|re.DOTALL)
But, when the message comes up with like
GET / HTTP/1.0
Host: 10.2.0.12
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Connection: Keep-Alive
I can not catch them because of the places of host or, user-agent changed. So I need a generic regex that will catch all of them, even if the places of host, method, uri are changed in the message.