1

I have the following log message:

Aug 25 03:07:19 localhost.localdomainASM:unit_hostname="bigip1",management_ip_address="192.168.41.200",management_ip_address_2="N/A",http_class_name="/Common/log_to_elk_policy",web_application_name="/Common/log_to_elk_policy",policy_name="/Common/log_to_elk_policy",policy_apply_date="2020-08-10 06:50:39",violations="HTTP protocol compliance failed",support_id="5666478231990524056",request_status="blocked",response_code="0",ip_client="10.43.0.86",route_domain="0",method="GET",protocol="HTTP",query_string="name='",x_forwarded_for_header_value="N/A",sig_ids="N/A",sig_names="N/A",date_time="2020-08-25 03:07:19",severity="Eror",attack_type="Non-browser Client,HTTP Parser Attack",geo_location="N/A",ip_address_intelligence="N/A",username="N/A",session_id="0",src_port="39348",dest_port="80",dest_ip="10.43.0.201",sub_violations="HTTP protocol compliance failed:Bad HTTP version",virus_name="N/A",violation_rating="5",websocket_direction="N/A",websocket_message_type="N/A",device_id="N/A",staged_sig_ids="",staged_sig_names="",threat_campaign_names="N/A",staged_threat_campaign_names="N/A",blocking_exception_reason="N/A",captcha_result="not_received",microservice="N/A",tap_event_id="N/A",tap_vid="N/A",vs_name="/Common/adv_waf_vs",sig_cves="N/A",staged_sig_cves="N/A",uri="/random",fragment="",request="GET /random?name=' or 1 = 1' HTTP/1.1\r\n",response="Response logging disabled"

And I have the following RegEx:

request="(?<Flag1>.*?)"

I trying now to match some text again from the previous group under name "Flag1", the new match that I'm trying to flag it is /random?name=' or 1 = 1' as Flag2.

How can I match the needed text from other matched group number or flag name without insert the new flag inside the targeted group like:

request="(?<Flag1>\w+\s+(?<Flag2>.*?)\s+HTTP.*?)"

https://regex101.com/r/EcBv7p/1

Thanks.

moody
  • 39
  • 5
  • Why don't you want to insert it inside the Flag1 group? The match that you want, is part of the string matched by Flag1. – The fourth bird May 20 '22 at 11:37
  • Do you mean like `request="\w+\s+\K.*?(?=\s+HTTP[^"]*")` https://regex101.com/r/p8KgHd/1 or `(?<=request="\w+\s+).*?(?=\s+HTTP[^"]*")` https://regex101.com/r/uXARHG/1 – The fourth bird May 20 '22 at 13:03

2 Answers2

1

If I understand you correctly, you want to match whatever string a previous group has matches, right? In that case you can use \n or in this case \1 to match the same thing that your first capture group matched

Kevin Holtkamp
  • 479
  • 4
  • 17
  • The thing that I'm looking for is to match again some of the texts those were captured in group X in a new group, and not to match all texts those were captured. – moody May 20 '22 at 12:57
1

You can use

request="(?<Flag1>[A-Z]+\s+(?<Flag2>\/\S+='[^']*')[^"]*)"

See the regex demo.

Details:

  • (?<Flag1> - Flag1 group:
    • [A-Z]+ - one or more uppercase ASCII letters
    • \s+ - one or more whitespaces
    • (?<Flag2>\/\S+='[^']*') - Group Flag2: /, one or more non-whitespace chars, =', zero or more chars other than ', and then a ' char
    • [^"]* - zero or more chars other than "
  • ) - end of Flag1 group.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • How about if I want to let Flag2 not attached in the same group? to be in new a group and to match some of the texts again from Flag1 texts not all of them as in our example above. – moody May 20 '22 at 13:04
  • @moody Then please provide some new example string(s) with exact expected output. It is not clear what you mean by saying "let Flag2 non attached in the same group", they are never "attached" to anything. All groups just sub-match a piece of continuous text. They cannot match several disjoint parts of a string. – Wiktor Stribiżew May 20 '22 at 13:07
  • Ok in below link is full example of what I'm trying to do: https://regex101.com/r/7UTYoV/1 I'm trying to enhance the parsing time rate and the output fields. I did a test for 10K recoerd and the parsing time average is 8 seconds which is so bad rate. So, thats why I'm tring to reduce the time by modifing the way of parsing in the above link. And I see the Regex has many groups there and steps, and thats why I'm trying to let some flags read from other previous group or flag. Thanks. – moody May 20 '22 at 14:36
  • 1
    @moody It is not clear what you want to achieve here. What makes it slow is certainly not the group placement, it is the overuse of `.*?` pattern. The more such patterns to the left of the pattern, the longer it takes to process. What you need to do is to unroll them all, or precise (change to `[^"]*` where you need to match till the leftmost `"`, or with the `\d+` one or more digit pattern to match digits, etc.). See [this regex demo](https://regex101.com/r/7UTYoV/2). – Wiktor Stribiżew May 20 '22 at 15:12
  • Thanks alot for your help, I just tried your modifications and I was surprised the average is decreased from 8 to 2.60 second!! I will follow your comments to enhance it more. But good to know that the overuse of .*? pattern is delays the process. – moody May 20 '22 at 15:49
  • Is ok for you to explain this? ([^H]*(?:H(?!ost: )[^H]*)*Host: – moody May 20 '22 at 16:03
  • 1
    @moody Please use regex101.com to see the explanation of patterns. Here, it matches any non-`H`s followed with 0+ sequences of `H` not followed with `ost:` and then zero or more non-`H`s. See [this answer of mine](https://stackoverflow.com/a/38018490/3832970) to understand the technique. – Wiktor Stribiżew May 20 '22 at 16:27