2

For a security PoC in java 1.8 (java.util.regex.*) I try to detect in a log file an sql injection attack like "union select from", even if it's encoded to bypass a waf. Example from OWASP:

/*!%55NiOn*/ /*!%53eLEct*/
REVERSE(noinu)+REVERSE(tceles)
un?+un/**/ion+se/**/lect+

A dirty way to detect it thanks to a regex would be to detect 3 consecutive letters in character classes, [unio], [selct] and [from].

So a quite simple regex with few false positive would be like:

([unio])([unio&&[^\\1])[unio&&[^\\1\\2]] => does not match uni

[unio][unio&&[^u][unio&&[^un]] => does match uni

So I use subtraction, but using capturing group or named capturing group in a subtraction seems impossible but I need it to detect REVERSE(noinu)+REVERSE(tceles) as well as /*!%55NiOn*/ /*!%53eLEct*/

Does anyone know how I could do it?

Thanks and sorry for the crappy english

Darud
  • 23
  • 2

1 Answers1

1

If I understand your specification correctly, than the following should do the trick:

(([unio]|[selct]|[from])\2?(?!\2)){3,}+

For a detailed explanation see this Regex 101, but in short:

  • match one of the groups
  • look ahead for other members from the groups
  • look for at least three member

This will mix-and-match from the groups (i.e., it will find rio). If you want to have matches only from the specific groups, w/o mix-and-match then as a first try I would suggest to use three different regexes (one for each matching group), because while it is definitely doable to do that kind of matching with one single regex, the question is how readable it will be?

Edit: my answer is based on this SO answer

Edit2: based on the comments of the OP the solution would be:

(([unio])\2?(?!\2)){3,}.*(([selct])\4?(?!\4)){3,}.*(([from])\6?(?!\6)){3,}.*
Community
  • 1
  • 1
D. Kovács
  • 1,232
  • 13
  • 25
  • Nice! Your regex does work! But I had to modify it as it matched too many false positive: `valentin` which is a name for example – Darud Jan 06 '17 at 13:42
  • From the limited specification that's what I could cook up :) I'm glad I could help, and you can build from this forward. Please consider accepting my answer. And happy *secure* coding \o/ – D. Kovács Jan 06 '17 at 13:45
  • Here is mine: `(([unio])\2?(?!\2)){3,}.*(([selct])\2?(?!\2)){3,}.*(([from])\2?(?!\2)){3,}.*` Thanks a lot! – Darud Jan 06 '17 at 13:53
  • @Darud that won't work, as `\2` is alway the capture group `([unio])` (put your regex into Regex 101, and it will tell you which capture group is `([selct])` and `([from])` respectively you have to update the second and third look-ahead) – D. Kovács Jan 06 '17 at 14:10
  • Just be aware that this regex could be a performance bottleneck for very long inputs (see that your last link as example can take between 100-200ms) and that you should normalize your Strings before putting them into this regex: https://www.securecoding.cert.org/confluence/display/java/IDS01-J.+Normalize+strings+before+validating+them – D. Kovács Jan 06 '17 at 14:37