2

I've read about Evil RegExp and usually ensure a basic level of safety is in place when dealing with user input with regards to RegExp.

What I am unsure about is whether this issue is also present in Glob. I imagine it will come down to the individual implementations of Glob'ing' and in my particular instance I am using https://github.com/gobwas/glob/

I'd appreciate any advice available for how to test for this issue and potentially how to mitigate against it.

Illizian
  • 484
  • 1
  • 5
  • 13
  • Make sure to limit the size of the glob, and you should be fine. – Bergi Aug 01 '16 at 01:48
  • FYI if by "evil regex" you mean one that can cause infinitely recursive backtracking, you don't hvae to worry about that in Go. The regexp package in the standard library implements regular expressions correctly with finite state automata, so it never has to backtrack (at the cost of not supporting some features which you probably don't actually need anyways). For more info see: https://swtch.com/~rsc/regexp/regexp1.html – Sam Whited Aug 01 '16 at 04:30

1 Answers1

3

By "evil regex" I assume you mean a regex that falls victim to catastrophic backtracking.

From what you're describing, it seems like you're using a glob library to avoid these "evil regexes". Globs are essentially a weaker version of regex.

The thing that you're missing here is the fact that regexes don't have to be evil. This can be proven in plain Go, with no external libraries.

Try running this Go program:

package main

import (
    "fmt"; "regexp"
)

func main() {
    reg := regexp.MustCompile(`^([^z]*?,){11}P`)

    txt := `1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18zP`

    fmt.Println(reg.ReplaceAllString(txt, ""))
}

You might wonder why this code doesn't measure how much time execution took. It's because it's not needed (and also because I don't know much Go).

The regex will work in almost all regex flavors. You can try running it in Java, Perl or another similar flavor (I like using PCRE on https://regex101.com/#pcre), but the outcome will be one of two things:

  • A timeout
  • You get fed up with how long it's taking and stop the program

Yes, that combination causes catastrophic backtracking in most regex flavors. But not Go. Why?

Go doesn't use backtracking at all for its regexes, so it's not even a possibility. According to this site:

In Go, we find an optimized regular expression engine. This runs in linear time, making complex patterns faster. It is located in the regexp package.

Read more about the differences between backtracking and non-backtracking engines here.


Considering the glob library (according to that GitHub link) appears faster than Go's regexps, performance shouldn't be a problem.

Laurel
  • 5,965
  • 14
  • 31
  • 57
  • Are you sure there's not some way to attack the optimiser with large regexes that are hard to translate into state machines? – Bergi Aug 01 '16 at 01:46
  • @Bergi I'm not sure what you're asking. If the size of the regex is variable (usually it's not), it's O(m*n) where one is regex length and the other is input length. If you're asking if the regex compiler has trouble compiling regexes, I doubt it (and that would be a bug with Go). As long as the regex is valid, the engine should be fine. – Laurel Aug 01 '16 at 02:23
  • @Laurel that's fantastic news and another tick in the GoLang box. I'm actually using Glob for UX reasons, the pattern needs to be added by non-technical people. Let me read through the references and I'll I'll accept the answer. Thanks!! – Illizian Aug 01 '16 at 05:40