3

I'm trying to convert CamelCase to snake_case using the regex I found here. Here's a snippet of the code I'm using:

in := "camelCase"
var re1 = regexp.MustCompile(`(.)([A-Z][a-z]+)`)
out := re1.ReplaceAllString(in, "$1_$2")

The regex will match lCase. $1 here is l and $2 is Case, so using the replacement string "$1_$2" should result in camel_Case. Instead, it results in cameCase.

Changing the replacement string to "$1_" results in came. If I change it to "$1+$2", the result will be camel+Case as expected (see playground).

Right now, my workaround is to use "$1+$2" as the replacement string, and then using strings.Replace to change the plus sign to an underscore. Is this a bug or am I doing something wrong here?

Community
  • 1
  • 1
user-4859
  • 33
  • 2

1 Answers1

5

The fix is to use ${1}_$2 (or ${1}_${2} for symmetry).

Per https://golang.org/pkg/regexp/#Regexp.Expand (my emphasis):

In the template, a variable is denoted by a substring of the form $name or ${name}, where name is a non-empty sequence of letters, digits, and underscores.

...

In the $name form, name is taken to be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0.

So in $1_$2, you're actually looking for a group named 1_ and then another group named 2 and putting them together.

As to why using $1_$2 (or $foo$2 for that matter) results in "cameCase," that same documentation says:

A reference to an out of range or unmatched index or a name that is not present in the regular expression is replaced with an empty slice.

So replacing with "$1_$2" is equivalent to replacing with just "$2".

user94559
  • 59,196
  • 6
  • 103
  • 103