1

Considering the following string: "${test.one}${test.two}" I would like my regex to return two matches, namely "test.one" and "test.two". To do that I have the following snippet:

import java.util.regex.Matcher; import java.util.regex.Pattern;

public class RegexTester {

    private static final Pattern pattern = Pattern.compile("\\$\\{((?:(?:[A-z]+(?:\\.[A-z0-9()\\[\\]\"]+)*)+|(?:\"[\\w/?.&=_\\-]*\")+)+)}+$");

    public static void main(String[] args) {
        String testString = "${test.one}${test.two}";

        Matcher matcher = pattern.matcher(testString);

        while (matcher.find()) {
            for (int i = 0; i <= matcher.groupCount(); i++) {
                System.out.println(matcher.group(i));
            }
        }
    }
}

I have some other stuff in there as well, because I want this to also be a valid match ${test.one}${"hello"}.

So, basically, I just want it to match on anything inside of ${} as long as it either follows the format: something.somethingelse (alphanumeric only there) or something.somethingElse() or "something inside of quotations" (alphanumeric plus some other characters). I have the main regex working, or so I think, but when I run the code, it finds two groups,

${test.two} test.two

I want the output to be

test.one test.two

cloudwalker
  • 2,346
  • 1
  • 31
  • 69

2 Answers2

2

Basically, your regex main problem is that it matches only at the end of string, and you match many more chars that just letters with [A-z]. Your grouping also seem off.

If you load your regex at regex101, you will see it matches

  • \$\{
  • ( - start of a capturing group
    • (?: - start of a non-capturing group
      • (?:[A-z]+ - start of a non-capturing group, and it matches 1+ chars between A and z (your first mistake)
        • (?:\.[A-z0-9()\[\]\"]+)* - 0 or more repetitions of a . and then 1+ letters, digits, (, ), [, ], ", \, ^, _, and a backtick
      • )+ - repeat the non-capturing group 1 or more times
      • | - or
      • (?:\"[\w/?.&=_\-]*\")+ - 1 or more occurrences of ", 0 or more word, /, ?, ., &, =, _, - chars and then a "
      • )+ - repeat the group pattern 1+ times
    • ) - end of non-capturing group
  • }+ - 1+ } chars
  • $ - end of string.

To match any occurrence of your pattern inside a string, you need to use

\$\{(\"[^\"]*\"|\w+(?:\(\))?(?:\.\w+(?:\(\))?)*)}

See the regex demo, get Group 1 value after a match is found. Details:

  • \$\{ - a ${ substring
  • (\"[^\"]*\"|\w+(?:\(\))?(?:\.\w+(?:\(\))?)*) - Capturing group 1:
    • \"[^\"]*\" - ", 0+ chars other than " and then a "
    • | - or
    • \w+(?:\(\))? - 1+ word chars and an optional () substring
    • (?:\.\w+(?:\(\))?)* - 0 or more repetitions of . and then 1+ word chars and an optional () substring
  • } - a } char.

See the Java demo:

String s = "${test.one}${test.two}\n${test.one}${test.two()}\n${test.one}${\"hello\"}";
Pattern pattern = Pattern.compile("\\$\\{(\"[^\"]*\"|\\w+(?:\\(\\))?(?:\\.\\w+(?:\\(\\))?)*)}");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} 

Output:

test.one
test.two
test.one
test.two()
test.one
"hello"
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you, that was super helpful! I've been trying to get into regex stuff more here lately as it is something that I do not have a ton of experience with in the past, and it's definitely been a rocky start :) – cloudwalker May 01 '20 at 19:29
0

You could use the regular expression

(?<=\$\{")[a-z]+(?="\})|(?<=\$\{)[a-z]+\.[a-z]+(?:\(\))?(?=\})

which has no capture groups. The characters classes [a-z] can be modified as required provided they do not include a double-quote, period or right brace.

Demo

Java's regex engine performs the following operations.

(?<=\$\{")  # match '${"' in a positive lookbehind
[a-z]+      # match 1+ lowercase letters 
(?="\})     # match '"}' in a positive lookahead
|           # or 
(?<=\$\{)   # match '${' in a positive lookbehind
[a-z]+      # match 1+ lowercase letters 
\.[a-z]+    # match '.' followed by 1+ lowercase letters
(?:\(\))?   # optionally match `()`
(?=\})      # match '}' in a positive lookahead
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100