In a Java program, I want to find out all the occurrences in a given String of these substrings: $$, or $\d (the symbol '$' followed by an integer).
My problem started when I added an additional constraint stating that a match occurs only if the matched string is not part of a substring limited by certain sequence of characters.
For example, I want to ignore the matches if they are part of a substring surrounded by "/{" and "/}".
The following example brings all the occurrences of $$, or $\d, but does not considere the additional constraint of ignoring the match if it is inside "/{" and "/}".
public static final String PARAMETERS_PREFIX = "$";
public static final String ALL_PARAMS_SUFFIX = "$";
public static final String BEGIN_JAVA_EXPRESSION = "/{";
public static final String END_JAVA_EXPRESSION = "/}";
...
String test = "$1 xxx $$ " //$1 and $$ are matches
+ BEGIN_JAVA_EXPRESSION + "xxx $2 xxx" + END_JAVA_EXPRESSION; //$2 SHOULD NOT be a match
Set<String> symbolsSet = new LinkedHashSet<String>();
Pattern pattern = Pattern.compile(Pattern.quote(PARAMETERS_PREFIX)+"(\\d+|"+Pattern.quote(ALL_PARAMS_SUFFIX)+")");
Matcher findingMatcher = pattern.matcher(test);
while(findingMatcher.find()) {
String match = findingMatcher.group();
symbolsSet.add(match);
}
return new ArrayList<String>(symbolsSet);
In addition to find the keywords that are not part of certain substring, I want to be able to replace afterwards only those keywords by certain values. Then, the option of just removing everything between the delimited characters before doing the match probably will not help, since afterwards I need to be able to get the original string with the matched tokens replaced by certain values, and the tokens inside the delimited region should be left without modifications. This should be easy if I found the right regex.
Does someone could give me a hint about how to write the right regex for this problem ?.