I have a search string. When it contains a dollar symbol, I want to capture all characters thereafter, but not include the dot, or a subsequent dollar symbol.. The latter would constitute a subsequent match. So for either of these search strings...:
"/bla/$V_N.$XYZ.bla";
"/bla/$V_N.$XYZ;
I would want to return:
- V_N
- XYZ
If the search string contains percent symbols, I also want to return what's between the pair of % symbols.
The following regex seems do the trick for that.
"%([^%]*?)%";
Inferring:
- Start and end with a %,
- Have a capture group - the ()
- have a character class containing anything except a % symbol, (caret infers not a character)
- repeated - but not greedily *?
Where some languages allow %1
, %2
, for capture groups, Java uses backslash\number
syntax instead. So, this string compiles and generates output.
I suspect the dollar symbol and dot need escaping, as they are special symbols:
$
is usually end of string.
is a meta sequence for any character.
I have tried using double backslash symbols.. \
- Both as character classes .e.g.
[^\\.\\$%]
- and using OR'd notation
%|\\$
in attempts to combine this logic and can't seem to get anything to play ball.
I wonder if another pair of eyes can see how to solve this conundrum!
My attempts so far:
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
String search = "/bla/$V_N.$XYZ.bla";
String pattern = "([%\\$])([^%\\.\\$]*?)\\1?";
/* Either % or $ in first capture group ([%\\$])
* Second capture group - anything except %, dot or dollar sign
* non greedy group ( *?)
* then a backreference to an optional first capture group \\1?
* Have to use two \, since you escape \ in a Java string.
*/
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(search);
List<String> results = new ArrayList<String>();
while (m.find())
{
for (int i = 0; i<= m.groupCount(); i++) {
results.add(m.group(i));
}
}
for (String result : results) {
System.out.println(result);
}
}
}
The following links may be helpful:
- An interactive Java playground where you can experiment and copy/paste code.
- Regex101
- Java RegexTester
- Java backreferences (The optional backreference
\\1
in the Regex). - Link that summarises Regex special characters often found in languages
- Java Regex book EPub link
- Regex Info Website
- Matcher class in the Javadocs