I have a unique problem statement where I have to perform regex on an input string using triple characters. e.g. if my input is ABCDEFGHI
, a pattern search for BCD
should return false since I am treating my input as ABC+DEF+GHI
and need to compare my regex pattern with these triple characters.
Similarly, regex pattern DEF
will return true
since it matches one of the triplets. Using this problem statement, assume that my input is QWEABCPOIUYTREWXYZASDFGHJKLABCMNBVCXZASXYZFGH
and I am trying to get all output strings that start with triplet ABC
and end with XYZ
. So, in above input, my outputs should be two strings: ABCPOIUYTREWXYZ
and ABCMNBVCXZASXYZ
.
Also, I have to store these strings in an ArrayList. Below is my function:
public static void newFindMatches (String text, String startRegex, String endRegex, List<String> output) {
int startPos = 0;
int endPos = 0;
int i = 0;
// Making sure that substrings are always valid
while ( i < text.length()-2) {
// Substring for comparing triplets
String subText = text.substring(i, i+3);
Pattern startP = Pattern.compile(startRegex);
Pattern endP = Pattern.compile(endRegex);
Matcher startM = startP.matcher(subText);
if (startM.find()) {
// If a match is found, set the start position
startPos = i;
for (int j = i; j < text.length()-2; j+=3) {
String subText2 = text.substring(j, j+3);
Matcher endM = endP.matcher(subText2);
if (endM.find()) {
// If match for end pattern is found, set the end position
endPos = j+3;
// Add the string between start and end positions to ArrayList
output.add(text.substring(startPos, endPos));
i = j;
}
}
}
i = i+3;
}
}
Upon running this function in main as follows:
String input = "QWEABCPOIUYTREWXYZASDFGHJKLABCMNBVCXZASXYZFGH";
String start = "ABC";
String end = "XYZ";
List<String> results = new ArrayList<String> ();
newFindMatches(input, start, end, results);
for (int x = 0; x < results.size(); x++) {
System.out.println("Output String number "+(x+1)+" is: "+results.get(x));
}
I get the following output:
Output String number 1 is: ABCPOIUYTREWXYZ
Output String number 2 is: ABCPOIUYTREWXYZASDFGHJKLABCMNBVCXZASXYZ
Notice that first string is correct. However, for the second string, program is again reading from start of input string. Instead, i want the program to read after the last end pattern (i.e. skip the first search and unwanted characters such as ASDFGHJKL and should only print 2nd string as: ABCMNBVCXZASXYZ
Thanks for your responses