-1
String s = #Section250342,Main,First/HS/12345/Jack/M,2000 10.00,
#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,
#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,
#Section251234,Main,First/HS/12345/Jack/M,2000 11.00

Wherever there is the word /Jack/M in the3 string, I want to pull the section numbers(250342,251234) and the values(10.00,11.00) associated with it using regex each time.

I tried something like this https://regex101.com/r/4te0Lg/1 but it is still messed.

.Section(\d+(?:\.\d+)?).*/Jack/M
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
Ajm Kir
  • 17
  • 4

2 Answers2

1

If the only parts of each section that change are the section number, the name of the person and the last value (like in your example) then you can make a pattern very easily by using one of the sections where Jack appears and replacing the numbers you want by capturing groups.

Example:

#Section250342,Main,First/HS/12345/Jack/M,2000 10.00

becomes,

#Section(\d+),Main,First/HS/12345/Jack/M,2000 (\d+.\d{2})

If the section substring keeps the format but the other parts of it may change then just replace the rest like this:

#Section(\d+),\w+,(?:\w+/)*Jack/M,\d+ (\d+.\d{2})

I'm assuming that "Main" is a class, "First/HS/..." is a path and that the last value always has 2 and only 2 decimal places.

  • \d - A digit: [0-9]
  • \w - A word character: [a-zA-Z_0-9]
  • + - one or more times
  • * - zero or more times
  • {2} - exactly 2 times
  • () - a capturing group
  • (?:) - a non-capturing group

For reference see: https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/regex/Pattern.html

Simple Java example on how to get the values from the capturing groups using java.util.regex.Pattern and java.util.regex.Matcher

import java.util.regex.*;

public class GetMatch {

    public static void main(String[] args) {

        String s = "#Section250342,Main,First/HS/12345/Jack/M,2000 10.00,#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,#Section251234,Main,First/HS/12345/Jack/M,2000 11.00";
        
        Pattern p = Pattern.compile("#Section(\\d+),\\w+,(?:\\w+/)*Jack/M,\\d+ (\\d+.\\d{2})");
        Matcher m;
        String[] tokens = s.split(",(?=#)"); //split the sections into different strings
        
        for(String t : tokens) //checks every string that we got with the split
        {   
            m = p.matcher(t);
            if(m.matches()) //if the string matches the pattern then print the capturing groups
                System.out.printf("Section: %s, Value: %s\n", m.group(1), m.group(2));
        }
    }
}
Yellow
  • 68
  • 6
0

You could use 2 capture groups, and use a tempered greedy token approach to not cross @Section followed by a digit.

#Section(\d+)(?:(?!#Section\d).)*\bJack/M,\d+\h+(\d+(?:\.\d+)?)\b

Explanation

  • #Section(\d+) Match #Section and capture 1+ digits in group 1
  • (?:(?!#Section\d).)* Match any character if not directly followed by #Section and a digit
  • \bJack/M, Match the word Jack and /M,
  • \d+\h+ Match 1+ digits and 1+ spaces
  • (\d+(?:\.\d+)?) Capture group 2, match 1+ digits and an optional decimal part
  • \b A word boundary

Regex demo

In Java:

String regex = "#Section(\\d+)(?:(?!#Section\\d).)*\\bJack/M,\\d+\\h+(\\d+(?:\\.\\d+)?)\\b";
The fourth bird
  • 154,723
  • 16
  • 55
  • 70