2

I have a string in a jList that I am looking to split with a regex (for future simplicity if requirements change)

The string looks a lot like this:

ID: GF68464, Name: productname

the ID could be any combination of letters and numbers and could be any length.

I only want the ID to be matched, i.e excluding "ID: " and anything after the comma following the ID.

Here is what I have thus far but it doesn't seem to do what I ask it to

[^ID: ][a-zA-Z1-9][^,^.]

FURTHER INFO (EDIT)

I plan on extracting the ID to match against an array. (hence the need for a regex). Could this be done a different way?

4 Answers4

4

You can try this:

ID:\s*(\w+),

and extract the 1st capturing group. You can also use lookarounds (+1 to @p.s.w.g).


String str = "ID: GF68464, Name: productname";

Matcher m = Pattern.compile("ID:\\s*(\\w+),").matcher(str);
if (m.find()) {
    System.out.println(m.group(1));
}
GF68464
arshajii
  • 127,459
  • 24
  • 238
  • 287
  • nope, that appears to just match the entire string. Do forgive me if I'm making a faux pas – Cameron Miller Aug 15 '13 at 22:50
  • You can extract just the groups in `()`'s using the `Matcher.group` method - which should include ONLY the parenthesized text, and not the rest of the matched string. – CmdrMoozy Aug 15 '13 at 22:51
  • @CameronMiller It *matches* the entire string, but it *captures* the ID and the text following the comma in separate groups. You have to use the `.group` method to extract the relevant capture group. – p.s.w.g Aug 15 '13 at 22:52
  • Did the job. You sir, are a legend. – Cameron Miller Aug 15 '13 at 22:57
3

You could try using lookarounds:

(?<ID:\s*)\w+(?=,)

This will match any sequence of one or more word characters preceded by "ID:" and any number of white space characters, and followed by a comma.

p.s.w.g
  • 146,324
  • 30
  • 291
  • 331
0

What you want is called a non-capturing group. There are already some fairly high-quality examples of doing this in Java on SO - for example, this question: What is a non-capturing group? What does a question mark followed by a colon (?:) mean?

Community
  • 1
  • 1
CmdrMoozy
  • 3,870
  • 3
  • 19
  • 31
  • I believe, s?he needs capturing group. – kirilloid Aug 15 '13 at 22:47
  • You could either place the "ID: " portion in a non-capturing group, or place the ID itself in a capture group and then only extract that group from the returned match. Either should work. – CmdrMoozy Aug 15 '13 at 22:49
  • I will admit that capture groups are probably the more straight-forward solution, but non-capturing groups are still a handy trick to know. – CmdrMoozy Aug 15 '13 at 22:52
0

Create a regex like /^[a-z A-Z 0-9]*,/ then use can use match function and use value match[0] like

var regex = /^[a-z A-Z 0-9]*\,/;
var matches = your_string.match(regex);
var required_value = matches[0];

hope this helps

  • No, this won't work. It will reject the `:` in the `ID: `. You could add it to the character class, but then it would catch a lot more than what OP wanted. Also, you use a start anchor, but OP never said that it should only match the at the start of the string / line. Finally, this appears to be JavaScript, but the question was specifically tagged as Java. – p.s.w.g Aug 16 '13 at 01:57
  • I am sorry I took it a JavaScript query and thanks for clarification – Akash Arora Aug 16 '13 at 19:49