5

I am trying to capture 1 or 2 pieces of information. When using regexr it shows my expression to be working and capturing like it should, but when running it, it only captures from a single string (on the same data as in regexr) and returns null for the rest.

I have tried building the expression here

And when switching to JS flavor it shows the capturing groups not working via the color overlays, but it shows them working correctly in the side pane. Even the simplest capturing group seems to not work.

What am I missing?

Input is :

<@U0BUPU9QQ> 49
50
<@U0BUPU9QQ>
<@U0BUPU9QQ> noget 49 noget andet tekst 5 40
<@U0BUPU9QQ> noget andet tekst 5 40
<@U0BUPU9QQ|mn> has joined the channel

Output:

Should be the ID inside the <> (except the @) and the last group of digits in the line, if there is no ID then only the digits.

Onilol
  • 1,315
  • 16
  • 41
Mathias Nielsen
  • 1,560
  • 1
  • 17
  • 31

1 Answers1

7

Do not pay attention to the highlighting groups on regex101 for JS: if you see them in the MATCH INFORMATION pane on the right, they are matched and captured correctly.

In JS, here is the code that will fetch the capture groups (note that m[1] is the first capture group text, m[2] is the second group text, etc.):

var re = /^(?:<@([A-Z0-9]+)>)?.*\b([0-9]+)/gm; 
var str = '<@U0BUPU9QQ> 49\n50\n<@U0BUPU9QQ>\n<@U0BUPU9QQ> noget 49 noget andet tekst 5 40\n<@U0BUPU9QQ> noget andet tekst 5 40\n<@U0BUPU9QQ|mn> has joined the channel';
var m;
 
while ((m = re.exec(str)) !== null) {
    document.write(m[1] + "<br/>" + m[2] + "<br/><br/>");
}

Notes on the regex itself:

  • ^ - Start matching at the beginning of the line (due to m modifier)
  • (?:<@([A-Z0-9]+)>)? - an optional (due to ? quantifier) group matching
    • <@ - literal <@ symbols
    • ([A-Z0-9]+) - (Capture group 1) 1 or more alphanumeric symbols
    • > - closing angle bracket
  • .* - 0 or more character other than a newline (as many as possible)
  • \b([0-9]+) - (Capture group 2) 1 or more digits that are preceded by a word boundary

You can adjust the regex as per your requirements. Right now, it will match the ID (=the symbols inside optional <@...>), and the last digit sequence on a line. If you need the first digit sequence, use lazy matching .*? instead of the greedy one (.*).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • _if there is no ID then only the digits._ It should be: `^(?:<@([A-Z0-9]+)>)*.*?\b([0-9]+)` – hjpotter92 Oct 20 '15 at 13:15
  • Well, I am not sure what exact regex is required here. I have added the explanation. – Wiktor Stribiżew Oct 20 '15 at 13:20
  • Using the pattern from @hjpotter92 together with your code, it finally works as intended. Thank you all. – Mathias Nielsen Oct 20 '15 at 13:38
  • @MathiasNielsen: Do you mean you needed the *first* digit sequence? Then, as I wrote in my answer, the lazy dot matching is what you needed. Your post contains this requirement: **the last group of digits in the line**. Great that you found the exact solution you needed. – Wiktor Stribiżew Oct 20 '15 at 13:47