3

I'm using this regex:

([\w\s]+)(=|!=)([\w\s]+)( (or|and) ([\w\s]+)(=|!=)([\w\s]+))*

to match a string such as this: i= 2 or i =3 and k!=4

When I try to extract values using m.group(index), I get: (i, =, 2, **and k!=4**, and, k, ,!=, 4).

Expected output: (i, =, 2, or, i, =, 3, and, k , !=, 4) How do i extract the values correctly?

P.S. m.matches() returns true.

rocketboy
  • 9,573
  • 2
  • 34
  • 36
abhi5306
  • 134
  • 1
  • 7

4 Answers4

3

you are trying to match with a regexp on an expression...you might want to use a parser, because this regexp (when you have it) can't be extended further..but a parser can be extended at any time

for example, consider using antlr (ANTLR: Is there a simple example?)

Community
  • 1
  • 1
Zoltán Haindrich
  • 1,788
  • 11
  • 20
2

This is because your third set of parens (the one that you use for repeating expressions) is what's confusing you. Try using a non-capturing parens:

([\w\s]+)(=|!=)([\w\s]+)(?: (or|and) ([\w\s]+)(=|!=)([\w\s]+))*
zigdon
  • 14,573
  • 6
  • 35
  • 54
1

Description

Why not simplify your expression to match exactly what you're looking for?

!?=|(?:or|and)|\b(?:(?!or|and)[\w\s])+\b

enter image description here

Example

Live Demo hover over the blue bubbles in the text area to see exactly what is matched

Sample Text

i= 2 or i =1234 and k!=4 

Matches Found

[0][0] = i
[1][0] = =
[2][0] = 2 
[3][0] = or
[4][0] =  i
[5][0] = =
[6][0] = 1234 
[7][0] = and
[8][0] =  k
[9][0] = !=
[10][0] = 4
animuson
  • 53,861
  • 28
  • 137
  • 147
Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43
0

Everything in brackets makes a capturing group which you can later access via index. But you can make the group which you do not need non-capturing: (?: ... ), then it will not be considered at Matcher.group(int).

qqilihq
  • 10,794
  • 7
  • 48
  • 89