2

I have a regex:

\[[^\[\]]*\]\s*[<>|<=|>=|=|>|<]\s*'?"?\w*'?"?

this basically just parse a equation like:

[household_roster_relationships_topersona_nameadditionalpersono] = "1"

it works good, with

'=','>','<'.

but when the equation has

'<=','>=','<>'.

the parse stop at first character of

'<=','>=','<>'.

I have created a demo on regex101

How can I correct regex so it will work in this situation?

Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
Peter Huang
  • 972
  • 4
  • 12
  • 34

3 Answers3

2

Just change your char class for an alternation:

\[[^\[\]]*\]\s*(<>|<=|>=|=|>|<)\s*'?"?\w*'?"?
               ^              ^

See demo

Graham
  • 7,431
  • 18
  • 59
  • 84
Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
1

You need to replace the character class with a grouping construct.

Use

\[[^[\]]*]\s*(?:<>|[<>]=|[=><])\s*['"]?\w*['"]?
             ^^^              ^     

See the regex demo. The (?:<>|[<>]=|[=><]) non-capturing group (only used for grouping subpatterns) matches either <>, <=, >=, =, > or <.

Note I reduced some alternative branches to make the pattern a bit more compact. Also, I think you just want to match either ' or " at the end, so, a mere ['"]? (1 or 0 ' or ") should be enough.

Also, you do not need to escape a [ inside a character class ([[] matches a single [) and you do not need to escape ] outside a character class, it matches a literal ] symbol.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Hi, that works good. with the example in the demo. I have another example that break the regex. [diagnostic_procedures_performed] = 'UNDEFINED_CODE' and [other_procedure_date] > "01-01-1901" how would I fix too? – Peter Huang Feb 16 '17 at 16:55
  • Replace `\w*` with `[\w-]*` (0+ word or `-` symbols) or with `[^"']*` (0+ chars other than `'` and `"`). The latter may be more generic. – Wiktor Stribiżew Feb 16 '17 at 17:06
0

You need to use () instead of [] for multiple word character class selection ...see updated regex

m87
  • 4,445
  • 3
  • 16
  • 31