I'm trying to come up with a regular expression that will match the components of a PQL query bit by bit. Some examples of what I have in mind:
a==
[('a==')]
a==22
[('a==22')]
a=="b"
[('a=="b"')]
a=="b" and/or/not <- any of these
[('a=="b"', '{logical operator}')]
a=="b" or c.
[('a=="b"', 'or'), ('c.')]
a=="b" or c.d
[('a=="b"', 'or'), ('c.d')]
a=='b' and c=="
[("a=='b'", 'and'), ('c=="')]
Basically whenever a new section of the PQL statement is entered, we create a new match, and it works with queries of strings or numbers.
My current expression looks like this:
([a-zA-Z.\-_]+[=!<>]{0,2}([\"\']?[a-zA-Z\-!._ "\']*?[\"\']|[0-9]*))[ ]?(and|or|not)?[ ]?
It does a good job, but fails on something like this:
a==22 and b=='c
It thinks that c
belongs in an new match yielding the following:
[('a==22', '22', 'and'), ('b==', "'"), ('c')]
As opposed to
[('a==22', '22', 'and'), ('b==', "'c")]