0

I'm trying to come up with a regular expression that will match the components of a PQL query bit by bit. Some examples of what I have in mind:

a==
[('a==')]
a==22
[('a==22')]
a=="b"
[('a=="b"')]
a=="b" and/or/not <- any of these
[('a=="b"', '{logical operator}')]
a=="b" or c.
[('a=="b"', 'or'), ('c.')]
a=="b" or c.d
[('a=="b"', 'or'), ('c.d')]
a=='b' and c=="
[("a=='b'", 'and'), ('c=="')]

Basically whenever a new section of the PQL statement is entered, we create a new match, and it works with queries of strings or numbers.

My current expression looks like this:

([a-zA-Z.\-_]+[=!<>]{0,2}([\"\']?[a-zA-Z\-!._ "\']*?[\"\']|[0-9]*))[ ]?(and|or|not)?[ ]?

It does a good job, but fails on something like this:

a==22 and b=='c

It thinks that c belongs in an new match yielding the following:

[('a==22', '22', 'and'), ('b==', "'"), ('c')]

As opposed to

[('a==22', '22', 'and'), ('b==', "'c")]
lightstrike
  • 954
  • 2
  • 15
  • 31
  • I find it weird that singleton single/double quotes are valid... Otherwise, it would be simpler to parse. – Jerry May 06 '14 at 17:23
  • It looks like you're trying to build a parser with regexes, see [here](http://stackoverflow.com/questions/5389244/building-a-regex-based-parser). I was under the impression that that was a bad idea™ but that linked to question seems to have some resources to support it... (my memory was that you needed a type one formal grammar which cannot be parsed with regexes but my memory may be hazy). – Mike H-R May 06 '14 at 17:42
  • @MikeH-R Not all formal grammars are unparseable with regular expressions, though, in practice most of them are. You're right to say type 1 and 0 grammars are _generally_ unparseable with Regular expressions, but all type 3 grammars, and many type 2 grammars are. If you can write your language to use those grammars, it will be parseable with regex. – FrankieTheKneeMan May 06 '14 at 19:59
  • Right you are, as i found out reading the link i posted, there are a number of resources on parsing with regexes which seem very useful. – Mike H-R May 06 '14 at 20:01

0 Answers0