-1

I am dealing with some boolean queries with the use of "AND", "OR", or "NOT" in a IR system, and was wondering how I would check for invalid queries. For example, these queries would be invalid:

"tom OR", "tom NOT", "beans AND OR beans", "NOT AND mad", "(cat AND dog" [since ( incomplete]), etc...

These terms are invalid because a boolean query needs 2 "words" on each side (unless we have NOT). Any tips for error checking for these things? I am pretty lost and would like a direction or something.

Thanks!

  • Do you trim the input, if you're looking for invalid, does that mean you can get the valid ones? Do you have a code example? – Sylhare Feb 08 '18 at 04:03
  • I am looking for the invalid entries, and if a query contains an invalid query, I want to throw an error to the user. Just a print statement. Just want to make sure for valid queries. I am in the middle of something rn but I can throw some code to show you what I mean later if you want. – LearningCoding Feb 08 '18 at 04:09

1 Answers1

2

The best way is probably to parse the query into some kind of syntax tree. Because it's pretty simple, you could probably write the parser yourself. You could also use something like pyparsing to handle that for you.

Using a regular expression would probably be pretty painful -- they aren't context-sensitive, so you'll pretty quickly end up in this situation.

wgoodall01
  • 1,855
  • 14
  • 21