3

I am working on an Advanced Search feature where the expression I need to evaluate will look something like this with the parenthesis in place:

((Loan number is 1000 And
Lock Date is less than 12/03/2015) Or
Borrower SSN contains 12345) And
((Buy date is between 12/01/2015 and 23/02/2016 And
APR is less than 20000) Or
Loan amount is greater than 60000)

Or in simple words

((condition1 And condition2) Or condition 3) And ((condition4 And condition5) Or condition6).

If we look at the parenthesis, the condition1 and condition2 has to be evaluated first and then the output of this is executed with condition 3 and so on....

We have API to evaluate two conditions at a time. However the challenge in this context is

1) How to identify the corresponding parenthesis and evaluate them first. And then use this intermediate result for further evaluation?.

2)How to find unused parenthesis? For example (((condition1 And condition2))), in this case the though it is not required there are 3 starting and 3 closing parenthesis which is a valid expression.

I tried finding some algorithm here and here

However this takes token based manipulation which reads one character at a time and it is an arithmetic expression evaluation that computer understands. In my case these things are custom and we should find a algorithm to do this. Can anyone suggest a better approach for my scenario?

  • 1
    I would go with expression trees together with your api (or use linqkit if you don't want to create the expression trees yourself, although they look fairly simple) – Alexander Derck Apr 22 '16 at 07:31
  • 1
    You need a parser for your expression. Here's how to build one: http://stackoverflow.com/questions/2245962/is-there-an-alternative-for-flex-bison-that-is-usable-on-8-bit-embedded-systems/2336769#2336769 That article leads to others that will show you how to build trees to allow later evaluation, or how to evaluate the expression as it is parsed. – Ira Baxter Apr 22 '16 at 07:56
  • The shunting-yard algorithm allows you to convert an infix notation to a postfix notation, where 'inner' operators simply come before 'outer' ones. You can evaluate the result, as those articles do, but you can also turn that into an AST, or whatever format your API requires. So... what format does your API require? – Pieter Witvoet Apr 22 '16 at 08:30

1 Answers1

3

If I understand you correctly, you already got the expression evaluator. What you need is to split the evaluations according to the parenthesis. I'd use a loop, in which I'd find inner parenthesis groups, using this regex:

\(([^()]*)\)

Then, if found, replace them with the result of your evaluation routine, and repeat until a final string remains, without parenthesis.

Pseudo code:

Find a string enclosed by (), not containing any ()
If found
    Replace it with the evaluated value of the string (including parenthesis)
    Go again
Return result

About the unused parenthesis, let them be treated the same. They'll end up in your evaluation routine as a single value.

Check this fiddle. Instead of evaluating it returns a random number, 0 or 1, but it demonstrates the logic.

Hope this helps.

Regards.

SamWhan
  • 8,296
  • 1
  • 18
  • 45
  • 1
    In my opinion regular expression are not designed for parsing syntactically recursive expressions. For example can (partially) select string, but nothing to evaluation, communicating errors .... – Jacek Cz Apr 22 '16 at 08:24
  • @JacekCz I agree (to some extent - I frequently use regex for simple evaluation tasks). And the solution I proposed here doesn't use regex for the evaluation. It merely **identifies the parts in need of evaluation**. – SamWhan Apr 22 '16 at 08:52
  • Thanks Clas, that sets the base for me. – user3796454 Apr 22 '16 at 12:59
  • @ClasG, In the above case what should be the approach if the output of intermediate condition evaluation returns a complex object instead of string?. – user3796454 May 03 '16 at 08:03
  • You mention *We have API to evaluate two conditions at a time*, I guess what you send in to it is something like `Loan number is 1000 And Lock Date is less than 12/03/2015`, but what is the output? I assume that's the complex object you talk about - so can't your API process `[complex object] Or Borrower SSN contains 12345`? – SamWhan May 03 '16 at 09:12
  • @ClasG, I have started working on this. Actually my API does not take these strings directly instead I need to split them and pass it as appropriate parameter. As the nesting level increases handling this becomes tricky. We need to write some custom code to build that intelligence. I will work on that. Thanks a lot Clas :). Appreciate your help. – user3796454 May 04 '16 at 08:05