0

I have a complex string coming from the UI like:

(region = "asia") AND ((status = null) OR ((inactive = "true") AND (department = "aaaa")) OR ((costcenter = "ggg") OR (location = "india")))

I need to split it and use it in my code, but I have to take into consideration the braces so that grouping occurs exactly as shown. After split, I have to get something like the following in each iteration and break it down

First time:

(region = "asia") AND 

((status = null) OR ((inactive = "true") AND (department = "aaaa")) OR ((costcenter = "ggg") OR (location = "india")))

Second time:

(region = "asia") AND 

(

(status = null) OR 

((inactive = "true") AND (department = "aaaa")) OR 
((costcenter = "ggg") OR (location = "india"))

)

and so on...

Any pointers on how to achieve this?

Lucas Trzesniewski
  • 50,214
  • 11
  • 107
  • 158
Upendra k
  • 21
  • 1
  • 2
    Context free grammars. Consider using a parser generator tool like ANTLR. See the similar question and relevant answers at "Can regular expressions be used to match nested patterns?" at http://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns – Andy Thomas Nov 17 '14 at 14:23
  • The link in the comment above points to a theoretical answer. But yes, you should implement a parser for this. – Lucas Trzesniewski Nov 17 '14 at 14:27

1 Answers1

0

As it seems you are not willing to go in full-fledged parsing, and regex cannot tackle this kind of problem, maybe a step-wise solution.

Here a list of variables is construed, where the ith entry has the inner text value of (...) with variables of the form @123 where 123 is the i.

static String parse(String exp, List<String> vars) {
    final Pattern BRACED_REDEX = Pattern.compile("\\(([^()]*)\\)");
    for (;;) {
        Matcher m = BRACED_REDEX.matcher(exp);
        if (!m.find()) {
            break;
        }
        String value = m.group(1);
        String var = "@" + vars.size();
        vars.add(value);
        StringBuffer sb = new StringBuffer();
        m.appendReplacement(sb, var);
        m.appendTail(sb);
        exp = sb.toString();
    }
    vars.add(exp); // Add last unreduced expr too.
    return exp;
}

public static void main(String[] args) {
    String exp = "(region = \"asia\") AND ((status = null) OR ((inactive = \"true\") "
        + "AND (department = \"aaaa\")) OR ((costcenter = \"ggg\") OR "
        + "(location = \"india\")))";
    List<String> vars = new ArrayList<>();
    exp = parse(exp, vars);
    System.out.println("Root expression: " + exp);
    for (int i = 0; i < vars.size(); ++i) {
        System.out.printf("@%d = %s%n", i, vars.get(i));
    }
}

This will give

Root expression: @0 AND @8
@0 = region = "asia"
@1 = status = null
@2 = inactive = "true"
@3 = department = "aaaa"
@4 = @2 AND @3
@5 = costcenter = "ggg"
@6 = location = "india"
@7 = @5 OR @6
@8 = @1 OR @4 OR @7
@9 = @0 AND @8

For a full fledged solution you could use the Java Scripting API and either borrow the JavaScript engine or make your own small Scripting language,

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • I do not see this conveyed in the OP: *"you are not willing to go in full-fledged parsing"*. – Andy Thomas Nov 17 '14 at 15:15
  • @AndyThomas: Though weakened by "It seems, ..." that formulation indeed is a bit assumptuous. Maybe something grammatical and compact like **StringTemplate** might be feasible too. My impression: if the OP first thinks of tackling this problem with regex, then dropping terms like top-down parsing and grammars probably are not very attractive. _What is done here, already is a bit of parsing (on LISP level): redex evaluation._ – Joop Eggen Nov 17 '14 at 15:25