2

I have a linear expression from which i have to extract the operators from specific places.I dont need to extract all the operators.i.e my expression is

c*(a+b)+(a-b)/log(a+b)-(b-c/d)+(d-tan90)

the operators which are inside the bracket dont need to be separated. Only the operators which are in between two elements are to be separated.i.e my desired output will be

*,+,/,-,+ can anyone help?

Subho
  • 921
  • 5
  • 25
  • 48
  • 4
    Advice: don't use regexes to "parse" expressions. Either find an existing expression parser, or write one yourself. – Stephen C Sep 21 '14 at 03:21
  • can you suggest one? – Subho Sep 21 '14 at 03:27
  • While this isn't as bad as trying to [parse HTML with regex](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454), you'd still be better off using or writing a parser like @StephenC suggests. It should be fairly trivial to iterate through the expression and throw away everything in between parentheses and store off the operators. – azurefrog Sep 21 '14 at 03:27
  • do you need all operator or just the first one ? – Mzf Sep 21 '14 at 03:28
  • @Mzf no not only the first one in this case i need *,+,/,-,+ this output ,may be the expression is bigger in my actual problem. – Subho Sep 21 '14 at 03:30
  • @Subho if someone provide a regex solution, it would work only for the current example. Is this ok for you? – Avinash Raj Sep 21 '14 at 03:38
  • @Subho - google for "java expression parser" and take your pick. – Stephen C Sep 21 '14 at 03:39
  • @AvinashRaj absolutely not – Subho Sep 21 '14 at 03:40
  • @azurefrog - and that "solution" would be trivially broken by adding parentheses in inconvenient places. Unsatisfactory ... – Stephen C Sep 21 '14 at 03:41
  • @Subho then you need to provide all the possibilities. – Avinash Raj Sep 21 '14 at 03:42
  • ok can you provide solution for this one?? then i can get an idea – Subho Sep 21 '14 at 03:42
  • @AvinashRaj - ... which is probably impossible. Hence the futility of attempting to do this using regexes. – Stephen C Sep 21 '14 at 03:43

2 Answers2

1

Assuming there are no nested parentheses, you can do this by removing the character sequences you don't need. The character sequences you don't need are:

  • Any sequence starting with ( and ending with );
  • Any other character that isn't an operator.

You can throw out all those sequences using replaceAll. This statement will set operators to a string with all those removed, i.e. "*+/-+":

operators = inputString.replaceAll("\\([^)]*\\)|[^-+*/]", "");

This causes any sequence composed of a (, followed by zero or more non-) characters, followed by ) to be replaced with ""; it also causes any character that is not -, +, *, or /, to be replaced with "". The first alternative is tested first, so the second one will only affect characters that aren't in parentheses. Note that the hyphen in [^-+*/] comes first, before any other characters, so that the - isn't interpreted as indicating a range of characters.

If nested parentheses are a possibility, then don't use regexes. Regexes in Java cannot handle nested constructs. (I think there are some languages that support regex features that do handle them, but not Java. At least not the standard Java runtime. There could be a third-party Java library somewhere that supports it.) azurefrog's answer is the best approach.

Note: Now tested.

ajb
  • 31,309
  • 3
  • 58
  • 84
  • while nested parentheses are cannot be handled using classical regular expressions, Java `Pattern` regexes are more powerful than that (from a theoretical standpoint). However, a regex that can cope with a recursive grammar is liable to be too **gnarly**; i.e. too horribly complex for most people to read / write. You should avoid that approach *for that reason*. – Stephen C Sep 21 '14 at 03:48
1

If all you really need is the operators, I think an expression parser is overkill.

It's simple to just loop through the characters and store off the operators. The only (minor) complexity is keeping track of a count of parentheses.

This snippet will give you the output desired, and will also work if you end up with nested expressions:

    String expression = "c*(a+b)+(a-b)/log(a+b)-(b-c/d)+(d-tan90)";
    List<Character> operators = new ArrayList<Character>();
    int parentheses = 0;
    for (char c : expression.toCharArray()) {
        // throw away everything inside ( )
        if (c == '(') {
            parentheses++;
        } else if (c == ')') {
            parentheses--;
        }
        if (parentheses > 0) {
            continue;
        }

        // store operators outside ( )
        if (c == '+' || c == '-' || c == '*' || c == '/') {
            operators.add(c);
        }
    }
    System.out.println(operators);  // [*, +, /, -, +]

Note that I'm assuming you're working on a valid mathematical expression here. If you aren't sure that you're going to get good input, you'd need to validate it.

If you're planning on doing anything fancier, you might want to use an expression parser (such as Jep or Formula4J).

azurefrog
  • 10,785
  • 7
  • 42
  • 56