1

I want to replace the brackets in a string whilst tabbing and adding newlines, like a pretty print.

 (foo) AND ((bar) OR (baz))

becomes

     (
          foo
     )
AND
     (
          (
               bar
          )
     OR
          (
          baz
          )
     )

I have tried:

   "((foo) OR ((bar)(baz)))".replaceAll("\\((.*?)\\)", "\\(\n\t$1\n\\)")

but it doesn't quite work.

JaceyB
  • 11
  • 1
  • 5
    The grammar of your parenthesized expression isn't regular and so you can't use a regular expression to parse and modify it *according to its syntax*. You may have to write a parser, e.g. one using recursive descent or similar. – laune Mar 25 '15 at 18:55
  • Don't try to parse non-regular things with regular expressions, or [this](http://stackoverflow.com/a/1732454/1361506) will happen to you. – azurefrog Mar 25 '15 at 19:28

1 Answers1

1

laune azurefrog have right. Expression provided by You is not regular so You can't use regular expresion engine to prettify the expression. In my opinion the best You can do is to write a parser which can handle this task. I am always trying to separate parser logic from code dealing with the business. You can notice that the logic which deals with formatting seats in a implementation of ExpressionListener. Below You can find the code.

public class PrettyPrintExample {

    private interface ExpressionListener {

        void lb();

        void rb();

        void content(String content);

    }

    private enum Type {
        LB, RB, STRING, END
    }

    private static class Token {
        Type type;
        String value;

        public Token(Type type, String value) {
            super();
            this.type = type;
            this.value = value;
        }

        @Override
        public String toString() {
            return "Token [type=" + type + ", value=" + value + "]";
        }
    }

    private static class Lexer {

        private int current;
        private String input;

        public Lexer(String input) {
            this.input = input;
        }

        private char getChar() {
            return input.charAt(current++);
        }

        private void unputChar() {
            current--;
        }

        private boolean hasNextChar() {
            return current < input.length();
        }

        Token next() {

            if (!hasNextChar()) {
                return new Token(Type.END, "");
            }

            char c = getChar();

            while (Character.isWhitespace(c)) {
                c = getChar();
            }

            if (c == '(') {
                return new Token(Type.LB, "(");
            }

            if (c == ')') {
                return new Token(Type.RB, ")");
            }

            unputChar();

            StringBuilder buffer = new StringBuilder();
            while (hasNextChar()) {

                c = getChar();

                if (c != '(' && c != ')' && !Character.isWhitespace(c)) {
                    buffer.append(c);
                } else {
                    unputChar();
                    break;
                }

            }

            return new Token(Type.STRING, buffer.toString());

        }
    }

    private static Lexer lexer;
    private static Token currentToken;

    public static void parse(String line, ExpressionListener listener) {
        lexer = new Lexer(line);
        currentToken = lexer.next();
        expression(listener);
        consume(Type.END);
    }

    private static void expression(ExpressionListener listener) {

        while (true) {

            if (match(Type.STRING)) {
                listener.content(currentToken.value);
                consume(Type.STRING);
            } else if (match(Type.LB)) {
                consume(Type.LB);
                listener.lb();
                expression(listener);
                consume(Type.RB);
                listener.rb();
            } else {
                break;
            }

        }

    }

    private static boolean match(Type type) {
        return type == currentToken.type;
    }

    private static void consume(Type type) {
        if (!match(type)) {
            throw new RuntimeException(String.format("Should be %s is %s", type.name(), currentToken.type.name()));
        }
        currentToken = lexer.next();
    }

    public static void main(String[] args) {
        String line = "(foo) AND ((bar) OR (baz))";
        parse(line, new ExpressionListener() {

            private int indent = 0;

            private void indent(int indent) {
                StringBuilder builder = new StringBuilder();
                for (int i = 0; i < indent; i++) {
                    builder.append("\t");
                }
                System.out.print(builder.toString());
            }

            private void nl() {
                System.out.println();
            }

            @Override
            public void lb() {
                indent(++indent);
                System.out.print("(");
                nl();
            }

            @Override
            public void rb() {
                indent(indent);
                System.out.print(")");
                indent--;
                nl();
            }

            @Override
            public void content(String content) {
                indent(indent);
                System.out.print(content);
                nl();
            }
        });

    }

}

output

    (
    foo
    )
AND
    (
        (
        bar
        )
    OR
        (
        baz
        )
    )

I know that using CFL parser for Your case is not a simplest possible solution, but be aware that if Your expression will become much complex it would be really easy to extend the parser.

slavik
  • 1,223
  • 15
  • 17