0

I am trying to find the best way to address the issue of redundant string concatenation caused by using code of the following form:

logger.debug("Entering loop, arg is: " + arg) // @1

In most cases the logger.level is higher than debug and the arg.toString() and the string concatenation are a waste that user up cpu cycles and briefly use up memory.

Before the introduction of varargs the recommended approach was to test the logger level first:

if (logger.isDebugEnabled())    
  logger.debug("Entering loop, arg is: " + arg);  // @2

But now the preferred form is

  logger.debug("Entering loop, arg is: {}", arg); // @3

It is not very difficult to prefix each logger.debug with if (logger.isDebugEnabled()) (and its equivalent for the other methods) in a script, but I am trying to find the best way to convert the first form to the third.

Any suggestions? The challenge is to insert the correct number brace pairs {} in the format string. I wish logback would append the remaining arguments not covered by the placeholder at the end but I cannot find a reference that it does that.

As an alternative, I am thinking to write a class Concatenator as pasted at end and convert the first form to

logger.debug(new Concatenator("Entering loop, arg is: ", arg)); // @4

The Concatenator class delays the call to arg.toString() and string concatenation until the logger calls toString(), thereby avoiding both if the logger is at a higher filter level. It does add the overhead of creating an Object[] and a Concatenator but that should be cheaper than the alternative.

Questions:

  • I think this conversion (@1->@4 -- replace + with , and enclose in new Contatenator(...)) is much easier. Is there something I am missing?
  • Am I correct that @4 is much better than @1?
public class Concatenator {
    final Object[] input;
    String output;
    public Concatenator(Object... input) {
        this.input = input;
    }
    public String toString() {
        if (output == null) {
            StringBuffer b = new StringBuffer();
            for (Object s : input) b.append(s.toString());
            output = b.toString();
        }
        return output;
    }
    public static void main(String args[]) {
        new Concatenator("a", "b", new X());
        System.out.println(new Concatenator("c", "d", new X()));
    }
}
class X {
    public String toString() {
        System.out.println("X.toString");
        return super.toString();
    }
}
Richard Fearn
  • 25,073
  • 7
  • 56
  • 55
Miserable Variable
  • 28,432
  • 15
  • 72
  • 133
  • I'm really confused as to what you're asking. In addition, you have the exact same string concatenation occurring in your "@4" example as you do in "@1" – Brian Roach Feb 12 '14 at 04:05
  • 2
    @BrianRoach thanks for pointing out my error in `@4`, I meant to convert the `+` to `,`. I will update the question and add some explanation. – Miserable Variable Feb 12 '14 at 04:14
  • So ... I mean, is it that hard to count your args? I'm still missing something, maybe? The code you've written is fine, I've just never found it a problem to say "I've got 4 things, I need 4 `{}` in my format string". It's no more unreasonable requirement than when using `printf`. – Brian Roach Feb 12 '14 at 04:21
  • @BrianRoach that is exactly what I am wondering -- is it easy to count the number of parameters and insert brace-pairs in a script? I might have hundreds of occurrences and don't want to do it manually. On the other hand, my `Concatenator` is quite likely an abomination. – Miserable Variable Feb 12 '14 at 04:26
  • Oh, ok. I get it now. I think my brain was just tired. You want to write something to go through the source files and do the conversion for you and are trying to simplify what it has to do? – Brian Roach Feb 12 '14 at 14:01
  • Yes, thanks for writing. I was wondering if I had gone completely bonkers myself :) – Miserable Variable Feb 12 '14 at 14:18
  • Someone with more prowess in REGEX than me should be able to define one that can at least do the counting of the required arguments. Assuming that your logging output always starts with a String (and not an object, like `stuff + "String..."`), you can abstract the following rules: (i) A parameter within the String results in two `+` (and is conveniently positioned between them for extraction). (ii) A parameter at the end results in a single `+`. So counting the number of `+`-signs and dividing by 2 (rounding up) gives you the number of parameters for a given logging statement. – sheltem Feb 12 '14 at 16:58
  • Cont.: A conversion would then work by counting and extracting the args via a regex, combining the String-parts (by replacing the 'holes' in them left by the extracted args with a `{}`) and then adding the args at the end - separated by commas. I am pretty sure it's possible with regex, just not for me, hence the comment and not answer. ;) – sheltem Feb 12 '14 at 17:00
  • @sheltem I think things like counting occurrences and adding a `{}` in the first string for each `+` is quite difficult with regexp -- think about line breaks and stuff. I can see how I can jump around `sexp` in emacs-lisp to do what I wrote -- and even that may break in some cases -- but I think anything more would probably required parsing. – Miserable Variable Feb 12 '14 at 17:11
  • Sure, there are a few problems with that approach (like the line-breaks adding `+`s), so ultimately the accuracy of that method depends on your code base. 100 % accuracy with special cases factored in will make it a lot more complex, but probably still doable in the end. Only you can say if it's worth the effort. I also like the Concatenator-idea as described by Brian. Even if it mimicks the behaviour of slf4j, it should be a lot easier to implement than the search-and-replace-regex. – sheltem Feb 12 '14 at 17:30

1 Answers1

1

Unfortunately your approach isn't going to change anything. In fact, it introduces an additional object instantiation/allocation (your Concatenator). You're also using StringBuffer which introduces synchronization overhead you don't need.

The problem is the method signature for SLF4J's Logger.debug() calls. The first argument is always a String. This means you're going to have to call:

logger.debug(new Concatenator("Entering loop, arg is: ", arg).toString());

which means ... you're doing exactly the same thing as Java is going to do, but with more overhead.

The Java compiler handles the String concatenation operator (+) by creating a StringBuilder and doing exactly what you're doing in your Concatenator class on toString().

logger.debug("Entering loop, arg is: " + arg);

becomes:

logger.debug(new StringBuilder()
                 .append("Entering loop, arg is: ")
                 .append(arg).toString());

(If you use javap to look at the generated bytecode, you'll see that's the case.)

So, your current approach is going to be more expensive than what you have now.

Edit: So, the way you could make this work is by doing ...

logger.debug("{}", new Concatenator("Entering loop, arg is: ", arg));

This way your Concatenator is passed as an Object and its toString() not called unless the logger needs to. Also, replace the StringBuffer in your class with StringBuilder.

And if I didn't answer your question directly ... is this better than the original? Probably; The string concatenation isn't occurring unless it needs to. You are, however, introducing an object instantiation/allocation. The only real way to see the differences would be to profile it / write a benchmark.

Brian Roach
  • 76,169
  • 12
  • 136
  • 161
  • Thanks. My example is broken with the problem of the first argument required to be a `String` -- which I would have figured out if I had actually tried it :) Your suggestion of additional parameter solves that nicely. `StringBuffer` and `StringBuilder` probably perform pretty similarly with local variables. I am pretty sure the additional object creation more than compensates for the savings from the eliminated toString/string-concat and even justifies the overhead for the cases when the `Concatenator.toString()`. Comments? – Miserable Variable Feb 12 '14 at 16:21
  • I'd just re-iterate that the overhead in StringBuffer is *not* inconsequential; local variables have nothing to do with it. That class does internal synchronizing which is literally going to be on an order of magnitude more expensive - that's why StringBuilder was added in 1.5. As for performance, I agree it *feels/seems* like this will be cheaper, but the only true measurement is ... measuring :) (and keep in mind writing a measurement is tricky with the JIT. If you tight loop this, for example, the JIT will optimize away the overhead in StringBuffer, where it wouldn't otherwise) – Brian Roach Feb 12 '14 at 17:16
  • local variable has very much to do with it -- if a static analysis can show that the reference cannot escape the method then synchronization calls can be bypassed. A quick search gives [one](http://kaioa.com/node/59) article showing negligible difference and [another](http://stackoverflow.com/a/2771852/18573) showing significant difference, though I suspect still not a major art of the total time. Anyway, my argument was not for this or that but about the overall approach. – Miserable Variable Feb 12 '14 at 18:20