5

I created this class for being immutable and having a fluent API:

public final class Message {
    public final String email;
    public final String escalationEmail;
    public final String assignee;
    public final String conversationId;
    public final String subject;
    public final String userId;

    public Message(String email, String escalationEmail, String assignee, String conversationId, String subject, String userId) {
        this.email = email;
        this.escalationEmail = escalationEmail;
        this.assignee = assignee;
        this.conversationId = conversationId;
        this.subject = subject;
        this.userId = userId;
    }

    public Message() {
        email = "";
        escalationEmail = "";
        assignee = "";
        conversationId = "";
        subject = "";
        userId = "";
    }

    public Message email(String e) { return new Message(e, escalationEmail, assignee, conversationId, subject, userId); }
    public Message escalationEmail(String e) { return new Message(email, e, assignee, conversationId, subject, userId); }
    public Message assignee(String a) { return new Message(email, escalationEmail, a, conversationId, subject, userId); }
    public Message conversationId(String c) { return new Message(email, escalationEmail, assignee, c, subject, userId); }
    public Message subject(String s) { return new Message(email, escalationEmail, assignee, conversationId, s, userId); }
    public Message userId(String u) { return new Message(email, escalationEmail, assignee, conversationId, subject, u); }

}

My question is, will the optimizer be able to avoid lots of object creations when a new object is created like this:

Message m = new Message()
    .email("foo@bar.com")
    .assignee("bar@bax.com")
    .subject("subj");

Is there anything to be gained from making a separate mutable builder object instead?

Update 2: After reading apangin's answer my benchmark is invalidated. I'll keep it here for reference of how not to benchmark :)

Update: I took the liberty of measuring this myself with this code:

public final class Message {
public final String email;
public final String escalationEmail;
public final String assignee;
public final String conversationId;
public final String subject;
public final String userId;

public static final class MessageBuilder {
    private String email;
    private String escalationEmail;
    private String assignee;
    private String conversationId;
    private String subject;
    private String userId;

    MessageBuilder email(String e) { email = e; return this; }
    MessageBuilder escalationEmail(String e) { escalationEmail = e; return this; }
    MessageBuilder assignee(String e) { assignee = e; return this; }
    MessageBuilder conversationId(String e) { conversationId = e; return this; }
    MessageBuilder subject(String e) { subject = e; return this; }
    MessageBuilder userId(String e) { userId = e; return this; }

    public Message create() {
        return new Message(email, escalationEmail, assignee, conversationId, subject, userId);
    }

}

public static MessageBuilder createNew() {
    return new MessageBuilder();
}

public Message(String email, String escalationEmail, String assignee, String conversationId, String subject, String userId) {
    this.email = email;
    this.escalationEmail = escalationEmail;
    this.assignee = assignee;
    this.conversationId = conversationId;
    this.subject = subject;
    this.userId = userId;
}

public Message() {
    email = "";
    escalationEmail = "";
    assignee = "";
    conversationId = "";
    subject = "";
    userId = "";
}

public Message email(String e) { return new Message(e, escalationEmail, assignee, conversationId, subject, userId); }
public Message escalationEmail(String e) { return new Message(email, e, assignee, conversationId, subject, userId); }
public Message assignee(String a) { return new Message(email, escalationEmail, a, conversationId, subject, userId); }
public Message conversationId(String c) { return new Message(email, escalationEmail, assignee, c, subject, userId); }
public Message subject(String s) { return new Message(email, escalationEmail, assignee, conversationId, s, userId); }
public Message userId(String u) { return new Message(email, escalationEmail, assignee, conversationId, subject, u); }


static String getString() {
    return new String("hello");
    // return "hello";
}

public static void main(String[] args) {
    int n = 1000000000;

    long before1 = System.nanoTime();

    for (int i = 0; i < n; ++i) {
        Message m = new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    long after1 = System.nanoTime();

    long before2 = System.nanoTime();

    for (int i = 0; i < n; ++i) {
        Message m = Message.createNew()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString())
                .create();
    }

    long after2 = System.nanoTime();



    System.out.println("no builder  : " + (after1 - before1)/1000000000.0);
    System.out.println("with builder: " + (after2 - before2)/1000000000.0);
}


}

I found the difference to be significant (builder is faster) if the string arguments are not new objects, but all the same (see commented code in getString)

In what I imagine is a more realistic scenario, when all the strings are new objects, the difference is negligible, and the JVM startup would cause the first one to be a tiny bit slower (I tried both ways).

With the "new String" the code was in total many times slower as well (I had to decrease the n), perhaps indicating that there is some optimization of the "new Message" going on, but not of the "new String".

Inego
  • 1,039
  • 1
  • 12
  • 19
morten
  • 392
  • 1
  • 11
  • 3
    What you might also gain from a builder is to avoid having to rewrite N methods when you add an (N+1)th field to your class. – khelwood Jul 22 '16 at 08:09
  • 1
    That is why the builder pattern is not doing it like this ... in the end, you create a lot of new objects; so if this code runs "often"; it will be creating "garbage" constantly. – GhostCat Jul 22 '16 at 08:09
  • 1
    @GhostCat That is my concern, but it is possible the jit can optimize that away. If so, I prefer this way. – morten Jul 22 '16 at 08:20
  • 1
    Short answer: "no". Long answer: "Noooooooooooooooooooooooo. It can't". – GhostCat Jul 22 '16 at 08:43

5 Answers5

18

Yes, HotSpot JIT can eliminate redundant allocations in a local context.

This optimization is provided by the Escape Analysis enabled since JDK 6u23. It is often confused with on-stack allocation, but in fact it is much more powerful, since it allows not only to allocate objects on stack, but to eliminate allocation altogether by replacing object fields with variables (Scalar Replacement) that are subject to further optimizations.

The optimization is controlled by -XX:+EliminateAllocations JVM option which is ON by default.


Thanks to allocation elimination optimization, both your examples of creating a Message object work effectively the same way. They do not allocate intermediate objects; just the final one.

Your benchmark shows misleading results, because it collects many common pitfalls of microbenchmarking:

  • it incorporates several benchmarks in a single method;
  • it measures an OSR stub instead of the final compiled version;
  • it does not do warm-up iterations;
  • it does not consume results, etc.

Let's measure it correctly with JMH. As a bonus, JMH has the allocation profiler (-prof gc) which shows how many bytes are really allocated per iteration. I've added the third test that runs with EliminateAllocations optimization disabled to show the difference.

package bench;

import org.openjdk.jmh.annotations.*;

@State(Scope.Benchmark)
public class MessageBench {

    @Benchmark
    public Message builder() {
        return Message.createNew()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString())
                .create();
    }

    @Benchmark
    public Message immutable() {
        return new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    @Benchmark
    @Fork(jvmArgs = "-XX:-EliminateAllocations")
    public Message immutableNoOpt() {
        return new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    private String getString() {
        return "hello";
    }
}

Here are the results. Both builder and immutable perform equally and allocate just 40 bytes per iteration (exactly the size of one Message object).

Benchmark                                        Mode  Cnt     Score     Error   Units
MessageBench.builder                             avgt   10     6,232 ±   0,111   ns/op
MessageBench.immutable                           avgt   10     6,213 ±   0,087   ns/op
MessageBench.immutableNoOpt                      avgt   10    41,660 ±   2,466   ns/op

MessageBench.builder:·gc.alloc.rate.norm         avgt   10    40,000 ±   0,001    B/op
MessageBench.immutable:·gc.alloc.rate.norm       avgt   10    40,000 ±   0,001    B/op
MessageBench.immutableNoOpt:·gc.alloc.rate.norm  avgt   10   280,000 ±   0,001    B/op
Community
  • 1
  • 1
apangin
  • 92,924
  • 10
  • 193
  • 247
  • 1
    I changed the accepted answer to this one. Thank you, I have learnt much more than I asked for reading this answer. – morten Jul 26 '16 at 09:07
0

My understanding is that the JIT compiler works by re-arranging the existing code and performing basic statistical analysis. I don't think though that the JIT compiler can optimize object allocation.

Your Builder is incorrect and your fluent API will not work the way you expect (create just a single object per built).

You need to have something like:

  public class Message () {
     public final String email;
     public final String escalationEmail;

  private Message (String email,String escalationEmail) {
     this.email = email;
     this. escalationEmail = escalationEmail;
  }

  public static class Builder {
       public String email;
       public String escalationEmail;

       public static Builder createNew() {
           return new Builder();
       }

       public Builder withEmail(String email) {
          this.email = email;
          return this;
       }

       public Builder withEscalation(String escalation) {
          this.escalation = escalation;
          return this;
       }

       public Builder validate() {
          if (this.email==null|| this.email.length<7) {
             throw new RuntimeException("invalid email");
          }
       }


       public Message build() {¨
         return new Message(this.email,this.escalation);
       }

    } 

}

Then you can have something like.

Message.Builder.createNew()
                           .withEmail("exampple@email.com")
                           .withEscalation("escalation")
               .validate()
               .build();
Alexander Petrov
  • 9,204
  • 31
  • 70
  • In what way will it not work the way I expect? Are you saying a Builder is necessary for performance? – morten Jul 22 '16 at 08:15
  • Buikder is essential for performance and memory here. You will create a single object at the end of the Build operation. Also you build will be atomic. You can even put validation with a validate method if you want to and the build will not proceed if an object is not valid. It is extremely flexible – Alexander Petrov Jul 22 '16 at 08:16
  • You implementation generates a new object for each field set. Not good. Not to mention that there is no way to perform Validation or something more complicated. Also another advantage of the Builder pattern is that you have the ability to have many different Builders based on your object type. So it can come very handy when you have many default parameters. – Alexander Petrov Jul 22 '16 at 08:22
  • You have very good points about the flexibility of having a builder. In this instance that is something I do not need. I probably don't really need the performance either. The fluent api is just an example however. What I am really looking for is to know whether the jit (or even bytecode compiler) can optimize that away, also in other circumstances. I know there are certain times it can, but I do not have a good understanding of when it can and when it cannot. If I read between the lines, the answer is no it seems. – morten Jul 22 '16 at 08:30
  • 2
    *"I don't think though that the JIT compiler can optimize object allocation."* - of course, it can. The optimization is called `-XX:+EliminateAllocations` and it is ON by default. – apangin Jul 25 '16 at 21:57
0

First, your code didn't have a builder approach and generate a lot of object, but there is already an example of a builder so I will not add one more.

Then, regarding the JIT, short answer NO (there is no optimization of new object creation, except for dead code) ... long answer no but ... there is other mechanism that will optimize stuff in the JVM/

There is a string pool that avoid creation of multiple strings when using string literals. There is also a pool of object for each primitive wrapper type (so if you create a Long object with Long.valueOf it's the same object that is returned each time you ask for the same long ...). Regarding strings, there is also a string deduplication mechanism integrated in the G1 garbadge collector in java 8 update 20. You can test it with the following JVM options if you're using a recent JVM : -XX:+UseG1GC -XX:+UseStringDeduplication

If you really want to optimize new objet creation, you need to implement some sort of Object pooling and have your object being immutable. But be careful that this is not a simple task and you will end up having a lot of code dealing with object creation and managing pool size to not overflow your memory. So I advise you to do it only if it's really necessary.

Lastly, object instantiation in the heap is a cheap operation unless you create millions of objects in a second and the JVM is doing a lot of optimization in a lot of fields so, unless some good performance benchmark (or memory profiling) prove that you have an issue with object instantiation don't think about it too much ;)

Regards,

Loïc

loicmathieu
  • 5,181
  • 26
  • 31
-1

In builder pattern, you should do like this:

Message msg = Message.new()
.email("foo@bar.com")
.assignee("bar@bax.com")
.subject("subj").build();

Which Message.new() will create an object of builder class, the function email(..) and assignee(...) will return this. And the last build() function will create the Object based on your data.

Mavlarn
  • 3,807
  • 2
  • 37
  • 57
-1

will the optimizer be able to avoid lots of object creation

No, but instantiation is a very cheap operation on the JVM. Worrying about this performance loss would be a typical example of premature optimization.

Is there anything to be gained from making a separate mutable builder object instead?

Working with immutables is generally a good approach. On the other hand builders also won't hurt you, if you use the builder instances in a small context, so their mutable state is accessible only in a small, local envorironment. I don't see any severe disadvantages on any side, it is really up to your preference.

erosb
  • 2,943
  • 15
  • 22
  • Be careful here. Yes, creating new objects is "cheap"; but it still comes at cost. And willfully creating **garbage** is never a good idea. If you don't have context, and you don't know how often such code will be used ... You know, it is not like his design is so much better the way he implemented his none-builder. – GhostCat Jul 22 '16 at 08:45
  • I accepted this answer as it seems most consistent with what I found out by trying to measure this. – morten Jul 24 '16 at 12:04
  • 2
    Why do you say No? HotSpot JIT **does** eliminate local allocations by replacing object fields with variables. The optimization is called Scalar Replacement. There is a JVM flag for it `-XX:+EliminateAllocations` and it is ON by default. – apangin Jul 25 '16 at 22:04
  • I didn't know about it. Thanks – erosb Jul 26 '16 at 11:55