5

LongAdder as an alternative to AtomicLong

ExecutorService executor = Executors.newFixedThreadPool(2);    
IntStream.range(0, 1000)
    .forEach(i -> executor.submit(adder::increment));    
stop(executor);    
System.out.println(adder.sumThenReset());   // => 1000

LongAccumulator is a more generalized version of LongAdder

LongBinaryOperator op = (x, y) -> 2 * x + y;
LongAccumulator accumulator = new LongAccumulator(op, 1L);

ExecutorService executor = Executors.newFixedThreadPool(2);
IntStream.range(0, 10)
    .forEach(i -> executor.submit(() -> accumulator.accumulate(i)));
stop(executor);

System.out.println(accumulator.getThenReset());     // => 2539

I have some queries.

  1. Is LongAdder always preferred to AtomicLong?
  2. Is LongAccumulator preferred to both LongAdder and AtomicLong?
Tunaki
  • 132,869
  • 46
  • 340
  • 423
Ravindra babu
  • 37,698
  • 11
  • 250
  • 211

2 Answers2

8

The difference between those classes, and when to use one over the other, is mentioned in the Javadoc. From LongAdder:

This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.

And from LongAccumulator:

This class is usually preferable to AtomicLong when multiple threads update a common value that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.

[...]

Class LongAdder provides analogs of the functionality of this class for the common special case of maintaining counts and sums. The call new LongAdder() is equivalent to new LongAccumulator((x, y) -> x + y, 0L).

Thus, the use of one over the other depends on what your application intends to do. It is not always strictly prefered, only when high concurrency is expected and you need to maintain a common state.

Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • Got the difference now. Can you explain me the binary operator part in above code snippet? – Ravindra babu Feb 28 '16 at 22:25
  • 2
    @ravindra `LongBinaryOperator`? That's a functional interface defining a functional method taking 2 `long`s as parameter and returning a `long`. So, it can be used to operate on 2 longs and return a long from some calculation. For example, summing 2 longs: `LongBinaryOperator longAdder = (x, y) -> x + y;` – Tunaki Feb 28 '16 at 22:27
1

@Tunaki answered the question skillfully, but there is still one issue that affects the choice.

Adding to a Cell requires a Cell per thread. The internal code uses getProbe() in Striped64 which returns:

return UNSAFE.getInt(Thread.currentThread(), PROBE);

Probe is a system field used in threadLocalRandomSeed.

My understanding is that probe is unique for each thread. If you have a high number of threads create/destroy then probe is created for each new thread.

Therefore, the number of Cells may become excessive. If someone has more detail on this I would like to hear from you.

edharned
  • 1,884
  • 1
  • 19
  • 20
  • It seems that you have posted your answer in wrong question. – Ravindra babu Feb 28 '16 at 23:15
  • @ravindra how so? The question is about choice, I answered the question. – edharned Feb 28 '16 at 23:17
  • got it. Can you elaborate more on Probe and relation to these three classes? – Ravindra babu Feb 28 '16 at 23:23
  • There’s not a cell per thread—that would be much easier to achieve. The number of actual cells depends on the contention, in other words its maximum is related to the number of CPU cores. If you have significantly more threads than cores, they can’t all fight for cells as they can’t be running for real at the same time. But what numbers are we talking about? A cell, i.e. an encapsulated `long` value, per thread would be negligible compared to the overhead each thread has on its own. – Holger Feb 29 '16 at 12:07
  • There aren't many comments in the code about Probe. If Probe relates to a CPU core (as @Holger says) then LongAdder etc probably is a better choice than AtomicLong for high contention. – edharned Feb 29 '16 at 14:22
  • And further reading into the comments of Striped64 yields more understanding and confusion. The table size is number of CPUs: Runtime.getRuntime().availableProcessors() does not always return the number of hardware threads. See this JavaSpecialists newsletter http://www.javaspecialists.eu/archive/Issue220.htm But it still seems that LongAdder is better than AtomicLong for high contention. – edharned Feb 29 '16 at 15:09
  • To add slightly more precision to what Holger wrote: Javadoc of Striped64 (https://github.com/prometheus/client_java/blob/master/simpleclient/src/main/java/io/prometheus/client/Striped64.java) says: "The table size is doubled upon further contention until reaching the nearest power of two greater than or equal to the number of CPUS." So the table starts small, and is grown only when contention is actually detected. Also, each Cell requires at least (15*8=120)+16=136 bytes. Much longer than a long, but still much less than OS-level Threads (kBytes). This could change with fibers. – Rainer Blome Jan 09 '22 at 10:22