The primary reason it is "faster" is its contended performance. This is important because:
Under low update contention, the two classes have similar characteristics.
You'd use a LongAdder for very frequent updates, in which atomic CAS and native calls to Unsafe
would cause contention. (See source and volatile reads). Not to mention cache misses/false sharing on multiple AtomicLongs (although I have not looked at the class layout yet, there doesn't appear to be sufficient memory padding before the actual long
field.
under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.
The implementation extends Striped64
, which is a data holder for 64-bit values. The values are held in cells, which are padded (or striped), hence the name. Each operation made upon the LongAdder will modify the collection of values present in the Striped64. When contention occurs, a new cell is created and modified, so the the old thread can finish concurrently with contending one. When you need the final value, the sums of each cell is simply added up.
Unfortunately, performance comes with a cost, which in this case is memory (as often is). The Striped64 can grow very large if a large load of threads and updates are being thrown at it.
Quote source:
Javadoc for LongAdder