What is your development checklist for Java low-latency application?

Question

I would like to create comprehensive checklist for Java low latency application. Can you add your checklist here?

Here is my list
1. Make your objects immutable
2. Try to reduce synchronized method
3. Locking order should be well documented, and handled carefully
4. Use profiler
5. Use Amdhal's law, and find the sequential execution path
6. Use Java 5 concurrency utilities, and locks
7. Avoid Thread priorities as they are platform dependent
8. JVM warmup can be used
9. Prefer unfair locking strategy
10. Avoid context-switching (many threads lead to counter productive)
11. Avoid boxing, un-boxing
12. Give attention to compiler warnings
13. Number of threads should be equal or lesser than the number of core

Low-latency application is tuned for every milli-seconds.

Many people write low latency Java applications that respond in much less than 1 ms. To me, low-latency in Java means sub-millisecond. — Ted Graham, Nov 22 '10 at 18:42
*"6. use locks"* => or even better, try to make your algorithm lock free. — assylias, Dec 20 '12 at 12:09
(A) Locking is not bad, contention is. Understand how to avoid contention (lock-free can be worse if contended CAS). (B) Little's law. (C) optimize around CPU caches — Ben Manes, Jan 07 '13 at 07:05

score 9 · Answer 1 · answered Apr 04 '10 at 14:36

9

Although immutability is good, it is not necessarily going to improve latency. Ensuring low-latency is likely to be platform dependent.

Other than general performance, GC tuning is very important. Reducing memory usage will help GC. In particular if you can reduce the number of middle-aged objects that need to get moved about - keep it object either long lived or short lived. Also avoid anything touching the perm gen.

answered Apr 04 '10 at 14:36

Tom Hawtin - tackline

145,806
30
211
305

Hawtin, doesn't using immutable datastructures help latency when you don't have to synchronize around shared data? – Binil Thomas Jun 24 '10 at 16:08
this is probably the best answer here – bestsss May 30 '11 at 16:44

score 6 · Answer 2 · answered Apr 06 '10 at 06:56

6

avoid boxing/unboxing, use primitive variables if possible.

answered Apr 06 '10 at 06:56

Eleco

3,194
9
31
40

score 5 · Answer 3 · answered May 10 '10 at 15:51

5

Avoid context switching wherever possible on the message processing path Consequence: use NIO and single event loop thread (reactor)

answered May 10 '10 at 15:51

bobah

18,364
2
37
70

score 4 · Answer 4 · answered May 15 '14 at 04:41

Avoid extensive locking and multi-threading in order not to disrupt the enhanced features in modern processors (and their caches). Then you can use a single thread up to its unbelievable limits (6 million transactions per second) with very low latency.

If you want to see a real world low-latency Java application with enough details about its architecture have a look at LMAX:

The LMAX Architecture

score 4 · Answer 5 · answered Apr 04 '10 at 14:10

4

Buy, read, and understand Effective Java. Also available online

answered Apr 04 '10 at 14:10

Thorbjørn Ravn Andersen

73,784
33
194
347

score 3 · Answer 6 · answered Jan 06 '13 at 13:36

Measure, measure and measure. Use as close to real data with as close to production hardware to run benchmarks regularly. Low latency applications are often better considered as appliances, so you need to consider the whole box deployed not just the particular method/class/package/application/JVM etc. If you do not build realistic benchmarks on production like settings you will have surprises in production.

Dmytro · Answer 7 · 2020-05-04T22:49:01.723

Consider using non-blocking approaches rather than synchronisation.
Consider using volatile or atomic variables over blocking data structures and locks.
Consider using object pools.
Use arrays instead of lists as they are more cache-friendly.
Normally for small tasks sending data to other cores can take more time than processing on a single core because of locking and memory and cache access latency. Hence, consider processing a task by a single thread.
Decrease the frequency of accessing main memory and try to work with data stored in caches.
Consider choosing a server-side C2 JIT compiler that is focused on performance optimizations contrary to C1 which is focused on quick startup time.
Make sure you don't have false object field sharing when two fields used by different threads can be situated on a single cache line.
Read https://mechanical-sympathy.blogspot.com/
Consider using UDP over TCP

score 1 · Answer 8 · answered Aug 31 '12 at 14:59

1

Do not schedule more threads in your application than you have cores on the underlying hardware. Keep in mind that the OS will require thread execution and potentially other services sharing the same hardware, so your application may be requried to use less than the maximunm number of cores available.

answered Aug 31 '12 at 14:59

Brad

15,186
11
60
74

2

This is true for compute intensive tasks, not necessarily for blocking/IO/other tasks where having more threads makes sense. If you have more threads then cores however you will need to 'herd' them such that the compute intensive ones are pinned separately from the blocking ones. – Nitsan Wakart Jan 06 '13 at 13:30

score 1 · Answer 9 · answered Apr 12 '16 at 14:08

I think "Use mutable objects only where appropriate" is better than "Make your objects immutable". Many very low latency applications have pools of objects they reuse to minimize GC. Immutable objects can't be reused in that way. For example, if you have a Location class:

class Location {
    double lat;
    double lon;
}

You can create some on bootup and use them over and over again so they never cause allocations and the subsequent GC.

This approach is much trickier than using an immutable location object though, so it should only be used where needed.

mohdajami · Answer 10 · 2010-04-04T14:45:35.987

0

Use StringBuilder instead of String when generating large Strings. For example queries.

edited Apr 04 '10 at 14:45

answered Apr 04 '10 at 14:06

mohdajami

9,604
3
32
53

2

only makes sense when you want to do something with the String, e.g. concatenating other Strings or reversing or suchlike. – Tedil Apr 04 '10 at 14:11
1

It usually makes no difference. The bytecodes that the Java compiler generates for String concatenations uses StringBuilders! – Stephen C Apr 04 '10 at 14:18
Concatenation in a loop (or multiple statements) is the usual case where explicit `StringBuilder` wins. – Tom Hawtin - tackline Apr 04 '10 at 14:37

score 0 · Answer 11 · answered Apr 21 '10 at 21:20

0

Another important idea is to get it working first, then measure the performance, then isolate any bottlenecks, then optimize them, then measure again to verify improvement.

As Knuth said, "premature optimization is the root of all evil".

answered Apr 21 '10 at 21:20

maerics

151,642
46
269
291

1

Few of the application success or failure just depends on performance. Though premature optimization is wrong, that rule wont suite for every application. Low latency application has to be built with certain guidelines. – Mohan Narayanaswamy Apr 23 '10 at 22:33
1

I always liked "Make it work first, before you make it work fast". – Oversteer Nov 12 '13 at 14:25
1

I like to keep in mind that, having said this, Knuth dedicated most of his life to algorithmic efficiency. :) – L. Blanc Apr 12 '16 at 14:09
Keep in mind the full quote: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." – Leponzo May 27 '21 at 03:44

supernova · Answer 12 · 2018-01-03T01:44:59.110

0

In addition to developer level solutions advised here already it can also be very beneficial to consider accelerated JIT runtimes e.g Zing and off heap memory solutions like Teracotta BigMemory, Apache Ignite to reduce Stop-the-world GC pauses. If some GUI involved using Binary Protocols like Hessian, ZERO-C ICE instead of webservice etc is very effective.

edited Jan 03 '18 at 01:44

answered Jan 03 '18 at 01:38

supernova

3,111
4
33
30

What is your development checklist for Java low-latency application?

12 Answers12