38

I would like to create comprehensive checklist for Java low latency application. Can you add your checklist here?

Here is my list
1. Make your objects immutable
2. Try to reduce synchronized method
3. Locking order should be well documented, and handled carefully
4. Use profiler
5. Use Amdhal's law, and find the sequential execution path
6. Use Java 5 concurrency utilities, and locks
7. Avoid Thread priorities as they are platform dependent
8. JVM warmup can be used
9. Prefer unfair locking strategy
10. Avoid context-switching (many threads lead to counter productive)
11. Avoid boxing, un-boxing
12. Give attention to compiler warnings
13. Number of threads should be equal or lesser than the number of core

Low-latency application is tuned for every milli-seconds.

Mohan Narayanaswamy
  • 2,149
  • 6
  • 33
  • 40
  • 3
    Many people write low latency Java applications that respond in much less than 1 ms. To me, low-latency in Java means sub-millisecond. – Ted Graham Nov 22 '10 at 18:42
  • 1
    *"6. use locks"* => or even better, try to make your algorithm lock free. – assylias Dec 20 '12 at 12:09
  • 2
    (A) Locking is not bad, contention is. Understand how to avoid contention (lock-free can be worse if contended CAS). (B) Little's law. (C) optimize around CPU caches – Ben Manes Jan 07 '13 at 07:05

12 Answers12

9

Although immutability is good, it is not necessarily going to improve latency. Ensuring low-latency is likely to be platform dependent.

Other than general performance, GC tuning is very important. Reducing memory usage will help GC. In particular if you can reduce the number of middle-aged objects that need to get moved about - keep it object either long lived or short lived. Also avoid anything touching the perm gen.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
6

avoid boxing/unboxing, use primitive variables if possible.

Eleco
  • 3,194
  • 9
  • 31
  • 40
5

Avoid context switching wherever possible on the message processing path Consequence: use NIO and single event loop thread (reactor)

bobah
  • 18,364
  • 2
  • 37
  • 70
4

Avoid extensive locking and multi-threading in order not to disrupt the enhanced features in modern processors (and their caches). Then you can use a single thread up to its unbelievable limits (6 million transactions per second) with very low latency.

If you want to see a real world low-latency Java application with enough details about its architecture have a look at LMAX:

The LMAX Architecture

Amir Moghimi
  • 1,391
  • 1
  • 11
  • 19
4

Buy, read, and understand Effective Java. Also available online

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347
3

Measure, measure and measure. Use as close to real data with as close to production hardware to run benchmarks regularly. Low latency applications are often better considered as appliances, so you need to consider the whole box deployed not just the particular method/class/package/application/JVM etc. If you do not build realistic benchmarks on production like settings you will have surprises in production.

Nitsan Wakart
  • 2,841
  • 22
  • 27
2
  • Consider using non-blocking approaches rather than synchronisation.
  • Consider using volatile or atomic variables over blocking data structures and locks.
  • Consider using object pools.
  • Use arrays instead of lists as they are more cache-friendly.
  • Normally for small tasks sending data to other cores can take more time than processing on a single core because of locking and memory and cache access latency. Hence, consider processing a task by a single thread.
  • Decrease the frequency of accessing main memory and try to work with data stored in caches.
  • Consider choosing a server-side C2 JIT compiler that is focused on performance optimizations contrary to C1 which is focused on quick startup time.
  • Make sure you don't have false object field sharing when two fields used by different threads can be situated on a single cache line.
  • Read https://mechanical-sympathy.blogspot.com/
  • Consider using UDP over TCP
Dmytro
  • 27
  • 5
1

Do not schedule more threads in your application than you have cores on the underlying hardware. Keep in mind that the OS will require thread execution and potentially other services sharing the same hardware, so your application may be requried to use less than the maximunm number of cores available.

Brad
  • 15,186
  • 11
  • 60
  • 74
  • 2
    This is true for compute intensive tasks, not necessarily for blocking/IO/other tasks where having more threads makes sense. If you have more threads then cores however you will need to 'herd' them such that the compute intensive ones are pinned separately from the blocking ones. – Nitsan Wakart Jan 06 '13 at 13:30
1

I think "Use mutable objects only where appropriate" is better than "Make your objects immutable". Many very low latency applications have pools of objects they reuse to minimize GC. Immutable objects can't be reused in that way. For example, if you have a Location class:

class Location {
    double lat;
    double lon;
}

You can create some on bootup and use them over and over again so they never cause allocations and the subsequent GC.

This approach is much trickier than using an immutable location object though, so it should only be used where needed.

L. Blanc
  • 2,150
  • 2
  • 21
  • 31
0

Use StringBuilder instead of String when generating large Strings. For example queries.

mohdajami
  • 9,604
  • 3
  • 32
  • 53
  • 2
    only makes sense when you want to do something with the String, e.g. concatenating other Strings or reversing or suchlike. – Tedil Apr 04 '10 at 14:11
  • 1
    It usually makes no difference. The bytecodes that the Java compiler generates for String concatenations uses StringBuilders! – Stephen C Apr 04 '10 at 14:18
  • Concatenation in a loop (or multiple statements) is the usual case where explicit `StringBuilder` wins. – Tom Hawtin - tackline Apr 04 '10 at 14:37
0

Another important idea is to get it working first, then measure the performance, then isolate any bottlenecks, then optimize them, then measure again to verify improvement.

As Knuth said, "premature optimization is the root of all evil".

maerics
  • 151,642
  • 46
  • 269
  • 291
  • 1
    Few of the application success or failure just depends on performance. Though premature optimization is wrong, that rule wont suite for every application. Low latency application has to be built with certain guidelines. – Mohan Narayanaswamy Apr 23 '10 at 22:33
  • 1
    I always liked "Make it work first, before you make it work fast". – Oversteer Nov 12 '13 at 14:25
  • 1
    I like to keep in mind that, having said this, Knuth dedicated most of his life to algorithmic efficiency. :) – L. Blanc Apr 12 '16 at 14:09
  • Keep in mind the full quote: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." – Leponzo May 27 '21 at 03:44
0

In addition to developer level solutions advised here already it can also be very beneficial to consider accelerated JIT runtimes e.g Zing and off heap memory solutions like Teracotta BigMemory, Apache Ignite to reduce Stop-the-world GC pauses. If some GUI involved using Binary Protocols like Hessian, ZERO-C ICE instead of webservice etc is very effective.

supernova
  • 3,111
  • 4
  • 33
  • 30