Monday, 7 September 2015

Runtime Jitter - Zing vs Hotspot

Hi there, and welcome. This content is still relevant, but fairly old. If you are interested in keeping up-to-date with similar articles on profiling, performance testing, and writing performant code, consider signing up to the Four Steps to Faster Software newsletter. Thanks!
This is an update to my last post exploring behaviour in the Oracle/OpenJDK JVM that forces a periodic safepoint under normal conditions.

After that post was published, Gil Tene from Azul Systems asked how their Zing JVM matched up against Oracle/OpenJDK in terms of runtime jitter.

At LMAX Exchange, we have been using the Zing JVM for several years, after we found that it removed a large proportion of our latency outliers. Namely those caused by garbage collection pauses.

Measuring runtime jitter

These measurements were initially made to try to demonstrate jitter introduced to an application by the Linux kernel's scheduler. Since the kernel is responsible for allocating CPU time to runnable processes, these decisions can show up as sources of latency in well-tuned applications.

In this application, a 'producer' thread makes a call to System.nanoTime() (which under the hood uses the monotonic clock on Linux) and passes the result into an instance of the Disruptor. On the consuming side, an 'accumulator' thread also calls System.nanoTime(), then records the delta (in nanoseconds) into an HdrHistogram.

Using this method, we can explore the time taken to pass a message between two threads. In the majority of cases, this will be very quick, but there will be outliers introduced by the runtime (i.e. JVM) and the operating system (i.e. Linux scheduler).

During developing this application, I came across the 100-microsecond jitter introduced by the Oracle/OpenJDK's forced safepoint behaviour. This is discussed in more detail in my previous post.

So how do these two JVMs fare against each other?

Comparing JVMs

In the results below, the effect of the forced safepoints are clear, giving a maximum jitter of around 100 microseconds:

Oracle/OpenJDK with forced safepoints enabled (default):

== Accumulator Message Transit Latency (ns) ==
mean                     269
min                      168
50.00%                   216
90.00%                   464
99.00%                   608
99.90%                   736
99.99%                   960
99.999%                 4352
99.9999%               15872
max                   106496
count                3595101

Disabling the forced safepoints removes the outliers:

Oracle/OpenJDK with forced safepoints disabled:

== Accumulator Message Transit Latency (ns) ==
mean                     385
min                      152
50.00%                   352
90.00%                   464
99.00%                   640
99.90%                   768
99.99%                   864
99.999%                 3072
99.9999%               17408
max                    20480
count                3595101

Comparing this to Zing's default behaviour:


== Accumulator Message Transit Latency (ns) ==
mean                     263
min                      136
50.00%                   256
90.00%                   288
99.00%                   448
99.90%                   512
99.99%                   608
99.999%                 3200
99.9999%               10240
max                    13312
count                3595101

For those who like a visual representation, here's the comparison in chart form:

And in log-scale:


Zing doesn't need to force periodic safepoints during normal operation, so assuming that you don't have any other sources of jitter in your program, you'll get a flatter latency profile with out-of-the-box behaviour.

It is possible to restrict the forced safepoint behaviour of the Oracle/OpenJDK, but the consequences of doing so are unclear. Dragons may well be involved.

At LMAX Exchange, we currently run our microbenchmarks on Oracle JDK, and we do suppress the periodic forced safepoint behaviour in order to reduce jitter in the results. So far, there have been no adverse effects, but our microbenchmarks only run for short periods of time.

About the tests

OracleJDK version:

java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

Zing version:

java version "1.8.0-zing_15.05.0.0"
Zing Runtime Environment for Java Applications (build 1.8.0-zing_15.05.0.0-b8)
Zing 64-Bit Tiered VM (build 1.8.0-zing_15.05.0.0-b16-product-azlinuxM-X86_64, mixed mode)

OracleJDK flags:

Baseline run:

-XX:+DisableExplicitGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintTenuringDistribution -XX:-UseBiasedLocking -Xmx4g -Xms4g

Disabled forced safepoints run:

-XX:+UnlockDiagnosticVMOptions -XX:GuaranteedSafepointInterval=600000 -XX:+DisableExplicitGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintTenuringDistribution -XX:-UseBiasedLocking -Xmx4g -Xms4g

Zing JVM flags:

-XX:-UseMetaTicks -XX:-UseTickProfiler -XX:GenPauselessNewThreads=2 -XX:GenPauselessOldThreads=2 -XX:+ConcurrentDeflation -XX:+UseRdtsc

These tests were run on a highly-tuned Linux system, utilising such marvellous techniques as CPU isolation, thread affinity, cache-friendly location, and other magic fairy dust.

An upcoming blog post will go into more detail on how to get OS scheduler jitter down to the low tens-of-microseconds.