In computing, a benchmark is an act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. Benchmarking in java can be boiled down to measuring how long some operation takes. To measure and find which approach is better, we can write a small benchmark program, often called a Microbenchmark.
Evaluating performance and obtaining metrics of codes in applications, frameworks and tools are essential for developers. but to do that properly, we need to have an understanding of how the JVM actually executes Java bytecodes including dynamic compilation and optimization. According to Brian Goetz, Without understanding the dynamic compilation process, it's almost impossible to correctly write or interpret performance tests for Java classes. Even with that knowledge, it is still difficult to write a proper benchmark program.
Dynamic compilation and optimization
Java can create "write once, run anywhere" applications. This is achieved by the JVM. Java source code is converted to byte code at the time of compilation and that byte code runs on the JVM.
javac is a static compiler. It converts java code into bytecode and does very little optimization Java is Dynamically compiled. That means, the language is compiled to machine code while the program is being executed, not before. Not many optimizations are done at the time of compilation. JVM constantly tracks what is going on in a Java application and dynamically optimizes accordingly
Just-in-time compilationJVM has two principal components: the execution engine and the runtime. The execution engine consists of two major components: the garbage collector and the JIT compiler.
JIT compilation is an adaptive optimization for methods that are proven to be performance-critical. It helps improve the performance of Java applications.
How do JIT works? At first, it identifies the performance critical methods by maintaining an invocation count. A threshold value (-XX: CompileThreshold) is assigned to the invocation counter at the beginning. It is decremented each time the method is called.
Secondly, Once the counter hits zero, JIT is triggered and those methods will be optimized. In this way, codes which are frequently executed and have performance advantages will be optimized. No time will be wasted on the infrequent code.
What happened if JIT compiler compiles all the methods? When JVM starts for the first time. There will be many method calls. Compiling all this could increase the start up time. In other words, HotSpot interpreter analyzes the code as it runs to detect the critical hot spots in the program. It avoids infrequent code(most of the program) and devotes more attention to the performance-critical parts of the program. This hot spot monitoring is continued dynamically as the program runs and adapts its performance on the fly to the user's needs.
On-stack replacement (OSR)
In early versions of JVM, HotSpots were identified and compiled and not replaced until the method exited and was re-entered. A compiled version was used only in the next invocation of the method. Sometimes the compiled version was never used in cases such as where all the computation is done in a single invocation of a method.
OSR was a solution to this. OSR can swap compiled code with interpreted code(Not optimized) in the middle of a loop/method.
How does this work?
- The JVM starts executing some method for the first time ever, in the interpreter, e.g. main().
- That method has a long-running loop, that is now being executed in the interpreter
- The interpreter figures out that the method is hot and triggers a normal compilation
- That compilation will be used the NEXT time this method is called, but e.g. it's main() and there is no next time
- Eventually, the interpreter triggers an OSR compilation. The OSR is specialized by some bytecode it will be called at, which is typically the loop back-edge branch.
- Eventually, the OSR compilation completes.
- At this point, the interpreter jumps to the OSR code when it crosses the specialized entry bytecode – in the middle of the method.
Virtual method invocations is an important optimization bottleneck. What is Virtual method invocations ?? JVM calls the appropriate method for the object that is referred to in each variable. It does not call the method that is defined by the variable's type. These method calls require dynamic dispatching and that makes them much expensive.
Virtual method invocation example:
Once JVM has identified hotspots, it performs extensive method inlining together with other optimizations. Benefits of it are,
- Reduces the dynamic frequency of method invocations and reduce the time it takes for the method invocations.
- Produces much larger blocks of code for the optimizer to work on. This larger code blocks could lead to even more optimizations.
Java can change the pattern of method invocation at runtime and load classes dynamically. Dynamic class loading significantly complicates Method Inlining. What is Dynamic class loading? It allows compiling the application without all the dependencies. Required classes can be loaded later.
- JDBC drivers.
- Frameworks and containers.
If you see code with Class.forName(), then it is a case where classes are loaded dynamically.
Dynamically loaded classed could load new code into the program. These new codes and methods will need to be inlined again. In order to do that, JVM needs to dynamically deoptimize and optimize again.
Check below code fragment:
Foo foo = getFoo();
Before predicting the output of above code we need to think of several things.
- Will getFoo() return an instance of Foo?
- Will getFoo() return an instance of a subclass of Foo?
- Is Foo a final class ?
- Is doSomething() a final method?
Assuming that there are no loaded classes that extend foo and doSomething is a final method, JVM can do optimizations based on this information. But if a class that extends foo is loaded dynamically, JVM can figure this and optimize again.
Java HotSpot Client & Server compiler
Hotspot JVM has two compilers; Client and Server. Client compiler which is enabled by default is optimized to use less memory and startup time. Optimizations are not complex and compile with less time. It is focused on local code quality and does very few global optimizations
Server compiler is for long-running server applications. It is optimized to gain the maximum peak operating speed. Server compiler does optimizations such as dead code elimination, loop invariant hoisting, common subexpression elimination, constant propagation, global value numbering, global code motion, and null-check and range-check elimination. Even though the time it takes to compile is high, the execution time for the compiled code is less.
We can select the suitable compiler using a switch when starting the JVM.
JVM (HotSpot) supports several advanced optimization techniques in order to gain high performance. I have explained several and mentioned few as well. Some of these optimization include;
- Fast instanceof/checkcast
- Range check elimination
- Loop unrolling
- Feedback-directed optimizations
We can discuss these at another time.