Sunday, 26 June 2016

JIT Fun Part 4: Intrinsics and Inlining

Definition: Intrinsic


Intrinsic methods in Java are ones that have a native implementation by default in the JDK. They will have a Java version but most of the time it will get inlined and and the intrinsic implementation will be called instead.

Note that what methods are made intrinsic may depend on the platform.

Examples of Intrinsic methods include Math functions (sin, cos, min, max). The full list is here.

Example


Code:

We run this method with the following jvm flags:

-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining

In the output we see the following:

 @ 97   java.lang.Math::max (11 bytes)   (intrinsic)

Woohoo! It got inlined with the intrinsic code. 


JIT Fun Part 3: -XX:PrintCompilation

Ok enough guessing about these graphs. Let's look at what's really happening.

But first...

Compiler levels


Level Compiler
0 Interpreter
1 C1 and destined to stay in C1 forever
2 C1 but only pays attention to loop/method counters
3 C1 but gathers details for C2 eg counters, percentage of time conditional evaluates to true
4 C2

Paths through these levels:

0 -> 3 -> 4

  • This is the most common case. Method initially sent to C1 level 3 after a fair few calls, then if it is called a lot or contains a loop with lots of iterations it gets promoted to the super fast level 4 (= C2).

0 -> 3 -> 1

  • This happens if the method is really tiny. It gets sent to level 3 where it is analysed and we realise that it will never go to C2 so we stick it into C1 level 1 forever.


0 -> 2 -> 3 -> 4

  • If C2 is busy at the time of promotion from level 0 then we know we won't get promoted for a while so we go hang out in level 2 rather than hopping off to 3 directly. Then we may get promoted to 4 if we deserve it. 

Example


I've got the following code:

And the times (endTime - startTime) look like this:




I ran the code with -XX:PrintCompilation and I'm using Java 8 so tiered complation is enabled by default.

Let's see what is happening to the rawr method:

90
    164  171 %     3       com.ojha.Rawr::main91 @ 50 (131 bytes)

What does this mean?

i (ith iteration of my for loop which is on the x axis) = 90
timestamp since vm start = 164
this method was 171st in the queue to be compiled
% indicates on stack replacement (method contains a loop)
3: This is the most important bit! This means we are moving into C1, ie doing the first compilation.
50 is the bytecode index of the loop.

More output

159
    167  176       3       com.ojha.Rawr::main (131 bytes)

231
    169  172       3       java.io.PrintStream::ensureOpen (18 bytes)
    169  182 %     4       com.ojha.Rawr::main @ 50 (131 bytes)

At i = 231, the loop has been called enough times for the Rawr main method to be compiled at C2 (that's what the 4 means). Note that at the same time java.io.PrintStream.ensureOpen has been called enough times to be compiled at C1 level 3.

1818
    3       com.ojha.Rawr::main @ -2 (131 bytes)   made not entrant

This is basically saying that the level 3 version of the rawr method shouldn't be used any more (correct because we should now use the level 4 version). We can see this dip in the graph at x = 1818.

1999
    193  234   !   3       java.io.PrintStream::println (24 bytes)
    193  182 %     4       com.ojha.Rawr::main @ -2 (131 bytes)   made not entrant

At the very end of the method Rawr main is over so the level 4 version of the compiled code is made not entrant (ie nothing should 'enter' or call that compiled code).

Note that the ! indicated that the method contains a try catch loop. 

JIT Fun Part 2: Throwing the JIT a curve ball

This post follows on from the previous post about visualising JIT.

Let's start with a simple, silly main method:

Plotting the elapsed time in ns:




We can see that:

Up to about the 70th run it is running in the interpreter
Then it drops into C1 until 200
At 200 it optimises again into C2

At 600 it hits our curve ball and deoptimises the method.
Method eventually drops back in C2 at around 800.

Curve ball:

The compiler thought that our String curveBall was always going to be null and it added that into the compiler. However, when we set it to something other than null the compiler realised that it has made a wrong assumption and had to deoptimise that compiled method. 

JIT Fun Part 1: Quick JIT visualisation of tiered compliation

See the following code:
It's not doing anything exciting, just running the same inner loop 1000 times.

I have taken those times and put them in a file called 'output.txt'.

See the following python function which reads in the numbers from output.txt and plots them on a graph:

Here is the graph that it produces:



At first the code is running in the interperter, then the C1 JIT, then it settles into the C2 JIT. A couple of spikes probably indicate GC.




Scala with Cats: Answers to revision questions

I'm studying the 'Scala with Cats' book. I want the information to stick so I am applying a technique from 'Ultralearning...