APT Wiki | Jamaica / Instrumentation

Code instrumentation and profiling is so easy in a virtual machine environment like Jikes RVM! Recently, I have tried out three different kinds of instrumentation. In each case, I used a slightly different technique. Read on for more details:

Return Value Entropy

There are definitely patterns in return values! For example, consider the histogram below that shows the frequencies of different integer values as return values, taken over all integer-returning methods in SPECjvm98. (Note that for clarity, frequencies saturate at 200 here. In fact, some go as high as 90,000!

So, is it possible to predict sequences of return values for methods? Do some methods have "predictable" return values? These methods are good candidates for return value prediction, which enables method-level speculative parallelism. I added code to the baseline compiler to print out return values, as part of the bytecode to assembler code expansion for the ireturn operation. I also added code to the optimizing compiler to print out return values, as part of the BC2HIR transformation. I postprocessed these return value traces with a selection of perl scripts, to compute the information-theoretic entropy of each method's return value (assuming a sequence of method return values to be a random variable, which is quite a generous assumption)!! Assume that a method m can return one of n different return values. Then the probability of the ith value is p_i, which is estimated as the number of times the ith value is returned divided by the total number of calls to method m. Now, the entropy H is computed by the following equation:

The entropy is a measure of the predictability of the method return values. A completely predictable method will always return the same value. Its entropy will be 0. The worst case entropy is lg₂(n), where the method is called n times and returns a different value each time.

For example, here is an extract of the entropies calculated for the _205_raytrace SPECjvm98 benchmark.

 ensemble entropy for method 8902 is 0.00 (n is 2)
 ensemble entropy for method 8859 is 0.00 (n is 443)
 ensemble entropy for method 8874 is 0.20 (n is 380)
 ensemble entropy for method 8860 is 0.00 (n is 1148)
 ensemble entropy for method 8673 is 0.00 (n is 1)
 ensemble entropy for method 8912 is 4.31 (n is 180)
 ensemble entropy for method 8847 is 8.97 (n is 4301)

Each method has a unique id number (actually the Jikes RVM compiled-method-id value). n is the number of times each method is called in a single execution of the benchmark. The entropy values indicate the predictability of each method's return value.

So far, this study has considered the entire set of return values over a complete run of a program. However, it is well known that many programs go through distinct phases. In some phases, a method may have predictable return values, whereas in other phases the same method may be entirely unpredictable. I measured this by keeping a sliding window of the last 10 values returned by each method. I computed the entropy within each window. As expected, this clearly shows that some methods have distinct phases of (un)predictability. Here is an example graph. The y axis measures the entropy of 10 consecutive return values in the sliding window, for a particular method. The x axis measures progress in time, which advances by one unit for each method invocation.

This graph exhibits at least three distinct phases, in terms of the entropy (predictability!) of return values for this method. This kind of entropy information may be useful in the context of an adaptive optimizing infrastructure for Jikes RVM. If we can dynamically monitor the return value entropy for each method, we can decide when a method is predictable and leverage appropriate optimizations such as speculative method level parallelism.

Memory Access Patterns

We wanted to find out if programs are parallelizable by running them and observing their memory access patterns. We can check for dependences (that may prevent parallelization) offline, after the entire memory access trace has been recorded. If there are no problem dependences, then the code (at least, for that particular run) is parallelizable. At the moment, I am using the soot bytecode analysis framework to instrument all heap memory accesses (object field and array accesses) and method calls/returns. The instrumentation occurs statically, it has hooks into VM_Magic to convert object references into addresses that can be printed to stdout. At runtime, the instrumentation code runs inline with the user code and produces a stream of memory accesses. This is still work in progress.

For example, here is the trace output from some instrumented bytecode:

 memprof> Aw 1726600772 + 0
 memprof> Aw 1726600772 + 1
 memprof> Ar 1726600772 + 0
 memprof> Fw 1726600792 + x
 memprof> Ar 1726600772 + 0

A indicates Array access. F indicates Field access. r,w stand for read and write. The large number is the base address of the array/object in the heap. The small offset (integer or string) is the array index or field name whose value is being accessed.

Here is another example, from an instrumented version of the _205_raytrace benchmark:

 memprof> Mc <spec.benchmarks._205_raytrace.Point: float GetZ()>
 memprof> Fr -1876832116 + z
 memprof> Mr <spec.benchmarks._205_raytrace.Point: float GetZ()>
 memprof> Fw -1859055824 + z
 memprof> Mr <spec.benchmarks._205_raytrace.Point: spec.benchmarks._205_raytrace.Point 
  Combine(spec.benchmarks._205_raytrace.Point,spec.benchmarks._205_raytrace.Vector,float,float)>
 memprof> Mc <spec.benchmarks._205_raytrace.Point: float GetX()>
 memprof> Fr -1859055824 + x
 memprof> Mr <spec.benchmarks._205_raytrace.Point: float GetX()>

Mc indicates method call. Mr indicates method return. The addresses of the object fields seem to have overflowed here!

Instruction Counting

We wanted to discover how many instructions were being executed dynamically on a typical run of the Jikes RVM. I added an extra VM_Magic method to read the cycle count from the TSC register (IA32) or from the simulator profiling info (Jamaica). I inserted code to read the cycle count at the start and end of VM.boot(). My cycle counts for booting the OptAdaptive Jikes RVM with all classes precompiled into the boot image were:

 IA32:      223,338,990  (0.22 billion)
 Jamaica:   796,408,641  (0.80 billion)