APT Wiki | Jamaica / AhmedEl-Mahdy

I was a Research Associate, and now lecturing at Alexandria University, but still collaborating with the Jamaica project (whenever I'm not teaching!).

Upgrading Jikes

For the moment I'm interested on how the adaptive compilation be useful in parallelisation. I'm also interested in doacross style parallelisation. I thought to use this area of Wiki to help in arrange ideas, brainstorm new ones and summarise any progress I've.

Sort of house-keeping experiments:

I've done an experiment to verify that the adaptive compilation system works.
I've profiled many applications from Dacapo and JOlden benchmarks. I was hoping to find a method (or a loop) that is executed heavily, but I didn't. I've only found one benchmark (ps in DaCapo suite) that does spend a significant portion of time in a loop. It seems to be the interpretation loop. I guess the loop invokes execute() methods on instruction-like objects. Those object might recursively call execute on sub objects. A doacross might be beneficial
Earlier on I've found a very easy DOALL loop in antlr program (in DaCapo suite). But that loop in fact runs many iterations of the program on different input sets (harness program). I don't feel that is a representative loop!

Work plan:

Run a simple parallel program in JaVM to learn how the system works!
Run a parallel version of ps or antlr (whichever easier) and analyse performance. Want to examine what hinders performance.

Measuring System Load

--I'll try to summarise here a simple analytical method to measure system load. The writing is still in progress. I'll email the group when there is something meaningful to read!

I agree with Chris and Ian Rogers about using the number of free tokens as a system load metric. It just occurred to me that there is a simple performance formula (Little's law) that might be useful in estimating the number of free tokens, or at least justfiying other methods relying on token counting.

In summary, to get the average number of "free" tokens in an observation time interval T, we could:

measure the average time betweeen two successive visits of a token to a token interface unit (TIU).
multiplying that with the number of token received divided by T

OR (as an approximation)
If we set observation time to be the time of one successive revisit of a token, we could say that:

 number of free tokens = number of tokens received since a token revisits the same TIU.

To implement that we could have a 5-bit register counting number of tokens (saturating counter). And another 5-bit register holding token's ID. The count is incremented everytime a token arrives until the same token is seen again. In the latter case the count is latched in a special register (possibly a control register). The next token seen is stored into the ID register and the same procedure starts again.

If a token takes more than 32 cycles to arrive, the latch is reset to zero. And the same procedure starts again.

Why that works (or maybe not!)?

 
                    --------------                          
                   |              |                        
Arrivals     ----->| Block box    |------>  Departures      
                   |              |                        
                    --------------

Little's law [1] states that:

For a black box system and under an observation time interval "T", if number of arrivals equals to number of departures, then:

Mean number in the block box = Mean response time * Arrival rate

That simple relation might be useful in measuring the number of free tokens in Jamaica. We could model the system as such:

We would observe the token flow at any link (between two successive token interface units), and pretend to cut that link (in other words, observe either the input or output link of a TIU). We thus arrive to the following figure:

 
                    --------------                          
                   |              |                        
Token in     ----->| Token pools  |------>  Token out      
                   |              |                        
                    --------------

Let's assume that during an observation time interval T no token is acquired nor released by a context
Since tokens are circulating, we may say that #tokenout = #token_in (approximately).

Thus we could apply Little's law by:

recording the number of cycles since seeing the same token twice (R), getting its mean and calculating the number tokens arrived during that time div T that gives X (thruput)
We would then calculate the number of free tokens (N) to be R * X

Caveats:

If the rate of contexts acquired/released is significantly big compared to token in/out rates then we'll be inaccurately estimating N, (as that violates Little's law).
Also if #token in is significantly different from #token out then again we will be inaccurately estimating N

References

Raj Jain, The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation, and modeling, 1st Edition, Wiley, 1991. (pages 513-515)