Recent Changes - Search:

PmWiki

pmwiki.org

edit SideBar

ProjectProposals

Graph colouring register allocator

Graph colouring can produce better register allocation schemes, such as linear scan. This project will look to implement a graph colouring register allocator in the context of the Jikes RVM. It is expected this will benefit architectures with few registers such as IA32. The allocator will be written in Java as part of the Jikes RVM.

Related work

  • The Jikes RVM is a popular research virtual machine from IBM, amenable to extension and optimization. The Jamaica project has a successful track record in running useful Jikes RVM projects.

Goal

A working graph colouring register allocator with better runtime performance than a linear scan allocator, and as little runtime overhead as possible.


Web based Latex editor

Latex is a popular language for marking up academic theses and papers. The latex source is typically edited in a text editor. In many cases the editor features fewer spell checking abilities than a form spell checker in a web browser (such as Firefox's spell checker). In general there's a clear move away from stand alone applications such as latex, text editors and toward more integrated solution provided through a web site. Although wikis and the like can provide document creation abilities like latex, they typically lack most of latex's features, such as the packages for graphics and bibliographies. This project aims to wrap latex's rich features in a web site with greater power than a text editor.

Goal

The creation of a web site for the editing of latex source code. Latex written into the system should be compiled using latex to provide pdf output. The website should endeavor to be feature rich, for example allowing automatic saving of documents and giving editing assistance in the form of highlighting and tables of contents.


Java Operating System for PowerPC

Current Java Operating System research is focussed on readily available x86 systems. Whilst this is important, the PowerPC processor is present in many niche markets that are often early adopters of technology like Java Operating Systems. This project will look to port the JNode operating system, and the Jikes RVM with JNode operating system, to the PowerPC platform.

Related work

  • JNode is an open source Java OS released under the same licensing requirements as the Linux kernel.
  • The Jikes RVM is a popular research virtual machine from IBM, amenable to extension and optimization. The Jamaica project has a successful track record in running useful Jikes RVM projects.

Goal

A Java OS that will boot on a PowerPC or PowerPC system emulator.


Embedded JVM design

Many current JVMs are unsuited to embedded applications due to their size. Often performance has come first. This project will look at creating a version of the Jikes RVM for the embedded market. This will focus on libraries, choice of interpreter and compilers, how adaptive optimization can be better suited for embedded designs, and garbage collection design decisions.

Related work

  • The Jikes RVM is a popular research virtual machine from IBM, amenable to extension and optimization. The Jamaica project has a successful track record in running useful Jikes RVM projects.

Goal

A version of the Jikes RVM running in a reduced memory and performance foot print, but still incorporating important performance and memory optimizations.


Dynamically typed language virtual machine optimization

Modern dynamically typed languages such as Ruby and Perl, lack the kind of advanced virtual machine optimization present for statically typed languages, such as Java. An example of this is optimistic method inlining. Dynamically typed languages can have all components changed at runtime, however, this complexity is seldom exploited.

This project will extend the Parakeet project, a virtual machine executing dynamically typed parrot bytecodes building on the Jikes RVM, a JVM with an optimizing compiler written in Java.

Related work

  • Parakeet is a project developed within the Jamaica project last year.
  • The Jikes RVM is a popular research virtual machine from IBM, amenable to extension and optimization. The Jamaica project has a successful track record in running useful Jikes RVM projects.

Goal

An improved parrot bytecode execution engine, capable of new compiler optimizations, such as optimisitic optimizations.


Configurable JVM threading

Java threads are implementable in many ways on architectures and operating systems. For example, a Java thread may run and switch to another Java thread on a signal or an explicit instruction, this is the typical green threads approach to threading. This approach has advantages in simplifying garbage collector design. An alternate implementation is to map Java threads to underlying operating system threads. This has the advantage of being able to use all available processors and for the threading to be under the control of the OS kernel.

The purpose of this project is to implement a thread abstraction layer inside the Jikes RVM where the underlying threading configuration can be varied between operating system and virtual machine controlled threads.

This project will be co-supervised by developers in Silicon Valley specializing in bespoke hardware and software implementations.

Related work

  • There is much related work in commercial and opensource VMs. A successful candidate for this project will have to avoid exposure to existing VM threading designs to ensure the authenticity of their work and contribution back to opensource.
  • The Jikes RVM wiki contains details on the current threading model.

Goal

A configurable threading implementation that can be used to test the effect of threading models on JVM performance.


Bi-directional objects: an extension to a popular garbage collection toolkit

The memory management toolkit (MMTk) is a popular garbage collection framework used for a variety of different languages, such as Haskell and Java. In its current implementation, the expected object layout is with fields at ever increasing offsets from the beginning of the object. When scanning this object model, during garbage collection, an array of reference field offsets must be read to determine which fields of an object contain references. Whilst this object layout is popular, others have proposed object layouts that avoid the need for the offset array. They achieve this by grouping reference and numeric fields and growing the object, when a sub-class is added, by adding the new fields at either end of the object.

Related work

  • SableVM is a JVM with a bidirectional object layout. The paper "SableVM: A Research Framework for the Efficient Execution of Java Bytecode" describes how a bidirectional object layout works.
  • MMTk is part of the Jikes RVM project on SourceForge.
  • This paper by Dayong Gu, Clark Verbrugge, and Etienne M. Gagnon

Goal

A bi-directional object representation that can be successfully garbage collected by MMTk. It is expected that this will be tested by an implementation of the object model in the Jikes RVM.


A simple Java JIT

This project aims to rewrite a simple JIT compiler to include an optimisation called instruction folding. The existing JIT compiler maps Java bytecodes suchas:

  • iload 1 -- (2)
  • iconst 10 -- (1)
  • iadd -- (3)
  • istore 1 -- (2)

that perform an addition of 10 to local variable 1, to corresponding instructions that perform the JVM's operand stack in memory. The values in brackets represent the number of memory operations (loads or stores) required to perform these operations, the total is 8. However, adding 10 to a value in memory can be performed by a single load and store from/to memory. Instruction folding is an optimisation designed to solve this. Instead of having an operand stack in memory, this simple JIT compiler will emulate the stack at compile time. When a memory store operation is found (such as istore) then the stack will be inspected and the relavent instructions generated to hold values in registers.

It is proposed that this compiler should be a rewrite of the baseline compiler in the Jikes RVM.

Goal

A JVM that uses the newly written JIT compiler.

Java native compiler

The popularity of Java has led to the creation of many virtual machines. Java compilers that compile source or bytecode to machine code to be run as a normal executable file are relatively few in comparison. However, some runtime compilers offer the complexity of a static compiler. The aim of this project is to re-target an already written Java runtime compiler so that it creates code to be linked against and create a natively executable program that doesn't require a JVM.

The project will require the candidate first becoming familiar with the JVM used for this project, the Jikes RVM. Next the candidate will be required to adapt a harness that given a method name can compile it to machine code. The candidate will then have to take the machine code and use it to construct a machine code object file which will be the result of the compilation. The runtime system the object file can be linked against will be provided from the GCJ project which forms part of GCC. The final part of the project will be for the system to automatically discover what methods need compiling given a program name, and to use the appropriate parts of the system to create an executable.

Goal

A Java bytecode compiler that can compile Java programs into executable programs.

C compiler written in Java

Compilers are complex pieces of software that benefit from object-oriented design, strong typing and exception handling mechanisms. The C programming language is particularly popular. A C compiler written in Java would allow compiler ideas to be tried out with relative ease.

This project aims to marry a Java C parser with the Jikes RVM optimizing compiler so that simple C programs can be compiled into machine code. The resultant compiler can then work as a test bed for future compiler optimizations as well as proving an alternative to popular compilers such as GCC, and popular research compilers such as LCC (both of which are written in the C programming language with its inherent hazards).

Using the Jikes RVM infrastructure means that new optimzing compiler backends for the Jikes RVM (there are already backends for Intel and PowerPC) can be created and benefit immediately from a C compiler and virtual machine framework.

Related work

  • JavaCC is a Java parser generator that has an example C grammar.
  • SableCC is a Java parser generator that has 2 example C grammars.
  • The Jikes RVM has an optimizing compiler written in Java that operates upon a well defined intermediate form generating code arrays that can be dumped into object files.

Goal

A C compiler that can compile simple C programs into machine code.


Bayesian Compiler Optimization Planner

Bayesian probability is used to tell you which messages in your inbox are likely to be spam or not. You train the model by telling it what is spam, this then alters probabilites associated with words within the e-mail. The intermediate representation (IR) of a compiler can be thought of as a letter. We want to perform a set of compiler optimisations upon the IR, but we don't know what optimisations are good to choose at the different times within an adaptive compilation system. By creating a bayesian model of the compiler and then training it we can determine the compiler optimisations that have the highest probability of improving the compiled codes performance, and if we factor it into the network, the least compilation cost. We can then go from training the compiler against a set of benchmark programs, to using the trained data as an actual part of the production compiler.

Related work

Goal

An improved optimizing compiler, a bayesian network that describes it, a way of capturing information about compiler IR to predict useful compiler phases and insight into when compiler phases win you the most.


A Parallel Functional Language (SML) for a Chip Multiprocessor

Introduction

Due to their ability to express computation without the use of state, functional languages have long been considered as suitable vehicles for parallel programming where the co-ordination of shared state is a major complication. In spite of this, they have failed to gain acceptance partly due to reasons of efficiency and partly because the approach is unfamiliar to many programmers.

However, with thew now widely accepted view that multithreaded chip multiprocesors are set to become mainstream, there is renewed interest in the parallel functional approach.

The Jamaica project is investigating novel design approaches to both the hardware and software of chip multiprocessor systems. The current research platform is an instruction level simulator with a parallelising dynamic compilation virtual machine layer. The basis of the software system is a Java VM (Jikes) written in java.

MLJ is a compiler for Standard ML, written in ML, which outputs Java bytecodes and can thus enable ML programs to run on the Jamaica system. However, neither the MLJ system nor the current Jamaica parallelising VM would result in any parallelism being exploited. This project will investigate the issue of such parallelisation. This is likely to involve work both at the MLJ compiler level and at the Jikes VM level in order to devise design and implement the mechanisms to identify and execute potential parallelism.

Related Work

Goal

A system which will execute parallel functional programs on the Jamaica chip multiprocessor.


Exploration of Instruction and Hardware Choices (Graph Analysis)

Introduction

Hardware, at the gate level, can be seen as a graph. This graph is often known as the net list. Components within the hardware are typically designed to perform a higher level function. Higher order encapsulation sometimes reduces the possibility for lower level optimisation, it does however enable easier management of design.

Instructions within a compiler are often dealt with as trees with one result. However, SIMD architectures are capable of writing to multiple results.

Tree based algorithms (such as BURS) exist to produce all valid coverings of a tree. As a second phase the lowest cost covering is chosen. This project extends this idea so that it not only considers trees but also graphs. The project will produce a compiler compiler that reads in a specification of valid coverings. For all valid coverings an analysis will then be performed (to determine the most optimal covering) and then based on this productions used to prefered result.

Related Work

  • Proebsting developed the BURG BURS compiler compiler that is widely used
  • TWIG is another tree rewriting language
  • Graph rewriting considers (amongst other things) proving mathematically equations by performing series of rewrites
  • A similar system is considered here but the resultant compiler compiler was never released as open source, nor were the more generic uses of the compiler compiler considered.

Goal

A compiler compiler capable of producing a compiler that will process a graph and produce an equivalent graph with nodes merged, split and operations modified.


Superscalar Instruction Selection

Introduction

Instruction selection, instruction scheduling and register allocation are conventionally separate phases in a compiler. Instruction selection chooses which are the best instructions to execute based on a static costing (for example using the BURS algorithm). Instruction scheduling tries to improve the performance of the chosen instructions by performing local re-arranging to try to prevent pipeline stalls. Register allocation takes the temporary registers in use by the compiler this far and maps them on to architectural registers, and if there's not space, spills and fills the registers on to the stack.

For modern superscalar architectures making repeated use of a shared resources, such as floating point or memory units, can inhibit instruction parallelism. For this reason off-line compilers, such as GCC, make use of automata to simulate the processor pipeline and produce dynamically generated costings for instruction selections. These automata aren't just models of the architecture, but also take into account the likely overhead of spills and fills.

Related Work

Goal

The goal of this project is to improve the compiler backend for the Jikes RVM for a superscalar architecture (probably IA32). It is expected the new compiler backend would improve the overall performance of the Jikes RVM.


Fast ARM Emulator

Introduction

This project will investigate creating a new component for the JikesRVM that enables adaptive and optimizing compilation of ARM instructions. This will allow ARM codes to be moved to Java platforms, and the associated high performance will aid ARM prototype development.

Related Work

  • PearColator is a X86 and PowerPC DBT developed by the Jamaica project.
  • Komodo is the ARM debugger developed by the APT group.

Goal

The development of an ARM emulator capable of running Linux applications in the same way as PearColator runs X86 and PowerPC applications. Possible integration into existing ARM development environments, such as Komodo.


Graphical, multiprocessor, object-oriented, virtual machine debugger

Introduction

Modern CPUs and chip multiprocessors (multiple processing cores on a single chip) are complex parallel systems. Wave-form debugging or debugger (gdb) style debugging doesn't present the user of the processor simulator with information in the most intelligible form. Users would like to combine the wave form and gdb debugging with a graphical interface that would allow them to scratch away the surface of the simulated processor and take a look at what each of the individual elements within it are doing.

The aim of this project is to work with the Jamaica architectural simulator and debugger, written in Java, and work upon creating new visualisation tools.

Related Work

  • For the locally developed Balsa the GTK wave environment is used.
  • Dotty is used to calculate component placement in the Balsa visualization system with some limited Java implementations available.
  • There are a large number of ECAD visualization products available from companies such as Mentor graphics

Goal

To have a graphical visualization tool capable of being used to debug the next generations of Jamaica architectures. Also for use by virtual machine implementors wanting a fuller understanding of their dynamic systems and garbage collection.


Typed Instruction Set Design

Introduction

Superscalar performance allows the execution of more than one instruction per clock cycle. This can be compromised when the hardware suspects a data hazard exists (read-after-write, write-after-read, write-after-write).

Compilers are able to generate instructions sequences that are arranged to minimise pipeline stalls due to data hazards. However, the data hazards can be reduced further by the realisation that with a strongly typed language, such as Java, operations on different types of data can't have data hazards.

The aim of this project is to generate a simulator for an instruction set architecture (ISA) and evaluate the performance of having typed operations in the ISA. We aim to understand the trade-off between possibly bloating an ISA (with type information) and reducing data hazards.

Related Work

  • SimpleScalar is a free for non-commercial use super scalar simulator.
  • "Computer Architecture:A Quantative Approach" introduces DLX, an architecture and simulator that has superscalar implementations.
  • The Jamaica project has a number of in house simulation and fast binary translation tools.

Goals

A working simulator and Java translator with performance results showing the effect of having typed operations in the ISA.


Less/more - An Improved Pager

Introduction

Pager applications are used to look at large files without the cost of loading an editor. Current pager applications (such as less) support a number of search and line number facilities. However, their performance is slow in comparison to dedicated search tools (such as grep). A number of bottle-necks hinder their performace:

  • reliance on slow system calls (read instead of mmap)
  • expression matching optimised for regular expressions and use of outside libraries
  • linear searching as opposed to table based methods
  • no caching of search results

This project aims to re-write a fundamental UNIX tool for the benefit of the wider community.

Related Work

Goals

To implement a pager that improves performance compared to less when performing a sequence of operations on a large file (ideally both overall will reduce, realistically large savings can be made on user time). The release of the pager application as free software.


Seeds for other project ideas:

  • More work on JikesNODE to catch up with JNODE (ie get to full boot)
  • Jikes RVM and JX OS merger
  • Port JikesNODE to Jamaica
  • Port JikesNODE to Mac
  • Full machine Mac emulation (where did Mark Orchard get with this?)
  • Enhancing PearColator
  • New architectural ideas
  • New language ideas (in particular functional and parallel programming models)
Edit - History - Print - Recent Changes - Search
Page last modified on January 25, 2007, at 09:46 AM