APT Wiki | Jamaica / Thu16Mar06

Speculation and Transactional Memory subgroup meeting. IT413, 14:30, Thu 16 Mar 2006.

Present

Ian Watson, Behrem Kahn, Matt Horsnell, Mohammad Ansari, Gavin Brown, Jisheng Zhao, Andrew Dinn, Kostas Nikas, Ian Rogers, Andrew Dinn, Jeremy Singer

Agenda

Discussion about paper by Ananian and Rinard, on Efficient Object-Based Software Transactions. (pdf link)

Discussion

(iw) Start from the viewpoint that working in terms of objects is better than a flat memory-space. Thought so for a while, especially for transactions, then discovered this paper. A little confused about its philosophy. For instance, it seems to be saying that only one transactional reader is allowed at once.

(ad) examine figure 6 - version lists etc. Certainly possible to have multiple would-be readers as race condition is being resolved.

(iw) The FLAG value is used to mark out data that has been (effectively) speculated on. Have to chase up versions chain to find true value and abort all other transactions in the chain.

(ad) For full understanding, need to read the source code carefully, though it's quite tricky in places. The main aim seems to be to make non-transactional reads and writes efficient. Of course this has the effect of shifting complexity to the transaction handling code.

(iw) If cloned objects are formed using standard memory management mechanisms, then are they garbage-collected properly?

(ad) I presume so, since when transactions are aborted, corresponding versioned objects will be orphaned from the version list. Once they are no longer referenced, they are candidates for GC.

(gb) Can we establish the wider context of this paper?

(iw) Transactional memory is one scheme to implement speculative parallelism. Code must be written in terms of transactional blocks. Programmers assume that these transactions always run to completion. There are mechanisms to deal with interference between transactions. Such interference generally causes transactions to abort - all their side-effects undone.

(ad) So data violations must be detected, transactions must be aborted and restarted.

(iw) The simplest scheme is TCC. When a transaction commits, it updates its speculative values to main memory and broadcasts these changes to all other in-flight transactions. These are responsible for detecting violations and aborting. This is the big problem with flat memory, it's inefficient because you have to tell everyone about all the changes, even if they aren't interested. This paper seems interesting because it detects interference at the object level rather than at the memory level.

(ad) When an object is modified in a transaction, that information is stored with the object itself. Later transactions can easily abort / tell others to abort when they examine the object themselves.

(ad) The Load Linked / Store Conditional primitive is used to implement the transactional semantics. This allows transactions to monitor addresses to see whether someone else updates them.

(iw) This is different to a hardware cache. There is no central log. The information is stored with the objects themselves.

(ad) This scheme should be efficient when there are no clashes, but it's not so clear how performance is affected by many conflicts.

(ir) How about having mushroom object caches in hardware? Perhaps it would be possible to put different object versions in the cache as pseudo-objects.

conversation then descended into confused banter about the exact behaviour of transactional readers as described in the paper

(ir) It might be possible to use a lazy memcpy to avoid the overhead of full object cloning for each transactional operation on an object. Similar to OS copy-on-write protection of shared pages between processes.

(ad) The issue is that transactions will always update any object fields they touch with a FLAG value, so the cloning has to be eager rather than lazy. One interesting point is the way they have special data structures to handle large objects and avoid complete clones.

(iw) The proposed scheme seems to cause too many aborts, and is generally too complicated.

(ad) The idea of cloning objects for transactional modifications is good. We could use this. Also, it's nice that the hardware is optimized for the common (non-transactional) case.

(gb) Perhaps multiple readers are disallowed because they may eventually become writers?

more confusion!

(iw) One difficulty seems to be the lack of existing transactional benchmarks. No general-purpose programming languages support transactions directly.

(ad) There are many persistent extensions to Java that implement transactional behaviour! For example, a project at Glasgow Uni.

(ma) Working at the granularity of object field access is probably as inefficient as flat memory model(?)

(ad) Locking at the level of cache lines can induce false conflicts if several objects live on the same line. Don't get this problem with this object-based scheme.

(bk) Yes but there are proposed schemes that only allow one object per cache line, which also removes the problem of false conflicts!

(ad) An object cache seems to be the best way forward.

(iw) Not sure what to discuss next week. Any suggestions?