Architectural stuff
New fields in the TSO:
- New global speculation-depth register; always counts the number of specuation frames
on the stack; incremented when
starting speculation, decremented when finishing.
- Profiling stuff
Speculation frames
The info table for a speculation frame points to the static spec-depth configuration
for that speculation point. (Points to, because the config is mutable, and the info
table has to be adjacent to the (immutable) code.)
Abortion
Abortion is modelled by a special asynchronous exception ThreadAbort.
- In the scheduler, if a thread returns with ThreadBlocked, and non-zero SpecDepth, send it
an asynchronous exception.
- In the implementation of the catch# primop, raise an asynchonous exception if
SpecDepth is nonzero.
- Timeout, administered by scheduler. Current story: abort if a speculation frame lasts from
one minor GC to the next. We detect this by seeing if there's a profiling frame on the stack --- a
profiling frame is added at a minor GC in place of a speculation frame (see Online Profiling).
When tearing frames off the stack, we start a new chunk at every speculation frame, as well as every
update frame. We proceed down to the deepest speculation frame.
The AP_STACK closure built for a speculation frame must be careful not to enter the
next AP_STACK closure up, because that would re-enter a possible loop.
Delivering an asynch exception to a thread that is speculating. Invariant: there can be no catch frames
inside speculation (we abort in catch# when speculating. So the asynch exception just
tears off frames in the standard way until it gets to a catch frame, just as it would usually do.
Abortion can punish one or more of the speculation frames by decrementing their static config variables.
Synchronous exceptions
Synchronous exceptions are treated similarly as before. The stack is discarded up to an update frame; the
thunk to be updated is overwritten with "raise x", and the process continues. Until a catch frame.
When we find a spec frame, we allocate a "raise x" object, and resume execution with the return address
in the spec frame. In that way the spec frame is like a catch frame; it stops the unwinding process.
It's essential that every hard failure is caught, else speculation is unsafe. In particular, divide by zero
is hard to catch using OS support, so we test explicitly in library code. You can shoot yourself in the foot
by writing x `div#` 0, side-stepping the test.
Online profiling
Sampling can be more frequent than minor GC (by jiggling the end-of-block code) but cannot
be less frequent, because GC doesn't expect to see profiling frames.