NCG: Handle loops in register allocator
Fill in the missing parts in the register allocator so that it can
handle loops.
*) The register allocator now runs in the UniqSuppy monad, as it needs
to be able to generate unique labels for fixup code blocks.
*) A few functions have been added to RegAllocInfo:
mkRegRegMoveInstr -- generates a good old move instruction
mkBranchInstr -- used to be MachCodeGen.genBranch
patchJump -- Change the destination of a jump
*) The register allocator now makes sure that only one spill slot is used
for each temporary, even if it is spilled and reloaded several times.
This obviates the need for memory-to-memory moves in fixup code.
LIMITATIONS:
*) The case where the fixup code needs to cyclically permute a group of
registers is currently unhandled. This will need more work once we come
accross code where this actually happens.
*) Register allocation for code with loop is probably very inefficient
(both at compile-time and at run-time).
*) We still cannot compile the RTS via NCG, for various other reasons.