Optimise writing out the .s file
I noticed while working on the new IO library that GHC was writing out
the .s file in lots of little chunks. It turns out that this is a
result of using multiple printDocs to avoid space leaks in the NCG,
where each printDoc is finishing up with an hFlush.
What's worse, is that this makes poor use of the optimisation inside
printDoc that uses its own buffering to avoid hitting the Handle all
the time.
So I hacked around this by making the buffering optimisation inside
Pretty visible from the outside, for use in the NCG. The changes are
quite small.