+ - We insert spills for variables before the stack check! This is the reason for\r
+ some fishy code in StgCmmHeap.entryHeapCheck where we are doing some strange\r
+ things to fix up the stack pointer before GC calls/jumps.\r
+\r
+ The reason spills are inserted before the sp check is that at the entry to a\r
+ function we always store the parameters passed in registers to local variables.\r
+ The spill pass simply inserts spills at variable definitions. We instead should\r
+ sink the spills so that we can avoid spilling them on branches that never\r
+ reload them.\r
+\r
+ This will fix the spill before stack check problem but only really as a side\r
+ effect. A 'real fix' probably requires making the spiller know about sp checks.\r
+\r
+ EZY: I don't understand this comment. David Terei, can you clarify?\r
+\r
+ - Proc points pass all arguments on the stack, adding more code and\r
+ slowing down things a lot. We either need to fix this or even better\r
+ would be to get rid of proc points.\r
+\r
+ - CmmInfo.cmmToRawCmm uses Old.Cmm, so it is called after converting Cmm.Cmm to\r
+ Old.Cmm. We should abstract it to work on both representations, it needs only to\r
+ convert a CmmInfoTable to [CmmStatic].\r
+\r
+ - The MkGraph currenty uses a different semantics for <*> than Hoopl. Maybe\r
+ we could convert codeGen/StgCmm* clients to the Hoopl's semantics?\r
+ It's all deeply unsatisfactory.\r
+\r
+ - Improve performance of Hoopl.\r
+\r
+ A nofib comparison of -fasm vs -fnewcodegen nofib compilation parameters\r
+ (using the same ghc-cmm branch +libraries compiled by the old codegenerator)\r
+ is at http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.oldghchoopl.txt\r
+ - the code produced is 10.9% slower, the compilation is +118% slower!\r
+\r
+ The same comparison with ghc-head with zip representation is at\r
+ http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.oldghczip.txt\r
+ - the code produced is 11.7% slower, the compilation is +78% slower.\r
+\r
+ When compiling nofib, ghc-cmm + libraries compiled with -fnew-codegen\r
+ is 23.7% slower (http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.hooplghcoldgen.txt).\r
+ When compiling nofib, ghc-head + libraries compiled with -fnew-codegen\r
+ is 31.4% slower (http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.zipghcoldgen.txt).\r
+\r
+ So we generate a bit better code, but it takes us longer!\r
+\r
+ EZY: Also importantly, Hoopl uses dramatically more memory than the\r
+ old code generator.\r
+\r
+ - Are all blockToNodeList and blockOfNodeList really needed? Maybe we could\r
+ splice blocks instead?\r
+\r
+ In the CmmContFlowOpt.blockConcat, using Dataflow seems too clumsy. Still,\r
+ a block catenation function would be probably nicer than blockToNodeList\r
+ / blockOfNodeList combo.\r
+\r
+ - lowerSafeForeignCall seems too lowlevel. Just use Dataflow. After that\r
+ delete splitEntrySeq from HooplUtils.\r
+\r
+ - manifestSP seems to touch a lot of the graph representation. It is\r
+ also slow for CmmSwitch nodes O(block_nodes * switch_statements).\r
+ Maybe rewrite manifestSP to use Dataflow?\r
+\r
+ - Sort out Label, LabelMap, LabelSet versus BlockId, BlockEnv, BlockSet\r
+ dichotomy. Mostly this means global replace, but we also need to make\r
+ Label an instance of Outputable (probably in the Outputable module).\r
+\r
+ - NB that CmmProcPoint line 283 has a hack that works around a GADT-related\r
+ bug in 6.10.\r
+\r
+ - SDM (2010-02-26) can we remove the Foreign constructor from Convention?\r
+ Reason: we never generate code for a function with the Foreign\r
+ calling convention, and the code for calling foreign calls is generated\r
+\r
+ - AsmCodeGen has a generic Cmm optimiser; move this into new pipeline\r
+ EZY (2011-04-16): The mini-inliner has been generalized and ported,\r
+ but the constant folding and other optimizations need to still be\r
+ ported.\r
+\r
+ - AsmCodeGen has post-native-cg branch eliminator (shortCutBranches);\r
+ we ultimately want to share this with the Cmm branch eliminator.\r
+\r
+ - At the moment, references to global registers like Hp are "lowered" \r
+ late (in CgUtils.fixStgRegisters). We should do this early, in the\r
+ new native codegen, much in the way that we lower calling conventions.\r
+ Might need to be a bit sophisticated about aliasing.\r
+\r
+ - Question: currently we lift procpoints to become separate\r
+ CmmProcs. Do we still want to do this?\r
+ \r
+ NB: and advantage of continuing to do this is that\r
+ we can do common-proc elimination!\r
+\r
+ - Move to new Cmm rep:\r
+ * Make native CG consume New Cmm; \r
+ * Convert Old Cmm->New Cmm to keep old path alive\r
+ * Produce New Cmm when reading in .cmm files\r
+\r