+More notes (June 11)\r
+~~~~~~~~~~~~~~~~~~~~\r
+* Kill dead code assignArguments, argumentsSize in CmmCallConv.\r
+ Bake in ByteOff to ParamLocation and ArgumentFormat\r
+ CmmActuals -> [CmmActual] similary CmmFormals\r
+\r
+* Possible refactoring: Nuke AGraph in favour of \r
+ mkIfThenElse :: Expr -> Graph -> Graph -> FCode Graph\r
+ or even\r
+ mkIfThenElse :: HasUniques m => Expr -> Graph -> Graph -> m Graph\r
+ (Remmber that the .cmm file parser must use this function)\r
+\r
+ or parameterise FCode over its envt; the CgState part seem useful for both\r
+\r
+* Move top and tail calls to runCmmContFlowOpts from HscMain to CmmCps.cpsTop\r
+ (and rename the latter!)\r
+\r
+* "Remove redundant reloads" in CmmSpillReload should be redundant; since\r
+ insertLateReloads is now gone, every reload is reloading a live variable.\r
+ Test and nuke.\r
+\r
+* Sink and inline S(RegSlot(x)) = e in precisely the same way that we\r
+ sink and inline x = e\r
+\r
+* Stack layout is very like register assignment: find non-conflicting assigments.\r
+ In particular we can use colouring or linear scan (etc).\r
+\r
+ We'd fine-grain interference (on a word by word basis) to get maximum overlap.\r
+ But that may make very big interference graphs. So linear scan might be\r
+ more attactive.\r
+\r
+ NB: linear scan does on-the-fly live range splitting.\r
+\r
+* When stubbing dead slots be careful not to write into an area that\r
+ overlaps with an area that's in use. So stubbing needs to *follow* \r
+ stack layout.\r
+\r
+\r
+More notes (May 11)\r
+~~~~~~~~~~~~~~~~~~~\r
+In CmmNode, consider spliting CmmCall into two: call and jump\r
+\r
Notes on new codegen (Aug 10)\r
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r
\r
This will fix the spill before stack check problem but only really as a side\r
effect. A 'real fix' probably requires making the spiller know about sp checks.\r
\r
- - There is some silly stuff happening with the Sp. We end up with code like:\r
- Sp = Sp + 8; R1 = _vwf::I64; Sp = Sp -8\r
- Seems to be perhaps caused by the issue above but also maybe a optimisation\r
- pass needed?\r
+ EZY: I don't understand this comment. David Terei, can you clarify?\r
\r
- - Proc pass all arguments on the stack, adding more code and slowing down things\r
- a lot. We either need to fix this or even better would be to get rid of\r
- proc points.\r
+ - Proc points pass all arguments on the stack, adding more code and\r
+ slowing down things a lot. We either need to fix this or even better\r
+ would be to get rid of proc points.\r
\r
- CmmInfo.cmmToRawCmm uses Old.Cmm, so it is called after converting Cmm.Cmm to\r
Old.Cmm. We should abstract it to work on both representations, it needs only to\r
we could convert codeGen/StgCmm* clients to the Hoopl's semantics?\r
It's all deeply unsatisfactory.\r
\r
- - Improve preformance of Hoopl.\r
+ - Improve performance of Hoopl.\r
\r
A nofib comparison of -fasm vs -fnewcodegen nofib compilation parameters\r
(using the same ghc-cmm branch +libraries compiled by the old codegenerator)\r
\r
So we generate a bit better code, but it takes us longer!\r
\r
+ EZY: Also importantly, Hoopl uses dramatically more memory than the\r
+ old code generator.\r
+\r
- Are all blockToNodeList and blockOfNodeList really needed? Maybe we could\r
splice blocks instead?\r
\r
a block catenation function would be probably nicer than blockToNodeList\r
/ blockOfNodeList combo.\r
\r
- - loweSafeForeignCall seems too lowlevel. Just use Dataflow. After that\r
+ - lowerSafeForeignCall seems too lowlevel. Just use Dataflow. After that\r
delete splitEntrySeq from HooplUtils.\r
\r
- manifestSP seems to touch a lot of the graph representation. It is\r
calling convention, and the code for calling foreign calls is generated\r
\r
- AsmCodeGen has a generic Cmm optimiser; move this into new pipeline\r
+ EZY (2011-04-16): The mini-inliner has been generalized and ported,\r
+ but the constant folding and other optimizations need to still be\r
+ ported.\r
\r
- AsmCodeGen has post-native-cg branch eliminator (shortCutBranches);\r
we ultimately want to share this with the Cmm branch eliminator.\r
- See "CAFs" below; we want to totally refactor the way SRTs are calculated\r
\r
- Pull out Areas into its own module\r
- Parameterise AreaMap\r
+ Parameterise AreaMap (note there are type synonyms in CmmStackLayout!)\r
Add ByteWidth = Int\r
type SubArea = (Area, ByteOff, ByteWidth) \r
ByteOff should not be defined in SMRep -- that is too high up the hierarchy\r
insert spills/reloads across \r
LastCalls, and \r
Branches to proc-points\r
- Now sink those reloads:\r
- - CmmSpillReload.insertLateReloads\r
+ Now sink those reloads (and other instructions):\r
+ - CmmSpillReload.rewriteAssignments\r
- CmmSpillReload.removeDeadAssignmentsAndReloads\r
\r
* CmmStackLayout.stubSlotsOnDeath\r
never pass variables to join points via arguments.)\r
\r
Furthermore, there is *no way* to pass q to J in a register (other\r
-than a paramter register).\r
+than a parameter register).\r
\r
What we want is to do register allocation across the whole caboodle.\r
Then we could drop all the code that deals with the above awkward\r