-Notes on new codegen (Sept 09)\r
+Notes on new codegen (Aug 10)\r
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r
\r
Things to do:\r
+ - We insert spills for variables before the stack check! This is the reason for\r
+ some fishy code in StgCmmHeap.entryHeapCheck where we are doing some strange\r
+ things to fix up the stack pointer before GC calls/jumps.\r
+\r
+ The reason spills are inserted before the sp check is that at the entry to a\r
+ function we always store the parameters passed in registers to local variables.\r
+ The spill pass simply inserts spills at variable definitions. We instead should\r
+ sink the spills so that we can avoid spilling them on branches that never\r
+ reload them.\r
+\r
+ This will fix the spill before stack check problem but only really as a side\r
+ effect. A 'real fix' probably requires making the spiller know about sp checks.\r
+\r
+ - There is some silly stuff happening with the Sp. We end up with code like:\r
+ Sp = Sp + 8; R1 = _vwf::I64; Sp = Sp -8\r
+ Seems to be perhaps caused by the issue above but also maybe a optimisation\r
+ pass needed?\r
+\r
+ - Proc pass all arguments on the stack, adding more code and slowing down things\r
+ a lot. We either need to fix this or even better would be to get rid of\r
+ proc points.\r
+\r
+ - CmmInfo.cmmToRawCmm uses Old.Cmm, so it is called after converting Cmm.Cmm to\r
+ Old.Cmm. We should abstract it to work on both representations, it needs only to\r
+ convert a CmmInfoTable to [CmmStatic].\r
+\r
+ - The MkGraph currenty uses a different semantics for <*> than Hoopl. Maybe\r
+ we could convert codeGen/StgCmm* clients to the Hoopl's semantics?\r
+ It's all deeply unsatisfactory.\r
+\r
+ - Improve preformance of Hoopl.\r
+\r
+ A nofib comparison of -fasm vs -fnewcodegen nofib compilation parameters\r
+ (using the same ghc-cmm branch +libraries compiled by the old codegenerator)\r
+ is at http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.oldghchoopl.txt\r
+ - the code produced is 10.9% slower, the compilation is +118% slower!\r
+\r
+ The same comparison with ghc-head with zip representation is at\r
+ http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.oldghczip.txt\r
+ - the code produced is 11.7% slower, the compilation is +78% slower.\r
+\r
+ When compiling nofib, ghc-cmm + libraries compiled with -fnew-codegen\r
+ is 23.7% slower (http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.hooplghcoldgen.txt).\r
+ When compiling nofib, ghc-head + libraries compiled with -fnew-codegen\r
+ is 31.4% slower (http://fox.auryn.cz/msrc/0517_hoopl/32bit.oldghcoldgen.zipghcoldgen.txt).\r
+\r
+ So we generate a bit better code, but it takes us longer!\r
+\r
+ - Are all blockToNodeList and blockOfNodeList really needed? Maybe we could\r
+ splice blocks instead?\r
+\r
+ In the CmmContFlowOpt.blockConcat, using Dataflow seems too clumsy. Still,\r
+ a block catenation function would be probably nicer than blockToNodeList\r
+ / blockOfNodeList combo.\r
+\r
+ - loweSafeForeignCall seems too lowlevel. Just use Dataflow. After that\r
+ delete splitEntrySeq from HooplUtils.\r
+\r
+ - manifestSP seems to touch a lot of the graph representation. It is\r
+ also slow for CmmSwitch nodes O(block_nodes * switch_statements).\r
+ Maybe rewrite manifestSP to use Dataflow?\r
+\r
+ - Sort out Label, LabelMap, LabelSet versus BlockId, BlockEnv, BlockSet\r
+ dichotomy. Mostly this means global replace, but we also need to make\r
+ Label an instance of Outputable (probably in the Outputable module).\r
+\r
+ - NB that CmmProcPoint line 283 has a hack that works around a GADT-related\r
+ bug in 6.10.\r
+\r
+ - SDM (2010-02-26) can we remove the Foreign constructor from Convention?\r
+ Reason: we never generate code for a function with the Foreign\r
+ calling convention, and the code for calling foreign calls is generated\r
+\r
+ - AsmCodeGen has a generic Cmm optimiser; move this into new pipeline\r
+\r
+ - AsmCodeGen has post-native-cg branch eliminator (shortCutBranches);\r
+ we ultimately want to share this with the Cmm branch eliminator.\r
+\r
+ - At the moment, references to global registers like Hp are "lowered" \r
+ late (in CgUtils.fixStgRegisters). We should do this early, in the\r
+ new native codegen, much in the way that we lower calling conventions.\r
+ Might need to be a bit sophisticated about aliasing.\r
+\r
+ - Question: currently we lift procpoints to become separate\r
+ CmmProcs. Do we still want to do this?\r
+ \r
+ NB: and advantage of continuing to do this is that\r
+ we can do common-proc elimination!\r
+\r
+ - Move to new Cmm rep:\r
+ * Make native CG consume New Cmm; \r
+ * Convert Old Cmm->New Cmm to keep old path alive\r
+ * Produce New Cmm when reading in .cmm files\r
+\r
- Consider module names\r
\r
- Top-level SRT threading is a bit ugly\r
\r
- See "CAFs" below; we want to totally refactor the way SRTs are calculated\r
\r
- - Change \r
- type CmmZ = GenCmm CmmStatic CmmInfo (CmmStackInfo, CmmGraph)\r
- to\r
- type CmmZ = GenCmm CmmStatic (CmmInfo, CmmStackInfo) CmmGraph\r
- -- And perhaps take opportunity to prune CmmInfo?\r
-\r
- - Clarify which fields of CmmInfo are still used\r
- - Maybe get rid of CmmFormals arg of CmmProc in all versions?\r
-\r
- - We aren't sure whether cmmToRawCmm is actively used by the new pipeline; check\r
- And what does CmmBuildInfoTables do?!\r
-\r
- - Nuke CmmZipUtil, move zipPreds into ZipCfg\r
-\r
- Pull out Areas into its own module\r
Parameterise AreaMap\r
Add ByteWidth = Int\r
-- rET_SMALL etc ==> CmmInfo\r
Check that there are no other imports from codeGen in cmm/\r
\r
+ - If you eliminate a label by branch chain elimination,\r
+ what happens if there's an Area associated with that label?\r
+\r
- Think about a non-flattened representation?\r
\r
- LastCall: \r
http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/NewCodeGenPipeline\r
\r
\r
- - We believe that all of CmmProcPointZ.addProcPointProtocols is dead. What\r
+ - We believe that all of CmmProcPoint.addProcPointProtocols is dead. What\r
goes wrong if we simply never call it?\r
\r
- Something fishy in CmmStackLayout.hs\r
move the whole splitting game into the C back end *only*\r
(guided by the procpoint set)\r
\r
- \r
----------------------------------------------------\r
Modules in cmm/\r
----------------------------------------------------\r
\r
--------- Dead stuff ------------\r
-CmmProcPoint Dead: Michael Adams\r
-CmmCPS Dead: Michael Adams\r
-CmmCPSGen.hs Dead: Michael Adams\r
-CmmBrokenBlock.hs Dead: Michael Adams\r
-CmmLive.hs Dead: Michael Adams\r
-CmmProcPoint.hs Dead: Michael Adams\r
-Dataflow.hs Dead: Michael Adams\r
-StackColor.hs Norman?\r
-StackPlacements.hs Norman?\r
-\r
+-------- Testing stuff ------------\r
HscMain.optionallyConvertAndOrCPS\r
testCmmConversion\r
-DynFlags: -fconvert-to-zipper-and-back, -frun-cps, -frun-cpsz\r
+DynFlags: -fconvert-to-zipper-and-back, -frun-cpsz\r
\r
-------- Moribund stuff ------------\r
+OldCmm.hs Definition of flowgraph of old representation\r
+OldCmmUtil.hs Utilites that operates mostly on on CmmStmt\r
+OldPprCmm.hs Pretty print for CmmStmt, GenBasicBlock and ListGraph\r
CmmCvt.hs Conversion between old and new Cmm reps\r
CmmOpt.hs Hopefully-redundant optimiser\r
-CmmZipUtil.hs Only one function; move elsewhere\r
\r
-------- Stuff to keep ------------\r
-CmmCPSZ.hs Driver for new pipeline\r
+CmmCPS.hs Driver for new pipeline\r
\r
-CmmLiveZ.hs Liveness analysis, dead code elim\r
-CmmProcPointZ.hs Identifying and splitting out proc-points\r
+CmmLive.hs Liveness analysis, dead code elim\r
+CmmProcPoint.hs Identifying and splitting out proc-points\r
\r
CmmSpillReload.hs Save and restore across calls\r
\r
-CmmCommonBlockElimZ.hs Common block elim\r
+CmmCommonBlockElim.hs Common block elim\r
CmmContFlowOpt.hs Other optimisations (branch-chain, merging)\r
\r
CmmBuildInfoTables.hs New info-table \r
CmmStackLayout.hs and stack layout \r
CmmCallConv.hs\r
-CmmInfo.hs Defn of InfoTables, and conversion to exact layout\r
+CmmInfo.hs Defn of InfoTables, and conversion to exact byte layout\r
\r
---------- Cmm data types --------------\r
-ZipCfgCmmRep.hs Cmm instantiations of dataflow graph framework\r
-MkZipCfgCmm.hs Cmm instantiations of dataflow graph framework\r
+Cmm.hs Cmm instantiations of dataflow graph framework\r
+MkGraph.hs Interface for building Cmm for codeGen/Stg*.hs modules\r
+\r
+CmmDecl.hs Shared Cmm types of both representations\r
+CmmExpr.hs Type of Cmm expression\r
+CmmType.hs Type of Cmm types and their widths\r
+CmmMachOp.hs MachOp type and accompanying utilities\r
\r
-Cmm.hs Key module; a mix of old and new stuff\r
- so needs tidying up in due course\r
-CmmExpr.hs\r
CmmUtils.hs\r
CmmLint.hs\r
\r
PprC.hs Pretty print Cmm in C syntax\r
-PprCmm.hs Pretty printer for Cmm\r
-PprCmmZ.hs Additional stuff for zipper rep\r
-\r
-CLabel.hs CLabel\r
-\r
----------- Dataflow modules --------------\r
- Goal: separate library; for now, separate directory\r
-\r
-MkZipCfg.hs\r
-ZipCfg.hs\r
-ZipCfgExtras.hs\r
-ZipDataflow.hs\r
-CmmTx.hs Transactions\r
-OptimizationFuel.hs Fuel\r
-BlockId.hs BlockId, BlockEnv, BlockSet\r
-DFMonad.hs \r
+PprCmm.hs Pretty printer for CmmGraph.\r
+PprCmmDecl.hs Pretty printer for common Cmm types.\r
+PprCmmExpr.hs Pretty printer for Cmm expressions.\r
\r
+CLabel.hs CLabel\r
+BlockId.hs BlockId, BlockEnv, BlockSet\r
\r
----------------------------------------------------\r
Top-level structure\r
* HscMain.tryNewCodeGen\r
- STG->Cmm: StgCmm.codeGen (new codegen)\r
- Optimise: CmmContFlowOpt (simple optimisations, very self contained)\r
- - Cps convert: CmmCPSZ.protoCmmCPSZ \r
+ - Cps convert: CmmCPS.protoCmmCPS \r
- Optimise: CmmContFlowOpt again\r
- Convert: CmmCvt.cmmOfZgraph (convert to old rep) very self contained\r
\r
\r
\r
----------------------------------------------------\r
- CmmCPSZ.protoCmmCPSZ The new pipeline\r
+ CmmCPS.protoCmmCPS The new pipeline\r
----------------------------------------------------\r
\r
-CmmCPSZprotoCmmCPSZ:\r
+CmmCPS.protoCmmCPS:\r
1. Do cpsTop for each procedures separately\r
2. Build SRT representation; this spans multiple procedures\r
(unless split-objs)\r
\r
cpsTop:\r
- * CmmCommonBlockElimZ.elimCommonBlocks:\r
+ * CmmCommonBlockElim.elimCommonBlocks:\r
eliminate common blocks \r
\r
- * CmmProcPointZ.minimalProcPointSet\r
+ * CmmProcPoint.minimalProcPointSet\r
identify proc-points\r
no change to graph\r
\r
- * CmmProcPointZ.addProcPointProtocols\r
+ * CmmProcPoint.addProcPointProtocols\r
something to do with the MA optimisation\r
probably entirely unnecessary\r
\r
Manifest the stack pointer\r
\r
* Split into separate procedures\r
- - CmmProcPointZ.procPointAnalysis\r
+ - CmmProcPoint.procPointAnalysis\r
Given set of proc points, which blocks are reachable from each\r
Claim: too few proc-points => code duplication, but program still works??\r
\r
- - CmmProcPointZ.splitAtProcPoints\r
+ - CmmProcPoint.splitAtProcPoints\r
Using this info, split into separate procedures\r
\r
- CmmBuildInfoTables.setInfoTableStackMap\r
Figuring out proc-points\r
~~~~~~~~~~~~~~~~~~~~~~~~\r
Proc-points are identified by\r
-CmmProcPointZ.minimalProcPointSet/extendPPSet Although there isn't\r
+CmmProcPoint.minimalProcPointSet/extendPPSet Although there isn't\r
that much code, JD thinks that it could be done much more nicely using\r
a dominator analysis, using the Dataflow Engine.\r
\r
f's keep-alive refs to include h1.\r
\r
* The SRT info is the C_SRT field of Cmm.ClosureTypeInfo in a\r
- CmmInfoTable attached to each CmmProc. CmmCPSZ.toTops actually does\r
+ CmmInfoTable attached to each CmmProc. CmmCPS.toTops actually does\r
the attaching, right at the end of the pipeline. The C_SRT part\r
gives offsets within a single, shared table of closure pointers.\r
\r
Foreign calls\r
----------------------------------------------------\r
\r
-See Note [Foreign calls] in ZipCfgCmmRep! This explains that a safe\r
+See Note [Foreign calls] in CmmNode! This explains that a safe\r
foreign call must do this:\r
save thread state\r
push info table (on thread stack) to describe frame\r
Cmm representations\r
----------------------------------------------------\r
\r
-* Cmm.hs\r
+* CmmDecl.hs\r
The type [GenCmm d h g] represents a whole module, \r
** one list element per .o file **\r
Without SplitObjs, the list has exactly one element\r
\r
\r
-------------\r
-OLD BACK END representations (Cmm.hs): \r
+OLD BACK END representations (OldCmm.hs): \r
type Cmm = GenCmm CmmStatic CmmInfo (ListGraph CmmStmt)\r
-- A whole module\r
newtype ListGraph i = ListGraph [GenBasicBlock i]\r
\r
-------------\r
NEW BACK END representations \r
-* Not Cmm-specific at all\r
- ZipCfg.hs defines Graph, LGraph, FGraph,\r
- ZHead, ZTail, ZBlock ...\r
+* Uses Hoopl library, a zero-boot package\r
+* CmmNode defines a node of a flow graph.\r
+* Cmm defines CmmGraph, CmmTop, Cmm\r
+ - CmmGraph is a closed/closed graph + an entry node.\r
\r
- classes LastNode, HavingSuccessors\r
+ data CmmGraph = CmmGraph { g_entry :: BlockId\r
+ , g_graph :: Graph CmmNode C C }\r
\r
- MkZipCfg.hs: AGraph: building graphs\r
+ - CmmTop is a top level chunk, specialization of GenCmmTop from CmmDecl.hs\r
+ with CmmGraph as a flow graph.\r
+ - Cmm is a collection of CmmTops.\r
\r
-* ZipCfgCmmRep: instantiates ZipCfg for Cmm\r
- data Middle = ...CmmExpr...\r
- data Last = ...CmmExpr...\r
- type CmmGraph = Graph Middle Last\r
+ type Cmm = GenCmm CmmStatic CmmTopInfo CmmGraph\r
+ type CmmTop = GenCmmTop CmmStatic CmmTopInfo CmmGraph\r
\r
- type CmmZ = GenCmm CmmStatic CmmInfo (CmmStackInfo, CmmGraph)\r
- type CmmStackInfo = (ByteOff, Maybe ByteOff)\r
- -- (SP offset on entry, update frame space = SP offset on exit)\r
- -- The new codegen produces CmmZ, but once the stack is \r
- -- manifested we can drop that in favour of \r
- -- GenCmm CmmStatic CmmInfo CmmGraph\r
+ - CmmTop uses CmmTopInfo, which is a CmmInfoTable and CmmStackInfo\r
\r
- Inside a CmmProc:\r
- - CLabel: used\r
- - CmmInfo: partly used by NEW\r
- - CmmFormals: not used at all PERHAPS NOT EVEN BY OLD PIPELINE!\r
+ data CmmTopInfo = TopInfo {info_tbl :: CmmInfoTable, stack_info :: CmmStackInfo}\r
\r
-* MkZipCfgCmm.hs: smart constructors for ZipCfgCmmRep\r
- Depends on (a) MkZipCfg (Cmm-independent)\r
- (b) ZipCfgCmmRep (Cmm-specific)\r
+ - CmmStackInfo\r
\r
--------------\r
-* SHARED stuff\r
- CmmExpr.hs defines the Cmm expression types\r
- - CmmExpr, CmmReg, Width, CmmLit, LocalReg, GlobalReg\r
- - CmmType, Width etc (saparate module?)\r
- - MachOp (separate module?)\r
- - Area, AreaId etc (separate module?)\r
+ data CmmStackInfo = StackInfo {arg_space :: ByteOff, updfr_space :: Maybe ByteOff}\r
\r
- BlockId.hs defines BlockId, BlockEnv, BlockSet\r
+ * arg_space = SP offset on entry\r
+ * updfr_space space = SP offset on exit\r
+ Once the staci is manifested, we could drom CmmStackInfo, ie. get\r
+ GenCmm CmmStatic CmmInfoTable CmmGraph, but we do not do that currently.\r
\r
--------------\r
\r
+* MkGraph.hs: smart constructors for Cmm.hs\r
+ Beware, the CmmAGraph defined here does not use AGraph from Hoopl,\r
+ as CmmAGraph can be opened or closed at exit, See the notes in that module.\r
\r
-------------\r
-* Transactions indicate whether or not the result changes: CmmTx \r
- type Tx a = a -> TxRes a\r
- data TxRes a = TxRes ChangeFlag a\r
+* SHARED stuff\r
+ CmmDecl.hs - GenCmm and GenCmmTop types\r
+ CmmExpr.hs - defines the Cmm expression types\r
+ - CmmExpr, CmmReg, CmmLit, LocalReg, GlobalReg\r
+ - Area, AreaId etc (separate module?)\r
+ CmmType.hs - CmmType, Width etc (saparate module?)\r
+ CmmMachOp.hs - MachOp and CallishMachOp types\r
+\r
+ BlockId.hs defines BlockId, BlockEnv, BlockSet\r
+-------------\r