X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fcomm%2Fthe-beast%2Fmangler.html;fp=docs%2Fcomm%2Fthe-beast%2Fmangler.html;h=1ad80f0d5c38e6aadaecbc4a415c5bd10c42badb;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html new file mode 100644 index 0000000..1ad80f0 --- /dev/null +++ b/docs/comm/the-beast/mangler.html @@ -0,0 +1,79 @@ + + +
+ +
+ The Evil Mangler (EM) is a Perl script invoked by the Glorious Driver after the C compiler (gcc) has
+ translated the GHC-produced C code into assembly. Consequently, it is
+ only of interest if -fvia-C
is in effect (either explicitly
+ or implicitly).
+
+
+ The EM reads the assembly produced by gcc and re-arranges code blocks as + well as nukes instructions that it considers non-essential. It + derives it evilness from its utterly ad hoc, machine, compiler, and + whatnot dependent design and implementation. More precisely, the EM + performs the following tasks: +
+ The EM is located in the Perl script ghc-asm.lprl
.
+ The script reads the .s
file and chops it up into
+ chunks (that's how they are actually called in the script) that
+ roughly correspond to basic blocks. Each chunk is annotated with an
+ educated guess about what kind of code it contains (e.g., infotable,
+ fast entry point, slow entry point, etc.). The annotations also contain
+ the symbol introducing the chunk of assembly and whether that chunk has
+ already been processed or not.
+
+ The parsing of the input into chunks as well as recognising assembly + instructions that are to be removed or altered is based on a large + number of Perl regular expressions sprinkled over the whole code. These + expressions are rather fragile as they heavily rely on the structure of + the generated code - in fact, they even rely on the right amount of + white space and thus on the formatting of the assembly. +
+ Afterwards, the chunks are reordered, some of them purged, and some + stripped of some useless instructions. Moreover, some instructions are + manipulated (eg, loads of fast entry points followed by indirect jumps + are replaced by direct jumps to the fast entry point). +
+ The EM knows which part of the code belongs to function prologues and
+ epilogues as STG C adds tags of the
+ form --- BEGIN ---
and --- END ---
the
+ assembler just before and after the code proper of a function starts.
+ It adds these tags using gcc's __asm__
feature.
+
+ Update: Gcc 2.96 upwards performs more aggressive basic
+ block re-ordering and dead code elimination. This seems to make the
+ whole --- END ---
tag business redundant -- in fact, if
+ proper code is generated, no --- END ---
tags survive gcc
+ optimiser.
+
+
+ +Last modified: Sun Feb 17 17:55:47 EST 2002 + + + +