X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fcomm%2Fthe-beast%2Fsyntax.html;fp=docs%2Fcomm%2Fthe-beast%2Fsyntax.html;h=be5bbefa17a7c8bd1c6a09b8518bfd1270dd6725;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html new file mode 100644 index 0000000..be5bbef --- /dev/null +++ b/docs/comm/the-beast/syntax.html @@ -0,0 +1,99 @@ + + + + + The GHC Commentary - Just Syntax + + + +

The GHC Commentary - Just Syntax

+

+ The lexical and syntactic analyser for Haskell programs are located in + fptools/ghc/compiler/parser/. +

+ +

The Lexer

+

+ The lexer is a rather tedious piece of Haskell code contained in the + module Lex. + Its complexity partially stems from covering, in addition to Haskell 98, + also the whole range of GHC language extensions plus its ability to + analyse interface files in addition to normal Haskell source. The lexer + defines a parser monad P a, where a is the + type of the result expected from a successful parse. More precisely, a + result of type +

+data ParseResult a = POk PState a
+		   | PFailed Message
+
+

+ is produced with Message being from ErrUtils + (and currently is simply a synonym for SDoc). +

+ The record type PState contains information such as the + current source location, buffer state, contexts for layout processing, + and whether Glasgow extensions are accepted (either due to + -fglasgow-exts or due to reading an interface file). Most + of the fields of PState store unboxed values; in fact, even + the flag indicating whether Glasgow extensions are enabled is + represented by an unboxed integer instead of by a Bool. My + (= chak's) guess is that this is to avoid having to perform a + case on a boxed value in the inner loop of the lexer. +

+ The same lexer is used by the Haskell source parser, the Haskell + interface parser, and the package configuration parser. + +

The Haskell Source Parser

+

+ The parser for Haskell source files is defined in the form of a parser + specification for the parser generator Happy in the file Parser.y. + The parser exports three entry points for parsing entire modules + (parseModule, individual statements + (parseStmt), and individual identifiers + (parseIdentifier), respectively. The last two are needed + for GHCi. All three require a parser state (of type + PState) and are invoked from HscMain. +

+ Parsing of Haskell is a rather involved process. The most challenging + features are probably the treatment of layout and expressions that + contain infix operators. The latter may be user-defined and so are not + easily captured in a static syntax specification. Infix operators may + also appear in the right hand sides of value definitions, and so, GHC's + parser treats those in the same way as expressions. In other words, as + general expressions are a syntactic superset of expressions - ok, they + nearly are - the parser simply attempts to parse a general + expression in such positions. Afterwards, the generated parse tree is + inspected to ensure that the accepted phrase indeed forms a legal + pattern. This and similar checks are performed by the routines from ParseUtil. In + some cases, these routines do, in addition to checking for + wellformedness, also transform the parse tree, such that it fits into + the syntactic context in which it has been parsed; in fact, this happens + for patterns, which are transformed from a representation of type + RdrNameHsExpr into a representation of type + RdrNamePat. + +

The Haskell Interface Parser

+

+ The parser for interface files is also generated by Happy from ParseIface.y. + It's main routine parseIface is invoked from RnHiFiles.readIface. + +

The Package Configuration Parser

+

+ The parser for configuration files is by far the smallest of the three + and defined in ParsePkgConf.y. + It exports loadPackageConfig, which is used by DriverState.readPackageConf. + +

+ +Last modified: Wed Jan 16 00:30:14 EST 2002 + + + +