/**************************************************************** * Grammar for interface files ****************************************************************/ This document purports to describe the syntax (and semantics?) of interface files generated by GHC for use by Hugs. /**************************************************************** * ToDo: ****************************************************************/ o GHC currently generates "Functor( :Functor :Functor map )" in export lists. This is no longer legal and is very confusing besides - but what will GHC generate instead? /**************************************************************** * Closures generated by GHC ****************************************************************/ GHC generates a closure for the following objects (if exported): o variables o instance decls o methods selectors and superclass selectors o selector functions (from record syntax) o data constructors If an object foo (respectively Foo) is declared in a module Bar, then the closure is called Bar_foo_closure (respectively Bar_Foo_closure). Whether the object is static or not is not reflected in the name. The type or arity of the object is not reflected in the name. The name is just Bar_foo_closure. Modifications to the above: 1) Depending on the architecture, it might be necessary to add a leading underscore to the name. 2) We also have to apply the infamous Z-encoding: Code from somewhere inside GHC (circa 1994) * Z-escapes: "std"++xs -> "Zstd"++xs char_to_c 'Z' = "ZZ" char_to_c '&' = "Za" char_to_c '|' = "Zb" char_to_c ':' = "Zc" char_to_c '/' = "Zd" char_to_c '=' = "Ze" char_to_c '>' = "Zg" char_to_c '#' = "Zh" char_to_c '<' = "Zl" char_to_c '-' = "Zm" char_to_c '!' = "Zn" char_to_c '.' = "Zo" char_to_c '+' = "Zp" char_to_c '\'' = "Zq" char_to_c '*' = "Zt" char_to_c '_' = "Zu" char_to_c c = "Z" ++ show (ord c) (There's a commented out piece of code in rts/Printer.c which implements this.) /**************************************************************** * Lexical syntax ****************************************************************/ The lexical syntax is exactly the same as for Haskell with the following additions: Keywords ~~~~~~~~ We add: __export __interface __requires Pragmas ~~~~~~~ GHC will use pragmas of the form: {-## ##-}. These are always ignored by Hugs and may be ignored by GHC. GHC will be able to use lazy parsing for these - just as it currently does for unfoldings and the like. Compiler generated names ~~~~~~~~~~~~~~~~~~~~~~~~ Are of the form _letter(letter|digit|symbol)*. It's important that they can always be generated by putting "_l" in front of a valid Haskell varid, varop, conid or conop. It's also important that valid Haskell patterns such as _:_ should not be valid compiler generated names. The letter indicates something about the kind of object it is but all that Hugs needs to do is separate conid/ops from varid/ops - which it does depending on whether the letter is uppercase. /**************************************************************** * Header ****************************************************************/ iface : '__interface' ifaceName NUMLIT version 'where' Body Body : '__requires' STRINGLIT ';' { importDecl ';' } { instanceImportDecl ';' } { exportDecl ';' } { fixityDecl ';' } { classDecl ';' } { instanceDecl ';' } { typeDecl ';' } { valueDecl ';' } version : NUMLIT /**************************************************************** * Import-export stuff * * I believe the meaning of 'import' is "qualified import" - but * I'm not sure. - ADR ****************************************************************/ importDecl : 'import' CONID NUMLIT instanceImportDecl : 'instance' 'import' CONID NUMLIT exportDecl : '__export' CONID { Entity } Entity : EntityOcc | EntityOcc StuffInside | EntityOcc '|' StuffInside EntityOcc : Var | Data | '->' | '(' '->' ')' StuffInside : '{' ValOcc { ValOcc } '}' ValOcc : Var | Data /**************************************************************** * Fixities ****************************************************************/ fixityDecl : 'infixl' optdigit op | 'infixr' optdigit op | 'infix' optdigit op /**************************************************************** * Type declarations * * o data decls use "Data" on lhs and rhs to allow this decl: * * data () = () * * o data declarations don't have the usual Haskell syntax: * o they don't have strictness annotations * o they are given an explicit signature instead of a list of * argument types * o field selectors are given an explicit signature * * [Simon PJ asked me to look again at how much work it would take to * handle the standard syntax. The answer is: * o It takes an awful lot of code to process the standard syntax. * o I can hardly reuse any of the existing code because it is too * tightly interwoven with other parts of static analysis. * o The rules for processing data decls are very intricate * (and are worse since existentials and local polymorphism were * added). Implementing a complicated thing twice (once in * GHC and once in Hugs) is bad; implementing it a third time * is Just Plain Wrong. * ] * * Data decls look like this: * * data List a = Nil :: forall [a] => List a * | Cons{hd,tl} :: forall [a] => a -> List a -> List a * where * hd :: forall [a] => List a -> a * tl :: forall [a] => List a -> List a * * o The tyvars on the lhs serve only to help infer the kind of List * o The type of each data constructor and selector is written * explicitly. * o A small amount of work is required to figure out which * variables are existentially quantified. * o GHC will require an inlining pragma to recover strictness * annotations. ****************************************************************/ typeDecl : NUMLIT 'type' TCName {TVBndr} '=' Type | NUMLIT 'data' Data {TVBndr} ['=' Constrs ['where' Sels]] | NUMLIT 'newtype' TCName {TVBndr} [ '=' Data AType ] Constrs : Constr {'|' Constr} Constr : Data [Fields] '::' Type Fields : '{' VarName {',' VarName} '}' Sels : Sel {';' Sel} Sel : VarName '::' ['!'] Type /**************************************************************** * Classes and instances * * Question: should the method signature include the class * constraint? That is, should we write the Eq decl like this: * * class Eq a where { (==) :: a -> a -> Bool } -- like Haskell * * or like this * * class Eq a where { (==) :: Eq a => a -> a -> Bool } * * There's not much to choose between them but the second version * is more consistent with what we're doing with data constructors. ****************************************************************/ classDecl : NUMLIT 'class' [ Context '=>' ] TCName {TVBndr} 'where' CSigs instanceDecl : 'instance' [Quant] Class '=' Var CSigs : '{' CSig { ';' CSigs } '}' CSig : VarName ['='] '::' Type /**************************************************************** * Types ****************************************************************/ Type : Quant Type | BType '->' Type | BType Context : '(' Class { ',' Class } ')' Class : QTCName { AType } BType : AType { AType } AType : QTCName | TVName | '(' ')' // unit | '(' Type ')' // parens | '(' Type ',' Type { ',' Type } ')' // tuple | '[' Type ']' // list | '{' QTCName { AType } '}' // dictionary Quant : 'forall' {TVBndr} [Context] '=>' TVBndr : TVName [ '::' AKind ] Kind : { AKind -> } AKind AKind : VAROP // really '*' | '(' Kind ')' /**************************************************************** * Values ****************************************************************/ valueDecl : NUMLIT Var '::' Type /**************************************************************** * Atoms ****************************************************************/ VarName : Var TVName : VARID Var : VARID | VAROP | '!' | '.' | '-' Data : CONID | CONOP | '(' ')' | '[' ']' TCName : CONID | CONOP | '(' '->' ')' | '[' ']' QTCName : TCName | QCONID | QCONOP ifaceName : CONID /**************************************************************** * End ****************************************************************/