{-# OPTIONS_GHC -XModalTypes -XScopedTypeVariables -XFlexibleContexts -XMultiParamTypeClasses -ddump-types -XNoMonoPatBinds #-} module GArrowTutorial where import Data.Bits import Data.Bool (not) import GHC.HetMet.GArrow import GHC.HetMet.GuestLanguage hiding ( (-) ) import Control.Category import Control.Arrow import Prelude hiding ( id, (.) ) -- The best way to understand heterogeneous metaprogramming and -- generalized arrows is to play around with this file, poking at the -- examples until they fail to typecheck -- you'll learn a lot that -- way! -- Once you've built the modified compiler, you can compile this file -- with: -- -- $ inplace/bin/ghc-stage2 tutorial.hs -- -- -XModalTypes adds a new syntactical expression, "code brackets": code_fst = <[ \(x,y) -> x ]> -- This new expression is the introduction form for modal types: code_fst :: forall a b g. <[ (a,b) -> a ]>@g -- Think of <[T]>@g as being the type of programs written in language -- "g" which, when "executed", return a value of type "T". I mention -- "language g" because the *heterogeneous* aspect of HetMet means -- that we can limit the sorts of constructs allowed inside the code -- brackets, permitting only a subset of Haskell (you have to use -- Haskell syntax, though). -- There is a second new expression form, "~~", called "escape": code_fst_fst = <[ \z -> ~~code_fst (~~code_fst z) ]> -- Note that ~~ binds more tightly than any other operator. There is -- an alternate version, "~~$", which binds more weakly than any other -- operator (this is really handy sometimes!). To demonstrate this, -- the next two expressions differ only in superficial syntax: example1 foo bar = <[ ~~$ foo bar ]> example2 foo bar = <[ ~~( foo bar) ]> -- example3 foo bar = <[ ~~ foo bar ]> -- ... but the third one is completely different (and in fact, doesn't -- even parse, but we'll get to that in a moment) -- The escape operation must appear within code brackets. In truth, -- it is really a "hole" punched in the code brackets -- the thing to -- which the escape operator gets applied is typed as if it were -- *outside* the code brackets. It must have type <[T]>, and the -- escape operator allows it to be used *inside* code brackets as if -- it had type "T". -- So, the escape operator is basically a way of pasting code -- fragments into each other. -- This is where those type variables after the "@" sign come in: if -- you paste two pieces of code into a third, all three must be -- written in the same language. We express this by unifying their -- tyvars: compose_code :: forall g a b c. <[a->b]>@g -> <[b->c]>@g -> <[a->c]>@g compose_code x y = <[ \z -> ~~y (~~x z) ]> -- Now, try commenting out the type ascription above and uncommenting -- any of these three: -- -- compose_code :: forall g h a b c. <[a->b]>@h -> <[b->c]>@g -> <[a->c]>@g -- compose_code :: forall g h a b c. <[a->b]>@g -> <[b->c]>@h -> <[a->c]>@g -- compose_code :: forall g h a b c. <[a->b]>@g -> <[b->c]>@g -> <[a->c]>@h -- -- The typechecker won't let you get away with that -- you're trying -- to force a type which is "too polymorphic" onto paste2. If the -- compiler allowed that, the resulting metaprogram might try to -- splice together programs written in different languages, resulting -- in mayhem. -- NEW SCOPING RULES: The syntactical depth (or just "depth") of an -- expression is the number of surrounding code-brackets minus the -- number of surrounding escapes (this is strictly a syntax concept -- and has NOTHING to do with the type system!). It is very important -- to keep in mind that the scope of a bound variable extends only to -- expressions at the same depth! To demonstrate, the following -- expression will fail to parse: -- badness = \x -> <[ x ]> -- ...and in the following expression, the occurrence of "x" is bound -- by the first (outer) lambda, not the second one: no_shadowing_here = \x -> <[ \x -> ~~x ]> -- Lastly, you can wrap code-brackets around an identifier in a -- top-level, let, or where binding. Notice how GHC doesn't complain -- here about defining an identifier twice! foo = \x -> x+1 <[ foo ]> = <[ \(x::Bool) -> x ]> -- Now you can use foo (the second one!) inside code-brackets: bar x = <[ foo ~~x ]> bar :: forall g. <[Bool]>@g -> <[Bool]>@g -- In fact, the identifiers have completely unrelated types. Which -- brings up another important point: types are ALWAYS assigned -- "relative to" depth zero. So although we imagine "foo" existing at -- depth-one, its type is quite firmly established as <[ Bool -> Bool ]> -- It has to be this way -- to see why, consider a term which is more -- polymorphic than "foo": <[ foo' ]> = <[ \x -> x ]> -- Its type is: <[ foo' ]> :: forall a g . <[ a -> a ]>@g -- ...and there's no way to express the g-polymorphism entirely from -- within the brackets. -- So why does all of this matter? Mainly so that we can continue to use . We'd like -- the "+" operator to work "as expected" -- in other words, we'd like -- people to be able to write things like increment_at_level1 = <[ \x -> x + 1 ]> -- However, in unmodified haskell an identifier like (+) may have only -- one type. In this case that type is: -- -- (+) :: Num a => a -> a -> a -- -- Now, we could simply decree that when (+) appears inside code -- brackets, an "implicit ~~" is inserted, so the desugared expression -- is: -- -- increment_at_level1 = <[ \x -> ~~(+) x 1 ]> -- -- unfortunately this isn't going to work for guest languages that -- don't have higher-order functions. Haskell uses curried arguments -- because it has higher-order functions, but in a first-order guest -- language a more sensible type for (+) would be: -- -- (+) :: Num a => (a,a) -> a -- -- ... or even something less polymorphic, like -- -- (+) :: (Int,Int) -> Int -- -- so to maintain flexibility, we allow an identifier to have -- different types at different syntactic depths; this way type -- choices made for Haskell don't get imposed on guest languages that -- are missing some of its features. -- -- In hindsight, what we REALLY want is for increment_at_level1 to -- be desugared like this (much like the Arrow (|...|) syntax): -- -- increment_at_level1 = <[ \x -> ~~( <[x]> + <[1]> ) ]> -- -- ... because then we can declare -- -- instance Num a => Num <[a]> where ... -- -- or just -- -- instance Num <[Int]> where ... -- -- unfortunately there's a major problem: knowing how to do this sort -- of desugaring requires knowing the *arity* of a function. For -- symbols we can kludge it by checking Haskell's parsing rules (there -- are only a handful of unary symbols; all others are binary), but -- this is crude and won't work at all for non-symbol identifiers. -- And we can look at a type like x->y->z and say "oh, that's a -- two-argument function", but sometimes GHC doesn't know the complete -- type of an identifier in the midst of unification (i.e. "x has type -- Int->a for some a, where a could be Int or Int->Int"), so guessing -- the arity from the type cannot be done during parsing, which is -- when we need to do this. -- -- Okay, I think that's more or less a brain dump of why I changed the -- scoping rules and the problems with the other solutions I tried. -- -- I am very interested in hearing any suggestions on better ways of -- dealing with this, so long as you can still use operators like (+) -- in guest languages without higher-order functions. -- -------------------------------------------------------------------------------- -- Ye Olde and Most Venerable "pow" Function pow :: forall c. GuestIntegerLiteral c => GuestLanguageMult c Integer => Integer -> <[ Integer -> Integer ]>@c pow n = if n==0 then <[ \x -> 1 ]> else <[ \x -> x * ~~(pow (n - 1)) x ]> -- a more efficient two-level pow pow' :: forall c. GuestIntegerLiteral c => GuestLanguageMult c Integer => Integer -> <[ Integer -> Integer ]>@c pow' 0 = <[ \x -> 1 ]> pow' 1 = <[ \x -> x ]> pow' n = if n `mod` 2==0 then <[ \x -> (\y -> y*y) (~~(pow' $ n `shiftR` 2) x) ]> else <[ \x -> x * ~~(pow' $ n-1) x ]>