{-# OPTIONS_GHC -XModalTypes -XScopedTypeVariables -XFlexibleContexts -XMultiParamTypeClasses -ddump-types -XNoMonoPatBinds #-}
module GArrowTutorial
where
import Data.Bits
import Data.Bool (not)
import GHC.HetMet.GArrow
import GHC.HetMet.GuestLanguage hiding ( (-) )
import Control.Category
import Control.Arrow
import Prelude hiding ( id, (.) )
-- The best way to understand heterogeneous metaprogramming and
-- generalized arrows is to play around with this file, poking at the
-- examples until they fail to typecheck -- you'll learn a lot that
-- way!
-- Once you've built the modified compiler, you can compile this file
-- with:
--
-- $ inplace/bin/ghc-stage2 tutorial.hs
--
-- -XModalTypes adds a new syntactical expression, "code brackets":
code_fst = <[ \(x,y) -> x ]>
-- This new expression is the introduction form for modal types:
code_fst :: forall a b g. <[ (a,b) -> a ]>@g
-- Think of <[T]>@g as being the type of programs written in language
-- "g" which, when "executed", return a value of type "T". I mention
-- "language g" because the *heterogeneous* aspect of HetMet means
-- that we can limit the sorts of constructs allowed inside the code
-- brackets, permitting only a subset of Haskell (you have to use
-- Haskell syntax, though).
-- There is a second new expression form, "~~", called "escape":
code_fst_fst = <[ \z -> ~~code_fst (~~code_fst z) ]>
-- Note that ~~ binds more tightly than any other operator. There is
-- an alternate version, "~~$", which binds more weakly than any other
-- operator (this is really handy sometimes!). To demonstrate this,
-- the next two expressions differ only in superficial syntax:
example1 foo bar = <[ ~~$ foo bar ]>
example2 foo bar = <[ ~~( foo bar) ]>
-- example3 foo bar = <[ ~~ foo bar ]>
-- ... but the third one is completely different (and in fact, doesn't
-- even parse, but we'll get to that in a moment)
-- The escape operation must appear within code brackets. In truth,
-- it is really a "hole" punched in the code brackets -- the thing to
-- which the escape operator gets applied is typed as if it were
-- *outside* the code brackets. It must have type <[T]>, and the
-- escape operator allows it to be used *inside* code brackets as if
-- it had type "T".
-- So, the escape operator is basically a way of pasting code
-- fragments into each other.
-- This is where those type variables after the "@" sign come in: if
-- you paste two pieces of code into a third, all three must be
-- written in the same language. We express this by unifying their
-- tyvars:
compose_code :: forall g a b c. <[a->b]>@g -> <[b->c]>@g -> <[a->c]>@g
compose_code x y = <[ \z -> ~~y (~~x z) ]>
-- Now, try commenting out the type ascription above and uncommenting
-- any of these three:
--
-- compose_code :: forall g h a b c. <[a->b]>@h -> <[b->c]>@g -> <[a->c]>@g
-- compose_code :: forall g h a b c. <[a->b]>@g -> <[b->c]>@h -> <[a->c]>@g
-- compose_code :: forall g h a b c. <[a->b]>@g -> <[b->c]>@g -> <[a->c]>@h
--
-- The typechecker won't let you get away with that -- you're trying
-- to force a type which is "too polymorphic" onto paste2. If the
-- compiler allowed that, the resulting metaprogram might try to
-- splice together programs written in different languages, resulting
-- in mayhem.
-- NEW SCOPING RULES: The syntactical depth (or just "depth") of an
-- expression is the number of surrounding code-brackets minus the
-- number of surrounding escapes (this is strictly a syntax concept
-- and has NOTHING to do with the type system!). It is very important
-- to keep in mind that the scope of a bound variable extends only to
-- expressions at the same depth! To demonstrate, the following
-- expression will fail to parse:
-- badness = \x -> <[ x ]>
-- ...and in the following expression, the occurrence of "x" is bound
-- by the first (outer) lambda, not the second one:
no_shadowing_here = \x -> <[ \x -> ~~x ]>
-- Lastly, you can wrap code-brackets around an identifier in a
-- top-level, let, or where binding. Notice how GHC doesn't complain
-- here about defining an identifier twice!
foo = \x -> x+1
<[ foo ]> = <[ \(x::Bool) -> x ]>
-- Now you can use foo (the second one!) inside code-brackets:
bar x = <[ foo ~~x ]>
bar :: forall g. <[Bool]>@g -> <[Bool]>@g
-- In fact, the identifiers have completely unrelated types. Which
-- brings up another important point: types are ALWAYS assigned
-- "relative to" depth zero. So although we imagine "foo" existing at
-- depth-one, its type is quite firmly established as <[ Bool -> Bool ]>
-- It has to be this way -- to see why, consider a term which is more
-- polymorphic than "foo":
<[ foo' ]> = <[ \x -> x ]>
-- Its type is:
<[ foo' ]> :: forall a g . <[ a -> a ]>@g
-- ...and there's no way to express the g-polymorphism entirely from
-- within the brackets.
-- So why does all of this matter? Mainly so that we can continue to use . We'd like
-- the "+" operator to work "as expected" -- in other words, we'd like
-- people to be able to write things like
increment_at_level1 = <[ \x -> x + 1 ]>
-- However, in unmodified haskell an identifier like (+) may have only
-- one type. In this case that type is:
--
-- (+) :: Num a => a -> a -> a
--
-- Now, we could simply decree that when (+) appears inside code
-- brackets, an "implicit ~~" is inserted, so the desugared expression
-- is:
--
-- increment_at_level1 = <[ \x -> ~~(+) x 1 ]>
--
-- unfortunately this isn't going to work for guest languages that
-- don't have higher-order functions. Haskell uses curried arguments
-- because it has higher-order functions, but in a first-order guest
-- language a more sensible type for (+) would be:
--
-- (+) :: Num a => (a,a) -> a
--
-- ... or even something less polymorphic, like
--
-- (+) :: (Int,Int) -> Int
--
-- so to maintain flexibility, we allow an identifier to have
-- different types at different syntactic depths; this way type
-- choices made for Haskell don't get imposed on guest languages that
-- are missing some of its features.
--
-- In hindsight, what we REALLY want is for increment_at_level1 to
-- be desugared like this (much like the Arrow (|...|) syntax):
--
-- increment_at_level1 = <[ \x -> ~~( <[x]> + <[1]> ) ]>
--
-- ... because then we can declare
--
-- instance Num a => Num <[a]> where ...
--
-- or just
--
-- instance Num <[Int]> where ...
--
-- unfortunately there's a major problem: knowing how to do this sort
-- of desugaring requires knowing the *arity* of a function. For
-- symbols we can kludge it by checking Haskell's parsing rules (there
-- are only a handful of unary symbols; all others are binary), but
-- this is crude and won't work at all for non-symbol identifiers.
-- And we can look at a type like x->y->z and say "oh, that's a
-- two-argument function", but sometimes GHC doesn't know the complete
-- type of an identifier in the midst of unification (i.e. "x has type
-- Int->a for some a, where a could be Int or Int->Int"), so guessing
-- the arity from the type cannot be done during parsing, which is
-- when we need to do this.
--
-- Okay, I think that's more or less a brain dump of why I changed the
-- scoping rules and the problems with the other solutions I tried.
--
-- I am very interested in hearing any suggestions on better ways of
-- dealing with this, so long as you can still use operators like (+)
-- in guest languages without higher-order functions.
--
--------------------------------------------------------------------------------
-- Ye Olde and Most Venerable "pow" Function
pow :: forall c. GuestIntegerLiteral c => GuestLanguageMult c Integer => Integer -> <[ Integer -> Integer ]>@c
pow n =
if n==0
then <[ \x -> 1 ]>
else <[ \x -> x * ~~(pow (n - 1)) x ]>
-- a more efficient two-level pow
pow' :: forall c. GuestIntegerLiteral c => GuestLanguageMult c Integer => Integer -> <[ Integer -> Integer ]>@c
pow' 0 = <[ \x -> 1 ]>
pow' 1 = <[ \x -> x ]>
pow' n = if n `mod` 2==0
then <[ \x -> (\y -> y*y) (~~(pow' $ n `shiftR` 2) x) ]>
else <[ \x -> x * ~~(pow' $ n-1) x ]>