3 These are notes about a _simple_ but complete pattern-matching
4 compiler for Haskell. I presume familiarity with Phil's
5 pattern-matching stuff in Simon's book and use roughly the same notation.
7 Abbreviations: "p" for pattern, "e" (or "E") for expression, "g" for
8 guard, "v" for variable, "u" for new variable I made up. "[]" for
11 Subscripts: "p11" is really short for "p_{1,1}". Sometimes I'll use
12 a "?", as in "pm1 ... pm?", to mean the second subscript goes up to
13 something I'm really not worried about.
15 NB: LETRECS NOT DEALT WITH YET.
17 ---------------------------------------------------------------------
18 We need a slightly souped-up "match" for Haskell (vs the Phil-chapter
19 one). Simon suggested a re-arrangement of things, which I have then
20 further re-arranged...
25 Eliminate default arg of match (3rd arg in Phil-chapter match) in
26 favour of returning the variable (not special value) fail. Thus a
27 possible translation for
42 Now the issue of whether to duplicate code or share it becomes whether
43 to substitute copies of e2 or not. This is a decision we need to take
44 anyway for all other let-bound things, so why not for fail too? If
45 fail is used only once, we will certainly substitute for it.
47 We could even detect that fail is used only in a head position, so it
48 can be implemented as a stack-adjust then a jump. This might well
49 apply to other let-bound things too.
51 Now here's a proposal for the "match" function. The main difference is
52 1) no default argument
53 2) [contra simon's suggestion] Patterns are still per-row as in
55 3) [partain] even the input exprs are CoreExprs
57 OK, for a "match" for m equations each with n patterns:
60 -- n (variable) names, one per pattern column, bound
61 -- to the n expressions we are matching against the
64 -> [([Pat], CoreExpr)]
65 -- one pair for each of the m equations: the n
66 -- patterns in that equation, then the CoreExpr that
67 -- is evaluated if we get a match. The CoreExpr may
68 -- contain free "fail"s; some hackery required to
69 -- ensure that is OK; see below
72 -- the resulting code to do the matching
76 (1) a list of n (match-expression, pattern-column) pairs
77 (2) a list of m post-match expressions, expr i to be inserted
78 immediately after equation i's lhs matches
80 (1) a desugared expr equivalent of the whole "match"
85 [([p11, ..., p1n], e1), ..., ([pm1, ..., pmn], em)]
87 match [ (e1, [p11, ...,pm1]), ..., (en, [p1n, ...,pmn])]
90 ********* MEANS *********
96 fail = case (u1, ..., un) of
101 Alternatively, this specification could be given in terms of
102 pattern-matching lambdas, as in Phil's chapter.
104 NOT CHANGED BEYOND HERE
106 -------------------------------------------------------------------
107 Cranking through a good old function definition with the above:
109 f p11 p12 ... p1n | g11 = e11
114 f pm1 pm2 ... pmn | gm1 = em1
118 The "match" equivalent is:
121 match [ (u1, [p11, ...,pm1]), ..., (un, [p1n, ...,pmn])]
123 where fail = error "pattern-match for f failed\n"
124 E1 = if g11 then e11 else if g12 then ... else fail
126 Em = if gm1 then em1 else if gm2 then ... else fail
130 -------------------------------------------------------------------
131 It is helpful to me to think about the simple/base cases for this
138 corresponds to the syntactically bogus (zero equations!?)
141 () -> {- nothing!! -}
145 EMPTY RULE -- no more patterns
147 match [] [ ([], E1), ..., ([], Em) ]
149 [where, incidentally, each Ei will be of the form
150 (not that it has to be...)
163 becomes ("E1 [] E2 [] ... [] Em" in Phil's chapter...)
173 with any "fail" in Em being bound from an outer scope; perhaps it's
174 easier to see written as:
180 -------------------------------------------------------------------
181 HANDLING LAZY ("TWIDDLE") PATTERNS
183 For Haskell, the "mixture rule" (p.~88) looks at a pattern-column and
184 splits the equations into groups, depending on whether it sees
186 * all constructors, or
187 * all variables _OR LAZY PATTERNS_
189 The following example shows what "match" does when confronted by one
190 of these variables/lazy-patterns combinations. Note the use of the
203 match [(u1, [ v, ~p ])]
204 [ if g11 then e11 else if ... else fail, -- E1
205 if g21 then e21 else if ... else fail -- E2
207 where fail = error "no match in f\n"
209 which transmogrifies into
217 if g11 then e11 else if ... else fail
220 let free_var1_of_p = match [(u2, [ p ])] [ free_var1_of_p ]
222 free_var?_of_p = match [(u2, [ p ])] [ free_var?_of_p ]
224 if g21 then e21 else if ... else fail -- E2
227 where fail = error "no match in f\n"
229 For more specific match-failure error messages, one could insert
230 "let fail = ..."'s in strategic places.
232 -------------------------------------------------------------------
233 "match" EQUIVALENTS FOR VARIOUS HASKELL CONSTRUCTS
235 * function definition -- shown above
237 * pattern-matching lambda (souped up version in static semantics)
239 \ p1 p2 ... pn | g1 -> e1
247 match [ (u1, [p1]), ..., (un, [pn])]
248 [ if g1 then e1 else if ... then em else fail
250 where fail = error "no match in pattern-matching lambda at line 293\n"
252 * pattern-matching (simple, non-recursive) "let"
262 which has a "match" equivalent of
264 match [(e, [~p])] [ E ]
266 The full-blown Haskell "let" is more horrible:
275 case ( if g1 then e1 else... else if gn then en else error "?" ) of
278 thinking about which I am not able to sleep well at night.
279 (Won't those g's have things bound from inside p ?)
281 * pattern-matching (not-quite-so simple, non-recursive) "let"
294 p = if g1 then e1 else if ... else if gm then em else fail
295 where fail = "...some appropriate thing..."
299 match [ (if g1 ... then em else fail, [ ~p ]) ]
301 where fail = "...some appropriate thing..."
303 * "case" expressions (souped up version in static semantics)
316 match [ (e0, [p1, ..., pm]) ]
317 [ if g11 then e11 else if ... else fail -- E1
319 if gm1 then em1 else if ... else fail
321 where fail = error "pattern-matching case at line xxx failed\n"
323 * list comprehensions