[Prev][Next][Index][Thread]

typed versus untyped



In a seminar a couple of weeks ago at MIT, Henk Barendregt offered the slogan
``Church vs.  Curry'' to indicate two views connecting types and lambda
calculi.  Of course as Henk noted himself, Church did as much toward
developing the Curry approach as did Curry, so I'm a bit concerned that
Henk's slogan may be misleading.  Until a consensus emerges, I'll stick to
talking about ``typed vs. untyped''.

The canonical example of an effective procedure for type reconstruction is
the Hindley algorithm [TAMS,v.146,1969,29--40] which tries to assign simple
types built from the function space constructor, -->, to untyped lambda
terms.  This procedure plays a valuable role from both the typed and untyped
viewpoints.  Hindley's algorithm has been extended and incorporated by Milner
into the ML language, and more recently adopted by Turner in his Miranda
language, where it provides the feel of untyped LISP-style programming
without sacrificing the benefits of a strongly, statically typed language.

In the typed view, one regards an untyped term as (possibly) arising from a
fully typed term as a result of ERASING the type information.  Such untyped
erasures arise naturally even in the typed approach, since the typings are
irrelevant to reducing the terms---the Computer Science slogan here is that
``typing is a static discipline'' which, once SUCCESSFULLY completed, is
irrelevant at runtime.  The Hindley algorithm can be understood as exactly
deciding whether or not an untyped term is an erasure, and if so, it
calculates a ``most general'' typed term whose erasure is the untyped one.

In the untyped view, one understands typing procedures as efficient heuristic
proof procedures for theorems about the possible values/meanings of untyped
terms.  This is the view taken by Milner in his classic paper [JCSS,
v.17,1978,348--375] and also by Hindley is his recent papers [TCS,v.22,
1983,1--18, and 127--133].  Here a type-expression is taken to denote a
subset of possible values in a model of the untyped lambda calculus, and the
benefit of discovering that term M has type Tau is that one then knows that
the value of M is among the subset of values denoted by Tau.  Hindley shows
that his algorithm, supplemented with a rule that types are preserved by
beta-CONVERSION, yields is a sound and complete proof procedure for typing
assertions, M:Tau, under this interpretation.

Given that both typed and untyped approaches highlight the Hindley
procedure, the contrast between these perspectives may not seem serious.
Without taking sides here on ``typed vs. untyped'' (I'll do that another
time), let me emphasize that the two views DO lead to significantly different
theories.

I think an elementary observation crystalizes the difference: there are typable
terms which behave the same when USED IN TYPABLE WAYS, but not when used in
untypable ways.  So I see the two appraoches as differing at the fundamental
definition of what makes two terms EQUIVALENT.  Two terms may be equivalent
when viewed as arising by erasure from a typed setting, but it can be
inconsistent to equate them as untyped terms.

Here is a simple example worked out together with Jon Riecke:

PROPOSITION: There are terms, M and N, of the untyped lambda
calculus such that
(1)  M and N have the same most general Hindley type,
(2)  the only constant appearing in M and N is the Boolean TRUE,
(3)  M and N are observationally congruent (indistinguishable) wrt all
     contexts into which M and N are typably substitutable, including contexts
     containing fixed-point, conditional, and/or the usual arithmetic and
     boolean combinators,
(4)  M and N reduce to distinct normal forms, and so are computationally
     distinguishable by an untyped context.

PROOF:
Let	ONE   ::=  lam f.lam x. f x
	TWO   ::=  lam f.lam x. f(f x)
	THREE ::=  lam f.lam x. f(f(f x))
be the standard Church numerals.  Let
	F     ::=  lam f. (lam x.lam y. x)  f  (f(f(TWO(lam b. TRUE)))) .

Clearly, F reduces in two steps to the identity (lam f.f).  However, the
``garbage'' subterms of F play an important role wrt type-checking: you can
quickly check that the most general Hindley-typing of F requires
	b: bool
	f: (bool-->bool)-->(bool-->bool)	(Call this type `boolfunc'.)
	F: boolfunc-->boolfunc.

Let	M     ::=  (F ONE)
	N     ::=  (F THREE).

Since F reduces to the identity, M and N reduce to the normal forms ONE and
THREE, respectively. You can again easily see that M and N have the same most
general---indeed unique---type, namely, boolfunc.

Now look at the typed terms of type boolfunc from which ONE and THREE are
obtained by erasure, namely,
	ONE_boolfunc   ::= lam f:bool-->bool.lam x:bool. (f x),
	THREE_boolfunc ::= lam f:bool-->bool.lam x:bool. f(f(f x)) .

Under the classical semantics when --> means ALL total set-theoretic
functions and bool={true, false}, as well as in cpo semantics when --> means
continous functions and bool={true, false, bottom}, we know that ONE_boolfunc
and THREE_boolfunc MEAN THE SAME THING, and hence are OBSERVATIONALLY
CONGRUENT wrt all TYPED contexts, even typed contexts containing fixed-point,
conditional, integer, and in general all the combinators we can interpret
semantically.  Hence, the erasures ONE and THREE are observationally
congruent wrt to all TYPABLE contexts with a hole typable as boolfunc.
But M and N reduce to ONE and THREE and have unique Hindley-typing
boolfunc, so we conclude that (1--3) hold.

On the other hand ONE and THREE, being distinct normal forms, are
distinguishable by an untyped context, which proves (4).  To be more precise,
it follows from Bohm's Theorem that there is a pure (ie, constant-free)
untyped context C[.] such that C[M] has a normal form and C[N] is a rank 0
unsolvable.  Alternatively, use the impure context
	C_int[.] ::= IF ( [.] successor 0) = 1 THEN 0 ELSE bottom FI.

By the way, C_int[.] happens to be typable with the hole [.] typed at
(int-->int)-->(int--int), but of course C_int[M] does not type-check---as in
general no distinguishing context could.

Regards, A.