RE: type safety

Subject: RE: type safety
From: Greg Morrisett <jgm@cs.cornell.edu>
Date: Thu, 30 Dec 1999 13:53:31 -0500
Cc: "'types@cis.upenn.edu '" <types@cis.upenn.edu>
> This claim is extremely misleading.  For "partial" primitive
> operations like division, you can define the domain of the operation
> as a type in a suffciently expressive type system.  In fact, the same
> "trick" can be used to eliminate array bounds checking by introducing
> subrange types as in Pascal.  But what have you gained?  The
> programmer must now insert an explicit case check for nearly every
> application of the primitive operation (e.g., array access) and throw
> an exception when the check fails.  

This is only true in very simple systems.  Xi and Pfenning's
proposal for DML provides enough support that the check can
be moved to the caller instead of the callee -- hence the
requirement becomes part of the (documented) interface, 
no runtime test is necessary in simple cases, and we have
a way to communicate invariants.  A similar approach is used
in TALx86 and Touchstone.  I expect to see nice treatments
of array bounds checks folded into next generation languages.

And of course any static typing discipline will be 
incomplete.  This does not make them unattractive or
useless, rather a hard area that must carefully balance
user-interface, technical, and esthetic issues (what kinds 
of type information will programmers write, what can be 
inferred, what kinds of error messages make sense,
how is programming methodology influenced, what are
resaonable notions of "bad", etc.)  And
this is why it's been such an active area of research
for the past 10-15 years (as opposed to S-expressions.)

Both you and Matthias seem to want to throw out
static typing because it will never be complete.
The same argument applies to garbage collection --
clearly it's useless for people to work in this
area because determining what's garbage is undecidable.
And trace-based collectors are incomplete, so let's
just toss them out and leave everything up to the
programmer.

I furthermore disagree with Matthias's characterization 
of programming language safety and the role of type
systems.  The first step in defining a language's
semantics is to define the possible program states.
For Scheme, we use an extremely impovrished context-free
technology for doing this (e.g., an LL1 grammar).  For
ML or Haskell, we use a somewhat more sophisticated
context-sensitive approach for defining what the
program states are, distinguishing them from raw
pre-terms.  Even the "untyped" lambda calculus has
requirements about balanced parens and free variables.
ML and Haskell program states have the nice property
that there are only a few reachable "bad" program
states (division by zero or array bounds) without
changing the natural definition of transition.  Scheme,
on the other hand, must essentially change the natural
notion of transition to incorporate an online reference
monitor to determine whether we are about to transition
to a bad state.  

Hence, the type system comes into play before we even
have a notion of what a program state is, much less
what a "good" state is or a "bad" state.  And there's
an argument that by carefully defining your program
states, you can use more natural (and indeed efficient)
transitions in your semantics.  

It is also not clear to me how Matthias's notion of
"fully tagged" evaluation works for languages like
ML with polymorphism or for languages with inclusion-based
subtyping.  Each value may have to be tagged with
a number of different "types" and, in the presence
of polymorphism, we'll have to dynamically generate
tags and run over values to dynamically attach these
tags to them.  This becomes quite problematic when
refs (and other computational effects) come into play.
[Our recent ICFP paper and companion technical report
points to some of the trickyness.]  It's just not as
simple as it sounds -- *if you take the approach that
types (i.e., well-formedness of machine states) comes
before you even consider what you mean by "bad"*.

Furthermore, there needs to be a fairly complex
set of rules that relate the set of tags on values
with contexts to determine whether an operation is
allowed or not.  Consider, for instance, a set of
ML modules with complex sharing constraints on some
abstract data type.  In some contexts, a value of
the abstract type will have any of the module's "tags"
for the type.  In other contexts (notably modules that
are privy to the definition of the abstract type) 
we can ignore the type tags and treat the underlying
value as a "raw" value of the implementation type.

What this means is that, in defining the transition
relation, you'll be forced to fold a static type
system, relating contexts and operations, into the
dynamic semantics to determine whether a transition
is allowed or not.  It just so happens that for languages
such as Scheme with trivial type systems, these
checks can be done in a context-insensensitive manner.
But that's the problem with Scheme so to speak --
it's completely unaware of context.

So, while at a gross level, Matthias is right about
"safety" in the traditional sense, I think there's
plenty of room for disagreement about the role that
type systems play with respect to safety.

-Greg Morrisett
Prev: Re: type safety
Next: Re: type safety
Index(es):
- Main
- Thread