RE: failures

A while back Stephan Kahrs uncovered an unsoundness in the 1990 version of
Standard ML having to do with datatype's escaping their scope.
Specifically, it was allowed that a datatype could be introduced via a let,
but to have a value of that type (say, a constructor) returned from the let.
This was "obviously wrong" by comparison with abstype's, but it took a bit
of work to show that the semantics as given originally was, in fact,
Martin has a good point about the role of semantics helping us to be sure.
But I would also say that in my experience a semantics can also be useful in
finding errors.  For example, we found the let/callcc bug by trying to push
through a soundness proof, and discovering that the proof didn't work.  It
didn't take long from there to discover a counterexample.
A similar example came up recently in connection with the notion of
applicative and generative functors in ML.  Moscow ML allows one to convert
a generative functor into an applicative functor by eta expansion.  Although
it made me uncomfortable we could find nothing concretely wrong with this,
until we undertook a thorough semantic analysis.  By examining the issue
from the point of view of abstract computational effects we observed that
this "feature" amounted to turning a partial function into a total function
by simply eta-expanding it.  This seemed more concretely wrong, and, as it
turns out, uncovers an unsoundness in the implementation.  (This is all
explained in a forthcoming POPL 2003 paper.)