Re: Formal semantics for C

On 27-Dec-2001, Matthias Felleisen <matthias@ccs.neu.edu> wrote:
> Fergus Henderson <fjh@cs.mu.oz.au> writes: 
> > Undefined behaviour can be handled by having the denotation of the program
> > as a whole be the universal set of all possible behaviours.
> If you call this mathematically useful, please don't let us stop you from
> spending your time on designing a "semantics" for C (or whatever). 

I am happy that Norrish and others have been working on formal semantics for C.
This is good foundational work which may prove useful in the future.

But that is not what I have been doing.  I myself happen to prefer
to put my efforts into things which are more directly useful.  I have
spent some of my time encouraging the C and C++ communities to reduce
the amount of undefined and unspecified behaviour in those languages.
But most of my efforts have been devoted to taking the other path:
designing and implementing a different language which eliminates almost
all the undefined and unspecified behaviour that you find in C and C++.

However, even for pure functional (or logic) languages,
and even when you are explicitly striving to minimize the amount
of unspecified and undefined behaviour in your language,
in practice there are still a number of areas in which
unspecified or even undefined behaviour is desirable.
So the same issues apply to languages like SML, Haskell,
and Mercury, although of course it applies to a much smaller
subset of the language than is the case for C or C++.

> It is my believe that C is an "assembly" language for building and
> "scripting" Unix and that the people who designed C didn't think about
> a mathematical, machine-independent semantics and, frankly, they and their
> friends who use it, probably don't care about it. 

I think everyone on the C standards committee values precisely specified
semantics, and understands the advantages of machine-independent semantics.
However, those are not the only things they value, and in many cases there
are other things which they value higher.

As for whether the semantics should be _mathematical_, well, currently
structured English ("Standardese") seems in most cases to be more useful
than mathematics for language definitions and programming language
standards, because it can be more easily understood by a wider audience.

The use of formal grammar notations for programming language syntax is
a notable exception.  In that case, the advantages of a formal notation
are clear.  These notations are well-known in the industry, are easy to
understand, and allow the semantics to be specified both concisely and
precisely.  Futhermore, there are a plethora of automated tools for
processing such grammars, and some of these tools are widely used.

But currently the state-of-the-art in tools for making use of formal
semantics seems to be well behind that for grammars.  Until we have
better tools, and these tools become widely used and widely accepted,
the advantages of mathematical specifications will be limited.

Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.