Re: Subject reduction fails in Java

Trying to comprehend the intricacies of transition semantics, I've just
thought over the whole recent discussion on Subject Reduction in Java
and come to the following conclusions. Part of them are summarizing,
while others - marked with (*) - are new or at least have not been
stated explicitly. In particular, evaluation order plays a surprising
role. Please correct me if I'm wrong in any respect. I also include some

A. In Java - at the source level -, Subject Reduction does not hold.
   In the context of transition semantics, this is due to the fact that 
   the Substitution Property does not hold. [In general in that context, 
(*)Subject Reduction seems to require the Substitution Property and two 
   invariance properties (Side Effect Property and Array Property), 
   mentioned in the recent type soundness papers of Donald and Sophia et
   al. Why exactly are they needed?]

B. The Substitution Property does not hold because of _each_ of 
   the following reasons.
   1. Method overloading may select - depending on the argument types -
      different methods with unrelated return types
   2. The conditional expression has a too strong typing requirement.
   3. The covariance of array types allows for badly-typed intermediate 
      states in assignments that are _later_ caught by a run-time check.

C. As far as we know, Java is Type-Safe. The question then is how to 
   prove this in the light of A and B. The known solutions are:

   ad 1. The only solution is to keep track of the outcome of the 
         static resolution of method overloading, thus introducing an
         intermediate language for which Subject Reduction holds.
         This can be done, as given in the Java specification, 
         by adding the argument types determined by the static  
         resolution  as annotations within the program.
(*)      In particular, contrary to several earlier statements, the
         extended beta-reduction (with casts) does _not_ solve this
         problem. Recall Phil's

	   class Counterexample {
	     static boolean overloaded (Object x) { return true; }
	     static int overloaded (String s) { return 42; }
	    static void m (Object x){System.out.println(overloaded(x));}
	     public static void main (String[] args) { m("hello"); }



         evaluates to 


         which, because of call-by-value, further evaluates to


         which again prints "42", not "true" as expected.

   ad 2. and 3. Both concepts are harmless when using an evaluation
         semantics, where the Substitution Property is not needed.
         But when one insists in using a transition semantics, one is
         faced with intermediate subterms whose type may have become 
         narrower, which renders the terms badly typed. For both 
         concepts, the intermediate language has to be adapted in order
         to faciliate Subject Reduction (from which the Type-Safety of
         the source language should emerge), as follows.
   ad 2. 
         (i) The ideal solution - even for the source level -  would be 
             that the type of the conditional is the LUB of the types of
             both branches. This LUB does not exist in current Java. 
(*)          "Compound Types" seem to provide it.
	        author = {Martin B{\"u}chi and Wolfgang Weck},
	        title = {{Java} Needs Compound Types},
	        institution = {Turku Center for Computer Science},
	        number = {182},
	        isbn = {952-12-0224-6},
	        note = {http://www.abo.fi/\symbol{126}
	        month = {May},
	        year = 1998}

         (ii) The problem is solved by the extended beta-reduction
              (which - at no run-time costs - inserts type casts),
(*)           but only because the conditional expression is non-strict:

             Recall the counterexample given by Benjamin.

		Integer   i = new Integer();
		HashTable h = new HashTable();
		Object o = m (i, h);

             Naively reducing the last line, we obtain

		Object o = (b ? i : h);

             which is badly typed, whereas

		Object o = (b ? (Object)i : (Object)h);

             is type-correct. Now if the 2nd and 3rd arguments of the 
	     expression had to be evaluated first, we would end up with
             the same problem again. But fortunately, the condition is
             evaluated first and one of both branches is selected.

         (iii) inspired by (i), one may employ a weaker typing rule that
               allows for _any_ upper bound in the intermediate   
               language. A (minor?) complication is non-unique
               intermediate typing.

         (iv) another solution is to annotate the intermediate language
              with the type of the overall expression as determined 
              statically. This preserves unique typing.

   ad 3. The typing rule for array assignment has to be relaxed in the 
         sense that widening between the rhs and the lhs is not 
(*)      required. This would not be necessary if the Java semantics
         specified that the rhs is evaluated _before_ the lhs.

- David

THAT's   _\\/,  For a more realistic picture, my address and other stuff
NOT me:  (@ @)  see http://www.in.tum.de/~oheimb/                    <><