Wednesday, January 02, 2008

Switch/Case and Autoboxing/Autounboxing

Because I have seen people trying this several times in the last couple weeks I decided I might just as well write a post about it: A seemingly common misconception of Java 5's auto(un)boxing.

While it has been a while now since we moved to Java 5 only now are people slowly getting familiar with the new syntax features. Most of them learned Java beginning with 1.4 and have a history in the DBase or FoxPro world. So object oriented programming and Java as one implementation of it are understood, however maybe not as deeply as you would expect. Some are especially impressed by the ease of use autoboxing and -unboxing bring to the wrapper classes for primitives. I also find that feature quite useful, because objects falling out of the persistence framework have full-blown object types for boolean values or numbers. This makes it rather cumbersome to work with them. Autounboxing helps a lot there:

if (theBusinessObject.isCompleted().booleanValue() && theBusinessObject.getNumber().intValue() > 5) {
...
}
if (theBusinessObject.isCompleted() && theBusinessObject.getNumber() > 5) {
...
}

Undeniably the second example is much easier to read. It becomes even more obvious once you start doing calculations based on wrapped primitives. Nevertheless problems may arise if you do not know what this new syntax will do under the covers. In the above case it is quite clear that the compiler will just put the calls to ".booleanValue()" and ".intValue()" into the bytecode on your behalf.

Consider this example:

public class Box {

 private static final Integer one = 1;
 private static final Integer two = 2;
 private static final Integer three = 3;

 public static void main(String[] args) {
  int myInt = three;
  switch (myInt) {
   case 1:
    System.out.println("One");
    break;
   case 2:
    System.out.println("Two");
    break;
   case 3:
    System.out.println("Three");
    break;
   default:
    System.out.println("None");
    break;
  }  
 }
}

Thanks to autoboxing the reference type variables "one", "two" and "three" can be assigned using a primitive int on the right hand side of the "=" sign. And because of the autounboxing of "three" in the first line of "main()" it can be assigned to "myInt". After that you find just a regular switch/case construct on that primitive int.

Decompiling this using e. g. javap reveals the "magic" behind this:


D:\temp>c:\jdk1.5.0_12\bin\javap -c Box
Compiled from "Box.java"
public class Box extends java.lang.Object{
...
public static void main(java.lang.String[]);
  Code:
   0:   getstatic       #2; //Field three:Ljava/lang/Integer;
   3:   invokevirtual   #3; //Method java/lang/Integer.intValue:()I
...

static {};
  Code:
   0:   iconst_1
   1:   invokestatic    #10; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   4:   putstatic       #11; //Field one:Ljava/lang/Integer;
   7:   iconst_2
   8:   invokestatic    #10; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   11:  putstatic       #12; //Field two:Ljava/lang/Integer;
   14:  iconst_3
   15:  invokestatic    #10; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   18:  putstatic       #2; //Field three:Ljava/lang/Integer;
   21:  return
}

At index 3 in "main()" you can see the automatically inserted call to java.lang.Integer.intValue(). This is the autounboxing. In the static initializer it goes the other way round: The compiler inserts java.lang.Integer.valueOf(int) at indexes 1, 8, and 15. Here the autoboxing takes place.

So far so easy. Now look at this:

public class Box {

 private static final Integer one = 1;
 private static final Integer two = 2;
 private static final Integer three = 3;

 public static void main(String[] args) {
  int myInt = three;
  switch (myInt) {
   case one:
    System.out.println("One");
    break;
   case two:
    System.out.println("Two");
    break;
   case three:
    System.out.println("Three");
    break;
   default:
    System.out.println("None");
    break;
  }  
 }

}

Trying to compile this will fail:

D:\temp>c:\jdk1.5.0_12\bin\javac Box.java
Box.java:10: constant expression required
                        case one:
                             ^
Box.java:13: constant expression required
                        case two:
                             ^
Box.java:16: constant expression required
                        case three:
                             ^
3 errors

I have seen this pattern numerous times, and whenever someone comes across it they seem to wonder what the difference is compared to the first example and why they get a compile error. They expect unboxing to happen at each "case". However they do not realize that this is not the same as putting the primitive value there, but is resolved to a method call under the covers - which of course is illegal in that context.

I have found it helpful to show people the bytecode output that gets generated in the first case. As a side effect they also usually learn for the first time about the existence of decompilers :)

One last piece of advice: Eclipse has a feature to specify a different syntax coloring for places where autoboxing and -unboxing occur. I recommend defining a clearly recognizable format, e. g. I use underlined text in a dark red color. I find it rather helpful to remind me that sometimes in such situations a null-check is a good idea - after all reference types might be null opposed to primitive values.

No comments: