Common gotchas in Java

Overview

Java is a minimalist language with deliberately less features than other languages, never the less it has edge cases which strange effects, and even some common cases with surprising effects to trip up the unwary.  If you are used to reading another language you can easily read Java the wrong way leaving to confusion.

Variables are only references or primitives

That's right, variables are not Objects.  This means when you see the following, s is not an object, it is not a String, it is a reference to a String

    String s = "Hello";

This answers many areas of confusion such as; 

Q: If String is immutable how can I change it. e g. s += "!";
A: You can't in normal Java, you can only change a reference to a String.

== compares references, not their contents.

To add to the confusion, using == some times works.  If you have two immutable values which are the same, the JVM can try to make the references the same too. e.g.

    String s1 = "Hi", s2 = "Hi";
    Integer a = 12, b = 12;

In both these case, an object pool is used so the references end up being the same.  s1 == s2  and a == b are both true as the JVM has made the references to the same object.  However, vary the code a little so the JVM doesn't pool the objects, and == returns false, perhaps unexpectedly.  In this case you need to use equals.

    String s3 = new String(s1);
    Integer c = -222, d = -222;

    s1 == s2      // is true
    s1 == s3      // is false
    s1.equals(s3) // is true
    a == b        // is true
    c == d        // is false (different objects were created)
    c.equals(d)   // is true

For Integer, the object pool starts at -128 up to at least 127 (possibly higher)

Java passes references by value

All variables are passed by value, even references.  This means when you have a variable which is a reference to an object, this reference is copied, but not the object. e.g.

public static void addAWord(StringBuilder sb) {
     sb.append(" word");
     sb = null;
}

StringBuilder sb = new StringBuilder("first ");
addWord(sb);
addWord(sb);
System.out.println(sb); // prints "first word word"

The object referenced can be changed, but changes to the copied reference have no effect on the caller.

In most JVMs, the Object.hashCode() doesn't have anything to do with memory location

A hashCode() has to remain constant.  Without this fact hash collections like HashSet or ConcurrentHashMap wouldn't work.  However, the object can be anywhere in memory and can change location without your program being aware this has happened. Using the location for a hashCode wouldn't work (unless you have a JVM where objects are not moved)

For OpenJDK and the HotSpot JVM, the hashCode() generated on demand and stored in the object's header. Using Unsafe you can see whether the hashCode() has been set and even change it by over

Object.toString() does something surprising rather than useful

The default behaviour of toString() is to print an internal name for a class and a hashCode().

As mentioned the hashCode is not the memory location, even though it is printed in hexi-decimal.  Also the class name, especially for arrays is confusing.  For example; a String[] is printed as [Ljava.lang.String; The [ signifies that it is an array, the L signifies it is a "language" created class, not a primitive like byte which BTW has a code of B.  and the ; signifies the end of the class.  For example say you have an array like

String[] words = { "Hello", "World" };
System.out.println(words);

print something like

[Ljava.lang.String;@45ee12a7

Unfortunately you have to know that the class is an object array, e.g. if you have just Object words, you have a problem and you have to know to call Arrays.toString(words) instead.  This break encapsulation in a rather bad way and a common source of confusion on StackOverflow.

I have asked different developers at Oracle about this and the impression I got is that it's too hard to fix it now. :(

Comments

  1. The default behavior for Object.toString() makes sense for pretty much everything except for arrays. For some arbitrary object I create, Java has no idea how I want it that instance to be represented, so it prints what it knows: the name of the class and its hashcode.

    Unfortunately, Java does know more about an array: that it is a grouping of objects or a certain type, each having their own toString() method. It seems to me like the most logical way this would be represented as a string is to do something like: [1, 2, 3, null, null] (for an array of size 5). Or you could just have it print what Arrays.toString() prints, which removes the nulls at the end and instead makes it look like an ArrayList. But unfortunately, that's not the way it's done, which is a pity because the hashcode of the list is not very useful.

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Unusual Java: StackTrace Extends Throwable