Sunday, November 26, 2006

Java 5 crashes on Linux 2.6, too

Two and a half weeks ago I published a post about problems with random VM crashes using Java 5 on Linux with a 2.4 kernel.

Most of the feedback I got suggested upgrading to a more recent kernel version. Because this represents a major undertaking for our application (several thousand clients deployed with Java 1.4.2 on RH9) we needed to be sure this would work.

Because all the crash reports - see the original post - seemed to hint into the GC's direction I wrote a little test application to stress the garbage collector. What it does is to create a configurable number of threads, each of which just allocates a byte[] of variable size. In case an OutOfMemoryError occurs, the thread gets replaced with a new one. You can find the code at the bottom of this post.

I started 4 instances of this tool under Ubuntu 6.06, each configured with 20 threads (first parameter) and up to 40MB of memory per thread (second parameter, in bytes).

After 4 days of continuous running - we had just started to feel a little more safe - one of the processes crashed, leavig a hs_err file behind, again telling us the current activity was a full garbage collection.

We have now filed a bug report with Sun which is yet to be reviewed by them. They say that it currently takes around 3 weeks before you get an official bug id, however I do not think any fix will be in time for us to use Java 5 for our next application release.

Does anyone know, if there is a way of getting the crash files with debug symbols? Maybe this would allow us to do some more testing on our own. Is there some sort of debug-VM version available for download?

public class GCStress {
    private static Thread[] threads;

    private static Random random = new Random();

    private static int maxBytes;

    private static int numThreads;

    public static void main(String[] args) {
        numThreads = Integer.parseInt(args[0]);
        maxBytes = Integer.parseInt(args[1]);
        threads = new Thread[numThreads];
        for (int x = 0; x < numThreads; x++) {
            threads[x] = new Thread(new Allocator(), "Thread_" + x);
            threads[x].start();
        }

        while (true) {
            for (int x = 0; x < numThreads; x++) {
                if (!threads[x].isAlive()) {
                    threads[x] = new Thread(new Allocator(), "Thread_r" + x);
                    threads[x].start();
                }
            }
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                // ignore
            }

        }
    }

    private static class Allocator implements Runnable {

        public void run() {
            while (true) {
                int tSize = random.nextInt(maxBytes);
                System.out.println(Thread.currentThread().getName()
                        + " allocating " + tSize / 1024 + "kb");
                byte[] tMemFiller = new byte[tSize];
                try {
                    Thread.sleep(random.nextInt(200));
                } catch (InterruptedException e) {
                    ; // ignore
                }
            }
        }

    };
}

1 comment:

Daniel Schneller said...

We are currently working with Sun to investigate this. There is a bug report filed under ID #6500147 .