Monday, April 30, 2007

Exception in "CompilerThread0" java.lang.OutOfMemoryError

Ever since we switched out build process from Java 1.4 to Java 5 (1.5.0_09) we have seen FindBugs crash with an OutOfMemoryError when started from ant. The whole thing is running under RedHat Enterprise Linux 4. The output is always the same:

[findbugs] Running FindBugs...
[findbugs] Exception in thread "CompilerThread0" java.lang.OutOfMemoryError: requested 134217736 bytes for Chunk::new. Out of swap space?

Our first attempts to increase the heap size with the -Xmx VM parameter did not help. We suspected the newer FindBugs release we had installed roughly at the same time, but this turned out to be wrong, because newer and older versions showed the same behaviour.

Armed with the fresh knowledge about the SAP Memory Analyzer just learned at jax.07 I tried to get a heap dump to try and find out what was causing the problem. Unfortunately the VM ignored my -XX:+HeapDumpOnOutOfMemoryError option completely (on Sun's VM Options page there it says -XX:-Heap... (with a minus), tried that, too).

Getting suspicious I realized that usually Java 5 VMs say something like OutOfMemoryError: Java Heap Space which was missing in this case.

I had never before seen that strange "CompilerThread0" in an error message, same for the "requested xx bytes for Chunk::new. Out of swap space?" part. Looking around I stumbled across this bug report (and some more, all related to this one - just follow the related bugs list): Bug #6487381 "Additional path for 5.0 jvm crash on exhaustion of CodeBuffer".

Apparently something goes wrong when the HotSpot JIT tries to compile something to native code for faster execution. One of the other bugs you can find by following the link above mentions that there is problem with situations where there is not enough room for further compiles in what is called the CodeBuffer: instead of failing gently and just continuing to run the application without compiling or throwing some older compiled code away the VM would just crash. This should have been fixed with Java 5.0, but apparently there is another code path in the VM that will cause a hard failure. (I wonder what may be requesting another 128MB(!), but anyways...)

From what the bug says this issue will be fixed with 5.0u12, however 5.0u11 is the most recent release to download, so no help there.

Attempting to find whether it might be a single method that was causing the crash reproducibly I tried the -XX:+PrintCompilation option. It produced lots and lots of output about compiled methods. A "guide" to its output can be found here: decyphering -XX:+PrintCompilation output.

As one might have expected, this did not reveal anything useful either. So the last resort was to switch to the Client VM which managed to complete the FindBugs run. In retrospect this explains why we were able to build without any problems on a Windows machine - the Client VM is default there unless you explicitly request the server.

For now we will leave it at that and try again as soon as 5.0u12 comes out.

7 comments:

Markus Kohler said...

Hi,
Have you tried to increase the code cache size with
the -XX:ReservedCodeCacheSize option ?
Regards,
Markus Kohler

Daniel Schneller said...

Yes, I maxed it up to 256MB, double the size it requested. Also tried with equal initial and maximum heap sizes. No avail.

Ralf Schmelter said...

Hi,

this error occurs because the JIT compiler could not allocate enough C-Heap (the one you get via malloc()), so increasing the Codecache size will not help (on the contrary, since it will decrease the amount of C-Heap available). The problem is that the JIT (the server compiler) has some places where it might use excessive amounts of memory (130MB in your case). Sun has added some heuristics to prevent this, but as you saw they will not always save you. There are a few possible solutions. First you could file a bug report to Sun, but you probably have to have a smaller test which shows the error. Second you could find out the method which compilation causes the error (with the compilation traces) and add this to a .hotspot_compiler file to exclude it from JIT compilation. Or you can try to reduce the Java-Heap and Perm-Gen size, so more C-Heap is available and you might be lucky that this is enough. Or, as you found out yourself, you can use the client compiler, which doesn't use the sophisticated algorithms which need so much memory. And sometimes there is really not enough swap space available (as the message suggest), so increasing this would be helpful. But most of the time it's not the available memory, but the available address space that's missing on x86.

Regards,
Ralf

Martin said...

|> Or, as you found out yourself, you can
|> use the client compiler, which doesn't
|> use the sophisticated algorithms which
|> need so much memory.

For me, just removing the "-server" option did not help.

Martin said...

|> For me, just removing the "-server"
|> option did not help.

But defining "-client" explicitly helps. Thanks.

adolfo said...

I had the same problem and it seems that it started with 1.5.0_06 because with 1.5.0_05 i cannot reproduce it...

Brenda said...

This issue just started happening for my standalone java application which runs on a solaris sparc T200 box. My java build is 1.5.0_23-b01.

This does not happen always . The program reads about 100 files daily
and occurs randomly at different stages of execution.

Any new insights on this ?
Thanks