Awt_Robot and File Handles
In January I wrote about translucent windows with Swing. I found it in O'Reilly's Swing Hacks book and was pretty pleased with the results.
However for some time we have been experiencing problems with our application in production (running Sun's JRE 1.4.2 on RedHat 9). They were not obviously GUI related; for some strange reason a barcode reader attached to the machine randomly stopped working. We traced and debugged a whole lot, but could not find any reason apart from a bug in the native driver layer for these (JavaPOS) devices. As a result we went to the driver vendor and had them look into it on the driver level.
After a while they came up with some results which revealed that there really was a problem in the native part of the driver; however they could not find a way to reproduce it using their test tools, but only when using our application.
Basically what they found was a process called
awt_robot that had a file handle on a device node in the
/dev filesystem used to communicate with the barcode scanner. However that process had not issued an open call to the file but had started using the already open file handle right away. When our main application tried to close the handle at some point, that close call froze until one killed the
awt_robot manually. Only then would the close call return and the application continue normally.
So that explained where the problem was, but now how it came to pass. Armed with that knowledge about the
awt_robot we started looking around and found it to be a binary in
$JAVA_HOME/jre/lib/i386/. Apparently it is part of the implementation of the
java.awt.Robot. No one here really knew that there was a separate program to back the Robot class, but when we debugged through the translucent-windows-part of our code we could see an instance of this program being forked by the
java process upon first invocation. Having been started once it would not go away until the VM itself was shut down.
Up to then we had (at least I had) believed that the native robot stuff was part of the JVM itself. As they say: You never stop learning...
Anyway (learning even more here), as any process forked under Linux inherits the file handles of its parent (found this page from IBM with a quick Google search) the
awt_robot also inherits the (already open) handle on the device node. It seems that as long as the child process does not close this handle, the parent also cannot do so and has to wait for it. So if
awt_robot does not usually exit before its parent
java process stops the handle will actually never be closed during the application's runtime.
So in the end our hardware problem turned out to be really very well hidden GUI problem. We commented out the code for the fancy windows and everything ran smoothly again. We did not try if this problem also occurs on the Windows version of Java, but I believe one should at least be aware of the situation.
These comments originate from my old blog and I find them interesting enough to repost here: