Saturday, December 30, 2006

Java 5 crash - the saga continues

It really seems to be a problem related to the GC options mentioned here. However there seems to be more to it. With the rather aggressive setting of only 6 seconds between full GCs we were able to reproduce the problem, however only on a very specific combination of hardware, OS and Java VM.

Running on RedHat 9 with Sun's Java VM 1.5.0_09 we could only see the problem on one machine that appears to be the same as all the others we use. However as this very same machine shows no problems at all when running our application with Java 1.4.2_08 or 1.6.0RC - even with the 6 seconds interval - I do not believe in faulty hardware.

I removed the machine's hard drive and put it into an identical system (as far as we can see and are told by its manufacturer). In that box (and two more I tried) I cannot get any crash with either Java 1.4, 1.5 or 1.6. Strangely enough even the drive from one of those other machines mounted into the problematic box showed no problems. We really do not have any clue here.

Nevertheless I have now moved everything back to where it was and have increased the GC interval to 4 hours. So far I have not seen any more crashes. The system will be running till January 3rd. If there were no crashes between now and then we will consider this parameter as a solution to our immediate problem, even though the underlying cause may never be found...

Friday, December 15, 2006

Some progress on Java 5 on Linux crash

In two previous posts (first here and second here) I reported about Java 5 VM crashes on Linux machines.

Digging deeper into the problem with external support led to some new evidence. Apparently the problem is in some way related to regular garbage collects initiated by the so called "GC Daemon" thread. It gets spawned when you use some fashion or other of RMI and calls full GCs in order to get rid of unreachable remote objects.

One can specify the interval (in milliseconds) between calls to the garbage collector using -Dsun.rmi.dgc.server|client.gcInterval. With our application using RMI to call remote services we reduced this value to as little as 6 seconds. As we expected this let us reproduce the problem we have much more often than before. In 4 days we observed 8 VM crashes, each of them with very similar hs_err files.

This means that my test program might have revealed even another bug, because it did produce some crashes, however it does not use anything connected with RMI.

In the freshly released Java SE 6 the default interval was increased from one minute to one hour (see Bug #6200091 and the RMI release notes).

As a workaround we might just increase those values for our application explicitly to lessen the risk of related crashes, however a real fix would be great.

Sunday, December 03, 2006

Follow Up: F-Secure's response

Not too long ago I wrote about a problem concerning F-Secure Anti-Virus 2007 and the Kerio Personal Firewall in this article. At the end of it I said I would inform F-Secure about the problem. I did and this is about their response.

On November 13th I used the support form on F-Secure's website (Germany) to report to them the problem I had experienced. My report was about 2K long (I still have it) and included precise information about the situation, what I had found out and what to do to prevent it. I suggested having the installer issue a warning concerning 3rd party personal firewalls, especially since those seemed to be no problem with the 2006 version. I also included a link to my even more verbose blog post.

On November 14th I got a response from one of their support agents. Apart from a lengthy auto-generated intro on how to issue correct problem reports (comes with every mail, does not have anything to do with your individual request) I got a very brief answer (I translated this from German, but I tried to be as exact as possible):

Dear F-Secure Customer, thank you very much for your request to our technical support. Unfortunately the firewall software you use is not compatible with our software. If you need a firewall, we recommend using our "Internet Security". Should you have further questions or need assistance with the actions above please call under ... We wish you a pleasant day. Regards, ...

If I was asked I could probably tell the ready-made text blocks used in this answer apart...

I replied again, because this did not satisfy me at all. I told them that I would have had expected a little more verbosity, especially since I had provided a detailed analysis of the problem. Furthermore I told them about the same problem happening to several people I know - all not too tech-savvy people who called me for assistance. Some of them were so pis^H^H^H dissatisfied that they will probably not buy a new subscription for F-Secure once their current one has expired.

I specifically asked two simple questions: a) Whether my analysis was correct and if it could be used to prevent the problem from happening should need to upgrade more machines. b) If it would be possible to follow my suggestion and simply have the installer display a warning that as of version 2007 personal firewalls might cause severe problems. This would have been enough from my standpoint - even though detecting the remaining files of a Kerio PFW would have been even nicer.

This was their answer (again translated):

Dear F-Secure Customer, thank you very much for your request to our technical support. regarding a) It is not possible to run version 2007 on systems on which Kerio Firewall has not been completely removed. regarding b) Our setup's sidegrade feature does unfortunately not detect all versions of the Kerio Firewall. Otherwise it will insist on uninstalling the Firewall before it starts to install. Should you have further questions or need assistance with the actions above please call under ... We wish you a pleasant day. Regards, ...

As the icing on the cake on November 21st I got a feedback request about how content I was with the recent support issue I had filed to ensure optimum support with technical difficulties. I in fact filled out the form, complaining about the very brief and impersonal answers that did not meet my expectations. I have not heard anything since then.

Is this great? I know someone who will seriously think about renewing his own subscription...

Sunday, November 26, 2006

Java 5 crashes on Linux 2.6, too

Two and a half weeks ago I published a post about problems with random VM crashes using Java 5 on Linux with a 2.4 kernel.

Most of the feedback I got suggested upgrading to a more recent kernel version. Because this represents a major undertaking for our application (several thousand clients deployed with Java 1.4.2 on RH9) we needed to be sure this would work.

Because all the crash reports - see the original post - seemed to hint into the GC's direction I wrote a little test application to stress the garbage collector. What it does is to create a configurable number of threads, each of which just allocates a byte[] of variable size. In case an OutOfMemoryError occurs, the thread gets replaced with a new one. You can find the code at the bottom of this post.

I started 4 instances of this tool under Ubuntu 6.06, each configured with 20 threads (first parameter) and up to 40MB of memory per thread (second parameter, in bytes).

After 4 days of continuous running - we had just started to feel a little more safe - one of the processes crashed, leavig a hs_err file behind, again telling us the current activity was a full garbage collection.

We have now filed a bug report with Sun which is yet to be reviewed by them. They say that it currently takes around 3 weeks before you get an official bug id, however I do not think any fix will be in time for us to use Java 5 for our next application release.

Does anyone know, if there is a way of getting the crash files with debug symbols? Maybe this would allow us to do some more testing on our own. Is there some sort of debug-VM version available for download?

public class GCStress {
    private static Thread[] threads;

    private static Random random = new Random();

    private static int maxBytes;

    private static int numThreads;

    public static void main(String[] args) {
        numThreads = Integer.parseInt(args[0]);
        maxBytes = Integer.parseInt(args[1]);
        threads = new Thread[numThreads];
        for (int x = 0; x < numThreads; x++) {
            threads[x] = new Thread(new Allocator(), "Thread_" + x);

        while (true) {
            for (int x = 0; x < numThreads; x++) {
                if (!threads[x].isAlive()) {
                    threads[x] = new Thread(new Allocator(), "Thread_r" + x);
            try {
            } catch (InterruptedException e) {
                // ignore


    private static class Allocator implements Runnable {

        public void run() {
            while (true) {
                int tSize = random.nextInt(maxBytes);
                        + " allocating " + tSize / 1024 + "kb");
                byte[] tMemFiller = new byte[tSize];
                try {
                } catch (InterruptedException e) {
                    ; // ignore


Tuesday, November 21, 2006

Amazingly simple - Collection Initialization

On Todd Huss' blog I just came about a very simple way of initializing a collection with a set of predefined values. It is so simple that it is amazing people do not use it way more often. For my part, I have seen this use of instance initializers for the first time, although they are nothing sooo special...

Saturday, November 18, 2006

MySQL/InnoDB slowness with Blobs

Reading about Peter Zaitsev's feature idea about Finding columns which a query needs to access - which I would really like to see implemented - reminded me of a bug report I filed in 2004 and which bit me again only a few days ago. You can find it under Bug #7074 in the MySQL bug tracking tool. Although it is filed as a feature request, I think one should be aware of this, as it may cause problems in your applications (it did in ours).

Basically it is about explicitly specifying which columns you need in a result set, instead of just using SELECT *. This is generally a good idea, however if the table contains BLOB columns, it becomes even more important, as it may affect performance heavily in an unexpected manner.

From the bug report:

MySQL first reads all the selected columns, and only after that checks the WHERE.

This may lead to long running queries, even if you do not use the BLOB column in the WHERE clause and even if there is no data to retrieve based on the query conditions.

For more details see the bug report.

Monday, November 13, 2006

System Lockup: F-Secure AV 2007 and Kerio Firewall

Recently I received a notification about F-Secure Anti-Virus 2007 being available. As an F-Secure customer you are entitled to upgrade from the 2006 version if your subscription is valid. So I downloaded the installation package and performed the upgrade.

After the obligatory reboot things started to fall apart. My computer would not respond for more than about 30 seconds after I had logged in. Opening the Start menu would work, maybe even opening e. g. the Control Panel sub menu. However nothing else would work after this point. Using Ctrl-Alt-Del to get the Task Manager just allowed me to "wipe" the start menu from the screen, no more action would be possible.

What made me suspicious was a little dialog I had to dismiss right after logging in that informed me about my Kerio Personal Firewall not being found by the system-tray GUI. Because conflicting firewalls are known to cause lockup problems like this, I originally bought F-Secure Anti-Virus instead of the whole Internet Security package. Anti-Virus 2006 had been working fine in conjunction with the separate personal firewall.

I rebooted to see if this was some sort of transient problem with the first reboot after the install. This time I did not even get an Explorer to launch and show me my desktop. Apart from the wallpaper and a mouse pointer I could not see anything. Hitting Ctrl-Alt-Del again let me launch the Task Manager. I tried to start explorer.exe from there, to no avail.

I decided to uninstall the personal firewall. I tried to boot into Safe Mode, just to see that it would not come up and instead die with a blue screen. To be fair, I have to say that I had not tried Safe Mode for a loong time, so I do not know if it would have worked before my problems started.

My only way to resolve this was to boot into the Vista RC installation I luckily had not deleted yet and to disable the startup of the firewall service in the XP install. To do so I loaded the windows\system32\config\system registry hive into the Vista regedit and set the startup type (ControlSet00x\Services\servicename\Start to 0 - which means disabled - in the firewall service node of the active ControlSet001. You can see which control set is the one for "normal" Windows startup by looking at the SYSTEM\Select\Default value.

Upon restart the situation did not change, the same problem as before. Because I was not sure whether just disabling the Kerio service had been enough, I decided to uninstall it. To do so I had to disable F-Secure Anti-Virus, too. So I loaded Vista again and opened the XP registry. Luckily the F-Secure services all have human readable key names, all starting with "F-Secure", so it was very easy to disable them as well.

Back in XP I was for the first time able to do more than wait for the lock-up. I uninstalled Kerio using the Control Panel's "Add/Remove Programs" applet and rebooted, after I had set the F-Secure services back to their original startup settings.

Guess what... It still did not work... I came to the conclusion that there must be some sort of a bug in F-Secure Anti-Virus's 2007 version. In the meantime my father had called, complaining about the same problem, which at the time seemed to support my theory. At that point however I did not know yet, that he used the Sunbelt Personal Firewall, too.

After going through the whole boot Vista - load XP registry - disable services - reboot to XP hassle I finally uninstalled Anti-Virus 2007, rebooted and re-installed 2006. At this point I had restored the situation where I had originally left off - minus the Kerio Personal Firewall.

For some reason I did not want to believe that F-Secure would ship such a lousy product. I fired up regedit again and opened the services subtree. There I reviewed every one of them, not knowing what exactly to look for. Finally I found these two entries:


Windows Registry Editor Version 5.00

"DisplayName"="Kerio HIPS Driver"
"TraceFile"="C:\\Programme\\Kerio\\Personal Firewall 4\\logs\\khips.log"


The ImagePath in the second node reads "%systemroot%\system32\drivers\khips.sys" when viewed with regedit. Searching the net for that name reveals that it is the "Kerio Host Intrusion Prevention Service". Obviously this is a remainder of the Kerio Personal Firewall that I thought I had removed.

In the Device Manager one can also see this service when the "View/Show Hidden Devices" option is enabled. It will show up under "Non-PnP-Drivers" (sorry if the option names are a little off, I am trying to guess their names, because I use a German Windows).

As soon as I had removed both of the registry keys above ( contains a reference to fwdrv) and rebooted, I could use F-Secure Anti-Virus 2007 without any problems. I will file this with F-Secure now...

JavaPosse podcast on Java GPL'ing

They guys of the JavaPosse have just released a special issue of their podcast in which they interview Mark Reinhold (chief engineer for Java SE), Rich Sands (community marketing manager for Java SE) and Eric Chu (senior director of the Client Systems Group and head of its Java ME initiatives).

Dick wall posted a story on called Questions about Open Source Java? This Podcast may have the answers! which leads to the podcast. Very interesting stuff, especially concerning the famous question "Will my app have to become Open Source, too, if I use Sun's Java?".

Be sure to cast your vote for the item on :)

Saturday, November 11, 2006

Vista Aero and QuickTime

Very cool behaviour of QuickTime on Vista (ok, RC1, but I do not think this will become better):

This is shown when you start the QuickTime control panel applet. Notice that the publisher information is displayed as "Microsoft Windows Publisher", so you have no idea that this was really the QuickTime applet. It could have been any other process in the background, too.

Tuesday, November 07, 2006

Java 5 random VM crashes

We are currently evaluating the consequences of migrating our application from Java 1.4 to Java 5. While initial tests revealed only simple issues (like variables called enum etc.) we are now seeing a much more severe problem: Random VM crashes.

Currently we only see this on Linux (Kernel 2.4) only, however even there we cannot reliably reproduce the problem. On a single machine we have seen two crashes in a week. Notably the application was not being used, it was just started and waiting for user input. Some background threads are running in this situation, however they do not do any work, either. They just poll some database tables for external changes, but there were none.

All of a sudden a VM would crash, leaving a hs_err_pid1234.txt behind. This is what they look like (shortened):

# An unexpected error has been detected by HotSpot Virtual Machine:
#  SIGSEGV (0xb) at pc=0x402989b9, pid=8736, tid=1094691632
# Java VM: Java HotSpot(TM) Client VM (1.5.0_09-b03 mixed mode, sharing)
# Problematic frame:
# V  []

---------------  T H R E A D  ---------------

Current thread (0x08099230):  VMThread [id=8737]

siginfo:si_signo=11, si_errno=0, si_code=1, si_addr=0x00000008

EAX=0x00000000, EBX=0x403b7aec, ECX=0x00000008, EDX=0x6861ce38
ESP=0x413fa3d0, EBP=0x413fa3e0, ESI=0x403aa980, EDI=0x403c5ac8
EIP=0x402989b9, CR2=0x00000008, EFLAGS=0x00010206

Top of Stack: (sp=0x413fa3d0)
0x413fa3d0:   0806c558 0806c558 403b7aec 403af5c4
0x413fa3e0:   413fa400 40298ccc 403af5c4 413fa448
0x413fa3f0:   413fa410 40329d7c 413fa448 403b7aec
0x413fa400:   413fa430 40334754 403c5ac8 403af5c4
0x413fa410:   413fa4f0 403262a1 413fa448 413fa454
0x413fa420:   403630e3 403b7aec 00000002 0806c438
0x413fa430:   413fa470 40170ff0 403c5ac8 00000000
0x413fa440:   00000001 00000001 42133220 00000001 

Instructions: (pc=0x402989b9)
0x402989a9:   08 49 89 08 8b 40 08 8b 14 88 8b 42 04 8d 48 08
0x402989b9:   8b 40 08 52 51 ff 50 58 8b 06 83 c4 10 8b 08 85 

Stack: [0x4137a000,0x413fb000),  sp=0x413fa3d0,  free space=512k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
C  []

VM_Operation (0x4a74d258): full generation collection, mode: safepoint, requested by thread 0x0815d230
# An unexpected error has been detected by HotSpot Virtual Machine:
#  SIGSEGV (0xb) at pc=0x40189396, pid=18790, tid=1094691632
# Java VM: Java HotSpot(TM) Client VM (1.5.0_09-b03 mixed mode, sharing)
# Problematic frame:
# V  []

---------------  T H R E A D  ---------------

Current thread (0x08099230):  VMThread [id=18791]

siginfo:si_signo=11, si_errno=0, si_code=1, si_addr=0x59c83398

EAX=0x59c83398, EBX=0x403b7aec, ECX=0x6eb6f518, EDX=0x5bc99a08
ESP=0x413fa0a8, EBP=0x413fa0c0, ESI=0x5bc99a04, EDI=0x6eb6f520
EIP=0x40189396, CR2=0x59c83398, EFLAGS=0x00010202

Top of Stack: (sp=0x413fa0a8)
0x413fa0a8:   6afc194c 5bc99a08 6eb6f524 403b7aec
0x413fa0b8:   403aa980 403c5ac8 413fa0e0 402989c1
0x413fa0c8:   6eb6f288 5bc999f8 0806c558 0806c558
0x413fa0d8:   403b7aec 403af5c4 413fa100 40298ccc
0x413fa0e8:   403af5c4 413fa148 413fa110 40329d7c
0x413fa0f8:   413fa148 403b7aec 413fa130 40334754
0x413fa108:   403c5ac8 403af5c4 413fa1f0 403262a1
0x413fa118:   413fa148 413fa154 403630e3 403b7aec 

Instructions: (pc=0x40189396)
0x40189386:   8d 14 86 89 55 ec 39 d6 73 18 8b 06 85 c0 74 0a
0x40189396:   8b 00 83 e0 03 83 f8 03 75 18 83 c6 04 3b 75 ec 

Stack: [0x4137a000,0x413fb000),  sp=0x413fa0a8,  free space=512k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
V  []
C  []

VM_Operation (0x48f8e4d8): full generation collection, mode: safepoint, requested by thread 0x08653a08

Looking through the Sun bug database I found several reports about similar crashes, however they were all closed as not reproducible. This is our problem, too. Right now the application has been running for 5 days without a problem. Nevertheless this is not too comforting, as we would have several thousand VMs in production use. Should we decide to migrate even a chance of 0.1% for this crash would leave us with several problem reports a day which we cannot accept.

Any comments and hints will be greatly appreciated.

Thursday, November 02, 2006

Beryl on Edgy Eft

As announced previously I spent some time to get Beryl to work on my newly upgraded Edgy Eft installation. Although it did not went as smoothly as I would have hoped, it was not too troublesome either.

Dual head still seems to be a major problem in many areas in Linux. This definitely something the Windows people do not have to worry about just as much, but ok, this may partly be related to the hardware vendors not providing some sort of unified and/or open drivers.

Nevertheless it is now working, after some changes to my xorg.conf. Before those I always got an error message from Beryl, complaining about a missing RandR extension.

The effects are really nice, some of them are however too slow for my taste in the default settings. After speeding them up a little (I do not like to wait for a context-menu to wobble into view, if it wobbles for more than a fraction of a second) I really liked it. There are some issues left, but I assume this is because of the ongoing development. E. g. window resizing is a little strange if you grab a window's top edge and move the mouse up and down. One would expect the window to remain in place and gain or loose height from the top, i. e. where you drag. However sometimes windows seem to be resized on the bottom.

Video playback is also choppy, but that seems to depend on the file I play back. Probably due to different codecs, however I have not really looked deeper into it.

From what I have seen so far, I believe there is very much potential in this :)

For those interested, this is the contents of my xorg.conf

Section "Files"
        FontPath        "/usr/share/X11/fonts/misc"
        FontPath        "/usr/share/X11/fonts/cyrillic"
        FontPath        "/usr/share/X11/fonts/100dpi/:unscaled"
        FontPath        "/usr/share/X11/fonts/75dpi/:unscaled"
        FontPath        "/usr/share/X11/fonts/Type1"
        FontPath        "/usr/share/X11/fonts/100dpi"
        FontPath        "/usr/share/X11/fonts/75dpi"
        FontPath        "/usr/share/fonts/X11/misc"
        # path to defoma fonts
        FontPath        "/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType"

Section "Module"
        Load    "i2c"
        Load    "bitmap"
        Load    "ddc"
        Load    "dri"
        Load    "extmod"
        Load    "freetype"
        Load    "glx"
        Load    "int10"
        Load    "type1"
        Load    "vbe"

Section "InputDevice"
  Driver       "kbd"
  Identifier   "Keyboard[0]"
  Option       "Protocol" "Standard"
  Option       "XkbLayout" "de"
  Option       "XkbModel" "pc105"
  Option       "XkbRules" "xfree86"

Section "InputDevice"
  Driver       "mouse"
  Identifier   "Mouse[1]"
  Option       "Buttons" "6"
  Option       "Device" "/dev/input/mice"
  Option       "Name" "Logitech USB Wheel Mouse"
  Option       "Protocol" "explorerps/2"
  Option       "Vendor" "Sysp"
  Option       "ZAxisMapping" "4 5"

Section "Monitor"
  HorizSync    30-121
  Identifier   "Monitor[0]"
  ModelName    "H750"
  VendorName   "Hansol"
  VertRefresh  56-70

Section "Screen"
  DefaultDepth 24
  SubSection "Display"
    Depth      15
    Modes      "1280x1024" 
  SubSection "Display"
    Depth      16
    Modes      "1280x1024" 
  SubSection "Display"
    Depth      24
    Modes      "1280x1024" 
  SubSection "Display"
    Depth      8
    Modes      "1280x1024" 
  Device       "Device[0]"
  Identifier   "Screen[0]"
  Monitor      "Monitor[0]"

Section "Device"
  BoardName    "GeForce 7600GS"
  BusID        "1:0:0"
  Driver       "nvidia"
  Identifier   "Device[0]"
  Option       "AddARGBGLXVisuals" "True"
  Option       "RenderAccel" "True"
  Option       "TwinView"
  Option       "SecondMonitorHorizSync" "60-71"
  Option       "MetaModes" "1280x1024,1280x1024"
  Option       "TwinViewOrientation" "RightOf"
  Option       "SecondMonitorVertRefresh" "50-160"
  VendorName   "NVidia"

Section "ServerLayout"
  Identifier   "Layout[all]"
  InputDevice  "Keyboard[0]" "CoreKeyboard"
  InputDevice  "Mouse[1]" "CorePointer"
  Option       "Clone" "off"
  Option       "Xinerama" "off"
  Screen       "Screen[0]"

Section "DRI"
    Group      "video"
    Mode       0660

Section "Extensions"

Sunday, October 29, 2006

Upgrade to Egdy Eft

Yesterday I read about the final release of Ubuntu 6.10, Edgy Eft. As for my good experiences with Dapper Drake I decided to upgrade. Having sort of a Debian history I was quite confident that a dist-upgrade would work quite flawlessly, especially since I had not made any deep modifications of my system (definitely a point for Ubuntu here! :)).


I went to the Ubuntu homepage and read the Upgrade Notes. I had always wondered - however not really bothered to find out either - what this alternate install CD was. Now I know that it is used to save bandwidth in the upgrade process, because it can be used as a local repository source. I do however wonder, why this cannot be done with the regular install media. But anyways...


I ran the CD based upgrade as described in the upgrade notes. A graphical tool came up and asked me whether I wanted to download updates of more recent packages from the net. I said yes, suspecting that there would not be that many of them, as the whole distribution had just come out. However after a while analyzing my system I was told that about 250MB of newer packages would have had to be downloaded. I decided to abort here, because I had a bad feeling about being of of several thousands, hitting the repository server.

I re-ran the tool and this time said "no" when it asked me if I had a cheap/fast internet connection. Still the updater claimed it needed about 250MB of data, however I suspected this was just a badly formulated message that appeared no matter what. So I let it go from here (acknowledging the "point of no return" warning) this time.

A progress bar showed up and a label claiming 1117 packages needed to be fetched. The first 900 or so went very fast, seemingly from the mounted CD image. However then things started to become ugly. Looking at the process list revealed that apt was happily downloading packages from the net with an astounding speed of between 7 and 15kb(!)/s... Nothing fancy of course... Just OpenOffice, some GTK libraries, several dictionaries, the GIMP help files in German and English and so forth. All in all the upgrade from Dapper to Edgy took me around 9 hours, 8.5 of which where just used for downloads I had tried to avoid in the first place.

Afterwards it occurred to me that this alternate install CD did not contain everything I had installed, so I guess it just did not have any chance but to download those packages, however in that case I would have liked a DVD to download via BitTorrent first.

Manual cleanup

Once the upgrade tool was done, it rebooted the machine. When the Grub menu came up I had to manually choose the new kernel, because of my manually modified choice of configurations. This was ok. However for some reason I went through a text-mode boot process where I would have expected a nice usplash screen. Looking at the log later revealed that it complained about not having a configuration for 640x480. I have two DFPs, both 1280x1024, so I do not know why it would have tried the lower resolution. I added vga=794 to the kernel line in the menu.lst to resolve this. Once it worked I could see some nice artwork :)

Next I logged in and found myself unable to start a simple gnome-terminal. Choosing it from the Applications menu made a "Starting terminal..." appear in the task bar for a few seconds, but now terminal opened. Running xterm worked however. Googling the web I almost immediately found this entry in the Ubuntu Forums. Obviously it has something to do with the X11 configuration. Strangely enough this had always worked with Dapper.

What worked however (and without any further ado) was video playback, even with correct colors. I did not have the time to dig deeper into my wrong video colors problem, and as it seems this will not be necessary any more :)

Another thing I found to be not working was Azureus. It came up with its splash screen and almost immediately terminated again. Starting it from a terminal brought this up:

ds@yavin:~$ azureus 
changeLocale: *Default Language* != English (United States). Searching without country..
changeLocale: Searching for language English in *any* country..
changeLocale: no message properties for Locale 'English (United States)' (en_US), using 'English (default)'
# An unexpected error has been detected by HotSpot Virtual Machine:
#  SIGSEGV (0xb) at pc=0xb0527d02, pid=8613, tid=3085334192
# Java VM: Java HotSpot(TM) Client VM (1.5.0_08-b03 mixed mode, sharing)
# Problematic frame:
# C  []
# An error report file with more information is saved as hs_err_pid8613.log
# If you would like to submit a bug report, please visit:
Aborted (core dumped)

I found some people on the net (see here and here, however they at least managed to get a stack trace. However I decided to try the "official" Azureus version from Sourceforge and just backed up /usr/share/java/Azureus2.jar and replaced it with Azureus2.5.0.0.jar. This solved the problem for me, however this is something I would not have expected from a final release. This is not some obscure feature not working, but the whole app not coming up...

Next steps

Next thing I'll try is setting up a 3D desktop environment. I will probably go with the description in this forum entry. I will keep updating as I go...

Friday, October 27, 2006

Flash 9 beta in Ubuntu Dapper

Maybe I shouldn't do vacations anymore... This time I got a nasty flu two days after my return. Well, slowly I am feeling better and thought I might just tell you that the installation of the Flash Player 9 beta for Linux worked like a charm on my Ubuntu Dapper Drake (6.06) machine.

I just downloaded the archive from Adobe Labs, uninstalled the previous version using apt-get and put the new file into my private plugin directory. Apart from the

ds@yavin: ~$ sudo apt-get remove flashplugin-nonfree

everything is described in the readme.txt file that comes included in the archive.

It is really great to have synchronous audio/video playback for the first time under Linux. About time, but hey, thanks anyway :)

Saturday, October 14, 2006

MySQL 5.0: DECIMALs queried with Strings

We are currently preparing a MySQL 4.1 to MySQL 5.0 migration. First tests showed a very nasty problem, however.

One of our test cases incorporates queries against DECIMAL columns that use strings as the queried values. In MySQL 4.1 this works flawlessly. The reason behind this is that in contrast to 4.1 the newer server version does a (in my opinion very stupid) conversion from String to double, which in many cases cannot correctly store the precise value.

This may lead to very subtle bugs, especially when using an optimistic locking approach as we do. We only noticed the problem, because we got a ConcurrentModificationException, as an update query that contained a string-ized BigDecimal did not match any rows.

See MySQL bug reports 23260 and 22290 for more details.

Right now this leaves us with little options but to not migrate to 4.1 as our application has several hundreds of thousands of lines where most of the database access is handled by an OR mapping layer, but there are also numerous cases of manually crafted SQL which would be hard to identify and analyse individually.

What I absolutely do not get is that with the introduction of precision math they also begin to use floats and doubles, whereas most people request the new math because it should make monetary calculations more reliable. Instead all of a sudden existing applications are likely to exhibit all sorts of weird problems, from calculation errors to completely different behaviour (see above). Answering complaints about this with a simple "this is documented behaviour" is a bad excuse if you ask me.

I do really like MySQL, it is a great product and it has served me well for years. I have always appreciated the involvement of the community very much, however cases like this may be what (at least partially at this time) makes the difference between the "really big players" and MySQL.

Monday, October 02, 2006

MySQL replication timeout trap

Today I spent several hours trying to find a problem in our application until I found out there was a problem on the MySQL side. In our setup we have several slave machines that replicate data from a master server. The slaves are configured with a series of replicate-do-table directives in the my.cnf file so that only parts of the schema get replicated. The remaining tables are modified locally, so to avoid conflicts they are not updated with data coming from the master.

We do however have the need to replicate data from the master for some special-case tables. To solve this we usually have a column that indicates whether a record was created on a master or a slave machine and use an appropriate WHERE clause in all updates. This avoids collisions in concurrent INSERTs on the master and the slave. The application knows which of the two records to use.

Due to historical reasons I do not want to elaborate on we did not use such a distinction column for one table (let's call it T1). Instead we created a new table T2 that stores data generated on the slave. T2 is only written to when the slave is separated from the network. As soon as it gets reconnected, the data from T2 is sent to an application server which merges it with the master's T1 table.

This usually ensures that T1 is up to date on the slave, too (with some seconds lag, of course), through replication.

However a customer noticed that records that were inserted into T2 and later sent to the application server did not show up in the slave's T1 table, even after several minutes, leaving the application with very out-of-date information.

Assuming some sort of replication error I connected to the slave and issued a SHOW SLAVE STATUS command (I dropped uninteresing rows of the output):

*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: MASTER001
            Master_Log_File: MASTER001.009268
        Read_Master_Log_Pos: 68660485
             Relay_Log_File: SLAVE005-relay-bin.000071
              Relay_Log_Pos: 27920448
      Relay_Master_Log_File: MASTER001.009268
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
         Replicate_Do_Table: ...
Replicate_Wild_Ignore_Table: ...
                 Last_Errno: 0
               Skip_Counter: 0
        Exec_Master_Log_Pos: 68660485
            Relay_Log_Space: 162138564
      Seconds_Behind_Master: 0
1 row in set (0.00 sec)

As you can see both the Slave_IO thread and the Slave_SQL threads are running. So no replication problem here, everything is fine. Or so it seems.

Suspecting a problem with our application code I went through it, line by line, because at certain points it issues STOP SLAVE and START SLAVE commands. However I could not find anything I would not have expected.

However what I found when re-running the SHOW SLAVE STATUS several times in the course of a few minutes made me curious: It did not show any changes to the position in the master- or relay log files. This effectively means that no updates are replicated from the master, even though the everything appears to be alright.

Even then I did not have the right idea but suspected some sort of bug in the old 4.1.12 release of MySQL we are using. But upgrading the slave to the latest 4.1.21 release did not solve the problem either. It could still be easily reproduced by unplugging the cable, creating some data in T2, reconnecting and waiting for the data to be inserted into T1 on the master but not the slave.

I only got it, when I saw that a simple STOP SLAVE; START SLAVE fixed the problem. (I know, should have had this idea earlier...).

The reason for this strance behaviour is the default setting for the slave-net-timeout variable (3600s). Once a slave has connected to its master it waits for data to arrive (Slave_IO_State: Waiting for master to send event). If it does not receive any data within slave-net-timeout seconds it considers the connection broken and tries to reconnect. To prevent frequent reconnects when there is little activity on the master this setting defaults to one hour.

However it does not reliably notice a disconnect that occurs less than one hour after the last replicated statement. That is exactly what happened: Unplugging the network cable broke the master-slave connection, however the slave did not notice and therefore still displayed a "no problems" status. Had I waited for an hour I would have seen the data arrive alright...

So to get quicker updates I just had to decrease the timeout value in the slave's config file:

slave-net-timeout = 30
master-connect-retry = 30

Because in our case the activity on the master side is quite heavy, 30s without any updates is plenty and most certainly an indication for a dropped network connection. With these settings the slaves notice a broken connection within half a minute and immediately try to reconnect. If the network is still down they will keep trying in 30 second intervals (master-connect-retry).

So lesson learned: do not trust the output of SHOW SLAVE STATUS unconditionally!

Tuesday, September 26, 2006

Windows: TCP Port conflicts above 1024

We have repeatedly run into problems with our JBoss application servers not being able to start after a Windows system reboot, because their configured network ports (e. g. 1099) had already been claimed by some other process.

It took quite a while to find the reason, because often just trying again several times without stopping or starting any other programs, it would suddenly succeed.

What turned out to be the reason is the dynamic port allocation for ports above 1024 (so called ephemeral ports). If any process requests a random port, it may get one of those you would like to use for your own applications.

On Windows 2000/2003 Server installations as well as on Windows XP Pro you can reserve port ranges (even if they only cover a single port) for your applications. Effectively they are not reserved for anything specific, but just excluded from the dynamic allocation. To do so, create or edit the following registry value (type REG_MULTI_SZ/Multi-String Value):


In this value specify port ranges in the format xxxx-yyyy with xxxx and yyyy being the lowest and highest port of the range to be reserved. To reserve a single port, just use the same values for both (e. g. 1099).

What I find interesting is the suprisingly high frequency of this problem occuring. Even though I do not know for sure, I suspect Windows just starting random allocations at 1024 and counting upwards. That would at least to some degree explain why the problem occurred so often for 1099 but not for other, higher ports.

Responsible for assigning those random ports is the "RPC endpoint mapper", itself reachable via TCP port 135. There are several components that make use of this service, many of them included in Microsoft products. Most of these can be individually configured to request specific ranges or single ports to be used, but if you need specific ports to be reserved for your application under all circumstances, using the parameter above seems more sensible.

For further information see the original Microsoft Knowledgebase item #812873 which describes the ReservedPorts setting and several others, reachable from #832017 which explain in more detail which products can be individually configured.

Sunday, September 24, 2006

Video colors wrong on Ubuntu

Today I was suprised to see a video in Totem with very strange colors. It almost looks as if one color component is missing. I was alredy looking for problems with Totem on the net, when I noticed that on the second screen of my dual head setup everything was fine. Moving the window half-way between the two screens gives me one half with the right and one half with wrong colors.

It seems to affect different players (Xine and Mplayer, too) and all sorts of video material (tried DVD, mpg and xvid).

Though I do not know for sure, I suspect the latest upgrade package that also upgraded the nvidia driver. If I do not find another solution I will try downgrading to the previous version again.

Friday, September 22, 2006

MySQL Index Analyzer: 0.04 released

Over on the MySQL Index Analyzer site I have just released version 0.04 as a downloadable package.

This is the first GUI version that allows real analysis and has more features than the command line version.

A quick overview of what is new:

  • Swing GUI
  • Analysis features as on the command line
  • Copying of generated ALTER TABLE statements to the clipboard
  • Information on data and index size distribution
  • Rudimentary analysis of possible disk space savings

So go have a look :)

Thursday, September 21, 2006

The Register: Interesting Article on software standards

"The Register" has an interesting article called Dumb customers and dumber software standards that I recommend reading.

It is about interoperability between companies and their software products and that much innovation could be gained on all sides should they aim to work closer together. A core sentence is Customers should not be deterred from participating in standards groups for fear of giving away business plans or valuable competitive information, Snow added. [...] "We have stuff we do in the financial services business - I don't speak about that stuff at places like this or in standards organizations because there's a potential loss of competitive advantage. On the other hand, we have a lot of other stuff that isn't core to our business that we'd be better served to commodify [sic]," he said..

I definitely recommend having a look.

Tuesday, September 19, 2006

MySQL Index Analyzer: First GUI version

Over on my MySQL Index Analyzer blog I just posted two screenshots of the first working GUI drafts. If you are interested, you will have to check this version out from the SVN repository, I will prepare a new distribution package soon.

Monday, September 18, 2006

Ubuntu kernel update broke X11

Yesterday I installed an Ubuntu (Dapper Drake) kernel upgrade through the system upgrade mechanism. The new version is 2.6.15-27-686. Today, when I tried to boot, I just got a console logon, X11 refused to start because of problems with the configuration.

The following log was output to /var/log/Xorg.0.log, but from what I could see, the messages were contradictory. First the NVIDIA GPU is found and several messages from the module are displayed, but at the end of the log it says Failed to load the NVIDIA kernel module!.

(II) NVIDIA X Driver  1.0-8762  Mon May 15 13:09:21 PDT 2006
(II) NVIDIA Unified Driver for all Supported NVIDIA GPUs
(II) Primary Device is: PCI 01:00:0
(--) Chipset NVIDIA GPU found
(--) Chipset NVIDIA GPU found
(II) Setting vga for screen 0.
(II) Setting vga for screen 1.
(**) NVIDIA(0): Depth 24, (--) framebuffer bpp 32
(==) NVIDIA(0): RGB weight 888
(==) NVIDIA(0): Default visual is TrueColor
(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
(**) NVIDIA(0): Option "NoLogo" "0"
(**) NVIDIA(0): Option "AllowGLXWithComposite" "true"
(**) NVIDIA(0): Option "UseDisplayDevice" "CRT"
(**) NVIDIA(0): Option "UseEdidDpi" "FALSE"
(**) NVIDIA(0): Enabling RENDER acceleration
(II) NVIDIA(0): Support for GLX with the Damage and Composite X extensions is
(II) NVIDIA(0):     enabled.
(EE) NVIDIA(0): Failed to load the NVIDIA kernel module!
(EE) NVIDIA(0):  *** Aborting ***
(**) NVIDIA(1): Depth 24, (--) framebuffer bpp 32
(==) NVIDIA(1): RGB weight 888
(==) NVIDIA(1): Default visual is TrueColor
(==) NVIDIA(1): Using gamma correction (1.0, 1.0, 1.0)
(**) NVIDIA(1): Option "NoLogo" "0"
(**) NVIDIA(1): Option "AllowGLXWithComposite" "true"
(**) NVIDIA(1): Option "UseDisplayDevice" "DFP"
(**) NVIDIA(1): Option "UseEdidDpi" "FALSE"
(**) NVIDIA(1): Enabling RENDER acceleration
(EE) NVIDIA(1): Failed to load the NVIDIA kernel module!
(EE) NVIDIA(1):  *** Aborting ***
(II) UnloadModule: "nvidia"
(II) UnloadModule: "ramdac"
(II) UnloadModule: "fb"
(II) UnloadModule: "nvidia"
(II) UnloadModule: "ramdac"
(II) UnloadModule: "fb"
(EE) Screen(s) found, but none have a usable configuration.

Fatal server error:
no screens found

It took me some time to notice that the automatic upgrade had only downloaded the new kernel package, but left out the linux-restricted-modules package. After manually installing a version matching the new kernel, I got X11 back immediately:

$ sudo apt-get install linux-restricted-modules-2.6.15-27-686

I do not know if this is "normal" behaviour, but I would definitely like to not have to repair my system after a seemingly flawless upgrade....

Friday, September 15, 2006

Prevent Windows automatic updates reboot

Another one of those reminder posts: How to prevent the reboot triggered by the Windows automatic update feature.

  • Disable automatic updates completely. Not recommended.

  • Change the policy. Recommended.

    1. Start - Run - gpedit.msc

    2. Local Group Policy - Computer Configuration - Administrative Templates - Windows - Components - Windows Updates

      If you do not see the last element, use the context menu on Administrative Templates to add a new template. Pick the wuau.adm template and close the dialogs. Windows Updates should now show up.

    3. Activatge "No auto-restart for scheduled Automatic Updates Installations" and/or use a different warning interval.

On XP Home go to or create the following registry key/value HKEY_LOCAL_MACHINE \Software\Policies \Microsoft\Windows \WindowsUpdate\AU\NoAutoRebootWithLoggedOnUsers (Dword) and set it to either 0 (allow reboots) or 1 (disallow reboots).

Friday, September 08, 2006

Log Buffer #9: a Carnival of the Vanities for DBAs

As did last week's Craig Mullins of I wrote a short post about the LogBuffer recently when I found it in my Blog's referrer list and was promptly asked if I would like to compile one myself.

So here am I am, welcoming all of you to the the 9th issue of the Log Buffer, a Carnival of the Vanities for the DBA community. Once again you will find a plethora of links to all sorts of information on the one thing that keeps many of us both fed and sometimes close to blank despair.

Myself being some sort of a mixture between a software developer and database admin I have had a fair amount of time over the last years to get experience especially with MySQL. I did some work with the MS SQL server, too, so I am happy to have at least one item to cover for this faction:

Joseph Sack made sort of a mental note about SQL Server 2005 Instant File Initialization on his SQL Server blog. If you have been unnerved by waiting for your SQL Server to physically fill up its data files before allowing you to use them, go have a look. This can save a lot of time and may be especially interesting when expanding an existing installation.

While this is sadly the only item I have for the Microsoft fans, the MySQL people seem to be a lot chattier.

Peter Zaitsev is one of them. Formerly a MySQL AB employee he started his own consulting firm and was recently joined by his former - and new - colleague Vadim Tkachenko. During their time at MySQL AB they were already very active on their MySQL Performance Blog, but leaving the company does not seem to have affected their ambitions in any way.

There were three very interesting articles published during the last week alone, probably the most noteworthy, even if not completely new or even limited to databases, being Peter's warning that Even Minor Upgrades Are Not Always Safe. I am sure many of us (*raises hand*) have been through the trouble of upgrading to a bug-fix release of (insert favorite software package here) and only after having transferred it into production been confronted with its nasty little side effects. So from personal experience I can only support Peter's view on the matter.

Roland Bouman of MySQL AB published a detailed article on Refactoring MySQL Cursors, both on the O'Reilly Databases blog as well as his own blogspot blog. While the post itself centers on MySQL's new (since version 5) support of cursors, it surely holds valuable general information that can be applied to any other system as well. I have to admit that when reading it I felt reminded of several situation where I found myself or other people doing exactly what Roland describes. So if you have ever written a cursor, maybe you should have a look at this and after it decide again whether your choice was wise.

Jay Pipes has two interesting items, even more so as they point in somewhat different directions. The first one is his mentioning of an Interview with Rohit Nadhani from Webyog concerning the recent open-sourcing of their popular SQLyog MySQL GUI frontend which is now available in both a community and professional edition.

If you have some ressources to spare and are still looking for something useful to do with them, be sure to have a look at Jay's second article entitled Build Farm Scripts Available on MySQL Forge which is an announcement of the progress of the very interesting MySQL Build Farm initiative. They aim to provide a unified way of building MySQL on a variety of platforms and concentrate the results in order to improve quality.

Another post from Peter Zaitsev is about the Slow Query Log Analysis Tools. These include a patch to the MySQL server which increases the slow-query log threshold resolution from seconds to milliseconds. While most people will probably not want to use a patched server in production it might still be worth a look for benchmarking and performance tuning during development.

Also on the MySQL Performance Blog, Vladim Tkachenko provides a helpful hint towards the useful GROUP_CONCAT extension to GROUP BY. It can help prevent looping on the application side by combining values of a group into a single field value. Example snippets for PHP are provided, but can easily be adapted to suit other needs as well.

While we are at it, Mark Rittman is not only writing about GROUP BY...ROLLUP, but also about his plans for Comparing Oracle 10g Aggregation Techniques. He is going to present his findings about initial aggregation speed, ease of use and speed as well as space requirements at the UK Oracle User Group Conference in Birmingham this November. Since Mark says he is going to keep us informed about his progress, a regular look at his blog is definitely advisable.

This week Pete Finnigan gives a link to a Nice idea on audition using trace events in his Oracle security forums. The basic thought is to use Oracle error messages and attach events to them to improve Oracle security auditing, possibly finding some intruder poking around.

Although written using some Oracle wording, Dominic Giles post On the subject of I/O is really an excellent and mostly database-agnostic article. It covers the common problem of non-technical folks (and even some of the technical ones, too) underestimating the extreme importance of a well-balanced system design regarding the I/O subsystem, memory and CPU power.

Written by another Oracle expert - Chris Foot - but also interesting for all DBAs out there, regardless of the products they use, is The Non-Technical Art of Being a Successful DBA - Naming Conventions. It contains recommendations and best-practice advice to help prevent what Chris calls Moonlight Script Hunting, and outlines how to keep even large installations with lots of different databases running in a well-structured way.

Staying with Oracle for one more item: Eddi Awad has published some examples for the Cool Undocumented SQL Function: REVERSE which seems to have been available since Oracle 8i, if not officially documented. While its practical use may not be immediately evident, reading this article might get you some fresh ideas.

Two interesing posts come from Xapbr. The first started as he looking for some recommendations in a MySQL IRC channel and ended up as A review of the Glom graphical database frontend for PostgreSQL, somewhat similar to Microsoft Access or Filemaker Pro, but with a "real" database behind it.

Xapbr's second post is a notice that Version 0.1.149 of innotop has been released. While previous versions of this great tool for keeping an eye on the low level status of MySQL's InnoDB storage engine had some stability problems, those have been addressed now. I have already tried it and so far, I havent't encountered any of the problems I used to have.

More on PostgreSQL can be found among Greg's Postgres stuff who provides some tools that might come in handy at times for Finding Multi-Column Keys using the INFORMATION_SCHEMA. Apart from being outright useful they might also be seen as a starting point for more things to be done with database metadata.

Re-inventing Greg's method to prevent re-inventing nicely shows PostgreSQL's ability to use different sorts of languages to implement custom functionality. Robert Treat demonstrates how to implement an email validity check in pl/php using the Pear::Validate module, following the same feature, however implemented earlier by Greg in pl/perl with EMail::Valid.

To finish this weeks edition I'd like to point you to Sean McCown's What's a Tiny Mistake? on the Database Underground blog. While I primarily find it to be a nice obituary for "The Crocodile Hunter" Steve Irwin, Sean also draws a parallel to the IT business by giving out a warning to software developers and database folks reminding them of that small changes can have disastrous effects.

Tuesday, August 29, 2006

MySQL Index Analyzer 0.02 package ready for download

I just uploaded the first ready-to-use package of the MySQL Index Analyzer (version 0.02). You can download it via its download site.

I recommend going to the MIA homepage to see what's new and how to use it.

Monday, August 28, 2006

Old (Bad) habits die hard

Recently I was reminded painfully of the fact that habits you have taken to once hardly ever get laid off.

I usually consider myself someone who tries to write software after I have thought it through. I do not mean "over-engineering", "over-abstracting" and "over-prepare-for-anything-that-might-ever-come'ing". However I also believe that starting hacking blindly is not a good thing either. And I try to write "nice" code, even though it might be a little more work, as long as it is easier to read or just plain more stable (which is often the same).

Sometimes however, especially under a tight schedule, by force of habit I (and probably any developer out there) tend to do things that upon later review make me feel deeply embarrassed. Just so I did a couple of weeks ago...

I had to write a component that translates data from a legacy system, stored in plain text files, into a relational database, accessed through an object relational mapping layer. Each entry into the SQL database is generated from single text file. The information used as the primary key is encoded into the filenames.

For some reason I cannot understand anymore today, I decided to keep track of the files I had already processed by appending their names to a new plain text protocol file. At the time I probably thought it was quicker to implement than to come up with a new kind of protocol value object to store it into the database for each entry. However in retrospective I doubt that this approach was really faster to implement. (Even worse, this code is deployed into an J2EE application server where you are not even really allowed to do file i/o if you follow the EJB spec.)

Anyway, this translation component includes a polling mechanism that regularly inspects a directory for new, unprocessed files. While during my tests nothing looked wrong, after some time in production the processing (better: the polling) became continously slower, as the list of files already processed was growing longer and longer. So the time to find out if a file was new or still had to be processed gradually increased. To make matters worse, the number of file names available is limited by the design of the legacy program, so they start to roll over after a while, making the poller believe that a file had already been processed and could be skipped, while in fact there was new data in it. I would have had to include the file modification timestamp into the list of already processed items and compared them for every poll cycle as well to make this work again.

So in the end I had to refactor the whole thing and roll out an update of our application that stored the necessary meta-information into the database server to get an acceptable throughput again. To avoid the the file modification date check we additionally we implemented an archive mechanism for the legacy files to keep the number of items in a single folder limited and the names unique.

However all of this shows that people tend to fall back into old habits, once they get under pressure. It can be really hard to notice this in time and make yourself do it "right" from the start.

Here is my private list of things I notice people (including myself, of course) doing time and again despite knowing better:

  • Use files instead of other, more suitable forms of storage
  • Hard code strings ("I will come back to I18N-ize this later...")
  • Include debugging output via System.out.println instead of through the logging mechanism
  • Add checks for null's in places where you should not get any null in the first place instead of trying to find out the real cause.
  • Change the code, but not the documentation - it is always fun to look for a bug or unexpected behaviour that is caused, because a method does not do what you would expect from its comment
  • Change the behaviour, but not the names (of methods, fields, variables...). This should never happen, given the excellent renaming support modern IDEs offer.

Feel free to add to this list anything that comes to your mind :)

Saturday, August 26, 2006

MySQL Index Analyzer (MIA): Refactoring, Part 1

Development of a more structured version is in progress. I just committed a set of changes to the repository that changes the current, rather monolithic, single-class design to what will be the first step of a more modular one.

Currently the generation of ALTER TABLE statements has been removed and the output format of the analysis is slightly different. But that is mere cosmetic.

What is much more important is that now the gathering of database schema information is based on a pluggable system, designed around an interface called IndexDescriptorProvider. Up to now I have just ported the MySQL 4 stuff that was already in the first version to this new architecture. Please feel free to have a look at it and tell me what you think.

Next thing I'll do is implement a provider based on the INFORMATION_SCHEMA database available in MySQL 5.x to see if I missed anything.

Go to for details.

Friday, August 25, 2006

MySQL Index Analyzer Basic Documentation

I posted some Basic Documentation for MySQL Index Analyzer including a simple example. This is intented to get started with the tool without having to read the code.

Any ideas and suggestions for improvements are appreciated :)

Tuesday, August 22, 2006

Windows compressed folders... again

I wrote about the "buggyness" of the Windows Compressed Folders feature before. However today I found out an even more unbelievable bug.

I copied a ZIP file containing several thousand files onto a Windows 2003 Server machine (with SP1). Once it was there I connected via RDP and right-clicked the file in the Explorer, choosing "Extract All" from the context menu.

After clicking "Next" to get past the "Welcome to the Decompression Idio^H^H^H^HWizard" I chose the target directory and hit "Next" again. Immediately I was asked whether I wanted to overwrite a file already existing in the target location. As I had created a new folder to hold the archive contents I was somewhat suprised by this dialog; so I cancelled the operation and tried again.

Now you won't believe it: Once the decompression starts upon clicking the second "Next" button, this very same button remains enabled and functioning! Obviously I had (by accident) clicked twice on the button in my first attempt. Turns out you can click it again and again, as long as the decompression is not finished and it will ask you if you want to overwrite the files it just put there itself....

Now is this embarrassing for MS or what?

Oh, and while we are at it: Be sure that have a good laugh while reading "Dear Sir Bill Gates: invoice enclosed" and the follow-up Stupid operating systems or stupid operators?.

Monday, August 21, 2006

MySQL Index Analysis Tool

Back in January I posted a simple MySQL duplicate index finder tool. Because I read requests for such a tool on the MySQL Performance Blog I decided to open a new project on Google's code hosting service as well as a new blog to track it.

So if you are interested and maybe even want to contribute to it, go have a look.

Saturday, August 19, 2006


Looking at the list of referrers for my 1st blog, I found one coming from Mike Kruckenberg. First of all it is a nice post to read for anyone who (has to) work with databases, so be sure you take a look.

Moreover I learned about LogBuffer which I did not know before. So if anyone else dealing with databases does not know it yet, maybe you will it as interesting as I did.

Monday, August 14, 2006

Windows XP image preview broken

Today a friend called, asking if I had any idea why his Windows XP machine did not display any picture miniature in the "My Pictures" folder anymore. He had already tried to reset file type associations and some other experiments, but without any luck. Moreover using JPG files as desktop background did not work anymore.

After some thinking he rembered that the problem might first have occured after installing the first WMF hotfix published by Ilfak Guilfanov even before Microsoft provided a patch.

This led us to the solution. First we made sure that the patch was uninstalled by issuing the following command:

msiexec.exe /X{E1CDC5B0-7AFB-11DA-8CD6-0800200C9A66}

If it tells you, the program is not installed currently, you should be able to remove it from the "Add/Remove Software" control panel applet or by running "c:\Program Files\WindowMetafile\Fixunins000.exe".

Finally we re-registered the Windows picture and fax viewer library by issuing this commands:

regsvr32 -u %windir%\system32\shimgvw.dll
regsvr32 %windir%\system32\shimgvw.dll

This should re-enable previews and also restore the desktop background functions.

From JRoller to Blogger?

This is my first entry to my freshly opened blog at I will try this out for a little to see if I like it more than JRoller, where I currently keep my blog. I will probably keep the two in sync for a while to get an idea of which one suits me better.

Sunday, August 13, 2006

Ubuntu Framebuffer Console

Just another one of those "reminder" posts. After installing Ubuntu it booted with the splash screen in 640x480. As I can never remember the mode number for 1280x1024, here it is:

kernel   /boot/vmlinuz-2.6.15-25-686 root=/dev/hda1 ro quiet splash vga=794

This of course also lets you see much more text on a VT.

Monday, July 24, 2006

X11 Dual Head with nVidia

This is more a reminder for myself than a regular post. Because I have been fiddling around with the xorg.conf to get my Linux desktop right with dual head and the right resolution on each screen, I post my configuration here. If anybody finds it useful, they are welcome to copy it.

I am currently using Ubuntu 6.06 (Dapper Drake) which comes with X.Org 7.0. My machine has a GeForce 5600XT based graphics board by LeadTek. It has a DVI-I and a VGA port. I installed the closed-source nvidia driver to get hardware acceleration and extended features the X.Org nv driver does not provide. Moreover I could not get dual head right with the X.Org driver.

The following configuration sets up two screens, one for the left display (analogue TFT, 1280x1024) and one for the right (CRT, 1024x768).

# /etc/X11/xorg.conf (xorg X Window System server configuration file)
# This file was generated by dexconf, the Debian X Configuration tool, using
# values from the debconf database.
# Edit this file with caution, and see the /etc/X11/xorg.conf manual page.
# (Type "man /etc/X11/xorg.conf" at the shell prompt.)
# This file is automatically updated on xserver-xorg package upgrades *only*
# if it has not been modified since the last upgrade of the xserver-xorg
# package.
# If you have edited this file but would like it to be automatically updated
# again, run the following command:
#   sudo dpkg-reconfigure -phigh xserver-xorg

Section "Files"
 FontPath "/usr/share/X11/fonts/misc"
 FontPath "/usr/share/X11/fonts/cyrillic"
 FontPath "/usr/share/X11/fonts/100dpi/:unscaled"
 FontPath "/usr/share/X11/fonts/75dpi/:unscaled"
 FontPath "/usr/share/X11/fonts/Type1"
 FontPath "/usr/share/X11/fonts/100dpi"
 FontPath "/usr/share/X11/fonts/75dpi"
 # path to defoma fonts
 FontPath "/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType"

Section "Module"
 Load "i2c"
 Load "bitmap"
 Load "ddc"
# Load "dri"
 Load "extmod"
 Load "freetype"
 Load "glx"
 Load "int10"
 Load "type1"
 Load "vbe"

Section "InputDevice"
 Identifier "Generic Keyboard"
 Driver  "kbd"
 Option  "CoreKeyboard"
 Option  "XkbRules" "xorg"
 Option  "XkbModel" "pc105"
 Option  "XkbLayout" "de"
 Option  "XkbVariant" "nodeadkeys"

Section "InputDevice"
 Identifier "Configured Mouse"
 Driver  "mouse"
 Option  "CorePointer"
 Option  "Device"  "/dev/input/mice"
 Option  "Protocol"  "ExplorerPS/2"
 Option  "ZAxisMapping"  "4 5"
 Option  "Emulate3Buttons" "true"

Section "Device"
 Identifier "nvidia0"
 BoardName "GeForce FX 5600XT"
 Driver  "nvidia"
 BusID  "PCI:1:0:0"
 Screen  0
    #Option "ConnectedMonitor" "CRT"
    Option "UseDisplayDevice" "CRT"
    Option "NoLogo" "0"
    Option "RenderAccel" "true"
    Option "AllowGLXWithComposite" "true"
    Option "UseEdidDpi" "FALSE"

Section "Device"
 Identifier "nvidia1"
 BoardName "GeForce FX 5600XT"
 Driver  "nvidia"
 BusID  "PCI:1:0:0"
 Screen  1
    #Option "ConnectedMonitor" "CRT"
    Option "UseDisplayDevice" "CRT"
    Option "NoLogo" "0"
    Option "RenderAccel" "true"
    Option "AllowGLXWithComposite" "true"
    Option "UseEdidDpi" "FALSE"

Section "Extensions"
 Option "Composite" "Enable"

Section "Monitor"
 Identifier "H750"
 Option  "DPMS"

Section "Monitor"
 Identifier "Amaga"
 Option  "DPMS"

Section "Screen"
 Identifier "Screen0"
 Device  "nvidia0"
 Monitor  "H750"
 DefaultDepth 24
 SubSection "Display"
  Depth  24
  Modes  "1280x1024" "1024x768" "800x600" "640x480"

Section "Screen"
 Identifier "Screen1"
 Device  "nvidia1"
 Monitor  "Amaga"
 DefaultDepth 24
 SubSection "Display"
  Depth  24
  Modes  "1024x768" "800x600" "640x480"

Section "ServerLayout"
 Identifier "Dual Head Layout"
 Screen 0 "Screen0"
 Screen 1 "Screen1" RightOf "Screen0"
 InputDevice "Generic Keyboard"
 InputDevice "Configured Mouse"
 Option  "Xinerama" "On"

Section "DRI"
 Mode 0666

Wednesday, July 19, 2006

Fedora Core 5, Part 1

Now this is the late "Part 1 of 2" (Part 2 was published first). Took me longer than expected, but I finally got it done.

In the beginning there was Windows...

I have been working with all versions of Windows starting with 3.0 on a 386SX16. Over the years I have gained extensive knowledge about many aspects of the inner workings, the weaknesses and strengths. Up to now I have been using Windows XP as my "main" operating system. It certainly has its flaws, but all told it is a fine piece of software.

I also believe to have a somewhat profound insight into Linux. I started with Debian Potato and have since used both newer Debian versions (Woody and Sarge) as well as several flavors of Red Hat (beginning with RH9, up to FC5) and some SuSE, too. Right now I am using Ubuntu Dapper Drake 6.06 to write this. However the majority of the Linux systems I work with are servers, most of them without even running X11.

While I have always liked Windows as a desktop OS I have become more and more annoyed by the plethora of worms and viruses that have surfaced over the last few years. Moreover I do not like the direction that Microsoft is taking. I was skeptical when the first versions of XP required you to activate your copy, but I could live with that, because independent sources found it to be anonymous (enough). However this whole "genuine advantage" thing they are doing now, collecting more and more information about your hard- and software, hash-values of the BIOS and other stuff is just what people were careful about before, but now they seem to accept it. Well, I decided, I will not. As a Java developer I luckily do not necessarily have to work on Windows. All my favorite tools are there on Linux, too; Eclipse, MySQL and whatever else I need. Ok, I admit that I will probably not get rid of Windows completely, but maybe a VMware setup will do. Now that the VMware server is available for free, this could be an interesting alternative.

The question for me just was: Which distribution suits me best as a Windows replacement?

Fedora Core 5

Dual Boot

First I tried FC5 as a replacement, because from what I heard it was a good choice for up-to-date software and a broad community supporting it. So I installed it on a separate hard drive to see if I liked it.

Doing so I came across a number of issues that I think will make many people (continue to) think of Linux as not (yet) ready for the desktop. Most of those are really small things and quite easy to fix, but combined together they can really annoy people that are not too much interested in how the system works, but just want it to let them do their daily work.

The thing I found most annoying was that the setup program did not get the dual boot with Windows right. XP got integrated into the grub.conf file automatically, however the installer did not take into account that it was installing Fedora on a different disk. This produced a not-working grub menu entry. I could not boot Windows, except through changing the boot drive in the BIOS setup.
I had to manually make grub swap the drive order which finally allowed me to boot into Windows again; this is something the installer should definitely take care of! I do not want to know how many people managed to cause real damage while trying to get back to their other operating system(s). So for anyone interested, this is my config (grub.conf, Windows on /dev/hda and Linux on /dev/sda):

title           Microsoft Windows XP Professional
root            (hd1,0)
map             (hd0) (hd1)
map             (hd1) (hd0)
chainloader     +1

Graphics Setup

Concerning graphics, the installer set up a single monitor configuration, using the free nv driver. I was impressed to find a "Dual Head" tab in "System-Administration-Display" which offered to simply choose the type and resolution of the second monitor.
This however subsequently prevented the X server from starting again. That really annoyed me. After I had manually removed the dual head settings from the xorg.conf file I searched the web for the error message I read in the X server's log:

Fatal server error: Requested Entity already in use!

I quickly came up with lots of people having the same problem, all of them basically recommending to either leave dual head off (no option), recompile the driver from source (not sure why, as I was using just stock FC5 components) and installing the binary only nvidia driver, directly from nVidia. I went with the last option, taking the prepared rpm from

Being (too) optimistic I went for the Dual Head settings again and got the same problem as before. The "left" card was configured to use the nvidia driver, the right one was set to nv again.
Now this is not what I expect from a state of the art distribution. I understand it is basically the hardware manufacturers' fault to not release their drivers so they can be included with the distributions. But hey, at least I would expect two things:

  1. Let the user test new settings first. Learn from Windows here: restore the previous settings if no confirmation happens within 30s.
  2. Do not mess up the config file again by putting in the default driver, but if in doubt, ask whether to use the same driver as for the first card.

I am not sure who exactly to blame for these problems, but frankly: I don't care! This is what keeps people from accepting Linux distributions for everyday use. It's 2006 and we are just used to not breaking something by wanting to switch on a second monitor.


One more thing that has always been a very important issue to me is fonts. I know there are different opinions concerning font anti-aliasing and sub-pixel anti-aliasing ("ClearType" in Windows-speak). There are those who like it and those who don't. I count myself a member of the "like it" fraction.
Some weeks ago I saw a video about typography in Windows Vista on Microsoft's Channel 9 blog (see here). Before that I did not know about the very minute tuning and delicate tweaking that takes place to produce fonts that are readable both at very large and especially very small sizes where the automatic scaling of vectors produces many undesirable artifacts.

In Fedora I also chose to have my fonts anti-aliased on a sub-pixel level. Despite the fact that one of my first actions was to copy my Tahoma, Verdana and Trebuchet MS true-type font files from Windows to Linux it still looked wrong. It is hard to describe, but just looking at the desktop icons labels did not feel right. At a size of 8 the Windows screen looked much clearer and readable than the one on Fedora. Searching the web I learned even more about font rendering, especially concerning hinting and the patents connected with it. Well, maybe that's something you have to live with and get used to.


The "Anaconda" installer successfully detected both my Creative Labs SoundBlaster Live and the on-board Intel sound device. It even let me choose which one to use by default. I went for the SoundBlaster, because that's where the speakers are attached.
I ran into the same problem again that I had with FC4 (see this earlier post): Sound as root, but sad silence as a regular user.

I finally found what has probably been the issue with FC4, too: The setup had very well defined the SoundBlaster as my soundcard of choice, alas apparently it only did so for the root user. When logged in as a normal user it used the Intel sound device. So it was not a permissions problem as I suspected back then at all. That of course explains the lack of error messages and the complete uselessness of all volume sliders and muting switches; the sound was there, just there were no speakers to let me know.
It may be beneficial to be able to select different preferred devices per user, but hey, why not use the installation default for all users as a start? One more thing on the "need not be" list.

"Second Impressions"

I used FC5 for some weeks, only booting into Windows occasionally, mostly to do home-banking. Basically I liked it, but it still did not have the "ease of use" factor I would have liked. Package management with yum for example is way to slow for my taste. Especially at the beginning when I needed to install numerous packages as I went, I learned to hate the its progress indicator walking slowly while working through some XML files.

Ubuntu "Dapper Drake"

One day I downloaded the free VMware server and installed Ubuntu 6.06 (Dapper Drake) in a virtual machine. I immediately felt at home, probably because I have known Debian for so long. It only took a couple of days to purge FC5 and install Dapper as my "new primary" Windows replacement. The default setup is very well done. It's just one CD, so there is not as much choice as with the Fedora DVD (in fact, there is no choice at all in the installer, just like Windows). But hey, that's ok, because most of the stuff you need is either already there (Open Office, Firefox, GAIM...) or can easily be installed via apt-get or the very nice graphical software management tools (and apt is way faster than yum...).

Right now, after only three evenings, I am set up very nicely and finishing this article. I have to admit though, that my recent experience with the FC5 setup sped up the process somewhat. Please do not get me wrong: Fedora is a fine system, if you are looking for a distro to use, I definitely recommend having a look at it. But also be sure to give Ubuntu a try ;)

For me this could be the beginning of a very good relationship :)

Tuesday, July 11, 2006

Enhanced MySQL Administrator Graphs

Update: MySQL 4 and 5 behave differently.
In MySQL5 there is a steady activity on some handlers, probably caused by the status queries themselves. This does not happen in 4.1. I have issued a support call with MySQL and will see what to do about it.

Update 2: See the follow-up post for more information on how to work around this.

MySQL Administrator is one of the graphical tools MySQL provide to manage their database servers. Apart from other things like server daemon control and a log file viewer this tool includes visual controls to display the load of the database server.

Even though the out-of-the-box configuration already contains some useful diagrams, I added some new and modified the existing. If you'd like to use them, please feel free to do so, they can be downloaded here: mysqladmin_health.xml. Please notice, that this is not a java file, but I had to rename it to be able to upload it to JRoller. Just remove the ".java" from the filename.

There is a file in the Administrator installation directory that serves as a default. However for each user a copy is placed in his/her home directory. If this private copy exists, it will be used instead. On Windows you have to put it into the directory you reach by entering %userprofile% into the address bar of an Explorer window.

From there on the further path depends on your Windows language version. On a German Windows it is Anwendungsdaten\MySQL, on an English version it's Application data\MySQL. Please be sure that no instance of MySQL Administrator is running, because on exit it will overwrite that file with its currently running settings.

If you have useful customizations, please feel free to share them, too.

Here you can see the new graphs and a short explanation:

  • Individual counters for sent and received bytes
    Traffic Graph
  • InnoDB pages read/written ratio and number of temporary tables created and how many of them were on disk (as opposed to in memory)
    InnoDB reads and temp tables
  • History of threads in the server and their idle/working ratio
    Idle and working threads
  • Number and type of DML statements performed.
    History and type of DML statements
  • History of calls to the Read_First handler.
    History of Read First Handler calls
  • History of calls to the Read_Next handler.
    History of Read Next Handler calls
  • History of calls to the Read_Prev handler.
    History of Read Prev Handler calls
  • History of calls to the Read_Rnd handler.
    History of Read Random Handler calls
  • History of calls to the Read_Rnd_Next handler.
    Number of Read Random Next Handler calls

For more details on the different handler types, see the "Status variables" tab in MySQL Administrator or the MySQL manual (see the correct manual page for your MySQL version!)

MySQL 5.0, Bug 10210

In my previous post about enhancing the graphs the MySQL Administrator displays I added a remark that there seems to be a difference between MySQL 4.1 and 5.0.

As it seems this has already been reported as MySQL Bug #10210 and fixed, however only for 5.1. Summarizing the bug report is easy: They implemented Heisenberg (or better: the observer effect). You cannot query the counters for e. g. the number of temporary tables created without modifiying it as you go. In 4.1 all the SHOW STATUS... commands (see the manual page) could be executed without modifying the values displayed, because they were immediately sent to the client. In 5.0 a temporary table with the result is created and then sent. This allows the use of the data in stored procedures.

It is however very annoying, because it increases the noise in measuring significantly. One (somewhat clumsy) workaround is to modify the formula for the graphs you define in MySQL Administrator by subtracting the number that would otherwise be displayed on an idle server. This of course only works if you only have one instance of the Adminstrator connected to a particular server (otherwise the other sessions' increments still show up). Moreover this makes statistics on 4.1 servers wrong, so make sure you know how to read the data if you have a multi-version environment.

Wednesday, June 28, 2006

Fedora Core 5, Part 2

The title of this might seem a little strange as there has not been a "part one" yet, however I feel that I'd like to share this first nonetheless. Part 1 is going to be a little more exhaustive on several aspects of my searching for an alternative to Windows XP as a desktop system.

This one might also be called "DVD playback on Fedora Core 5".

One of the more disappointing aspects of my new Fedora Core 5 install is the apparent lack for multimedia support. Sure, there is the default totem player, but I got the impression that it tried hard to avoid playing back anything I would have liked to see or hear.

Especially DVDs were a concern for me. I have bought lots of movies and TV series over the years and I surely like to watch them once in a while, if it even is only a background windows playing Star Wars (to be honest, as often as I have seen those, I would not need the picture anymore anyway ;-).

I was close to revert back to Windows when I could not find any pre-installed software that would "just" require an additional decoder. I did not even find a hint how to do it. I understand that Fedora contains only free software, but a simple text note or a link to a more detailed explanation how to get the stuff you need on your own would have been nice.

On FedoraForum I found this thread containing some basic steps on how to get going with DVDs and video files. I suggest taking a look at it. I had set up the Livna repository before for the nvidia drivers, so I skipped that step. But everything else went very smoothly.
Following the "todo list" the only thing I noticed was that there were some messages about single packages not found, but it seems everything got installed just fine.

After that I could have my first look at The Revenge Of The Sith on Linux. :)

Tuesday, June 27, 2006

MySQL Replication: Error 1053

When setting up MySQL replication there are some things to remember. Although the setup is quite easy if you thoroughly read the documentation on MySQL's developer site you might still hit some issues.

We have quite a large scale replication setup (MySQL 4.1.12) with several hundred slaves. Today we saw a very strange situation: All of the slaves stopped replicating and claimed that a statement had been partially executed on the master side. The exact message was

Query partially completed on the master (error on master: 1053) and was aborted. There is a chance that your master is inconsistent at this point. If you are sure that your master is ok, run this query manually on the slave and then restart the slave with SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;

The error code 1053 which means as much as "server shutdown".

We checked the master and could not find anything unusual. The server had not been shut down at all and nothing seemed wrong with the master replication settings either.

In the end we found out what had caused the problem: Someone had tried to create a dump of the master server using mysqldump --master-data. This implies a FLUSH TABLES statement. Because that statement took too long it was aborted using the MySQL KILL command. However because that statement had already been replicated to the slaves and was now aborted, the slaves took it for "partially executed" (which is usually something bad). The error code you can see on a killed client is 1053. So the slaves decided that something serious had happened and stopped the communication with their master.

I will have to look more closely into the documentation, but probably I will have to file a bug report as the replication should not suffer from this special case.

Friday, June 16, 2006

XP activation hassles

As I wrote in my previous post I just bought a new notebook computer that came with a preinstalled Windows XP Home Edition. After I had reinstalled it, XP Home's activation would fail with a message about an invalid product key.

Already suspecting something nasty going on, I searched the net for a while and found, that in April 2005 Microsoft published a Technet article titled Preserving OEM Pre-Activation when Re-installing Windows XP. From what it says Microsoft disabled the internet activation for all OEM products to prevent system builders from pre-activating their machines and selling the product keys separately.

I was trying to activate at 0100h in the morning, and of course nobody was available at Microsoft. I thought that was ok, because you have 30 days to activate, but when I went to it told me that I had to activate first, because otherwise it could not run the Genuine Advantage check! Now that's to my liking! As a paying customer you have to wait till you can install the 45(!) patches that were available at that time.

Has anyone ever wondered what happens if Microsoft decides that XP has reached its end of life? If I once bought a Windows 3.1 license I can still install it today - given that I have a machine that's not too fast for it :) - but will I be able to do so with XP in 10 years?

Anyway, I indeed had to wait till the next day to go through the telephone activation procedure. Because I already knew that the automated phone activation would not work I just entered lots of 1's when asked to provide the activation code. After that I was put through to a living being, who first asked me to provide the first and the last block of my activation code. Then she wanted me to enter the product key from the sticker attached to the notebook (again). I do not know why, but I decided to play dumb and just do it. Finally, after I had told her the whole activation code and answered her question on how many machines I had installed XP to (honestly, who would tell them anything but "1", even if he/she did install it multiple times?), she gave me the final activation sequence.

First thing after that was to save the wpa.* files from the system32 folder to a CD. This should allow me to install the machine time and again if I want to without having to go through the activation again (of course only, if the hardware stays the same). To restore a saved activation state, just boot into Safe Mode and put the files from the CD back into the system32 folder.