Tuesday, August 29, 2006

MySQL Index Analyzer 0.02 package ready for download

I just uploaded the first ready-to-use package of the MySQL Index Analyzer (version 0.02). You can download it via its download site.

I recommend going to the MIA homepage to see what's new and how to use it.

Monday, August 28, 2006

Old (Bad) habits die hard

Recently I was reminded painfully of the fact that habits you have taken to once hardly ever get laid off.

I usually consider myself someone who tries to write software after I have thought it through. I do not mean "over-engineering", "over-abstracting" and "over-prepare-for-anything-that-might-ever-come'ing". However I also believe that starting hacking blindly is not a good thing either. And I try to write "nice" code, even though it might be a little more work, as long as it is easier to read or just plain more stable (which is often the same).

Sometimes however, especially under a tight schedule, by force of habit I (and probably any developer out there) tend to do things that upon later review make me feel deeply embarrassed. Just so I did a couple of weeks ago...

I had to write a component that translates data from a legacy system, stored in plain text files, into a relational database, accessed through an object relational mapping layer. Each entry into the SQL database is generated from single text file. The information used as the primary key is encoded into the filenames.

For some reason I cannot understand anymore today, I decided to keep track of the files I had already processed by appending their names to a new plain text protocol file. At the time I probably thought it was quicker to implement than to come up with a new kind of protocol value object to store it into the database for each entry. However in retrospective I doubt that this approach was really faster to implement. (Even worse, this code is deployed into an J2EE application server where you are not even really allowed to do file i/o if you follow the EJB spec.)

Anyway, this translation component includes a polling mechanism that regularly inspects a directory for new, unprocessed files. While during my tests nothing looked wrong, after some time in production the processing (better: the polling) became continously slower, as the list of files already processed was growing longer and longer. So the time to find out if a file was new or still had to be processed gradually increased. To make matters worse, the number of file names available is limited by the design of the legacy program, so they start to roll over after a while, making the poller believe that a file had already been processed and could be skipped, while in fact there was new data in it. I would have had to include the file modification timestamp into the list of already processed items and compared them for every poll cycle as well to make this work again.

So in the end I had to refactor the whole thing and roll out an update of our application that stored the necessary meta-information into the database server to get an acceptable throughput again. To avoid the the file modification date check we additionally we implemented an archive mechanism for the legacy files to keep the number of items in a single folder limited and the names unique.

However all of this shows that people tend to fall back into old habits, once they get under pressure. It can be really hard to notice this in time and make yourself do it "right" from the start.

Here is my private list of things I notice people (including myself, of course) doing time and again despite knowing better:

  • Use files instead of other, more suitable forms of storage
  • Hard code strings ("I will come back to I18N-ize this later...")
  • Include debugging output via System.out.println instead of through the logging mechanism
  • Add checks for null's in places where you should not get any null in the first place instead of trying to find out the real cause.
  • Change the code, but not the documentation - it is always fun to look for a bug or unexpected behaviour that is caused, because a method does not do what you would expect from its comment
  • Change the behaviour, but not the names (of methods, fields, variables...). This should never happen, given the excellent renaming support modern IDEs offer.

Feel free to add to this list anything that comes to your mind :)

Saturday, August 26, 2006

MySQL Index Analyzer (MIA): Refactoring, Part 1

Development of a more structured version is in progress. I just committed a set of changes to the repository that changes the current, rather monolithic, single-class design to what will be the first step of a more modular one.

Currently the generation of ALTER TABLE statements has been removed and the output format of the analysis is slightly different. But that is mere cosmetic.

What is much more important is that now the gathering of database schema information is based on a pluggable system, designed around an interface called IndexDescriptorProvider. Up to now I have just ported the MySQL 4 stuff that was already in the first version to this new architecture. Please feel free to have a look at it and tell me what you think.

Next thing I'll do is implement a provider based on the INFORMATION_SCHEMA database available in MySQL 5.x to see if I missed anything.

Go to http://mysql-index-analyzer.blogspot.com for details.

Friday, August 25, 2006

MySQL Index Analyzer Basic Documentation

I posted some Basic Documentation for MySQL Index Analyzer including a simple example. This is intented to get started with the tool without having to read the code.

Any ideas and suggestions for improvements are appreciated :)

Tuesday, August 22, 2006

Windows compressed folders... again

I wrote about the "buggyness" of the Windows Compressed Folders feature before. However today I found out an even more unbelievable bug.

I copied a ZIP file containing several thousand files onto a Windows 2003 Server machine (with SP1). Once it was there I connected via RDP and right-clicked the file in the Explorer, choosing "Extract All" from the context menu.

After clicking "Next" to get past the "Welcome to the Decompression Idio^H^H^H^HWizard" I chose the target directory and hit "Next" again. Immediately I was asked whether I wanted to overwrite a file already existing in the target location. As I had created a new folder to hold the archive contents I was somewhat suprised by this dialog; so I cancelled the operation and tried again.

Now you won't believe it: Once the decompression starts upon clicking the second "Next" button, this very same button remains enabled and functioning! Obviously I had (by accident) clicked twice on the button in my first attempt. Turns out you can click it again and again, as long as the decompression is not finished and it will ask you if you want to overwrite the files it just put there itself....

Now is this embarrassing for MS or what?

Oh, and while we are at it: Be sure that have a good laugh while reading "Dear Sir Bill Gates: invoice enclosed" and the follow-up Stupid operating systems or stupid operators?.

Monday, August 21, 2006

MySQL Index Analysis Tool

Back in January I posted a simple MySQL duplicate index finder tool. Because I read requests for such a tool on the MySQL Performance Blog I decided to open a new project on Google's code hosting service as well as a new blog to track it.

So if you are interested and maybe even want to contribute to it, go have a look.

Saturday, August 19, 2006

LogBuffer

Looking at the list of referrers for my 1st blog, I found one coming from Mike Kruckenberg. First of all it is a nice post to read for anyone who (has to) work with databases, so be sure you take a look.

Moreover I learned about LogBuffer which I did not know before. So if anyone else dealing with databases does not know it yet, maybe you will it as interesting as I did.

Monday, August 14, 2006

Windows XP image preview broken

Today a friend called, asking if I had any idea why his Windows XP machine did not display any picture miniature in the "My Pictures" folder anymore. He had already tried to reset file type associations and some other experiments, but without any luck. Moreover using JPG files as desktop background did not work anymore.

After some thinking he rembered that the problem might first have occured after installing the first WMF hotfix published by Ilfak Guilfanov even before Microsoft provided a patch.

This led us to the solution. First we made sure that the patch was uninstalled by issuing the following command:

msiexec.exe /X{E1CDC5B0-7AFB-11DA-8CD6-0800200C9A66}

If it tells you, the program is not installed currently, you should be able to remove it from the "Add/Remove Software" control panel applet or by running "c:\Program Files\WindowMetafile\Fixunins000.exe".

Finally we re-registered the Windows picture and fax viewer library by issuing this commands:

regsvr32 -u %windir%\system32\shimgvw.dll
regsvr32 %windir%\system32\shimgvw.dll

This should re-enable previews and also restore the desktop background functions.

From JRoller to Blogger?

This is my first entry to my freshly opened blog at blogger.com. I will try this out for a little to see if I like it more than JRoller, where I currently keep my blog. I will probably keep the two in sync for a while to get an idea of which one suits me better.

Sunday, August 13, 2006

Ubuntu Framebuffer Console

Just another one of those "reminder" posts. After installing Ubuntu it booted with the splash screen in 640x480. As I can never remember the mode number for 1280x1024, here it is:

kernel   /boot/vmlinuz-2.6.15-25-686 root=/dev/hda1 ro quiet splash vga=794

This of course also lets you see much more text on a VT.