Tuesday, January 10, 2006

MySQL Indices

I did not get around to go after the sound problem with my new Fedora installation. However I did write a little program I had had in my mind for some time now: A MySQL duplicate index detection tool.

Where I work we use a Java application framework that generates database schemas from UML class models. Unfortunately no one took care to deactivate a feature that creates an extra index on every table, that is the same as the primary key. This is obviuosly a waste of space and takes longer to update. From what I understand from the MySQL manual the optimizer will never use it, because it would be slower than accessing the primary key (which is the clustered index in InnoDB).

So instead of finding all those superfluous indices' (generated) names manually, I decided to write a little tool to help me. While I was working at it I sized it up a little to find any index B that is a complete prefix of a longer index A. Unless A is substantially longer than B I guess it will be faster to use A for reading while dropping B altogether, thus saving the update time and disk space.

I wrote it as a simple Java class, that maybe someone will find useful. It produces output similar to this:

Table foo
--------------------------------------------------------------------------------
PRIMARY (cols: 1:X_usa_id, 2:agsl, 3:a_index)
   obsoletes IDX_1(cols: 1:X_usa_id, 2:agsl,3:a_index)
   obsoletes IDX_3(cols: 1:X_usa_id)
IDX_4 (cols: 1:X_usa_id, 2:a_index, 3:agsl)
   obsoletes IDX_3(cols: 1:X_usa_id)

Table bar
--------------------------------------------------------------------------------
...

Currently it does not take into account the uniqueness and/or cardinality of the indices, because that was not a concern in my project. May I will update it to do so sometime. If anyone else cares to improve it, I should definitely like a copy.

Link works now DuplicateIndexFinder.java

No comments: