One of the key features MySQL often uses to advertise for their database is the modular architecture that allows them to have different storage engines below the same SQL layer. In practice the application and/or database designer can choose from a variety of low level data storage implementations that each offer different characteristics and may be chosen on a per table basis. (Even though I personally believe most designs will use one type of table for all tables of a particular schema).
The idea behind this is that for example people who do not need transactions should not have to worry about them at all – maybe there is a performance impact involved which they cannot afford to take. Moreover some specialized types of index or column might not be available on all engines. Basically the concept is very interesting and can be really useful for developers.
However there is a weakness that in my opinion needs some severe work to be done: The interface between the common SQL layer and the storage engines seems to be somewhat limited with respect to what storage engines can do to inform the level above about status and error conditions.
For example there is no (elegant) way
to find out about the details of a constraint violation problem when
using the InnoDB storage engine. While you will get an error message
that some statement failed due to a violation of referential
integrity constraints, you have to use the generic “
innodb status” command to get some details. However this will not
only tell you about the error you care about at that particular
moment, but will also give you lots of information on lots of other
stuff inside InnoDB. This is however necessary, because you do not
have any other means of find out out about those – e. g. when you
are investigating a performance problem.
From what I learned from a consultant some time ago this is due to the limit interface specification through with MySQL itself (the upper layer) and the storage engines talk to each other. Because this protocol has to be somewhat generic messages from the bottom to the upper levels have to be somehow wrapped into some form of special result set which you then have to parse and understand on your own. Moreover if memory serves me right, there is a limitation on how much data can be transferred at a time (could be a limitation of the client as well). Because of this you will not even always get a full InnoDB status output, because it will be truncated if it gets bigger than 64k.
While this is not particularly nice it is a limitation I believe is acceptable, especially in the case of InnoDB, because the innodb_monitor feature allows you to get the full output into a log file.
What I consider much worse however, is that error messages from the underlying storage engine are often mapped to more generic MySQL messages in an unpredictable way.
Time and again I have run into problems that present you with an error message that has nothing to do with the actual problem. For example in a replication scenario you might get an error message 1236, claiming that there is something wrong with replication position counters, but it turns out that this message can also mean a full disk on the master. If you know enough about the implementation details you might see the way this message comes to pass, but if you are troubleshooting a production system this is not what you want to do. Moreover I tend to forget these peculiarities if they are just seldom enough. Just recently I found a machine spitting out strange errors on each and every query I issued (InnoDB):
mysql> select count(*) from child; ERROR 1033 (HY000): Incorrect information in file: './test/child.frm' mysql>
Now, just having this message, what would you expect is wrong here? File corruption? Broken SCSI controller? Faulty memory?
When you use Google to search for
ERROR 1033 (HY000)” you will get all sorts of results, most of
them suggesting to try myisam_check (not very useful for InnoDB) or
REPAIR statements. Often you will find someone who claims that
restoring from the latest backup might be the only option.
While all of this is certainly true to solve the problems that this error message what originally intended to report, in my case they were all just leading in the wrong direction.
Turns out that something was wrong with
my.cnf configuration file. This was on a machine set up using the “Streaming Slave Deployment” mechanism I described in an
earlier article. For some reason the script that usually adapts the
config file automatically after downloading the data files had not
been started, so a default
my.cnf was still in place.
Unfortunately the InnoDB data file
sizes did not match those downloaded from the master server. This is
This is a listing of the data directory:
-rw-rw---- 1 mysql mysql 555745280 2007-08-31 20:26 ibdata1 -rw-rw---- 1 mysql mysql 104857600 2007-08-31 20:26 ibdata2 -rw-rw---- 1 mysql mysql 5242880 2007-08-31 20:26 ib_logfile0 -rw-rw---- 1 mysql mysql 5242880 2007-08-31 20:24 ib_logfile1 drwxr-xr-x 2 mysql mysql 4096 2007-08-31 20:27 mysql drwx------ 2 mysql mysql 4096 2007-05-12 12:24 test
It may not be obvious, but 555745280
ibdata1 is not 512MB, but 530MB. Nevertheless the MySQL
server started even with this wrong configuration. However every
statement would fail with the message above.
Shutting down the server and correcting the line above to
restored everything to a working state:
mysql> select count(*) from child;
+----------+ | count(*) | +----------+ | 9 | +----------+ 1 row in set (0.03 sec)
While I really like MySQL and find it generally rather easy to configure and get very high performance, this is definitely a major weakness I would like to see improved in future versions. For the time being I will try to make a post about anything that strikes me odd enough for someone to be interested in, too :)