On March 4th I wrote about an observation we made concerning InnoDB locking behavior in combination with foreign key constraints.
Since then we have been in contact with MySQL support and they in turn with the InnoDB developers. One of the conclusion I can already draw from the whole thing is that another level of indirection (InnoDB being a part of Oracle, i. e. an external resource to MySQL, too) does not tend to speed things up...
But be that as it is, here is a more thorough explanation we got about how the more aggressive locking comes to pass in case a foreign key constraint references a parent table's primary key. As the original response from the support staff spans several emails and includes some internal stuff I will paraphrase to protect the innocent (i. e. me ;-)).
First - just to remind myself - some notes on how the foreign key checks work. For any update you do to a child table, the DBMS needs to make sure that you cannot concurrently remove the parent table's row the FK refers to. Otherwise you could end up with a request to roll back the transaction that was intended to change a child row and be left without a suitable parent value. To do so, locks are placed on the index records being referenced by a child table's foreign key.
In the example transaction 1 locks a row in the parent table to update it. It does so by locking the primary key. It uses an X (exclusive) lock on the particular value of the PK for one row in this table. An exclusive lock blocks access to the entire row, because the primary key is a clustered index in InnoDB. That means the key itself is part of the record - in contrast to non-clustered indexes where a copy of index columns' values is placed in a location outside the actual records. So regardless of whether or not the key values themselves are changed, the record is locked completely.
Again: Because of InnoDB's structure all primary key locks are X locks, as the key itself is part of the row. In contrast to that an S (shared) lock only locks entries in a secondary index, which are stored outside the actual records.
Now, transaction 2 intends to update a record in a dependent table. It has to take a lock on the foreign key there whose reference key is the primary key in the parent table. However the exact row the child record refers to in the parent table has already been locked by transaction 1, hence operation 2 has to wait.
Transaction 3 needs to change a column in a row belonging to a child table, too. This column has a foreign key to the parent table, but not one whose parent is a primary key. It instead references a secondary key of the parent table. That means transaction 3 does not have to wait, because it does not require X lock on the entire row in the parent table. It only requires an S lock on the secondary key. As no other transaction has taken such a lock on that index entry, transaction 3 can proceed.
While all this explains perfectly why we experience a locking situation in one case, but not in the other, it still leaves the question open on how to go on from there.
Basically the one conclusion that can be drawn here already is this:
Either you get undesirable locks when foreign keys reference their parent tables' primary keys, our you have to define superfluous indexes that cost disk space and slow down write operations. In some cases you cannot even choose, because the optimizer might still decide to go for the primary key, even though you defined a secondary one, depending on the specific conditions for any given statement.
The latter part of that is especially true if you go for the - usually suggested - short primary keys in InnoDB tables, e. g. single numeric key columns. While those are generally a good idea especially for InnoDB they work against you in the scenario at hand.
Right now, the only suggestion we got from the MySQL folks is to rather go for less indexes, because the cost they introduce is usually significant in contrast to the locking problems. This is hard to quantify however and may depend heavily on your application's access patterns. Making matters still worse is InnoDB's current (5.0) limitation that it will lock all rows that are examined while executing any particular DML statement. Only in 5.1 only rows that get actually changed will be locked, alleviating the situation to some degree.
There might still be something coming here, however right now I do not think there is a big chance of this being changed any time soon. The one thing I will ask for however is to improve the documentation on foreign keys to have people know about this stuff in advance.