MySQL: Add primary key to table with duplicates

Maybe this is obvious, but I post it anyway, just to remind myself should I need it again.

Recently I had to change a table that I had not completely thought through when I first created it. The structure was so simple, I did not think I could do anything wrong with it:

CREATE TABLE `parent` (
  `par_id` bigint(20) NOT NULL,
  `somevalue` varchar(20) default NULL,
  PRIMARY KEY  (`par_id`)

CREATE TABLE `child` (
  `x_parid` bigint(20) default NULL,
  `value` bigint(10) default NULL,
  KEY `fk_parid` (`x_parid`),
  CONSTRAINT `child_ibfk_1` FOREIGN KEY (`x_parid`) REFERENCES `parent` (`par_id`)

There is a 1:0..* relationship between parent and child. Some sample data:

mysql> select * from parent;
| par_id | somevalue    |
|      1 | Parent No. 1 | 
|      2 | Parent No. 2 | 
|      3 | Parent No. 3 | 
3 rows in set (0.00 sec)

mysql> select * from child;
| x_parid | value |
|       1 |    10 | 
|       1 |    12 | 
|       1 |    15 | 
|       2 |    25 | 
|       2 |    26 | 
|       2 |    26 | 
|       3 |    31 | 
|       3 |    31 | 
|       3 |    31 | 
9 rows in set (0.00 sec)

Clearly it is possible (and intended) that there can be several children with references to the same parent, even if the have equal values. Inserting data using my application was no problem. However when I tried to read back a parent including its children, I got an error from the persistence framework (proprietary 3rd party) that there was a probable inconsistency in the data, because there were several identical children. Turns out, the thing needs a primary key on every table to manage it in memory.

So how do you add a primary key to the children table when it already has data? First attempt:

mysql> ALTER TABLE child
    ->   ADD COLUMN child_id BIGINT(20) NOT NULL FIRST,
    ->   ADD PRIMARY KEY(child_id);
ERROR 1062 (23000): Duplicate entry '0' for key 1

Of course, adding a column fills it with either NULL or with the data types default value, in this case 0. Declaring it as a primary key then ought to fail, because all 9 rows got the default value.

So we need a some initial distinct values for the column (in the application the persistence layer will provide generated keys once there is primary key column). For this the auto increment feature comes in handy:

mysql> ALTER TABLE child
    ->   ADD PRIMARY KEY(child_id);
Query OK, 9 rows affected (0.06 sec)
Records: 9  Duplicates: 0  Warnings: 0

mysql> select * from child;
| child_id | x_parid | value |
|        1 |       1 |    10 | 
|        2 |       1 |    12 | 
|        3 |       1 |    15 | 
|        4 |       2 |    25 | 
|        5 |       2 |    26 | 
|        6 |       2 |    26 | 
|        7 |       3 |    31 | 
|        8 |       3 |    31 | 
|        9 |       3 |    31 | 
9 rows in set (0.00 sec)

Because we do not need the auto incrementing anymore, we can remove it again:

Query OK, 9 rows affected (0.08 sec)
Records: 9  Duplicates: 0  Warnings: 0

Unfortunately this cannot be combined into a single ALTER TABLE statement, because the parser seems to first check if the statement's parts are valid, which is not the case for the last MODIFY COLUMN segment as the child_id column does not exist at that time.


Anonymous said…

Good post. Helped me quite a bit.

With respect,
Anonymous said…
Many thanks, you helped me out a treat.

Anonymous said…
Thanks, that helped me a lot. I was having problems with Devart's MySQL entity framework support and that's exactly what I needed.
Anonymous said…
Hi Daniel,

I have a slightly different problem but it falls under the primary key issue: my table perfume has three columns (including perfume_id as primary key) and is currently empty. I am trying to insert values into the column perfume_brand but I am getting an error message that says duplicate entry for 0 for key primary. How am I suppose to populate this table so that the perfume brand values get assigned a primary key? thanks
Make sure your primary key column is defined as "Auto Increment". Otherwise it is up to you to provide unique key values. For the first record the default value 0 will be inserted if you do not tell the database anything else. But for the second record this default value will not be valid as it is already taken.
PĂȘle-MĂȘle said…
Thanks . Got it, I recreated a table and set the ID as auto_increment
Anti-Dan said…
Worst advice ever, still cannot change the primary key for the table due to duplicates, by following your advice.
<irony>Jeez, Blogger must have swallowed the useful part of your comment...</irony>
Maybe if you had presented a little more detail one could be of help, but hey, chance gone. Won't feed the troll any more.
Unknown said…
I've used this page 4 times so far :-) Thank you, and please leave the page up for the next 4 times I use it!
Unknown said…
plz tell me how clear the mysql promt

mysql> ;;;;;;;;;;;,,,,,,,,,,,,,
mysql> ;;;;;;;;;;;,,,,,,,,,,,,,
mysql> ;;;;;;;;;;;,,
mysql> how to we clear the prompt
how we get new promt without closing.
Anonymous said…
Thanks, it helps me !
Anonymous said…
no shit you cant use a primary key if you have duplicate id numbers, anti dan....

wouldnt that kind of defeat the purpose of a primary key?
Unknown said…
Thanks. This helped me.
Anonymous said…
its really really really really really really really really helpful
Anonymous said…
Thanks a lot Daniel. Your post helped me a lot!

Aza D. Oberman said…
I'm getting the dreaded "Duplicate entry for key 'PRIMARY'" when initially populating a freshly created table using MySQL under DBI:

CREATE TABLE `adminFiles` (
`Identifier` varchar(35) NOT NULL,
`admin_email` varchar(40) DEFAULT NULL,
`admin_phone` char(15) DEFAULT NULL,
`Password` varchar(75) DEFAULT NULL COMMENT 'use RickRack.cgi',
`Identification` varchar(40) DEFAULT NULL,
`region` char(20) NOT NULL,
`temp_entry` int(1) DEFAULT NULL,
PRIMARY KEY (`Identifier`)

After preparing "INSERT INTO adminFiles (Identifier, admin_email, admin_phone, Password, Identification, region, temp_entry) VALUES (?, ?, ?, ?, ?, ?, ?)" my very first attempt to add the following generated the "Duplicate entry for key 'PRIMARY'" error:

sth->execute('Ubermiller', '', '2022335734', 'c5c36t', 'Admin', 'DCnd', '0');

Deleting the entire database and creating a new one doesn't help.

This PRIMARY key is not an integer that can be autoincremented -- the "fix" most favored here.

It seems far fetched but could the ORDINAL_POSITION of the columns be in play here?

Popular posts from this blog

SAXParseException: -1:-1: Premature End Of File - Misleading error

Amiga 500 Restoration: A501 Memory Expansion

Something to know about Toshiba external USB HDD