Sharing an auto_increment value across multiple MySQL tables

The title is SEO bait – you can’t do it. We’ve seen a few recurring patterns trying to achieve similar – and I thought I would share with you my favorite two:

Option #1: Use a table to insert into, and grab the insert_id:

CREATE TABLE option1 (id int not null primary key auto_increment) engine=innodb;

# each insert does one operations to get the value:
INSERT INTO option1 VALUES (NULL);
# $connection->insert_id();

Option #2: Use a table with one just row:

CREATE TABLE option2 (id int not null primary key) engine=innodb;
INSERT INTO option2 VALUES (1); # start from 1

# each insert does two operations to get the value:
UPDATE option2 SET id=@id:=id+1;
SELECT @id;

So which is better? I don’t think it’s that easy to tell at a first glance, since option 2 does look more elegant – but if the next value is fetched as part of a transaction – I can see a potential for many other transactions to back up waiting on a lock (more on that in a second).

To start with a naive test, I booted two EC2 small instances in the same availability zone. Ping times are ~0.5ms between nodes. –skip-name resolve is enabled on the server. There is some skew from the machine being virtualized. My simulation is:

Create a new sequence value
insert into another table

The testing options are:

Option 1 – auto_commit each statement
Option 2 – auto_commit each statement
Option 1 – nest each loop in a transaction
Option 2 – nest each loop in a transaction

The raw results are:

option1 usetransactions =19 seconds for x10000 iterations.
option1 ignoretransactions = 13 seconds for x10000 iterations.
option2 usetransactions = 27 seconds for x10000 iterations.
option2 ignoretransactions =22 seconds for x10000 iterations.

Alright – option1 seems quicker. The problem is that to be like most applications, we can’t really tell until a little concurrency is applied. Using only the “transactional” test in a few more concurrency options:

(results are transactions/second – more is better).

Option #1 starts to flatten out after 4 threads – and this is probably just because the machine I am running it against has 1 CPU. Option #2 stays flat the whole time.. and while we are running it, most of the threads are perpetually in a state of ‘Searching rows for update’ – which is what I suspect is better described as waiting on a lock.

Option #2 will likely scale better in auto_commit, since locks are held for the duration of a transaction, but this is not always possible to do if you have already started modifying data before you need an auto_increment number but you do not want to commit yet.

Full disclosure

Entry posted by Morgan Tocker | 7 comments

Add to: | | | |