The title is SEO bait – you can’t do it. We’ve seen a few recurring patterns trying to achieve similar – and I thought I would share with you my favorite two:
Option #1: Use a table to insert into, and grab the insert_id:
CREATE TABLE option1 (id int not null primary key auto_increment) engine=innodb; # each insert does one operations to get the value: INSERT INTO option1 VALUES (NULL); # $connection->insert_id();
Option #2: Use a table with one just row:
CREATE TABLE option2 (id int not null primary key) engine=innodb; INSERT INTO option2 VALUES (1); # start from 1 # each insert does two operations to get the value: UPDATE option2 SET id=@id:=id+1; SELECT @id;
So which is better? I don’t think it’s that easy to tell at a first glance, since option 2 does look more elegant – but if the next value is fetched as part of a transaction – I can see a potential for many other transactions to back up waiting on a lock (more on that in a second).
To start with a naive test, I booted two EC2 small instances in the same availability zone. Ping times are ~0.5ms between nodes. –skip-name resolve is enabled on the server. There is some skew from the machine being virtualized. My simulation is:
- Create a new sequence value
- insert into another table
The testing options are:
- Option 1 – auto_commit each statement
- Option 2 – auto_commit each statement
- Option 1 – nest each loop in a transaction
- Option 2 – nest each loop in a transaction
The raw results are:
option1 usetransactions =19 seconds for x10000 iterations. option1 ignoretransactions = 13 seconds for x10000 iterations. option2 usetransactions = 27 seconds for x10000 iterations. option2 ignoretransactions =22 seconds for x10000 iterations.
Alright – option1 seems quicker. The problem is that to be like most applications, we can’t really tell until a little concurrency is applied. Using only the “transactional” test in a few more concurrency options:
(results are transactions/second – more is better).
Option #1 starts to flatten out after 4 threads – and this is probably just because the machine I am running it against has 1 CPU. Option #2 stays flat the whole time.. and while we are running it, most of the threads are perpetually in a state of ‘Searching rows for update’ – which is what I suspect is better described as waiting on a lock.
Option #2 will likely scale better in auto_commit, since locks are held for the duration of a transaction, but this is not always possible to do if you have already started modifying data before you need an auto_increment number but you do not want to commit yet.
Entry posted by Morgan Tocker | 7 comments
PlanetMySQL Voting: Vote UP / Vote DOWN