Global Transaction Identifiers are in MySQL 5.6.5 DMR

Global Transaction Identifiers are in!

I am very happy and especially proud to announce that the replicationteam has delivered global transaction identifiers to MySQL 5.6.5 Development Milestone Release (DMR). Itis a very useful, big, impressive and game-changing feature that willmake life easier for many of our users. With this feature in place, itis much simpler to track replication progress through the replicationtopology thus it removes part of the burden of deploying andadministering complex multi-tier replication topologies. Actually, asstated before, this feature is an enabler, as it gives the user somuch more flexibility when it comes to deploying replication andtracking the data that is replicated. In particular, it is afoundation for reliable and automated fail/switch over with slavepromotion, reducing the need for 3rd party HA infrastructure (whichadds cost and complexity). In fact, my good colleague Chuck has already written a couple of utilities that build on global transaction identifiers and implement automated fail/switch over. You should havea look at that as well!

Having said that, lets have a look at what is under the hood.

Structure, Life Cycle and Tracking of GTIDs

It has been discussed before what a global transaction identifier(GTID) is. Just to recap, a global transaction identifier is a tuple(SID, GNO). SID is normally the SERVER_UUID and GNO is a sequencenumber (1 for the first transaction committed on SID, 2 for thesecond, and so on). Basically, a GTID is a logical identifier thatmaps into physical coordinates (log file name, file offset). Physicalcoordinates are likely to be different at each different server. Incontrast, global transaction identifiers are not.

Now, about the identifier lifecycle… When a transaction isexecuted for the first time on the master, the master assigns it aGTID. Since SID is the server's UUID, it remains constant fortransactions executed on that server. GNO on the other hand, isgenerated automatically when the transaction commits (it is thesmallest number not yet used as GNO for any other transaction with thesame SID). GTIDs are persisted in the binary log as a new log eventtype that holds the actual identifier. The event is calledGtid_log_event. As such, when the group of events for a giventransaction is to be written to the binary log, a new Gtid_log_eventis also written, preceding the group.

Once the GTID is in the binary log, the identifier flows seamlesslythrough the replication stream. The Gtid_log_event is read and sent bythe dump thread, received and stored in the slave's relay log by theIO thread and read from the relay log by the SQL thread. Nothing newhere. However, when the SQL thread executes the transaction, it doesnot generate a new identifier. Instead, it preserves the sameidentifier, relaying it to its own binary log. Thus, the serverensures that a replayed transaction gets the same GTID it was assignedon the master. In fact, two properties hold:

No transaction is re-executed more than once. (Therefore, a before applying a transaction a server checks that it has not applied it before. If it has, it skips it).
Two different transactions cannot have the same GTID.

All in all, the server must not and will not execute a transaction ifthat transaction GTID already exists in the binary log, or if someother, concurrent client is executing a transaction with the sameidentifier. Ultimately, this means that the server keeps a record ofwhich set of transactions it has seen/executed. This is mostly usefulin fail-over scenarios in which the DBA does not need to calculatehimself from which point in the replication stream the slave shouldpick up (when redirected to a new master). Since slaves know what theyhave processed, they can auto-position themselves in the replicationstream.

To sum up, the current GTIDs implementation is comprised of two majorblocks:

The transaction identifier. It serves a simple and yet very powerful purpose. It uniquely identifies a set of events.
The state machine that keeps track of which transactions a server has seen. It is a key part of the procedure of switching slaves to a new master as well as keeping the data of a slave consistent by preventing undesirable re-execution of transactions.

CHANGE MASTER Made Easy (Fail-over facilitator)

In a GTIDs enabled topology the master-slave handshake is slightlydifferent from what it has been until now. When the slave connects toa new master, the slave will tell the master which GTIDs it has in itsrelay or binary log. Therefore, the master can pick which transactionsthe slave is missing and send only those to the slave. This is allautomatic and means that the slave does not need to know any non-localdata. Before GTIDs, the slave had to know positions in the master'sbinary log. That is not needed now. More importantly, the user doesnot have to know anything about the replication positions, thus noposition (re)calculation has to be done when redirecting slaves to anew master. The user can just issue:

CHANGE MASTER TO MASTER_HOST='...', MASTER_PORT=SOME_PORT, MASTER_USER='...', MASTER_AUTO_POSITION=1;

The new parameter, MASTER_AUTO_POSITION=1, should be used instead ofthe position parameters MASTER_LOG_FILE and MASTER_LOG_POS and tellsthe server to use GTIDs (i.e., the GTIDs protocol handshake betweenmaster and slave). Then, when the slave is started, both master andslave will automatically agree on which transactions the slave ismissing and replication will resume from the correct point. Again, theDBA has only one thing to do while switching the slave to a newmaster: use MASTER_AUTO_POSITION=1 . Nothing else.

Restrictions and Design Changes

There are a few restrictions if the user wants to use this newfeature, since some MySQL functionality is not compliant with GTIDs. Ido not want to go into all the gory details, but here is a list ofconstructs that are automatically blocked by the server to ensurerobust operation under GTIDs mode, and a brief explanation why:

Non-transactional updates, such as MyISAM. There are a few cases when using non-transactional tables might result in duplicate GTIDs throughout a replication chain. Either by having transactions including changes to non-transactional tables or by having master and slave with different engine types.
CREATE TABLE … SELECT . In RBR, CREATE TABLE … SELECT is split into two transactional group of events: one for the CREATE TABLE and one for the row events. Thus, in a replication chain, both group of events could end up getting the same identifier. Since a server skips transactions that it has seen before, the row events would not be applied by a server further down the chain.
[CREATE|DROP] TEMPORARY TABLE executed inside a transaction. CREATE TEMPORARY TABLE and DROP TEMPORARY table are special statements: they can be executed inside a transaction - there is no implicit commit - but they cannot be rolled back. This is similar to updates to non-transactional tables. There are scenarios in which replication could break and/or parts of transactions might be replayed twice while failing-over to a new master. I will skip the details here, as this is a somewhat convoluted example.

Given the three cases above, to enable global transaction identifiers,one has to disable such offending statements, by starting the serverwith the switch:

–disable-gtid-unsafe-statements

On the design changes, a note worth mentioning is the fact that theapproach changed a bit since the last time I blogged. The currentimplementation leaves out the planned indexes that would mapidentifiers to physical coordinates. We are still looking into how tosolve all issues around this.

Hands-on

To start making use of GTIDs, the entire replication infrastructureneeds to be configured to use GTIDs. In addition, all servers should"speak" GTIDs, meaning that there should not be any transaction,without an identifier, still pending execution. Starting the serverwith GTIDs ON, requires four switches (two of which you already knowfrom long before):

–log-bin
–log-slave-updates
–gtid-mode=ON
–disable-gtid-unsafe-statements

Obviously, the server needs the binary log turned ON (–log-bin). Italso requires the –log-slave-updates ON, since an SQL thread mustpersist GTIDs data while relaying those events. Furthermore, it may bethe case that a master will be demoted to a slave's role at some pointin time. Turning this switch ON prevents the user from having torestart the server when that moment arrives.

There are also a couple of other two new options. One(–disable-gtid-unsafe-statements) was already explained on the"Restrictions" section. The last one, –gtid-mode, it is a simpleswitch that turns on the GTID feature.

That's it. Starting mysqld with these options gets us a serverconfigured to use GTIDs. That can be checked that by issuing:

mysql> SHOW VARIABLES LIKE '%gtid_mode%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_mode     | ON    |
+---------------+-------+
1 row in set (0,01 sec)

OK, great. Now, lets play a bit with it. Lets create a table:

mysql> use test;
Database changed
mysql> CREATE TABLE t1 (a INT);
Query OK, 0 rows affected (0,02 sec)

mysql> SHOW BINLOG EVENTS;
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name          | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| master-bin.000001 |   4 | Format_desc    |         1 |         117 | Server ver: 5.6.6-m8-debug-log, Binlog ver: 4                     |
| master-bin.000001 | 117 | Previous_gtids |         1 |         144 |                                                                   |
| master-bin.000001 | 144 | Gtid           |         1 |         188 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1' |
| master-bin.000001 | 188 | Query          |         1 |         281 | use `test`; CREATE TABLE t1 (a INT)                               |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+

We can notice a few new things in the binary log already. There is aPrevious_gtids event, which I have not mentioned before to not clutterthis blog entry. Lets just assume that it is an event that keeps trackof transaction identifiers that existed in a set of binary log filespreviously purged. Anyway, the most interesting part for now is theGtid event. It precedes the CREATE TABLE and assigns it the uniqueidentifier:

'4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1'

Very good! Lets insert a value into the table:

mysql> INSERT INTO t1 VALUES (1);
Query OK, 1 row affected (0,01 sec)

This results in the following set of events to be logged into thebinary log:

mysql> SHOW BINLOG EVENTS;
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name          | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| master-bin.000001 |   4 | Format_desc    |         1 |         117 | Server ver: 5.6.6-m8-debug-log, Binlog ver: 4                     |
| master-bin.000001 | 117 | Previous_gtids |         1 |         144 |                                                                   |
| master-bin.000001 | 144 | Gtid           |         1 |         188 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1' |
| master-bin.000001 | 188 | Query          |         1 |         281 | use `test`; CREATE TABLE t1 (a INT)                               |
| master-bin.000001 | 281 | Gtid           |         1 |         325 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:2' |
| master-bin.000001 | 325 | Query          |         1 |         400 | BEGIN                                                             |
| master-bin.000001 | 400 | Query          |         1 |         495 | use `test`; INSERT INTO t1 VALUES (1)                             |
| master-bin.000001 | 495 | Xid            |         1 |         522 | COMMIT /* xid=11 */                                               |
+-------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
8 rows in set (0,00 sec)

We can see right away that another identifier was assigned to the newtransaction. This time:'4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:2' .

Same SID, 4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7, different GNO, 2.

Now, lets go over to another server and set it as slave of thisone. One can do that by simply issuing (note: I am using root userjust for demonstration purposes):

mysql> CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=13000, MASTER_USER='root', MASTER_AUTO_POSITION=1;
Query OK, 0 rows affected (0,00 sec)

Inspecting SHOW SLAVE STATUS, one can find that everything is setup:

mysql> SHOW SLAVE STATUS\G
(...)
                  Master_Host: 127.0.0.1
                  Master_User: root 
                  Master_Port: 13000
(...)
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
1 row in set (0,00 sec)

You can see here two new fields, Retrieved_Gtid_Set andExecuted_Gtid_Set. The first one is the set of GTIDs pulled from themaster. The second one is the set of GTIDs actually executed.

Continuing… Lets start the slave:

mysql> START SLAVE;
Query OK, 0 rows affected (0,00 sec)

And lets inspect the slave status again:

mysql> SHOW SLAVE STATUS\G
(...)
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 13000
(...)
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
(...)
           Retrieved_Gtid_Set: 4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1-2
            Executed_Gtid_Set: 4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1-2
1 row in set (0,00 sec)

Inspecting the RELAY log on the slave, one can find:

mysql> show relaylog events in 'slave-relay-bin.000002';
(...)
| slave-relay-bin.000002 |  345 | Gtid           |         1 |         188 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1' |
| slave-relay-bin.000002 |  389 | Query          |         1 |         281 | use `test`; CREATE TABLE t1 (a INT)                               |
| slave-relay-bin.000002 |  482 | Gtid           |         1 |         325 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:2' |
| slave-relay-bin.000002 |  526 | Query          |         1 |         400 | BEGIN                                                             |
| slave-relay-bin.000002 |  601 | Query          |         1 |         495 | use `test`; INSERT INTO t1 VALUES (1)                             |
| slave-relay-bin.000002 |  696 | Xid            |         1 |         522 | COMMIT /* xid=11 */                                               |
+------------------------+------+----------------+-----------+-------------+-------------------------------------------------------------------+
13 rows in set (0,00 sec)

Looking into the slave's BINARY log, once can find:

mysql> SHOW BINLOG EVENTS;
+------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| Log_name         | Pos | Event_type     | Server_id | End_log_pos | Info                                                              |
+------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
| slave-bin.000001 |   4 | Format_desc    |         2 |         117 | Server ver: 5.6.6-m8-debug-log, Binlog ver: 4                     |
| slave-bin.000001 | 117 | Previous_gtids |         2 |         144 |                                                                   |
| slave-bin.000001 | 144 | Gtid           |         1 |         188 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1' |
| slave-bin.000001 | 188 | Query          |         1 |         281 | use `test`; CREATE TABLE t1 (a INT)                               |
| slave-bin.000001 | 281 | Gtid           |         1 |         325 | SET @@SESSION.GTID_NEXT= '4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:2' |
| slave-bin.000001 | 325 | Query          |         1 |         400 | BEGIN                                                             |
| slave-bin.000001 | 400 | Query          |         1 |         495 | use `test`; INSERT INTO t1 VALUES (1)                             |
| slave-bin.000001 | 495 | Xid            |         1 |         522 | COMMIT /* xid=15 */                                               |
+------------------+-----+----------------+-----------+-------------+-------------------------------------------------------------------+
8 rows in set (0,00 sec)

Notice how the original identifiers were indeed preserved.

Getting back to the master, one interesting new variable to observe isGTID_DONE. This variable contains the set of logged transactions. Onecan query it and know which transactions were actually seen/executedby the server:

mysql> SELECT @@GLOBAL.GTID_DONE;
+------------------------------------------+
| @@GLOBAL.GTID_DONE                       |
+------------------------------------------+
| 4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1-2 |
+------------------------------------------+
1 row in set (0,00 sec)

This is very cool. The same thing can be done on the slave side:

mysql> SELECT @@GLOBAL.GTID_DONE;
+------------------------------------------+
| @@GLOBAL.GTID_DONE                       |
+------------------------------------------+
| 4B2CBA63-8082-11E1-BE2D-F0DEF11A08B7:1-2 |
+------------------------------------------+
1 row in set (0,00 sec)

OK, this was just an appetizer! There are a few more variables andadditional extensions either to the replication layer as well as tomysqlbinlog that are worth checking out! I will let you uncover thoseyourself with the help of the great online manual which already documents much of this great feature.

Summary

This post provides a very brief insight on Global TransactionIdentifiers. It is a major replication feature that made it into MySQL5.6.5 and one that is extremely important for people doing HighlyAvailable systems based on MySQL replication.

Since it is a big feature and a big change in the replicationbehavior, I have only covered a small part of it in this post. It isjust like a very brief introduction. Anyway, this post touches thevery basic parts to get one going as it explains what a GTID and itslife cycle is, how to activate it and how to connect servers usingGTID protocol (and thus how easy and simple it has become to do asimple slave switchover to new master). In addition it also alsomentions some of the restrictions one has to deal with when using thisfeature.

It concludes with a very short overview of how a server behaves whenGTID is turned ON.

I cannot just end this post without referring the readers, again, tothe very nice and extremely interesting blog post from Chuck. Itpresents two new MySQL utilities that automate a couple of the mostcomplex replication administration tasks: switchover andfail over. Such utilities already build on global transactionidentifiers and take much of the pain away while doing slave promotionor taking a master down for maintenance, and so forth… And, by theway, MySQL 5.6 is full of new, interesting and very useful MySQLreplication features. You can find more details by going through thedeveloper's zone replication article. Go and have a look.

I hope you have a great time trying this new feature out, but moreimportantly, that it fits nice and perfectly in your own use casesgoing forward.

Have fun!

PlanetMySQL Voting: Vote UP / Vote DOWN

Global Transaction Identifiers are in MySQL 5.6.5 DMR