The Case for Global Transaction Identifiers
"Global Transaction Identifiers" is a feature that has been requested every now and then. And it is not so much about what it actually is, but rather about what it enables MySQL users to do. Having a logical identifier associated with each transaction instead of a physical one (filename + offset), provides more flexibility and removes the burden of complex math from userland scripts. We have put out there an early access release (based on 5.6 codebase) of our ongoing effort to implement the global transaction identifiers and we would like some early feedback. Keep in mind that this is NOT something to use in production as it is in very early development stages. That said...
What exactly is a global transaction identifier?
A global identifier is a tag that pin-points a set of changes resulting from the execution of a transaction.
Why do we need global transaction identifiers?
If every transaction has its own universally unique identifier, it becomes a lot easier to follow changes through a complex replication stream. It is easier for us, humans, to visualize and understand what is going on, consequently, the algorithms we write for dealing with binary logs and replication tend to be far less convoluted.
In practice, what are the benefits of this all?
Designing a Global Transaction Identifier
The replication team has been working on several nice features that you must have noticed before (multi-threaded slave, row-based replication enhancements, ...). But now, we are adding global transaction identifiers to the list.
Global Transaction Identifier
The goal of this task is to augment MySQL binary log with Global Transaction Identifiers. Thus each transaction has an associated global identifier (GTID), which is essentially a pair:
GTID = <SID, GNO>
In practice a transaction is logged as a group of events in the binary log. Thus sometimes we refer to groups instead of transactions - I ended up use them intermixed in the text below. Actually, there are a few details about this, but to avoid risking excessive and unnecessary complexity in what I will describe below, I will just omit those.
The following describes more clearly each part of an GTID.
Indexes and Relaying Transactions in the Replication Stream
In a typical MySQL replication setup there is one master server and a set of slave servers retrieving the changes from that one master and replaying those changes locally, against their own databases. Thus, GTIDs are added to transaction on the originating server - the master. As such, the following major changes have to be done on the originating side:
Slaves relay changes from the master. This means that transactions that are replayed by a slave thread will keep the original GTID. Furthermore, slaves also maintain indexes to keep track of their relay logs, and its content is also flushed asynchronously. As in the master, on slave restart, a recovery routine is run to make sure that the indexes are consistent.
The current snapshot does not yet relay GTIDs through the replication protocol, so we do not get to see the identifiers flowing all the way to the slave. But we can inspect the master binary log and have a look at the identifiers...
Early Access: Exercising the Labs Snapshot.
We have uploaded a snapshot of our current work, to labs, which you can try out. It's buggy and it's incomplete, but it lays the ground to what we will be delivering in the future. So... how can we show off a bit of what we have done? Currently, we can issue a set of commands on the master and look into the resulting binary log to search for information regarding the new transaction identifiers. For instance, issuing the following commands on a server with binary log enabled:
Will get you an output very similar to the following one, when inspecting the binary log with the mysqlbinlog tool:
In the output above one can find an additional line of metadata related to global transaction identifiers and a few new variables related to GTIDs. For now, lets just concentrate in finding the global transaction identifier. Looking at the line starting with "#Subgroup", one can find and '<UUID>:1' and '<UUID>:2'. These relate to the two transactional groups, the one that consists only of the DDL 'CREATE TABLE...' and the second group is the one that consists of the set of 'INSERT INTO...' statements that comprise the explicit transaction issued.
Now... we can filter out one of the transactions just by issuing (lets skip the create table):
Summary
This post provides some insights on the work the replication team is doing on designing and implementing global transaction ids. It gives a general overview of what the problem is and roughly what the solution is and how it is trying to solve a long standing requirement such as easier fail-over.
The good news are that we are not just designing anymore, we are already implementing it and you can even get a recent snapshot of this feature branch. Go... download it, look at the code, build the branch (or download a binary one), play a little bit with it. In the current implementation, global transaction ids are not yet part of the replication protocol, but you can see them by inspecting the master binary log, using mysqlbinlog tool.
Enjoy!![]()
PlanetMySQL Voting: Vote UP / Vote DOWN
"Global Transaction Identifiers" is a feature that has been requested every now and then. And it is not so much about what it actually is, but rather about what it enables MySQL users to do. Having a logical identifier associated with each transaction instead of a physical one (filename + offset), provides more flexibility and removes the burden of complex math from userland scripts. We have put out there an early access release (based on 5.6 codebase) of our ongoing effort to implement the global transaction identifiers and we would like some early feedback. Keep in mind that this is NOT something to use in production as it is in very early development stages. That said...
What exactly is a global transaction identifier?
A global identifier is a tag that pin-points a set of changes resulting from the execution of a transaction.
Why do we need global transaction identifiers?
If every transaction has its own universally unique identifier, it becomes a lot easier to follow changes through a complex replication stream. It is easier for us, humans, to visualize and understand what is going on, consequently, the algorithms we write for dealing with binary logs and replication tend to be far less convoluted.
In practice, what are the benefits of this all?
- Fail-over: automation of fail-over suddenly becomes a lot easier. Instead of working through the physical coordinates to decide which slave is most up to date, with respect to the master it is going to replace, one can just compare global transaction identifiers of the last applied transactions. Slave promotion gets easier. However, the major benefit comes when switching over the slaves to the new master. They can reference a global transaction identifier and not have to convert binary log filenames and offsets between different servers.
- Session consistency and Hierarchical replication: Offloading the master, through some hierarchical replication scheme, often works very well, especially if tied together with an intelligent load-balancer. Sometimes the load balancer sends read queries to a slave down in the hierarchy chain and at the same time it has to be sure that session consistency is guaranteed (updates on the master must have already been applied on the slave being queried). Making sure that the update has flowed all the way down to the desired slave without global transaction identifiers is laborious (one needs to climb down the hierarchy and follow the changes on every level in the hierarchy). On the other hand, with global transaction identifiers, one can just wait for the transaction with the desired identifier to be applied on a given slave and then run the query.
- Enabler for multi-master update everywhere replication: This is a complex problem to solve, but the fact that transactions can be uniquely identified and distinguished from one another lays the ground for establishing (at least partial) order between them. This property is often important in such replication setups (even for dealing with conflict detection).
Designing a Global Transaction Identifier
The replication team has been working on several nice features that you must have noticed before (multi-threaded slave, row-based replication enhancements, ...). But now, we are adding global transaction identifiers to the list.
Global Transaction Identifier
The goal of this task is to augment MySQL binary log with Global Transaction Identifiers. Thus each transaction has an associated global identifier (GTID), which is essentially a pair:
GTID = <SID, GNO>
In practice a transaction is logged as a group of events in the binary log. Thus sometimes we refer to groups instead of transactions - I ended up use them intermixed in the text below. Actually, there are a few details about this, but to avoid risking excessive and unnecessary complexity in what I will describe below, I will just omit those.
The following describes more clearly each part of an GTID.
- SID => currently it is a 128-bit number that identifies the server where the transaction/group of events was first committed. SID is normally the server UUID, but may be something different if a transaction is generated by something else other than than a regular MySQL Server. For example, for NDB, it identifies the Cluster.
- GNO => is a 64-bit sequence number: 1 for the first changes logged on SID, 2 for the second changes, and so on. No change can have GNO 0.
Indexes and Relaying Transactions in the Replication Stream
In a typical MySQL replication setup there is one master server and a set of slave servers retrieving the changes from that one master and replaying those changes locally, against their own databases. Thus, GTIDs are added to transaction on the originating server - the master. As such, the following major changes have to be done on the originating side:
- Annotate existing binary log events with the GTID that they belong to... or create a new type of event that stores a GTID associated with a set of subsequent events in the replication stream.
- Create an index to quickly find out which are the physical coordinates that map into the logical identifiers, ie, the GTID. This makes looking up for which binary log file and at which offset a transaction is in, given a certain GTID, very easy and quick. It is especially useful for a dump thread, when it starts, to quickly find the physical position from which it should start reading and sending events to the slave.
- SID: The unique source identifier for this group of events/transaction.
- GNO: the sequence number of the group of events/transaction.
- LGID: This a local identifier like an auto-increment primary.
- binlog file: name of binary log where this group is stored.
- binlog pos: offset in binary log where this group starts.
- binlog length: length of this group in binary log.
- group end: true if this is the last set of events with GTID.
Slaves relay changes from the master. This means that transactions that are replayed by a slave thread will keep the original GTID. Furthermore, slaves also maintain indexes to keep track of their relay logs, and its content is also flushed asynchronously. As in the master, on slave restart, a recovery routine is run to make sure that the indexes are consistent.
The current snapshot does not yet relay GTIDs through the replication protocol, so we do not get to see the identifiers flowing all the way to the slave. But we can inspect the master binary log and have a look at the identifiers...
Early Access: Exercising the Labs Snapshot.
We have uploaded a snapshot of our current work, to labs, which you can try out. It's buggy and it's incomplete, but it lays the ground to what we will be delivering in the future. So... how can we show off a bit of what we have done? Currently, we can issue a set of commands on the master and look into the resulting binary log to search for information regarding the new transaction identifiers. For instance, issuing the following commands on a server with binary log enabled:
shell> SET AUTOCOMMIT=0;
shell> CREATE TABLE t1 (a INT) Engine=InnoDB;
shell> INSERT INTO t1 VALUES (1);
shell> INSERT INTO t1 VALUES (2);
shell> INSERT INTO t1 VALUES (3);
shell> COMMIT;
Will get you an output very similar to the following one, when inspecting the binary log with the mysqlbinlog tool:
$ mysqlbinlog -v var/mysqld.1/data/master-bin.000001
(...)
# at 114
# Subgroup(#1, D5375118-EF7C-11E0-8C85-F0DEF11A08B7:1, END, COMMIT, binlog(no=0, pos=114, len=107, oals=0))
SET UGID_NEXT='D5375118-EF7C-11E0-8C85-F0DEF11A08B7:1', UGID_END=1, UGID_COMMIT=1/*!*/;
#111005 11:08:04 server id 1 end_log_pos 221 Query thread_id=1 exec_time=0 error_code=0
use test/*!*/;
SET TIMESTAMP=1317838084/*!*/;
SET @@session.pseudo_thread_id=1/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=0/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
CREATE TABLE t1 (a int) Engine=InnoDB
/*!*/;
# at 221
# Subgroup(#2, D5375118-EF7C-11E0-8C85-F0DEF11A08B7:2, END, COMMIT, binlog(no=0, pos=221, len=387, oals=27))
SET UGID_NEXT='D5375118-EF7C-11E0-8C85-F0DEF11A08B7:2', UGID_END=0, UGID_COMMIT=0/*!*/;
#111005 11:08:10 server id 1 end_log_pos 296 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1317838090/*!*/;
BEGIN
/*!*/;
# at 296
#111005 11:08:10 server id 1 end_log_pos 391 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1317838090/*!*/;
INSERT INTO t1 VALUES (1)
/*!*/;
# at 391
#111005 11:08:13 server id 1 end_log_pos 486 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1317838093/*!*/;
INSERT INTO t1 VALUES (2)
/*!*/;
# at 486
# Subgroup(#2, D5375118-EF7C-11E0-8C85-F0DEF11A08B7:2, END, COMMIT, binlog(no=0, pos=221, len=387, oals=27))
SET UGID_END=1, UGID_COMMIT=1/*!*/;
#111005 11:08:16 server id 1 end_log_pos 581 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1317838096/*!*/;
INSERT INTO t1 VALUES (3)
/*!*/;
# at 581
#111005 11:08:18 server id 1 end_log_pos 608 Xid = 11
COMMIT/*!*/;
SET UGID_NEXT='AUTOMATIC'/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
In the output above one can find an additional line of metadata related to global transaction identifiers and a few new variables related to GTIDs. For now, lets just concentrate in finding the global transaction identifier. Looking at the line starting with "#Subgroup", one can find and '<UUID>:1' and '<UUID>:2'. These relate to the two transactional groups, the one that consists only of the DDL 'CREATE TABLE...' and the second group is the one that consists of the set of 'INSERT INTO...' statements that comprise the explicit transaction issued.
Now... we can filter out one of the transactions just by issuing (lets skip the create table):
$ mysqlbinlog -v --exclude-ugids=D5375118-EF7C-11E0-8C85-F0DEF11A08B7:1 var/mysqld.1/data/master-bin.000001Or we could even not print identifiers at all:
(...)
$ mysqlbinlog -v --skip-ugids var/mysqld.1/data/master-bin.000001There are a couple of more switches implemented in the mysqlbinlog tool that are useful to handle contents on the binary log, based on the global transaction identifier. But I'll leave it up to you to check that out.
(...)
Summary
This post provides some insights on the work the replication team is doing on designing and implementing global transaction ids. It gives a general overview of what the problem is and roughly what the solution is and how it is trying to solve a long standing requirement such as easier fail-over.
The good news are that we are not just designing anymore, we are already implementing it and you can even get a recent snapshot of this feature branch. Go... download it, look at the code, build the branch (or download a binary one), play a little bit with it. In the current implementation, global transaction ids are not yet part of the replication protocol, but you can see them by inspecting the master binary log, using mysqlbinlog tool.
Enjoy!
PlanetMySQL Voting: Vote UP / Vote DOWN