I received many request about MySQL High Availability Architecture. There are a lot of tools, but how can we use them to achieve MySQL HA without over engineering everything.
To answer such demand, there is no generic architecture, of course there are leaders in solutions, but put them together can result in many different designs, I will start a series of article on that topic. Feel free, as usual to comment, but also recommend other tools that could be used in a future post.
So today’s post is related to MySQL InnoDB Cluster with Hashicorp’s Consul.
Architecture
This is the overview of the deployed architecture:
As you can see, we have 3 data centers, but in fact we have only two DCs on premises and one in the cloud. A large amount of request are always related on having only 2 DCs, or at least capacity for only two DCs (storage and CPU). But we already know that to achieve High Availability, we need at least 3 (3 nodes, 3 DCs, …). So in our design, we have DC1 and DC2 running each an independent MySQL InnoDB Cluster in Single-Primary Mode. We also have an asynchronous replication going from one member of dc1 to the primary master of dc2 and reverse way too.
In this post, I won’t cover the asynchronous replication failover that moves the asynchronous slave task on another member of the same group when the Primary Role is moved to another member.
Then we have one consul member on DC3 being there only for Quorum and answer DNS requests.
mysqlha service
In consul we create mysqlha service like this :
{ "services": [ { "name": "mysqlha", "tags": [ "writes", "reads" ], "port": 3306, "checks": [ { "script": "/etc/consul.d/scripts/check_mysql.sh", "interval": "5s" } ] } ] }
The check_mysql.sh script is just an easy bash script connecting on the read/write port of the local MySQL-Router
running on each member of the InnoDB Cluster (Group) and checking if there is a Primary non-R/O member.
This is what we can see in consul’s UI:
Failover rule
Now we will add a rule (query) for the failover:
curl --request POST --data '{ "Name": "mysqlha", "Service": { "Service": "mysqlha", "Failover": { "Datacenters": ["dc1","dc2"] }, "OnlyPassing": true } }' http://127.0.0.1:8500/v1/query
We need to perform this operation on one member of each DC (dc1, dc2 and dc3).
When this is performed, we can test the query on dc3 (where there is no service mysqlha is running):
[root@monitordc3 ~]# dig @127.0.0.1 -p 53 mysqlha.query.mysql ; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.4 <<>> @127.0.0.1 -p 53 mysqlha.query.mysql ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28736 ;; flags: qr aa rd; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;mysqlha.query.mysql. IN A ;; ANSWER SECTION: mysqlha.query.mysql. 0 IN A 192.168.50.4 mysqlha.query.mysql. 0 IN A 192.168.50.3 mysqlha.query.mysql. 0 IN A 192.168.50.2
We can notice that the dns returns all 3 nodes in DC1 (192.168.50.x
), perfect ! We can of course connect on any one using the router ports (6446
and 6447
).
The next step is related to the application servers, we have here two options, or we use a consul client service to perform the dns resolving or we just setup dns resolving on it but… on all we need to use a the node in dc3 and one node in the other dc (or we can create a fake extra dc only for doing the dns resolving, this is because on dc2 the service is running anyway but we want to be redirected to the primary dc if it’s running… we could query mysqlha.query.dc1.mysql but that will work only if there will be a consul server in dc1 reachable… which is not the case if the full dc goes down).
So if we have an app server in dc2, its name resolving setup should be :
search lan localdomain nameserver 192.168.70.2 nameserver 192.168.50.2
Asynchronous Slave
As between our two data centers, the bandwidth can be high, we opted for an asynchronous replication between the two DCs (remember, dc2 is used as GEO DR).
So now we need to setup asynchronous replication between both DCs, I opted for the Primary Masters of each DC to be the async master and the async slave. Being the slave is mandatory for the Primary Master as it’s the only one allowed to write data.
I created the replication grants on each Primary Masters (in dc1 and dc2), and I purged the GTID to start the asynchronous replication:
Traditional replication setup
I started with the user and the requested privileges:
mysql> CREATE USER 'async_repl'@'%' IDENTIFIED BY 'asyncpwd'; Query OK, 0 rows affected (0.50 sec) mysql> GRANT REPLICATION SLAVE ON *.* TO 'async_repl'@'%'; Query OK, 0 rows affected (0.11 sec)
Don’t forget to set GTID_PURGED
according to what you see in the other Primary Master.
Example:
First I check on mysqld4dc2
(the current Primary Master in DC2):
mysql4dc2> show global variables like 'GTID_EXECUTED'; +---------------+--------------------------------------------------------------------------------------+ | Variable_name | Value | +---------------+--------------------------------------------------------------------------------------+ | gtid_executed | 0de7c84a-118e-11e8-b11c-08002718d305:1-44, a6ae95b5-118d-11e8-9ad8-08002718d305:1-27 | +---------------+--------------------------------------------------------------------------------------+
And then on server in DC1 do:
mysql> SET @@GLOBAL.GTID_PURGED='0de7c84a-118e-11e8-b11c-08002718d305:1-44, a6ae95b5-118d-11e8-9ad8-08002718d305:1-27'
We do the same on all machines from dc1 with the GTID_EXECUTED
from DC2.
Now we will add a new service in consul (on each MySQL nodes in DC1 and DC2):
{ "name": "mysqlasync", "tags": [ "slave" ], "port": 3306, "checks": [ { "script": "/etc/consul.d/scripts/check_async_mysql.sh", "interval": "5s" } ] }
And don’t forget to copy the script check_async_mysql.sh in /etc/consul/scripts/
(this script is not made for production and may contain bugs, it’s just an example). You also need to edit this script to set SLAVEOFDC
to the opposite DC’s name.
This script also uses the consul keystore capability to set which node is the Primary-Master in each DC:
[root@mysql1dc1 consul.d]# curl -s "http://localhost:8500/v1/kv/primary?raw&dc=dc1" mysql1dc1 [root@mysql1dc1 consul.d]# curl -s "http://localhost:8500/v1/kv/primary?raw&dc=dc2" mysql4dc2
After having reloaded consul, in the UI you can see something like this on the Primary Master being also the asynchronous slave:
![]()
In Action
Let’s see it in action:
In the video we can see that when we kill the non Primary member remaining in DC1, the Group lost quorum. Of course the size of the group at that moment is 2 of 2, and when it gets killed even if the Primary one is still active, it has a quorum of 1/2 which is not bigger than 50%. Therefor it stops being available to avoid any split-brain situation.
Then when we restart the nodes we killed in DC1, we have to force quorum in the single remaining node to be able to rejoin the others instances after reboot. This is how:
Force Quorum:
MySQL mysql2dc1:3306 ssl JS > cluster.forceQuorumUsingPartitionOf('root@mysql2dc1:3306') Restoring replicaset 'default' from loss of quorum, by using the partition composed of [mysql2dc1:3306] Please provide the password for 'root@mysql2dc1:3306': Restoring the InnoDB cluster ... The InnoDB cluster was successfully restored using the partition from the instance 'root@mysql2dc1:3306'. WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored.
Rejoin the first instance:
MySQL mysql2dc1:3306 ssl JS > cluster.rejoinInstance('root@mysql3dc1:3306') ... The instance 'root@mysql3dc1:3306' was successfully rejoined on the cluster. The instance 'mysql3dc1:3306' was successfully added to the MySQL Cluster.
And rejoin the last one:
MySQL mysql2dc1:3306 ssl JS > cluster.rejoinInstance('root@mysql1dc1:3306') ... Rejoining instance to the cluster ... The instance 'root@mysql1dc1:3306' was successfully rejoined on the cluster. The instance 'mysql1dc1:3306' was successfully added to the MySQL Cluster.
And finally we can see that all members are back online:
MySQL mysql2dc1:3306 ssl JS > cluster.status(); { "clusterName": "MyGroupDC1", "defaultReplicaSet": { "name": "default", "primary": "mysql2dc1:3306", "ssl": "REQUIRED", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "mysql1dc1:3306": { "address": "mysql1dc1:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql2dc1:3306": { "address": "mysql2dc1:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "mysql3dc1:3306": { "address": "mysql3dc1:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } }, "groupInformationSourceMember": "mysql://root@mysql2dc1:3306"
Conclusion
As you noticed, MySQL Group Replication and MySQL InnoDB Cluster can be the base of a full MySQL High Availability architecture using multiple Data Centers. MySQL InnoDB Cluster can of course be used with a multitude of other solutions like Orchestrator, ProxySQL, … I’ll try to illustrate these solutions when I have some time, stay tuned !