Until MySQL 5.5 the only variable used to identify a network connectivity problem between Master and Slave was slave-net-timeout. This variable specifies the number of seconds to wait for more Binary Logs events from the master before abort the connection and establish it again. With a default value of 3600 this has been a historically bad configured variable and stalled connections or high latency peaks were not detected for a long period of time or not detected at all. We needed an active master/slave connection check. And here is where replication’s heartbeat can help us.
This feature was introduce in 5.5 as another parameter to the CHANGE MASTER TO command. After you enable it, the MASTER starts to send “beat” packages (of 106 bytes) to the SLAVE every X seconds where X is a value you can define. If the network link goes down or the latency goes up for more than the time threshold, then the SLAVE IO thread will disconnect and try to connect again. This means we now measure the connection time or latency, not the time without binary log events. We’re actively checking the communication.
How can I configure replication’s heartbeat?
Is very easy to setup with negligible overhead:
mysql_slave > STOP SLAVE;
mysql_slave > CHANGE MASTER TO MASTER_HEARTBEAT_PERIOD=1;
mysql_slave > START SLAVE;
MASTER_HEATBEAT_PERIOD is a value in seconds in the range between 0 to 4294967 with resolution in milliseconds. After the loss of a beat the SLAVE IO Thread will disconnect and try to connect again. Here is the SHOW SLAVE STATUS output after an error:
mysql_slave > show slave status\G
[...]
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
[...]
Last_IO_Errno: 2003
Last_IO_Error: error reconnecting to master 'rsandbox@127.0.0.1:19972' - retry-time: 60 retries: 86400
[...]
Is interesting to note that having a 5.5 slave with replication’s heartbeat enabled and connected to a 5.1 master doesn’t break the replication. Of course, the heartbeat will not work in this case because the master doesn’t know what is a beat or how to send it Image may be NSFW.
Clik here to view.
What status variables do I have?
The heartbeat check period time and the number of beats received.
mysql_slave > SHOW STATUS LIKE '%heartbeat%';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| Slave_heartbeat_period | 1.000 |
| Slave_received_heartbeats | 1476 |
+---------------------------+-------+
Conclusion
If you need to know when exactly the connection between your Master/Slaves breaks then replication’s heartbeat is the easiest and fastest solution to implement.
PlanetMySQL Voting: Vote UP / Vote DOWN