Consistency during reads have been a small concern from the adopters of MySQL InnoDB Cluster (see this post and this one).
This is why MySQL supports now (since 8.0.14) a new consistency model to avoid such situation when needed.
Nuno Carvalho and Aníbal Pinto already posted a blog series I highly encourage you to read:
- Group Replication – Consistency Levels
- Group Replication: Preventing stale reads on primary fail-over! (you can also check this post)
- Group Replication – Consistent Reads
- Group Replication – Consistent Reads Deep Dive
After those great articles, let’s check how that does work with some examples.
The environment
This is how the environment is setup:
- 3 members:
mysql1
,mysql2
&mysql3
- the cluster runs in Single-Primay mode
-
mysql1
is the Primary Master - some extra sys views are installed
Example 1 – EVENTUAL
This is the default behavior (group_replication_consistency='EVENTUAL'
). The scenario is the following:
- we display the default value of the session variable controlling the Group Replication Consistency on the Primary and on one Secondary
- we lock a table on a Secondary master (
mysql3
) to block the apply of the transaction coming from the Primary - we demonstrate that even if we commit a new transaction on
mysql1
, we can read the table onmysql3
and the new record is missing (the write could not happen due to the lock) - once unlocked, the transaction is applied and the record is visible on the Secondary master (
mysql3
) too.
Example 2 – BEFORE
In this example, we will illustrate how we can avoid inconsistent reads on a Secondary master:
As you could notice, once we have set the session variable controlling the consistency, operations on the table (the server is READ-ONLY) are waiting for the Apply Queue to be empty before returning the result set.
We could also notice that the wait time (timeout) for this read operation is very long (8 hours by default) and can be modified to a shorter period:
We used SET wait_timeout=10
to define it to 10 seconds.
When the timeout is reached, the following error is returned:
ERROR: 3797: Error while waiting for group transactions commit on group_replication_consistency= 'BEFORE'
Example 3 – AFTER
It’s also possible to return from commit on the writer only when all members applied the change too. Let’s check this in action too:
This can be considered as synchronous writes as the return from commit happens only when all members have applied it. However you could also notice that in this consistency level, wait_timeout
has not effect on the write. In fact wait_timeout
has only effect on read operations when the consistency level is different than EVENTUAL
.
This means that this can lead to several issues if you lock a table for any reason. If the DBA needs to perform some maintenance operations and requires to lock a table for a long time, it’s mandatory to not operate queries in AFTER
or BEFORE_AND_AFTER
while in such maintenance.
Example 4 – Scope
In the following video, I just want to show you the “scope” of these “waits” for transactions that are in the applying queue.
We will lock again t1
but on a Secondary master, we will perform a SELECT
from table t2
, the first time we will keep the default value of group_replication_consistency
(EVENTUAL
) and the second time we will change the consistency level to BEFORE
:
We could see that as soon as they are transactions in the apply queue, if you change the consistency level to something BEFORE
, it needs to wait for the previous transactions in the queue to be applied even if those events are related or not to the same table(s) or record(s). It doesn’t matter.
Example 5 – Observability
Of course it’s possible to check what’s going on and if queries are waiting for something.
BEFORE
When group_replication_consistency
is set to BEFORE (or includes it), while a transaction is waiting for the applying queue to be committed, it’s possible to track those waiting transactions by running the following query:
SELECT * FROM information_schema.processlist
WHERE state='Executing hook on transaction begin.';
AFTER
When group_replication_consistency
is set to AFTER (or includes it), while a transaction is waiting for the transaction to be committed on the other members too, it’s possible to track those waiting transactions by running the following query:
SELECT * FROM information_schema.processlist
WHERE state='waiting for handler commit';
It’s also possible to have even more information joining the processlist and InnoDB Trx tables:
SELECT *, TIME_TO_SEC(TIMEDIFF(now(),trx_started)) lock_time_sec
FROM information_schema.innodb_trx JOIN information_schema.processlist
ON processlist.ID=innodb_trx.trx_mysql_thread_id
WHERE state='waiting for handler commit' ORDER BY trx_started\G
Conclusion
This consistency level is a wonderful feature but it could become dangerous if abused without full control of your environment.
I would avoid to set anything AFTER
globally if you don’t control completely your environment. Table locks, DDLs, logical backups, snapshots could all delay the commits and transactions could start pilling up on the Primary Master. But if you control your environment, you have now the complete freedom to control completely the consistency you need on your MySQL InnoDB Cluster.