
From the initial release, one of the biggest complaints I had about Group Replication is that it allowed “stale” reads and there was no way to prevent them or to even know that you read “stale” data. That was a huge limitation. Thankfully, Oracle released features to control the consistency levels, and it was exactly a year ago! I don’t know about you, but I personally was confused by naming it group_replication_consistency=’AFTER’ or ‘BEFORE’.
So now I want to try to make sense of it and share my understanding (even if it is one year later).
Setup:
- 3 Node Percona Server for MySQL 8.0.18 running in Group Replication.
We will start with the default group_replication_consistency=’EVENTUAL’ and work from there. So let’s consider a very simple table:
CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `cnt` int(11) DEFAULT NULL, PRIMARY KEY (`id`) )
With over 10mln rows:
select count(*) from t1; +----------+ | count(*) | +----------+ | 10485760 | +----------+
And we will do a very simple action. On Node 1 we will update the table and on Node 2 we will read data.
Node 1:
UPDATE t1 SET cnt=11; Query OK, 10485760 rows affected (1 min 20.23 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
And immediately after that on the Node 2:
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 10 | +------+ 1 row in set (0.00 sec)
There are a few points to note:
- On Node 1 it took 1min 20 sec for the transaction to execute
- On Node 2 we got the result immediately, but we essentially got a “stale” read in that data was already updated, but we got an old version of it.
How can we get a better outcome:
group_replication_consistency=’BEFORE’;
Let’s look into the consistency level ‘BEFORE’. It says that a transaction on Node 2 will wait until the previously committed transaction on the Node 1 transaction is committed also on Node 2. So let’s change the set session to group_replication_consistency=’BEFORE’; on Node 2 and execute a similar update again.
Node 1 (still group_replication_consistency=’EVENTUAL’;):
UPDATE t1 SET cnt=12; Query OK, 10485760 rows affected (1 min 18.66 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
And after this, on Node 2 (with set session group_replication_consistency=’BEFORE’;):
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 12 | +------+ 1 row in set (1 min 11.45 sec)
So there are very notable changes:
- The transaction on Node 2 returned the correct result now.
- But now it took 1 min 11 sec to return it (instead of 0sec as in the previous case). Basically, the transaction waited on the transaction from Node 1 to be applied on Node 2
- The execution time on Node 1 did not change.
This mode allowed us to do exactly what we wanted – prevent stale reads. Great outcome! So what about group_replication_consistency=’AFTER’ ?
group_replication_consistency=’AFTER’;
This mode says that a transaction on Node 1 will not return OK until it makes sure that other nodes applied the transaction. To see it in action, we will put Node 1 into group_replication_consistency=’AFTER’ and Node 2 into group_replication_consistency=’EVENTUAL’;.
Node 1 (in group_replication_consistency=’AFTER’):
UPDATE t1 SET cnt=13; Query OK, 10485760 rows affected (3 min 0.46 sec) Rows matched: 10485760 Changed: 10485760 Warnings: 0
After that, Node 2 (in group_replication_consistency=’EVENTUAL’):
select cnt from t1 where id=10001; +------+ | cnt | +------+ | 13 | +------+ 1 row in set (0.00 sec)
Here the situation is different. Now the execution time on Node 1 doubled, as the transaction waits to be committed on all nodes and after that on Node 2 the execution is immediate.
This mode still avoids “stale” reads, but in this case, we shifted the wait time from Node 2 to Node 1, and this is how we can view the difference between ‘BEFORE’ and ‘AFTER’ consistency modes in Group Replication.
Both modes provide a consistent view, but:
- In ‘BEFORE’ mode: the readers on secondary nodes will be blocked, waiting on the moment when the consistent view is available, and
- In ‘AFTER’ mode the writers are blocked till other nodes get a consistent view.
So which mode to choose? Actually I think it is good to have an option here. You can choose, if you want, to put wait time on your readers or on your writers; the decision is up to how your application is designed.