
In this blog, I want to evaluate Group Replication Scaling capabilities to handle several writers, that is, when the read-write connection is established to multiple nodes, and in this case, two nodes. This setup is identical to my previous post, Evaluating Group Replication Scaling Capabilities in MySQL.
For this test, I deploy multi-node bare metal servers, where each node and client are dedicated to an individual server and connected between themselves by a 10Gb network.
I use the 3-nodes Group Replication setup.
Hardware specifications:
System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other) Service Tag | S292592X0110239C Platform | Linux Release | Ubuntu 18.04.4 LTS (bionic) Kernel | 5.3.0-42-generic Architecture | CPU = 64-bit, OS = 64-bit Threading | NPTL 2.27 SELinux | No SELinux detected Virtualized | No virtualization detected # Processor ################################################## Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz Caches | 80x28160 KB # Memory ##################################################### Total | 187.6G
For the benchmark, I use sysbench-tpcc 1000W prepared database as:
./tpcc.lua --mysql-host=172.16.0.11 --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 --db-driver=mysql --use_fk=0 --force_pk=1 --trx_level=RC prepare
The configs, scripts, and raw results are available on our GitHub.
The workload is “in-memory,” that is, data (about 100GB) should fit into innodb_buffer_pool (also 100GB).
For the MySQL version, I use MySQL 8.0.19.
I use the following command line:
./tpcc.lua --mysql-host=172.16.0.12,172.16.0.13 --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=$time --threads=$i --report-interval=1 --tables=10 --scale=100 --trx_level= RR --db-driver=mysql --report_csv=yes --mysql-ignore-errors=3100,3101,1213,1180 run
This establishes an active connection to TWO separate nodes in Group Replication.
To make sure that transactions do not work with stale data, I use
group_replication_consistency=’BEFORE’on all nodes. See more on this in my previous post, Making Sense of MySQL Group Replication Consistency Levels.
Results
Let’s review the results I’ve got. First, let’s take a look at how performance changes when we increase user threads from 1 to 256 for 3 nodes.
Interesting to see how the throughput becomes unstable when we increase the number of threads. To see it in more detail, let’s draw the chart with the individual scales for each set of threads:
As we can see, there are a lot of variations for threads starting with 2. Let’s check the 8 and 64 threads with 1-sec resolution.
There are multiple periods when throughput is 0.
These are the results with 1-sec interval:
393 610.99 603.01 3140.97 4822.97 4865.93 2454.05 1038 939.99 1340.02 1549.01 1561 626 0 0 66 0 0 63 0 0 69 0 0 268 369 367 331 385 356 258 0 0
There is basically an 11-second long stall when the cluster could not handle transactions.
Conclusion
Unless I missed some tuning parameters which would improve the performance, it is hard to recommend a multi-writer setup based on Group Replication. Though Multi-primary mode is disabled by default in MySQL Group Replication, you should not try to enable this.