
As our Percona Kubernetes Operator for Percona XtraDB Cluster gains in popularity, I am getting questions about its performance and how to measure it properly. Sysbench is the most popular tool for database performance evaluation, so let’s review how we can use it with Percona XtraDB Cluster Operator.
Operator Setup
I will assume that you have an operator running (if not, this is the topic for a different post). We have the documentation on how to get it going, and we will start a three-node cluster using the following cr.yaml file:
apiVersion: pxc.percona.com/v1-3-0 kind: PerconaXtraDBCluster metadata: name: cluster1 finalizers: - delete-pxc-pods-in-order spec: secretsName: my-cluster-secrets sslSecretName: my-cluster-ssl sslInternalSecretName: my-cluster-ssl-internal allowUnsafeConfigurations: false pxc: size: 3 image: percona/percona-xtradb-cluster-operator:1.3.0-pxc resources: requests: memory: 1G cpu: 600m affinity: antiAffinityTopologyKey: "kubernetes.io/hostname" podDisruptionBudget: maxUnavailable: 1 volumeSpec: emptyDir: {} gracePeriod: 600 proxysql: enabled: false size: 3 image: percona/percona-xtradb-cluster-operator:1.3.0-proxysql resources: requests: memory: 1G cpu: 600m affinity: antiAffinityTopologyKey: "kubernetes.io/hostname" volumeSpec: persistentVolumeClaim: resources: requests: storage: 2Gi podDisruptionBudget: maxUnavailable: 1 gracePeriod: 30 pmm: enabled: false image: percona/percona-xtradb-cluster-operator:1.3.0-pmm serverHost: monitoring-service serverUser: pmm
If we are successful, we will have three pods running:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cluster1-pxc-0 1/1 Running 0 2m27s 192.168.139.65 node-3 <none> <none> cluster1-pxc-1 1/1 Running 0 95s 192.168.247.1 node-2 <none> <none> cluster1-pxc-2 1/1 Running 0 73s 192.168.84.130 node-1 <none> <none>
It’s important to note that IP addresses allocated are internal to Kubernetes Pods and not routable outside of Kubernetes.
Sysbench on an External to Kubernetes Host
In this part, let’s assume we want to run a client (sysbench) on a separate host, which is not a part of the Kubernetes system. How do we do it? We need to expose one of the pods (or multiple) to the external world, and for this, we use Kubernetes service with type NodePort:
kubectl expose po cluster1-pxc-0 --type=NodePort kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cluster1-pxc-0 NodePort 10.104.69.70 <none> 3306:30160/TCP,4444:31045/TCP,4567:30671/TCP,4568:30029/TCP 8s
So here we see that port 3306 (MySQL port) is exposed as port 30160 on node-3 (node where pod cluster1-pxc-0 is running). Please note this will invoke a kube-proxy process on node-3, which will handle incoming traffic on port 30160 and route it to the cluster1-pxc-0 pod. Kube-proxy by itself will introduce some networking overhead.
To find the IP address of Node-3:
kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP node-3 Ready <none> 29m v1.17.2 147.75.56.103 <none>
So now we can connect the dots and connect the mysql client to IP 147.75.56.103 port 30160 and create database sbtest, which we need to run sysbench:
mysql -h147.75.56.103 -P30160 -uroot -proot_password > create database sbtest;
And now we can prepare data for sysbench (nevermind some parameters, we will come to them later).
sysbench oltp_read_only --tables=10 --table_size=1000000 --mysql-host=147.75.56.103 --mysql-port=30160 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest prepare
Sysbench Running Inside Kubernetes
When we have sysbench running inside Kubernetes, it makes all these networking steps unnecessary and it simplifies a lot of things while also making one more complicated: how do you actually start a pod with sysbench?
For the start, we need an image with sysbench, and prudently we already have one in Docker Hub available as perconalab/sysbench, so we will use that one. And with an image you can prepare a yaml file to start a pod with kubectl create -f sysbench.yaml, or, I prefer to invoke it just from the command line (which is a little bit elaborate):
kubectl run -it --rm sysbench-client --image=perconalab/sysbench:latest --restart=Never -- bash
This way, Kubernetes will schedule sysbench-client pod on any available node, which may not be something we want. To schedule sysbench-client on a specific node, we can use:
kubectl run -it --rm sysbench-client --image=perconalab/sysbench:latest --restart=Never --overrides='{ "apiVersion": "v1", "spec": { "nodeSelector": { "kubernetes.io/hostname": "node-3" } } }' -- bash
This will start sysbench-client on node-3. Now from pod command line we can access mysql just using cluster1-pxc-0 hostname:
sysbench oltp_read_only --tables=10 --table_size=1000000 --mysql-host=cluster1-pxc-0 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest prepare
A Quick Intro to Sysbench
Although we have covered sysbench multiple times, I was asked to provide a basic intro for different scenarios, so I would like to review some basic options for sysbench.
Prepare Data
Before running a benchmark, we need to prepare the data. From our previous example:
sysbench oltp_read_only --tables=10 --table_size=1000000 --mysql-host=147.75.56.103 --mysql-port=30160 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest prepare
This will create ten tables with 1mln rows each, so it will generate data for ten tables, each about 250MB in size, for a total 2.5GB of data. This gives us an idea what knobs we can use to generate less or more data.
If we want, say, 25GB of data, we can use either 100 tables with 1mln rows each or ten tables with 10mln rows. For 50GB data, we can use 200 tables with 1mln rows or ten tables with 20mln rows, or any combination of tables and rows that will give 200mln rows in total.
Running Benchmark
Sysbench OLTP scenarios provides oltp_read_only and oltp_read_write scripts, where you can guess by the name – oltp_read_only will generate only SELECT queries, while oltp_read_write will generate SELECT, UPDATE, INSERT and DELETE queries.
Examples:
Read-only
sysbench oltp_read_only --tables=10 --table_size=1000000 --mysql-host=147.75.198.7 --mysql-port=32385 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest --time=300 --threads=16 --report-interval=1 run
Read-write
sysbench oltp_read_write --tables=10 --table_size=1000000 --mysql-host=147.75.198.7 --mysql-port=32385 --mysql-user=root --mysql-password=root_password --mysql-db=sbtest --time=300 --threads=16 --report-interval=1 run
Parameters to Play
From our example, you can see some parameters you can play with:
- –threads – how many user threads will connect to the database and generate queries. One will generate single-threaded load.
- –time – for long to run a benchmark. It may vary from very short (60 sec or so) period to very long (hours and hours) if we want to see the stability of the long runs
- –report-interval=1, how often to report results in progress. I often use one second to see the variance in the performance with one-sec resolution
Results interpretation
Running sysbench from one of the examples, you can see the following output:
[ 289s ] thds: 16 tps: 1872.97 qps: 37476.47 (r/w/o: 26237.63/6623.91/4614.93) lat (ms,95%): 10.09 err/s: 0.00 reconn/s: 0.00 [ 290s ] thds: 16 tps: 1913.93 qps: 38289.67 (r/w/o: 26808.07/6797.76/4683.84) lat (ms,95%): 9.73 err/s: 0.00 reconn/s: 0.00 [ 291s ] thds: 16 tps: 1562.75 qps: 31250.00 (r/w/o: 21874.50/5545.11/3830.39) lat (ms,95%): 23.95 err/s: 0.00 reconn/s: 0.00 [ 292s ] thds: 16 tps: 1817.99 qps: 36399.89 (r/w/o: 25473.92/6422.98/4502.99) lat (ms,95%): 11.24 err/s: 0.00 reconn/s: 0.00 [ 293s ] thds: 16 tps: 1632.31 qps: 32609.29 (r/w/o: 22832.40/5761.11/4015.77) lat (ms,95%): 24.38 err/s: 0.00 reconn/s: 0.00 [ 294s ] thds: 16 tps: 1917.99 qps: 38368.81 (r/w/o: 26857.87/6779.97/4730.98) lat (ms,95%): 9.56 err/s: 0.00 reconn/s: 0.00 [ 295s ] thds: 16 tps: 1744.97 qps: 34917.38 (r/w/o: 24441.56/6188.89/4286.92) lat (ms,95%): 13.46 err/s: 0.00 reconn/s: 0.00 [ 296s ] thds: 16 tps: 1913.02 qps: 38279.50 (r/w/o: 26790.35/6746.09/4743.06) lat (ms,95%): 9.91 err/s: 0.00 reconn/s: 0.00 [ 297s ] thds: 16 tps: 1723.01 qps: 34408.22 (r/w/o: 24086.16/6090.04/4232.03) lat (ms,95%): 15.83 err/s: 0.00 reconn/s: 0.00 [ 298s ] thds: 16 tps: 1725.63 qps: 34530.62 (r/w/o: 24173.84/6105.70/4251.09) lat (ms,95%): 16.41 err/s: 0.00 reconn/s: 0.00 [ 299s ] thds: 16 tps: 1895.97 qps: 37925.41 (r/w/o: 26552.59/6658.90/4713.93) lat (ms,95%): 9.73 err/s: 0.00 reconn/s: 0.00 [ 300s ] thds: 16 tps: 1866.92 qps: 37358.45 (r/w/o: 26153.91/6589.73/4614.81) lat (ms,95%): 9.91 err/s: 0.00 reconn/s: 0.00 SQL statistics: queries performed: read: 7307986 write: 1695485 other: 1436509 total: 10439980 transactions: 521999 (1739.67 per sec.) queries: 10439980 (34793.42 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.0510s total number of events: 521999 Latency (ms): min: 3.52 avg: 9.19 max: 463.32 95th percentile: 15.83 sum: 4799073.43 Threads fairness: events (avg/stddev): 32624.9375/864.41 execution time (avg/stddev): 299.9421/0.02
The first part is interval reports (one second, as we asked), and there we will see how many threads are running, and the most interesting part is “tps” and “lat” columns that report throughput and latency correspondingly for the given period of time.
In general, we want to see throughput higher and latency lower when we compare different experiments.
And the last part is the total statistics. The part we usually pay attention to is:
transactions: 521999 (1739.67 per sec.)
And
Latency (ms):
The more transactions and smaller latency time typically corresponds to better performance.