Quantcast
Channel: Planet MySQL

Galera Cluster for MySQL 8.0.36 released

$
0
0

Codership is pleased to announce a new Generally Available (GA) release of the multi-master Galera Cluster for MySQL 8.0, consisting of MySQL-wsrep 8.0.36 (release notes, download), with Galera replication library 4.18 (release notes, download) implementing wsrep API version 26. This release incorporates all changes to MySQL 8.0.36, adding a synchronous option for your MySQL High Availability solutions.

There have been some notable changes in the Galera replication library 4.18. We fixed bugs around garbd: we noticed hangs due to exceptions in the GCS layer not being caught, and now we have a graceful exit; when SSL is used, and there are graceful node shutdowns, garbd can crash, thus making the cluster non-Primary – this is now also fixed. We deprecated socket_ssl_compression and while it doesn’t get set any longer, you, the user will receive a warning if it is explicitly set. We do not consider this option functional. Commit cut tracking on node leaving has been fixed, however the GCS protocol version had to be bumped for backwards compatibility. We now ensure that the gcomm join process is run within the gcomm service thread.

In MySQL 8.0.36-26.17, a notable fix ensures the --wsrep-recover option can accurately recover the GTID from log files that contain non-text bytes. For CLONE SST, improvements include reading port settings from my.cnf if not explicitly set in wsrep_sst_address, a switch to caching_sha2_password for user creation during SST (moving away from mysql_native_password), and enhanced diagnostic messages, especially in SSL contexts. Diagnostic capabilities have been expanded to include transaction sequence numbers and source IDs for ignored errors, alongside more precise logging for row-based replication buffer file creation and doublewrite recovery, contingent on wsrep_debug settings.

In an effort to standardize Total Order Isolation (TOI) error voting, the cluster now relies solely on MySQL error codes. This change aims to mitigate inconsistencies stemming from locale differences and non-deterministic execution paths, though it necessitates a bump in the wsrep protocol version to ensure backward compatibility, and prevent cluster splits during upgrades. Moreover, a combination of native deadlock and BF abort scenarios in stored procedures, which previously resulted in assertion failures, has been addressed by refining the error handling process to ignore wsrep BF abort error for sub-statements until after their execution. Fixes were implemented for autocommit in SELECT FOR UPDATE queries to prevent BF abort-induced inconsistencies and assertions within InnoDB, alongside improvements in error handling for such queries within transactions.

Please download the latest software and update your Galera Clusters! We continue to provide repositories for popular Linux distributions, and we encourage you to use them. Contact us more more information about what Galera Cluster Enterprise Edition can do for you.


How to Remove Duplicate Rows in MySQL

$
0
0

Duplicates pose an ongoing risk to the data consistency and the overall database efficiency. This article will explore the issue of duplicate records, including their origins, the effects they have on databases, and strategies for swiftly detecting and permanently removing duplicates.

The post How to Remove Duplicate Rows in MySQL appeared first on Devart Blog.

MySQL Shorts - Episode #59 is Released

$
0
0
Episode #59 of MySQL Shorts in now available!

Galera Manager March 2024 Release now includes UI improvements and a SSH console tab

$
0
0

Codership is please to announce a new release of Galera Manager. This brings the installer to version 1.13.0 (you can check this by typing: ./gm-installer version which will report gm-installer version 1.13.0 (linux/amd64)) and the actual Galera Manager GUI to 1.8.4. Users will notice many usability improvements, and multiple fixes for issues filed at the galera-manager-support issue tracker.

The biggest facing user items in this release include a “jobs” tracker. It can be a flat list or a hierarchical view, and you will notice that this is not just for a cluster wide view, but also for individual nodes. Naturally on the left hand menu, you will also see Jobs listed there; this is to show jobs that do not belong to any cluster.

Another user feature that one would notice is that the SSH console is now a tab that is in the menu right after jobs. You can still pop it out, but it is a little more integrated within the UI.

In addition to that, unsupported Ubuntu 18.04 has been removed, and there is full support for Debian 12.

A few notes from the internal changelog that is published in the application (and also on the Galera Manager Changelog):

  • 1.8.4
  • feature: Start using infrastructure profiles
  • misc: Removed unnecessary dependency for cluster charts creation
  • feature: Faster user feedback for started jobs
  • feature: Added terminal tab
  • feature: add full support for Debian 12
  • fix: remove unsupported Ubuntu 18.04
  • feature: Display total job duration
  • feature: (UI) show job parent-child dependencies
  • feature: (UI) display jobs not beloging to any cluster

Please download Galera Manager and we look forward to your feedback. Note that we also plan to have a webinar on the latest release of Galera Manager, so keep your eyes peeled on the blog, and on our mailing list!

How to Connect to a MySQL Database on a DigitalOcean Droplet Using dbForge Studio

How to Install and Use MySQL 8 on Ubuntu 22.04

$
0
0
MySQL is a free, open-source, relational database management platform powered by Oracle Cloud. This tutorial will show you how to install MySQL 8 on Ubuntu 22.04 server.

How Missing Primary Keys Break Your Galera Cluster

$
0
0
Any Galera documentation about limitations will state that tables must have primary keys. They state that DELETEs are unsupported and other DMLs could have unwanted side-effects such as inconsistent ordering: rows can appear in different order on different nodes in your cluster.If you are not actively relying on row orders, this could seem acceptable. Deletes […]

MySQL HeatWave Faster Point-in-time Recovery

$
0
0
MySQL HeatWave Faster Point-in-time Recovery

Identifying and profiling problematic MySQL queries

$
0
0
MySQL has built-in functionality for collecting statistics on and profiling your MySQL queries. Learn how to leverage these features to identify problems.

Learning SQL Exercise

$
0
0

I’ve been using Alan Beaulieu’s Learning SQL to teach my SQL Development class with MySQL 8. It’s a great book overall but Chapter 12 lacks a complete exercise. Here’s all that the author provides to the reader. This is inadequate for most readers to work with to solve the concept of a transaction.

Exercise 12-1

Generate a unit of work to transfer $50 from account 123 to account 789. You will need to insert two rows into the transaction table and update two rows in the account table. Use the following table definitions/data:

                      Account:
account_id     avail_balance    last_activity_date
-----------    --------------   ------------------
       123               450    2019-07-10 20:53:27
       789               125    2019-06-22 15:18:35

                      Transaction:
txn_id    txn_date      account_id    txn_type_cd    amount
------    ----------    -------+--    -----------    ------
  1001    2019-05-15           123    C                 500
  1002    2019-06-01           789    C                  75

Use txn_type_cd = ‘C” to indicate a credit (addition), and use txn_type_cd = ‘D’ to indicate a debit (substraction).

New Exercise 12-1

The problem with the exercise description is that the sakila database, which is used for most of the book, doesn’t have transaction or account tables. Nor, are there any instructions about general accounting practices or principles. These missing components make it hard for students to understand how to build the transaction.

The first thing the exercise’s problem defintion should qualify is how to create the account and transaction tables, like:

  1. Create the account table, like this with an initial auto incrementing value of 1001:

    -- +--------------------+--------------+------+-----+---------+----------------+
    -- | Field              | Type         | Null | Key | Default | Extra          |
    -- +--------------------+--------------+------+-----+---------+----------------+
    -- | account_id         | int unsigned | NO   | PRI | NULL    | auto_increment |
    -- | avail_balance      | double       | NO   |     | NULL    |                |
    -- | last_activity_date | datetime     | NO   |     | NULL    |                |
    -- +--------------------+--------------+------+-----+---------+----------------+
    
  2. Create the account table, like this with an initial auto incrementing value of 1001:

    -- +----------------+--------------+------+-----+---------+----------------+
    -- | Field          | Type         | Null | Key | Default | Extra          |
    -- +----------------+--------------+------+-----+---------+----------------+
    -- | txn_id         | int unsigned | NO   | PRI | NULL    | auto_increment |
    -- | txn_date       | datetime     | YES  |     | NULL    |                |
    -- | account_id     | int unsigned | YES  |     | NULL    |                |
    -- | txn_type_cd    | varchar(1)   | NO   |     | NULL    |                |
    -- | amount         | double       | YES  |     | NULL    |                |
    -- +----------------+--------------+------+-----+---------+----------------+
    

Checking accounts are liabilities to banks, which means you credit a liability account to increase its value and debit a liability to decrease its value. You should insert the initial rows into the account table with a zero avail_balance. Then, make these iniitial deposits:

  1. Credit transaction table with an account_id column value of 123 with $500 and a txn_type_cd column value of ‘C’.
  2. Credit transaction table with an account_id column value of 789 with $75 and a txn_type_cd column value of ‘C’.

Write an update statement to set the avail_balance column values equal to the aggregate sum of the transaction table’s rows, which treats credit transacctions (those with a ‘C’ in the txn_type_cd column as a positive number and thos with a ‘D’ in the txn_type_cd column as a negative number).

Generate a unit of work to transfer $50 from account 123 to account 789. You will need to insert two rows into the transaction table and update two rows in the account table. Use the following table definitions/data:

  1. Debit transaction table with an account_id column value of 123 with $50 and a txn_type_cd column value of ‘D’.
  2. Credit transaction table with an account_id column value of 789 with $50 and a txn_type_cd column value of ‘C’.

Apply the prior update statement to set the avail_balance column values equal to the aggregate sum of the transaction table’s rows, which treats credit transacctions (those with a ‘C’ in the txn_type_cd column as a positive number and thos with a ‘D’ in the txn_type_cd column as a negative number).

Here’s the solution to the problem:

-- +--------------------+--------------+------+-----+---------+----------------+
-- | Field              | Type         | Null | Key | Default | Extra          |
-- +--------------------+--------------+------+-----+---------+----------------+
-- | account_id         | int unsigned | NO   | PRI | NULL    | auto_increment |
-- | avail_balance      | double       | NO   |     | NULL    |                |
-- | last_activity_date | datetime     | NO   |     | NULL    |                |
-- +--------------------+--------------+------+-----+---------+----------------+

DROP TABLE IF EXISTS account, transaction;

CREATE TABLE account
( account_id          int unsigned PRIMARY KEY AUTO_INCREMENT
, avail_balance       double       NOT NULL
, last_activity_date  datetime     NOT NULL )
 ENGINE=InnoDB 
 AUTO_INCREMENT=1001 
 DEFAULT CHARSET=utf8mb4 
 COLLATE=utf8mb4_0900_ai_ci;

-- +----------------+--------------+------+-----+---------+----------------+
-- | Field          | Type         | Null | Key | Default | Extra          |
-- +----------------+--------------+------+-----+---------+----------------+
-- | txn_id         | int unsigned | NO   | PRI | NULL    | auto_increment |
-- | txn_date       | datetime     | YES  |     | NULL    |                |
-- | account_id     | int unsigned | YES  |     | NULL    |                |
-- | txn_type_cd    | varchar(1)   | NO   |     | NULL    |                |
-- | amount         | double       | YES  |     | NULL    |                |
-- +----------------+--------------+------+-----+---------+----------------+

CREATE TABLE transaction
( txn_id         int unsigned  PRIMARY KEY AUTO_INCREMENT
, txn_date       datetime      NOT NULL
, account_id     int unsigned  NOT NULL
, txn_type_cd    varchar(1)
, amount         double
, CONSTRAINT transaction_fk1 FOREIGN KEY (account_id)
 REFERENCES account(account_id))
 ENGINE=InnoDB
 AUTO_INCREMENT=1001
 DEFAULT CHARSET=utf8mb4
 COLLATE=utf8mb4_0900_ai_ci;

-- Insert initial accounts.
INSERT INTO account
( account_id
, avail_balance
, last_activity_date )
VALUES
( 123
, 0
,'2019-07-10 20:53:27');

INSERT INTO account
( account_id
, avail_balance
, last_activity_date )
VALUES
( 789
, 0
,'2019-06-22 15:18:35');

-- Insert initial deposits.
INSERT INTO transaction
( txn_date
, account_id
, txn_type_cd
, amount )
VALUES
( CAST(NOW() AS DATE)
, 123
,'C'
, 500 );

INSERT INTO transaction
( txn_date
, account_id
, txn_type_cd
, amount )
VALUES
( CAST(NOW() AS DATE)
, 789
,'C'
, 75 );

UPDATE account a
SET    a.avail_balance = 
 (SELECT  SUM(
            CASE
              WHEN t.txn_type_cd = 'C' THEN amount
              WHEN t.txn_type_cd = 'D' THEN amount * -1
            END) AS amount
 FROM     transaction t
 WHERE    t.account_id = a.account_id
 AND      t.account_id IN (123,789)
 GROUP BY t.account_id);

SELECT * FROM account;
SELECT * FROM transaction;

-- Insert initial deposits.
INSERT INTO transaction
( txn_date
, account_id
, txn_type_cd
, amount )
VALUES
( CAST(NOW() AS DATE)
, 123
,'D'
, 50 );

INSERT INTO transaction
( txn_date
, account_id
, txn_type_cd
, amount )
VALUES
( CAST(NOW() AS DATE)
, 789
,'C'
, 50 );

UPDATE account a
SET    a.avail_balance = 
 (SELECT  SUM(
            CASE
              WHEN t.txn_type_cd = 'C' THEN amount
              WHEN t.txn_type_cd = 'D' THEN amount * -1
            END) AS amount
 FROM     transaction t
 WHERE    t.account_id = a.account_id
 AND      t.account_id IN (123,789)
 GROUP BY t.account_id);

SELECT * FROM account;
SELECT * FROM transaction;

The results are:

+------------+---------------+---------------------+
| account_id | avail_balance | last_activity_date  |
+------------+---------------+---------------------+
|        123 |           450 | 2019-07-10 20:53:27 |
|        789 |           125 | 2019-06-22 15:18:35 |
+------------+---------------+---------------------+
2 rows in set (0.00 sec)

+--------+---------------------+------------+-------------+--------+
| txn_id | txn_date            | account_id | txn_type_cd | amount |
+--------+---------------------+------------+-------------+--------+
|   1001 | 2024-04-01 00:00:00 |        123 | C           |    500 |
|   1002 | 2024-04-01 00:00:00 |        789 | C           |     75 |
|   1003 | 2024-04-01 00:00:00 |        123 | D           |     50 |
|   1004 | 2024-04-01 00:00:00 |        789 | C           |     50 |
+--------+---------------------+------------+-------------+--------+
4 rows in set (0.00 sec)

As always, I hope this helps those trying to understand how CTEs can solve problems that would otherwise be coded in external imperative languages like Python.

caching_sha2_password Support for ProxySQL Is Finally Available!

$
0
0
caching_sha2_password Support for ProxySQLProxySQL recently released version 2.6.0, and going through the release notes, I focused on the following:Added support for caching_sha2_password!This is great news for the community! The caching_sha2_password authentication method for frontend connections is now available. This has been a long-awaited feature …Why?Because in MySQL 8, caching_sha2_password has been the default authentication method. Starting from MySQL […]

Database Management Now Monitors MySQL HeatWave Clusters and Lakehouse

$
0
0
The initial release focused on MySQL OLTP observability. Building upon the existing foundation, we are pleased to announce an extension of its monitoring capabilities encompassing MySQL HeatWave clusters and Lakehouse-enabled DB Systems. With this enhancement, the Database Management service empowers data professionals with deeper insights and provides proactive monitoring opportunities, ensuring robust performance and informed decision-making.

Using the Oracle Cloud TypeScript SDK Part 6 - Updating a MySQL HeatWave Backup

$
0
0
Oracle offers a variety of SDKs for interacting with Oracle Cloud Infrastructure resources. In this post we discuss how to update some propertied of a backup of a MySQL HeatWave instance.

MySQL NDB Cluster replication: Introduction

$
0
0
Learn more about basic single-channel replication setup as well as the high-end active-active circular or merge replication setups. In this series we will describe how it works both on high level as well as the detailed technical level.

dbdeployer Tutorial on Mac

$
0
0
Not very long ago (well, maybe a little longer, this post is in draft for more than a year), in the spawn of less than 5 days, I suggested many colleagues to reproduce a problem they had with MySQL in a "more simple environment".  Such more simple environment can be created with dbdeployer.  dbdeployer is a tool to create "MySQL Sandboxes" on a Mac (laptop or desktop) or on Linux (vm,

EMEA MySQL Meetups in April 2024!

Japan: Special MySQL Meetup on April 11!

Creating a MySQL HeatWave Instance With the OCI CLI

$
0
0
The Oracle Cloud Infrastructure (OCI) command line interface (CLI) allows users to manage OCI resources. In this post, we will discuss how to use the OCI CLI to retrieve reference data and create a new MySQL HeatWave instance.

Summer 2023: Fuzzing Vitess at PlanetScale

$
0
0
My name is Arvind Murty, and from May to July of 2023, I worked on Vitess via an internship with PlanetScale. I was first introduced to Vitess when I was in high school as a potential open-source project for me to work on. I had been interested in working on one because they’re a relatively easy way to get some real-world experience in large-scale software development. Vitess seemed like an good place to start, so I started contributing, mostly on internal cleanup.

Ask Me Anything About MySQL 5.7 to 8.0 Post EOL

$
0
0
MySQL 5.7 to 8.0 Post EOLWe met with Vinicius Grippa, a Senior Support Engineer at Percona. He is also active in the open source community and was recognized as a MySQL Rock Star in 2023.In the previous interview with Vinicius, we discussed the upcoming End of Life (EOL) for MySQL 5.7. Now that MySQL 5.7 has reached EOL, MySQL 8 […]

Top 5 Reasons To Attend MySQL and HeatWave Summit 2024

$
0
0
Join us at MySQL and HeatWave Summit 2024, on May 1, 2024 in Redwood Shores, California. Free, unperson event, open to all. Learn about Generative AI and Machine Learning. Get One-on-One Help and Feature Demos. Hear from Leading Companies like NVIDIA, Meta, LY Corp, CERN, Panasonic Avionics. And Network with the MySQL Community.

Listing and Updating MySQL HeatWave Instances with the OCI CLI

$
0
0
The Oracle Cloud Infrastructure (OCI) command line interface (CLI) allows users to manage OCI resources. In this post, we will discuss how to use the OCI CLI to retrieve a list of MySQL HeatWave instances, retrieve more detailed information for a specific instance, and update properties of that instance.

Percona Monitoring and Management Setup on Kubernetes with NGINX Ingress for External Databases

$
0
0
Percona Monitoring and Management Setup on Kubernetes with NGINXIt’s a common scenario to have a Percona Monitoring and Management (PMM) server running on Kubernetes and also desire to monitor databases that are running outside the Kubernetes cluster. The Ingress NGINX Controller is one of the most popular choices for managing the inbound traffic to K8s. It acts as a reverse proxy and load […]

17 Years of Insecure MySQL Client !

$
0
0
Yes, this is a catchy title, but it is true, and it got you reading this post :-).  Another title could have been “Please load this mysql-dump: what could go wrong ?”.  As you guessed, loading a dump is not a risk-free operation.  In this post, I explain how the insecure MySQL client makes this operation risky and how to protect against it. And if you think this post is not

MySQL at OSC (Open Source Conference) Japan 2023 - Recap.

Profiling memory usage in MySQL

$
0
0
Learn how to visualize the memory usage of a MySQL connection

Backing up and Restoring a MySQL HeatWave Instance with the OCI CLI

$
0
0
The Oracle Cloud Infrastructure (OCI) command line interface (CLI) allows users to manage OCI resources. In this post, we will discuss how to use the OCI CLI to create a backup of a MySQl HeatWave instance and then restore that back up to a new HeatWave instance.

Creating a MySQL HeatWave Read Replica with the OCI CLI

$
0
0
The Oracle Cloud Infrastructure (OCI) command line interface (CLI) allows users to manage OCI resources. In this post, we will discuss how to use the OCI CLI to create a read replica of a MySQL HeatWave instance.

A Guide to Better Understanding MySQL Charset Levels

$
0
0
Charset levels in MySQLWe usually receive and see some questions regarding the charset levels in MySQL, especially after the deprecation of utf8mb3 and the new default uf8mb4. If you understand how the charset works on MySQL but have some questions regarding this change, please check out Migrating to utf8mb4: Things to Consider by Sveta Smirnova.Some of the questions […]

MySQL Shorts - Episode #60 is Released

$
0
0
Episode #60 of MySQL Shorts in now available!

How to Use MySQL Performance Schema to Troubleshoot and Resolve Server Issues

$
0
0
mysql performance schemaThis blog was originally published in January 2023 and was updated in April 2024.Recently I was working with a customer wherein our focus was to carry out a performance audit of their multiple MySQL database nodes. We started looking into the stats of the performance schema. While working, the customer raised two interesting questions: how […]

How to Find Duplicate, Unused, and Invisible Indexes in MySQL

$
0
0
mysql find unused indexesThis blog was originally published in January 2023 and was updated in April 2024.MySQL index is a data structure used to optimize the performance of database queries at the expense of additional writes and storage space to keep the index data structure up to date. It is used to quickly locate data without having to […]

MySQL 101: How to Find and Tune a Slow MySQL Query

$
0
0
mysql slow queryThis blog was originally published in June 2020 and was updated in April 2024.One of the most common support tickets we get at Percona is the infamous “database is running slower” ticket.  While this can be caused by a multitude of factors, it is more often than not caused by a bad or slow MySQL […]

Creating a MySQL HeatWave Configuration with the OCI CLI

$
0
0
The Oracle Cloud Infrastructure (OCI) command line interface (CLI) allows users to manage OCI resources. In this post, we will discuss how to use the OCI CLI to create a MySQL HeatWave configuration that can be specified when creating a new instance.

How to Add, Show, and Drop MySQL Foreign Keys

$
0
0

A key is typically defined as a column or a group of columns that are used to uniquely locate table records in relational databases (including MySQL, of course). And now that we've covered MySQL primary keys on our blog, it's time to give you a similarly handy guide on foreign keys.

The post How to Add, Show, and Drop MySQL Foreign Keys appeared first on Devart Blog.

MySQL NDB Cluster replication: Single-channel replication

$
0
0
This is the second article in our blog series about MySQL NDB Cluster replication, it describes the basic concept and functional parts of a single-channel replication which is used for replicating data between clusters.

Did MyDumper LIKE Triggers?

$
0
0
MyDumper LIKE TriggerYes, but now it likes them more, and here is why.IntroUsing the LIKE clause to filter triggers or views from a specific table is common. However, it can play a trick on you, especially if you don’t get to see the output (i.e., in a non-interactive session). Let’s take a look at a simple example […]

When COMMIT Is the Slowest Query

$
0
0

When COMMIT is the slowest query, it means your storage is slow. Let’s look at an example.

MySQL: Latency and IOPS

$
0
0

When talking about storage performance, we often her three terms that are important in describing storage performance. They are

  • bandwidth
  • latency
  • I/O operations per second (IOPS)

When you talk to storage vendors and cloud providers, they will gladly provide you with numbers on bandwidth and IOPS, but latency numbers will be hard to come by. To evaluate storage, especially for MySQL, you need just one number, and that is the 99.5 percentile commit latency for a random 16 KB disk write. But let’s start at the beginning.

Bandwidth

The bandwidth of an IO subsystem is the amount of data it can read or write per second. For good benchmarking numbers, this is usually measured while doing large sequential writes. A benchmark would, for example, write and read megabyte sized blocks sequentially to a benchmark files, and then later read them back.

Many disk subsystems have caches, and caches have capacities. The benchmark will exercise the cache and measure cache speed, unless you manage to disable the caches. Or deplete them, by creating a system workload large enough to exceed the cache limit. For example, on a server with 128 GB of memory, your benchmark will either have to disable these caches, or benchmark with a file of 384 GB or larger to actually measure storage system performance.

Writes can be forced to disk, at the filesystem level, by using the means of fdatasync(), O_DIRECT or O_SYNC, depending on what you want to measure and what your operating system and drivers provide. But if the storage itself provides caches, it may be that these are persistent (such as battery backed memory), and while that counts as persistent, it may not be the storage performance we want to measure. It could be that this is what we get until the cache is full, and then we get different numbers from the next tier of storage. Depending on what we need for our evaluation, we might need to deplete or disable this cache layer to get numbers from the tier we care about.

Strangely, read benchmarking is harder – we often can easily bypass writes caches by asking for write-though behavior. But if data is in memory, our read request will never reach the actual storage under test. Instead it will be served by some higher, faster tier of the storage subsystem. That’s good in normal operations, but bad if we want to find out worst-case performance guarantees.

From an uncached enterprise harddisk you would expect on the upside of 200 MB/sec streaming write bandwidth, from an uncached enterprise flash twice that, around 400 MB/sec written. For short time or from specialized storage you can get up to 10x that, but sustained it is often disappointingly low.

By bundling storage in a striped RAID-0 variant, you can get a lot more. By providing an abundance of battery-backed memory, you can get speeds that saturate your local bus speed or whatever network is in the path.

Latency

Write and read latency is the amount of time it takes to write a piece of data. For a database application developer, write latency usually amounts to “how long to I have to wait for a reasonably sized COMMIT to return”.

To better understand database write latency, we need to understand database writes. The long version of that is in explained in MySQL Transactions - the physical side .

The short version is that MySQL prepares a transaction in memory in the log buffer, and on commit writes the log buffer to the redo log. This is a sequentially structured, on-disk ring buffer that accepts the write. A commit will return once data is in there (plus replication shenanigans, if applicable). For a fast database, it is mandatory for the redo log to be on fast, persistent storage. Writes to the database happen in relatively tiny blocks, much smaller than actual database pages.

Later, out of the critical path, the database writes out the actual modified data pages as 16 KB random writes. This is important, but it happens a long time after the commit in a batched fashion, often minutes after the actual commit.

Still, a filesystem benchmark that uses randomly written 16 KB pages usually characterizes the behavior of a storage subsystem in a way that good predictions of its real world performance can be made. So run afio benchmark that writes single-threadedly to a very large test file, doing random-writes with a 16 KB block size, and look for the clat numbers and the latency histogram in the benchmark.

It might look like this:

...
  write: IOPS=2454, BW=38.4MiB/s (40.2MB/s)(11.2GiB/300001msec)
...
    clat percentiles (usec):
     |  1.00th=[  202],  5.00th=[  237], 10.00th=[  249], 20.00th=[  269],
     | 30.00th=[  285], 40.00th=[  302], 50.00th=[  314], 60.00th=[  330],
     | 70.00th=[  347], 80.00th=[  396], 90.00th=[  482], 95.00th=[  570],
     | 99.00th=[ 2212], 99.50th=[ 2966], 99.90th=[ 6915], 99.95th=[ 7832],
     | 99.99th=[ 9503]
...

Be sure to check the units, here usec (µs, microseconds, 1E-06 sec). fio changes them around as needed. Then look at the numbers in the clat histogram.

You will see relatively linear numbers, here up to the 95th percentile, and then a sudden and steep increase. Most storages are bimodal, and you will see “good” behavior and numbers in the low percentiles, and then numbers from the “bad” writes above. It is important to treat both numbers as differently and remember the cutoff percentile as well. This storage has a clat of about 0.57 ms, until the 95th percentile, and then 2.0 ms or more for about 5% of all writes. That’s not good, but these numbers are from a specific multithreaded test, not single-threaded writing, and I just included them to show what they would look like.

For reference, a local NVME disk should give you <100 µs for the good writes,
and present a cutoff at the 99.90th percentile (data from 2019 ).

A good disk-based NetApp filer connected with FC/AL, as sold in 2012, would present itself at ~500 µs write latency, and with a very high cutoff at the 99.5th or 99.9th percentile You would be mostly talking to the battery backed storage in that unit, so even with hard disks in it, it would be decent enough to run a busy database on it. If your storage is worse, latency-wise, than this 12-year-old piece of hardware, you will probably not be happy with it under load for database workloads.

IOPS

“I/O Operations Per Second” sounds a lot like latency, but is not. It is the total I/O budget of your storage. For old enterprise harddisks with a 5 ms seek time, you get around 200 IOPS from a single hard-disk. With a disk array, a multiple of that – more if you have more spindles. That is why in the old times before flash, DBAs strongly preferred arrays from many tiny disks over arrays from few large disks. That is also why disk-based database write-performance is completely dependent on the presence of battery backed storage.

With modern flash and NVME, paths to the actual storage in the flash chips are parallel. This is one of the big advantages of NVME over SSD or SATA, in fact. A single enterprise NVME flash drive can provide you with up to 800.000 IOPS.

But that is not a latency number. A single commit will still take around 1/20.000 sec, around 50µs. So if you execute single row insert/commit pairs in a loop, you will not see more than 20.000 commit/s, even if the single NVME drive can do 40x that. To leverage the full potential of the drive, you will need to access it 40-way parallel. Most databases cannot do that.

That is, because the workload itself that a database executes is inherently dependent on the order of commits.

Each transaction creates a number of locks that are necessary for the transaction to be executed correctly. Rows that are being written to are exclusively locked (X-locked), rows that are being referenced by foreign key constraints can be share-locked (S-locked), and additional locks may be taken out explicitly. Most people are running MySQL in a simplified model (for example using group replication) where only X-locks matter, and the set of primary keys that are written to (and hence get X-locked) is the write-set.

Two transactions are potentially parallel-executable when the intersection of their write-set is empty, that is, when they write to row sets that are non-overlapping. Most execution models for MySQL (expecially regarding parallel replication ) will try to execute transactions in parallel until they find an overlap. Then they will block until all transactions are done, and start the next batch of parallel transactions. This is not optimal, but simple, and prevents re-ordering of transactions in a way that is easy for developers to understand. It will also not break heartbeat writes that are often used to check for replication delay.

In reality, the execution-width of such a transaction stream varies wildly and is completely subject to what the applications do. For execution guarantees and service-level objectives, when using MySQL, the IOPS budget of the storage is immaterial. We can only make guarantees for the database based on storage latency. That is not a number you will find in the AWS catalog. So go, measure.

In the workloads I have seen, over a longer stretch of time I have often seen a degree of parallel execution hovering around 3-5. So your 2012 Netapp filer with 500 µs commit latency will, if the IOPS budget is big enough, execute around 2000 sequential commits/s, and will perform with around 6000-10.000 IOPS. You can guarantee only 2000 commit/s, because that is worst case performance, and as a DBA you do not control application workload.

Jitter

Jitter is the amount of variance you get in latency. Most of the time, we care a lot about jitter, especially with transactional workloads.

Jitter matters, because if we hum along with 10.000 write-commit/s, and we get a hiccup of 1/10 s (100ms), we are likely looking at 1000 stalled processes in SHOW PROCESSLIST. That is, if our max_connections is too low, we die. Also, even if we don’t die, the applications people will hate us.

You can see the effect of jitter or lock pileups, for example by running innotop -m Q -d 0.1 for a minute. If things are fine, the number of shown active, non-idle connections will be relatively constant and smaller than the number of CPU threads available to your machine.

TL;DR

  • Bandwidth is the MB/s you get from your storage.
    • Work with 200 MB/s for disk, 400 MB/s for bulk flash, and celebrate burst speeds of 4 GB/s for a short time.
  • Latency, for MySQL, is how long you wait for a 16 KB random-write to happen in fio.
    • 2012’s NetApp gives you 500 µs, 0.5 ms. Today’s NVME gives you 100 µs, 0.1 ms.
    • The cloud with very remote storage often gives you 1-2 ms.
    • commit-rate = 1/latency is the number of turns per second you get from a for-loop running “insert-commit”.
  • IOPS is the total, parallel IO budget for your database.
    • IOPS/commit-rate is the required degree of parallelism needed to eat the whole buffet, that is, use up all IOPS.
    • Transaction parallelism in MySQL is application-dependent, and is usually defined by how long a run of intersection-free write-sets you can find.
    • For the cases and workloads I have seen (webshops), it varies wildly over time, and is often 4-ish.
    • A single MySQL instance will usually not be able to consume all IOPS offered by even a single enterprise NVME drive (800k IOPS). This would require a 40-wide parallel path, 10x more than what workloads typically can offer.
  • Jitter matters.
    • That is why we look at 99th percentile and higher for completion latency (clat) in fio.
  • Cloud storage often sucks.
    • 1 ms or even 3 ms clat give you 300-1000 commit/s, in a loop. Often that is borderline insufficient. You can weaken persistence guarantees (innodb_flush_log_at_trx_commit), complicate your model by introducing other technologies (that are faster, because they have weaker persistence guarantees), or run MySQL yourself on i3 instances with local storage (but that is not why you chose the cloud).
  • Guarantees can be made only on things you control, and that may be bandwidth and latency, but never parallelism.
    • Parallelism is controlled by the application, not by the DBA.

The MySQL adaptive hash index

$
0
0
The adaptive hash index help to improve performance of the already-fast B-tree lookups




Latest Images