Quantcast
Channel: Planet MySQL
Viewing all 18800 articles
Browse latest View live

Configuring GTID and binary logging

$
0
0

This tutorial demands a service restart since some flags here presented can not be dynamically changed

What is GTID and why do I need it? Directly from the MySQL documentation (excerpt taken as is with different jargons than used here, for master/slave we are using primary/replica):

A global transaction identifier (GTID) is a unique identifier created and associated with each transaction committed on the server of origin (the master). This identifier is unique not only to the server on which it originated, but is unique across all servers in a given replication topology.

GTID assignment distinguishes between client transactions, which are committed on the master, and replicated transactions, which are reproduced on a slave. When a client transaction is committed on the master, it is assigned a new GTID, provided that the transaction was written to the binary log. Client transactions are guaranteed to have monotonically increasing GTIDs without gaps between the generated numbers. If a client transaction is not written to the binary log (for example, because the transaction was filtered out, or the transaction was read-only), it is not assigned a GTID on the server of origin.

In theory you can use replication with only binary logging enabled, however replication with GTID is significantly more reliable. And while some providers don’t require it, at Google Cloud GTID is mandatory.

Representation

To represent a GTID a pair of coordinates are used, one is the server_uuid paired with the transaction_id which is an integer. Example of valid GTID:

GTID = 8b5dbf2a-45b4-11e8-81bc-42010a800002:25

To understand more how this impacts replication, I recommend reading the section GTID Format and Storage  in the MySQL documentation.

Enabling GTID

Thankfully, to enable it you don’t need to do much, edit your mysqld.cnf file to support this variables:

server-id = 2 # Or any other number, we recommend to not be 1
log-bin = mysql-bin # Or any other valid value

gtid_mode = ON
enforce-gtid-consistency = true

Restart the database server to load up the new configuration with sudo service mysql restart.

Side effects

Some applications may cause errors due to the enforce-gtid-consistency flag. That happens because usually the application is trying to do a non-transactional action that also is not possible to replicate inside a transaction.

If you do the following:

START TRANSACTION;

CREATE TEMPORARY TABLE `tmp_users` ( id INTEGER );

COMMIT;

It is not a good practice I may add . You will get this error:

ERROR 1787 (HY000): Statement violates GTID consistency: CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE can only be executed outside transactional context. These statements are also not allowed in a function or trigger because functions and triggers are also considered to be multi-statement transactions.

What are you basically doing is telling the database to create a connection, which is fine, however the following command is a CREATE TEMPORARY TABLE. This command is bound to the current connection, and because it won’t have a transaction_id it won’t be able to replicate the statement. Temporary tables are not replicated.

If your application happens to do that, all you need to do is remove the creation of temporary tables to outside of the transaction. Unfortunately Magento does not do that.

See something wrong in this tutorial? Please don’t hesitate to message me through the comments or the contact page.


How to Compile Percona Server for MySQL 5.7 in Raspberry Pi 3

$
0
0
Percona Server for MySQL on a Raspberry Pi

Percona Server for MySQL on a Raspberry PiIn this post I’ll give to you the steps to compile Percona Server for MySQL 5.7.22 in Raspberry Pi 3. Why? Well because in general this little computer is cheap, has low power consumption, and is great to use as a test machine for developers.

By default Raspbian OS includes very few versions of MySQL to install

$ apt-cache search mysql | grep server
...
mariadb-server-10.0 - MariaDB database server binaries
mariadb-server-10.1 - MariaDB database server binaries
mariadb-server-core-10.0 - MariaDB database core server files
mariadb-server-core-10.1 - MariaDB database core server files
mysql-server - MySQL database server binaries and system database setup [transitional] (5.5)
...

If you want to install MySQL or MariaDB on an ARM architecture using official pre-built binaries, you are limited to those distributions and versions.

Roel Van de Paar wrote time ago this post “Percona Server on the Raspberry Pi: Your own MySQL Database Server for Under $80” using “Fedora ARM” like OS on the first versions of raspberry cards.

Now we will use the last version of Raspberry Pi 3 with Raspbian OS. In my case, the OS version is this

$ cat /etc/issue
Raspbian GNU/Linux 9 \n \l

The Installation of Percona Server for MySQL on Raspberry Pi 3

Let’s start. We will need many devs packages and cmake to compile the source code. There is the command line to update or install all these packages:

apt-get install screen cmake debhelper autotools-dev libaio-dev wget automake libtool bison libncurses-dev libz-dev cmake bzr libgcrypt11-dev build-essential flex bison automake autoconf bzr libtool cmake libaio-dev mysql-client libncurses-dev zlib1g-dev libboost-dev

Now we need to download the Percona Server for MySQL 5.7.22 source code and then we can proceed to compile.

Before starting to compile the source code, we will need to extend the swap memory. This is necessary to avoid encountering memory problems at compilation time.

$ dd if=/dev/zero of=/swapfile1GB bs=1M count=1024
$ mkswap /swapfile1GB
$ swapon /swapfile1GB
$ chmod 0600 /swapfile1GB

Now we can check the memory and find that memory swap is correct

$ free -m

This is the output in my case

$ free -m
total used free shared buff/cache available
Mem: 927 176 92 2 658 683
Swap: 1123 26 1097

I recommend to use a screen session to compile the source code, because it takes a lot of time.

$ cd /root
$ screen -SL compile_percona_server
$ wget https://www.percona.com/downloads/Percona-Server-LATEST/Percona-Server-5.7.22-22/source/tarball/percona-server-5.7.22-22.tar.gz
$ tar czf percona-server-5.7.22-22.tar.gz
$ cd percona-server-5.7.22-22
$ cmake  -DDOWNLOAD_BOOST=ON -DWITH_BOOST=$HOME/my_boost .
$ make
$ make install

After it has compiled and installed successfully, it’s time to create our datadir directory for this Percona Server version, and the mysql user. Feel free to use other directory names

$ mkdir /var/lib/mysql
$ useradd mysql -d /var/lib/mysql
$ chown mysql.mysql /var/lib/mysql

Now it’s time to create a minimal my.cnf config file to start mysql, you can use the below example

$ vim /data/percona-5.6.38/my.cnf
[mysql]
socket=/var/lib/mysql/mysql.sock
[mysqld]
datadir = /var/lib/mysql
server_id = 2
binlog-format = row
log_bin = /var/lib/mysql/binlog
innodb_buffer_pool_size = 128M
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
[mysqld_safe]
log-error=/var/lib/mysql/mysqld.log
pid-file=/var/lib/mysql/mysqld.pid

then we need to initialize the initial databases/schemas, ibdata and ib_logfile files with the next command

$ /usr/local/mysql/bin/mysqld --initialize-insecure --user=mysql --basedir=/usr/local/mysql --datadir=/var/lib/mysql

now—finally—it’s time start MySQL

$ /usr/local/mysql/bin/mysqld_safe --defaults-file=/etc/my.cnf --user=mysql &

We can check if MySQL started or not, by taking a look at the mysqld.log file

$ cat /var/lib/mysql/mysqld.log
...
2018-08-13T16:44:55.067352Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2018-08-13T16:44:55.067576Z 0 [Note] IPv6 is available.
2018-08-13T16:44:55.067680Z 0 [Note]   - '::' resolves to '::';
2018-08-13T16:44:55.067939Z 0 [Note] Server socket created on IP: '::'.
2018-08-13T16:44:55.258576Z 0 [Note] Event Scheduler: Loaded 0 events
2018-08-13T16:44:55.259525Z 0 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.7.22-22-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Source distribution

In our example it started ok.

Remember MySQL server was installed and started using an alternative path.
Now it’s time to connect and check if everything is running well.

$ /usr/local/mysql/bin/mysql -uroot
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.22-22-log Source distribution
Copyright (c) 2009-2018 Percona LLC and/or its affiliates
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.00 sec)

To check the version that’s running, you can use the next command

mysql> SHOW VARIABLES LIKE "%version%";
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+---------------------+
| innodb_version | 5.7.22-22 |
| protocol_version | 10 |
| slave_type_conversions | |
| tls_version | TLSv1,TLSv1.1 |
| version | 5.7.22-22-log |
| version_comment | Source distribution |
| version_compile_machine | armv7l |
| version_compile_os | Linux |
| version_suffix | -log |
+-------------------------+---------------------+
9 rows in set (0.02 sec)

Improving Performance

Keep in mind that you have configured the datadir directory in the same microSD where are you running the OS:  MySQL will run slowly. If you create a new table perhaps it will take a few seconds. So, I recommend that you use a separate USB SSD disk, and move the datadir directory to this SSD disk. That’s more useful and the performance is much better

I hope you enjoyed this guide on how to use a tiny server to install Percona Server for MySQL.
If you want to test other versions, please go ahead: the steps will be very similar to these.

Other related post about Raspberry PI

The post How to Compile Percona Server for MySQL 5.7 in Raspberry Pi 3 appeared first on Percona Database Performance Blog.

MySQL @Oracle Developer Community LAD Tour 2018

$
0
0

We are happy to announce that MySQL got an opportunity to be part of the Oracle Developer Community LAD Tour 2018 on the last minute notice. The event is hold on Aug 24 @ Mexico City, Mexico you will can - among other interesting talks - also to find a MySQL talk given by the local Sales Consultant, Manuel Contreras. Manuel will be talking about “MySQL 8.0”.

We are looking forward to seeing & talking to you there!

Question about Semi-Synchronous Replication: the Answer with all the Details

$
0
0

semi-sync replication MySQLI was recently asked a question by mail about MySQL Lossless Semi-Synchronous Replication. As I think the answer could benefit many people, I am answering it in a blog post. The answer brings us to the internals of transaction committing, of semi-synchronous replication, of MySQL (server) crash recovery, and of storage engine (InnoDB) crash recovery. I am also debunking some misconceptions that I have often seen and heard repeated by many. Let’s start by stating one of those misconceptions.

One of those misconceptions is the following (this is NOT true): semi-synchronous enabled slaves are always the most up-to-date slaves (again, this is NOT true). If you hear it yourself, then please call people out on it to avoid this spreading more. Even if some slaves have semi-synchronous replication disabled (I will use semi-sync for short in the rest of this post), these could still be the most up-to-date slaves after a master crash. I guess this false idea is coming from the name of the feature, not much can be done about this anymore (naming is hard). The details are in the rest of this post.

Back to the question I received by mail, it can be summarized as follows:

  • In a deployment where a MySQL 5.7 master is crashed (
    kill -9
     or
    echo c > /proc/sysrq-trigger
     ), a slave is promoted as the new master;
  • when the old master is brought back up, transactions that are not on the new master are observed on this old master;
  • is this normal in a lossless semi-sync environment?

The answer to that question is yes: it is normal to have transactions on the recovered old master that are not on the new master. This is not a violation of the semi-sync promise. To understand this, we need to go in detail about semi-sync (MySQL 5.5 and 5.6) and lossless semi-sync (MySQL 5.7).

Semi-Sync and Lossless Semi-Sync

Semi-sync replication was introduced in MySQL 5.5. Its promise is that every transaction where the client has received a COMMIT acknowledgment would be replicated to a slave. It had a caveat though: while a client is waiting for this COMMIT acknowledgment, other clients could see the data of the committing transaction. If the master crashes at this moment (without a slave having received the transaction), it is a violation of transaction isolation. This is also known as phantom read: data observed by a client has disappeared. This is not very satisfactory.

Lossless semi-sync replication was introduced in MySQL 5.7 to solve this problem. With lossless semi-sync, we keep the promise of semi-sync (every transaction where clients have received a COMMIT acknowledgment is replicated), with the additional promise that there is no phantom reads. To understand how this works, we need to dive into the way MySQL commits transactions.

The Way MySQL Commits Transactions

When MySQL commits a transaction, it is going through the following steps:

  1. Prepare the transaction in the storage engine (InnoDB),
  2. Write the transaction to the binary logs,
  3. Complete the transaction in the storage engine,
  4. Return an acknowledgment to the client.

The implementation of semi-sync or lossless semi-sync inserts themselves into the above process.

Semi-sync in MySQL 5.5 and 5.6 happens between step #3 and #4. After “completing” the transaction in the storage engine, a semi-sync master waits for one slave to confirm the replication of the transaction. As this happens after the storage engine has “completed” the transaction, other clients can see this transaction. This is the cause of our true ghost. Also — unrelated to phantom reads — if the master crashes at that moment and after bringing it back up, this transaction will be in the database as it has been fully “completed” in the storage engine.

It is important to realize that for semi-sync (and lossless-semi-sync), transactions are written to the binary logs in the same way as in standard (non-semi-sync) replication. In other words, standard and semi-sync replication behave exactly the same way up to and including step #2. Also, once transactions are in the binary logs, they are visible to all slaves, not only to the semi-sync slaves. So a non-semi-sync slave could receive a transaction before the semi-sync slaves. This is why it is false to assume that the semi-sync slaves are the most up-to-date slaves after a master crash.

It is false to assume that the semi-sync slaves are the most up-to-date slaves after a master crash.

In lossless semi-sync, waiting for transaction replication happens between steps #2 and #3. At this point, the transaction is not “completed” in the storage engine, so other clients do not see its data yet. But even if this transaction is not “completed”, a master crash at that moment and a subsequent restart would cause this transaction to be in the database. To understand why, we need to dive into MySQL and InnoDB crash recovery.

MySQL and InnoDB Crash Recovery

During InnoDB crash recovery, transactions that are not “completed” (have not reached step #3 of transaction committing) are rolled back. So a transaction that is not yet committed (has not reached step #1) or a transaction that is not yet written to the binary logs (has not reached step #2) will not be in the database after InnoDB crash recovery. However, if InnoDB rolled back a transaction that has reached the binary logs (step #2) but that is not “completed” (step #3), this would mean a transaction that could have reached a slave would disappear from the master. This would create data inconsistency in replication and would be bad.

Once a transaction reaches the binary logs it should roll forward.

To avoid the data inconsistency described above, MySQL does its own crash recovery before storage engine crash recovery. This recovery consists of making sure that all the transactions in the binary logs are flagged as “completed”. So if a transaction is between step #2 and #3 at the time of the crash, it is flagged as “completed” in the storage engine during MySQL crash recovery and it is rolled forward during storage engine crash recovery. In the case where this transaction has not reached at least a slave at the moment of the crash, it will appear in the master after crash recovery. It is important to note that this could happen even without semi-sync.

Having extra transactions on a recovered master can happen even without semi-sync.

The extra transactions that are visible on the recovered old master are because of the way MySQL and InnoDB carry out crash recovery. This is more likely to happen in a lossless semi-sync environment because of the delay introduced between steps #2 and #3 of the way MySQL commits transactions, but it could also happen without semi-sync if the timing is right.

The Facebook Trick to Avoid Extra Transactions

There is an original trick to avoid having extra transactions on a recovered master. This trick was presented by Facebook during a talk at Percona Live a few years ago (sorry, I cannot find any link to this, please post a comment below if you know of public content about this). The idea is to force MySQL to roll-back (instead of rolling forward) the transactions that are not yet “completed” in the storage engine. It must be noted that this should only be done on an old master that has been replaced by a slave. If it is done on a recovering master without failing over to a slave, a transaction that could have reached a slave would disappear from the master.

To trick MySQL into rolling back the non “completed” transactions, Facebook truncates the binary logs before restarting the old master. This way, MySQL thinks that the crash happened before writing to the binary logs (step #2). So MySQL crash recovery will not flag the transactions as “complete” in the storage engine and these will be rolled back during storage engine crash recovery. This avoids the recovered old master having extra transactions. Obviously, because these transactions were once in the binary logs, they could have been replicated to slaves. So the Facebook trick avoids the old master being ahead of the new master, possibly at the cost of bringing the old master behind the new master.

I know that Facebook then re-slaves the recovered old master to the new master, but I am not sure that this is possible with standard MySQL. The Facebook variant of MySQL includes additional features, and I think one of those is to put GTIDs in the InnoDB Redo logs. With this, and after the recovery of the old master, the GTID state of the database can be determined even if the binary logs are gone. In standard MySQL, I think that truncating the binary logs will result in losing the GTID state of the database, which will prevent re-slaving the old master to the new master. However, as InnoDB crash recovery prints the binary log position or the last committed transaction, I think re-slaving the old master to a Binlog Server would be possible in a semi-sync environment.

You can read more about semi-synchronous replication at Facebook below:

Debunking Other Misconceptions

Before closing this post, I would like to debunk other misconceptions that I often hear. Some people say that semi-sync (or lossless semi-sync) increases the availability of MySQL. In my humble opinion, this is false. Semi-sync and lossless semi-sync actually lower availability, there is no increase here.

Lossless semi-sync is not a high availability solution.

The statement that semi-sync and lossless semi-sync have lower availability than standard replication is justified by the introduction of new situations where transactions could be prevented from committing. As an example, if no semi-sync slaves are present, transactions will not be able to commit. The promise of lossless semi-sync is not about increasing availability, it is about preventing the loss of committed transactions in case of a crash. The cost of this promise is the added COMMIT latency and the new cases where COMMIT would be prevented from succeeding (thus reducing availability).

Group Replication is not a high availability solution.

For the same reasons, Group Replication (or Galera or Percona XtraDB Cluster) reduces availability. Group replication also brings the promise of preventing the loss of committed transactions at the cost of adding COMMIT latency. There is also another cost of Group Replication: failing COMMIT in some situations (I do not know of any situation in standard MySQL where COMMIT can fail, if you know of one, please post a comment below). An example of COMMIT failing is mentioned in my previous post on Group Replication certification. This additional cost introduces another interesting promise, but as this is not a post on Group Replication, so I am not covering this here.

Group Replication also introduces cases where COMMIT can fail.

This does not mean that lossless semi-sync and Group Replication cannot be used as a building block for a high availability solution, but by themselves and without other important components, they are not a high availability solution.

Thoughts about rpl_semi_sync_master_{timeout,wait_no_slave}

Above, I write that there are situations where a transaction will be prevented from committing. One of those situations is when there are no semi-sync slaves or when those slaves are not acknowledging transactions (for any good or bad reasons). There are two parameters to bypass this: rpl_semi_sync_master_wait_no_slave and rpl_semi_sync_master_timeout. Let’s talk about these a little.

The rpl_semi_sync_master_wait_no_slave parameter allows MySQL to bypass the semi-sync wait when there are not enough semi-sync slaves (semi-sync in MySQL 5.7 can wait for more than one slave and this behavior is controlled by the rpl_semi_sync_master_wait_for_slave_count parameter). The default value for the “wait_no_slave” parameter is ON, which means it still waits even if there are not enough semi-sync slaves. This is a safe default as it enforces the promise of semi-sync (not acknowledging COMMIT before the transaction is replicated to slaves). Even if setting this parameter to OFF is voiding that promise, I like that it exists (details below). However, I would not run MySQL unattended with waiting disabled in a full semi-sync environment.

The rpl_semi_sync_master_timeout parameter allows MySQL to short-circuit waiting for slaves after a timeout with acknowledging COMMIT to the client event is the transaction was not replicated. Its default is 10 seconds, which I think is wrong. After 10 seconds, there are probably thousands of transactions waiting for commit on the master and MySQL is already struggling. If we want to prevent MySQL from struggling, this parameter should be lower. However, if we want a zero-loss failover (and failover is taking more than 10 seconds), we should not commit transactions without replicating them to slaves, in which case this parameter should be higher. Higher or lower, which one should be used…

Using a “low” value for rpl_semi_sync_master_timeout looks very strange to me in a full semi-sync environment. It looks like the DBA cannot choose between committing as often as possible (standard non-semi-sync replication) or only committing transactions that are replicated (semi-sync). There is no way to have the best of both worlds here:

  • either someone wants high success rate on commit, which means that the DBA does not deploy semi-sync (and the cost of this is to lose committed transactions on failover),
  • or someone wants high persistence on committed transactions, in which case the DBA deploys semi-sync at the cost of lowering the probability of a successful commit (and increasing commit latency).

I see one situation where these parameters are useful: transitioning from a non-semi-sync environment to a full semi-sync environment. During this transition, we want to learn about the new restrictions of semi-sync without causing too much disruption in production, and these parameters come in handy here. But once in a full semi-sync deployment, where we fully want to avoid loosing committed transactions when a master crash, I would not consider it a good idea to let transactions commit without being replicated to slaves.

As a last comment on this, there are thoughts that a full semi-sync enabled master should probably crash itself when it is blocked for too long in waiting for slave acknowledgment. This is an interesting idea as it is the only way that MySQL has to unblock clients. I am not sure if this is implemented in some variant of MySQL though (maybe the Facebook variant).

I hope this post clarified semi-sync and lossless semi-sync replication. If you still have questions about this or on related subjects, feel free to post them in the comments below.

The post Question about Semi-Synchronous Replication: the Answer with all the Details appeared first on Percona Community Blog.

Get the Auditors in.

$
0
0

Here I have been looking into using the MySQL Enterprise Edition Audit Log plugin for 5.7. We have many options to audit (filters, encryption, compression, Workbench, rotation & purging, viewing the log, etc.) and it’s quite clear cut on what we’re auditing and not when active.

If you’re looking to go deep into the Audit Plugin, as part of the Enterprise Edition, you’ll want to look at the following Support note:

Master Note for MySQL Enterprise Audit Log Plugin (Doc ID 2299419.1)

And if you’re looking for other Audit Plugin examples, I’d recommend Tony Darnell’s blog post:

https://scriptingmysql.wordpress.com/2014/03/14/installing-and-testing-the-mysql-enterprise-audit-plugin/

 

Install

Venturing onwards, have a read of the install (or upgrade) steps:

https://dev.mysql.com/doc/refman/8.0/en/audit-log-installation.html

and then what a “filter”is:

https://dev.mysql.com/doc/refman/8.0/en/audit-log-filtering.html

That said, although I started with a new install, it’s more than likely you won’t. So let’s install the plugin accordingly.

Remember, this is the Audit Plugin only available with the Enterprise Edition binaries. So we will need to download the MySQL Server from http://edelivery.oracle.com or from http://support.oracle.com, “Patches & Updates”.

Prepare the env

mkdir -p /opt/mysql/audit

ls -lrt /usr/local/mysql/mysql-advanced-5.7.18-linux-glibc2.5-x86_64
ls -lrt /usr/local/mysql/mysql-commercial-8.0.12-linux-glibc2.12-x86_64

(we’ll use the 8.0.12 binaries later)

cd /usr/local/mysql/mysql-advanced-5.7.18-linux-glibc2.5-x86_64

Edit the my.cnf commenting out the audit log params which we will use later.

vi my_audit.cnf
..
port=3357
..
[mysqld]
#plugin-load =audit_log.so
#audit-log =FORCE_PLUS_PERMANENT
..
basedir =/usr/local/mysql/mysql-advanced-5.7.18-linux-glibc2.5-x86_64
..

Initialize & startup

bin/mysqld --defaults-file=my_audit.cnf --initialize-insecure

Yes, using –initialize-insecure defeats the whole object of auditing, but here we’re testing. I expect the environment you’ll be using has already some minimum security in place.

bin/mysqld --defaults-file=my_audit.cnf &
bin/mysql --defaults-file=my_audit.cnf -uroot

SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM INFORMATION_SCHEMA.PLUGINS
WHERE PLUGIN_NAME LIKE 'audit%';

SELECT @@audit_log_filter_id;

To install the audit log plugin, we have to run:

bin/mysql -uroot -S /opt/mysql/audit/mysql_audit.sock < /usr/local/mysql/mysql-advanced-5.7.18-linux-glibc2.5-x86_64/share/audit_log_filter_linux_install.sql
Result
OK

Now in another window:

tail -100f /opt/mysql/audit/data/audit.log

 

Confirm which version has the audit_log tables in InnoDB or MyISAM (latter won’t work on GR/IdC for obvious reasons):
– 5.7.18 -> MyISAM
– 8.0.12 -> InnoDB

bin/mysql -uroot -S /opt/mysql/audit/mysql_audit.sock
show create table mysql.audit_log_user;

 

Auditing time has come

Check if any user account is being audited:

SELECT * from mysql.audit_log_user;

And what filter, if any, is active:

SELECT @@audit_log_filter_id;

As we haven’t created anything yet, it’s all empty.

Now to create a user to audit (Thanks Tony Darnell!):

CREATE USER 'audit_test_user'@'localhost' IDENTIFIED BY 'audittest123';
GRANT ALL PRIVILEGES ON *.* TO 'audit_test_user'@'localhost';

Create an audit filter to log only the connections of the previously created user:
either

SELECT audit_log_filter_set_filter('log_connection', '{ "filter": { "class": { "name": "connection" } } }');

or

SELECT audit_log_filter_set_filter('log_connection', '{ "filter": { "log": false ,"class": { "log": true, "name": "connection" } }}');

And assign the filter just created to the user account we want to audit:

SELECT audit_log_filter_set_user('audit_test_user@localhost', 'log_connection');

Make sure that all auditing changes have been committed and set in proverbial stone:

SELECT audit_log_filter_flush()\G

So what filter did we create or do we have?

SELECT * from mysql.audit_log_filter;

Now login with that user and run some SQL, whilst another window has a tail -100f running on the audit.log:

bin/mysql -uaudit_test_user -paudittest123 -S /opt/mysql/audit/mysql_audit.sock

SELECT @@audit_log_filter_id;
SELECT * from mysql.audit_log_user;

Now exit and reconnect to see in the tail of the audit.log the disconnect & connect.
In the window with a “tail -100f audit.log” we will only see:

<AUDIT_RECORD>
<TIMESTAMP>2018-08-21T14:58:34 UTC</TIMESTAMP>
<RECORD_ID>2_2018-08-21T14:50:37</RECORD_ID>
<NAME>Connect</NAME>
<CONNECTION_ID>6</CONNECTION_ID>
<STATUS>0</STATUS>
<STATUS_CODE>0</STATUS_CODE>
<USER>audit_test_user</USER>
<OS_LOGIN/>
<HOST>localhost</HOST>
<IP/>
<COMMAND_CLASS>connect</COMMAND_CLASS>
<CONNECTION_TYPE>Socket</CONNECTION_TYPE>
<PRIV_USER>audit_test_user</PRIV_USER>
<PROXY_USER/>
<DB/>
</AUDIT_RECORD>

but no sql being audited.

Let’s create a filter just for sql queries, without logging connections as we already how to create a filter for that:

SELECT audit_log_filter_set_filter('log_sql', '{ "filter": { "log": true ,"class": { "log": false, "name": "connection" } }}');
SELECT audit_log_filter_set_user('audit_test_user@localhost', 'log_sql');
SELECT audit_log_filter_flush()\G

Run some selects on any table to view the result in the audit.log.

Now activate logging of I/U/D but not for Selects / Reads:

SELECT audit_log_filter_set_filter('log_IUD', '{
  "filter": {
    "class": {
      "name": "table_access",
        "event": {
          "name": [ "insert", "update", "delete" ]
        }
    }
  }
 }');

Lets apply it to the user:

SELECT audit_log_filter_set_user('audit_test_user@localhost', 'log_IUD');
SELECT audit_log_filter_flush()\G

Let’s confirm the user has the new filter applied:

SELECT * from mysql.audit_log_user;

Let’s create a table and test some I/U/D:

create database nexus;
use nexus;
create table replicant (
`First name` varchar(40) not null default '',
`Last name` varchar(40) not null default '',
`Replicant` enum('Yes','No') not null default 'Yes'
) engine=InnoDB row_format=COMPACT;
INSERT INTO `replicant` (`First name`,`Last name`,`Replicant`)
VALUES
('Roy','Hauer','Yes'),
('Rutger','Batty','Yes'),
('Voight','Kampff','Yes'),
('Pris','Hannah','Yes'),
('Daryl','Stratton','Yes'),
('Rachael','Young','Yes'),
('Sean','Tyrell','Yes'),
('Rick','Ford','No'),
('Harrison','Deckard','Yes');
DELETE FROM replicant where `First name`='Rick';
UPDATE replicant set `Replicant` = 'No' where `First name` = 'Harrison';
INSERT INTO replicant (`First name`,`Last name`,`Replicant`) VALUES ('Rick','Ford','No');
UPDATE replicant set `Replicant` = 'Yes' where `First name` = 'Harrison';

Create a filter for both login connections & I/U/D actions:

SELECT audit_log_filter_set_filter('log_connIUD', '{
  "filter": {
    "class": [
      {"name": "connection" },
      {"name": "table_access",
         "event": {
           "name": [ "insert", "update", "delete" ]
         }
      }
    ]
  }
 }');

Apply it / make it stick and then confirm:

SELECT audit_log_filter_set_user('audit_test_user@localhost', 'log_connIUD');
SELECT audit_log_filter_flush()\G
SELECT * from mysql.audit_log_user;

Now re-run the I/U/D & exit/connect and view the audit.log.

Upon assigning a filter to a specific account (user+host),  the previous filter is automatically replaced.
So let’s apply the log_connection filter to all users, i.e. “%”:

SELECT audit_log_filter_set_filter('log_connection', '{ "filter": { "class": { "name": "connection" } } }');
SELECT audit_log_filter_set_user('%', 'log_connection');
SELECT audit_log_filter_flush()\G
SELECT * from mysql.audit_log_user;

Although we have just assigned the log_connection filter to all users, the audit_test_user has the log_IUD filter assigned specifically, which means that no logins for this user are being recorded in the audit.log. We would have to use the log_connIUD filter for that.

So maybe we don’t want to log anything for the root user. So we can remove logging of root from the log_connection filter

SELECT audit_log_filter_remove_user('root@localhost');
SELECT audit_log_filter_flush()\G
SELECT * from mysql.audit_log_user;

If we log off and log back on we’ll observe that the root user is removed just for that session. Logging is enabled again via the generic filter for the root user once logged on again.

Given that the previous is only per session, we’ll now create a “log_nothing” filter and apply it to the user accounts that we don’t want anthing to be logged:

SELECT audit_log_filter_set_filter('log_nothing', '{ "filter": { "log": false } }');
SELECT audit_log_filter_set_user('root@localhost', 'log_nothing');
SELECT audit_log_filter_flush()\G
SELECT * from mysql.audit_log_user;

Try logging on:

mysql -uroot -S /opt/mysql/audit/mysql_audit.sock

and view the audit.log tail output. “root” is no longer being logged.

 

I hope this has helped give an insight into some examples of how to audit MySQL. There are many many more examples, i.e. no logging for DDL, logging just specific tables and/or schemas. It is entirely up to you what you log… or not.

Happy auditing!

Advertisements

Upgrading MySQL to 8.0.12 with Audit plugin.

$
0
0

As a spin-off from the previous post, https://mysqlmed.wordpress.com/2018/08/23/get-the-auditors-in/, I thought that it would be good to see how well the Audit plugin upgrades to MySQL 8. The big change in auditing is that the tables change from MyISAM to InnoDB, so keep your eyes open.

I’m using the previously used instance in version 5.7.18.

Preparation

Before we do anything, let’s make sure auditing will be in place when we restart the instance with 8.0.12:

Uncomment the plugin-load & audit-log params we had originally commented out. After all, this is something we should have done in the last post (apologies!):

vi my_audit.cnf:
  ..
  [mysqld]
  plugin-load =audit_log.so
  audit-log =FORCE_PLUS_PERMANENT
  ..

Restart the 5.7 instance so we upgrade from a rebooted / ‘as real as can be expected’ scenario:

bin/mysqladmin --defaults-file=my_audit.cnf -uroot shutdown
bin/mysqld --defaults-file=my_audit.cnf &

With the tail of the audit.log still running, login again as the audit_test_user:

bin/mysql -uaudit_test_user -paudittest123 -S /opt/mysql/audit/mysql_audit.sock

Observe that the login is being audited with out audit.log tail:

SELECT * from mysql.audit_log_user;
SELECT * from mysql.audit_log_filter;
SELECT @@audit_log_filter_id;

Lets check the instance with the upgrade checks:

https://dev.mysql.com/doc/refman/8.0/en/upgrading-strategies.html

Run the check-upgrade script:

mysqlcheck -S /opt/mysql/audit/mysql_audit.sock -uroot --all-databases --check-upgrade

Check for partitioned tables:

SELECT TABLE_SCHEMA, TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE ENGINE NOT IN ('innodb', 'ndbcluster')
AND CREATE_OPTIONS LIKE '%partitioned%';

Check for table names:

SELECT TABLE_SCHEMA, TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE LOWER(TABLE_SCHEMA) = 'mysql'
and LOWER(TABLE_NAME) IN
(
'catalogs',
'character_sets',
'collations',
'column_statistics',
'column_type_elements',
'columns',
'dd_properties',
'events',
'foreign_key_column_usage',
'foreign_keys',
'index_column_usage',
'index_partitions',
'index_stats',
'indexes',
'parameter_type_elements',
'parameters',
'resource_groups',
'routines',
'schemata',
'st_spatial_reference_systems',
'table_partition_values',
'table_partitions',
'table_stats',
'tables',
'tablespace_files',
'tablespaces',
'triggers',
'view_routine_usage',
'view_table_usage'
);

If you find anything, then now’s the moment to change the names, partitions, etc.

Shutdown the instance:

bin/mysqladmin --defaults-file=my_audit.cnf -uroot shutdown

 

MySQL 8

Change to 8.0.12 binaries.

Copy & edit the my.cnf.

cp /usr/local/mysql/mysql-advanced-5.7.18-linux-glibc2.5-x86_64/my_audit.cnf /usr/local/mysql/mysql-commercial-8.0.12-linux-glibc2.12-x86_64/my_audit80.cnf

Get rid of query_cache & sql_mode params from the my.cnf, and adjust the audit log params:

vi my_audit80.cnf
   ..
   port=3380
   ..
   [mysqld]
   plugin-load =audit_log.so
   audit-log =FORCE_PLUS_PERMANENT
   ..
   basedir =/usr/local/mysql/mysql-commercial-8.0.12-linux-glibc2.12-x86_64
   ..

 

Start the instance with the 8.0.12 binaries:

bin/mysqld --defaults-file=my_audit80.cnf &

And time to upgrade:

bin/mysql_upgrade -S /opt/mysql/audit/mysql_audit.sock -uroot
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Upgrading system table data.
Checking system database.
mysql.audit_log_filter OK
mysql.audit_log_user OK
mysql.columns_priv OK
mysql.component OK
mysql.db OK
mysql.default_roles OK
mysql.engine_cost OK
mysql.func OK
mysql.general_log OK
mysql.global_grants OK
mysql.gtid_executed OK
mysql.help_category OK
mysql.help_keyword OK
mysql.help_relation OK
mysql.help_topic OK
mysql.innodb_index_stats OK
mysql.innodb_table_stats OK
mysql.ndb_binlog_index OK
mysql.password_history OK
mysql.plugin OK
mysql.procs_priv OK
mysql.proxies_priv OK
mysql.role_edges OK
mysql.server_cost OK
mysql.servers OK
mysql.slave_master_info OK
mysql.slave_relay_log_info OK
mysql.slave_worker_info OK
mysql.slow_log OK
mysql.tables_priv OK
mysql.time_zone OK
mysql.time_zone_leap_second OK
mysql.time_zone_name OK
mysql.time_zone_transition OK
mysql.time_zone_transition_type OK
mysql.user OK
Found outdated sys schema version 1.5.1.
Upgrading the sys schema.
Checking databases.
nexus.replicant OK
sys.sys_config OK
Upgrade process completed successfully.
Checking if update is needed.

Ok, all ok.

Double checking:

ls -lrt /opt/mysql/audit/data/*upg*
-rw-rw-r-- 1 khollman khollman 6 ago 22 19:58 /opt/mysql/audit/data/mysql_upgrade_info

cat /opt/mysql/audit/data/mysql_upgrade_info
8.0.12

 

Let’s check the audit plugin status:

bin/mysql -uroot -S /opt/mysql/audit/mysql_audit.sock

SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM INFORMATION_SCHEMA.PLUGINS
WHERE PLUGIN_NAME LIKE 'audit%';
+-------------+---------------+
| PLUGIN_NAME | PLUGIN_STATUS |
+-------------+---------------+
| audit_log | ACTIVE |
+-------------+---------------+

SELECT @@audit_log_filter_id;

SELECT * from mysql.audit_log_user;
+-----------------+-----------+----------------+
| USER            | HOST      | FILTERNAME     |
+-----------------+-----------+----------------+
| %               |           | log_connection |
| audit_test_user | localhost | log_connIUD    |
| root            | localhost | log_nothing    |
+-----------------+-----------+----------------+

SELECT * from mysql.audit_log_filter;

Here we see the table is now InnoDB:

show create table mysql.audit_log_user;

So now we see everything is as it was, we suddenly remember we saw something about upgrading in the install section of the manual:

https://dev.mysql.com/doc/refman/8.0/en/audit-log-installation.html

Note

As of MySQL 8.0.12, for new MySQL installations, the USER and HOST columns in the audit_log_user table used by MySQL Enterprise Audit have definitions that better correspond to the definitions of the User and Host columns in the mysql.user system table. For upgrades to an installation for which MySQL Enterprise Audit is already installed, it is recommended that you alter the table definitions as follows:

So we run the following commands:

ALTER TABLE mysql.audit_log_user
DROP FOREIGN KEY audit_log_user_ibfk_1;
ALTER TABLE mysql.audit_log_filter
CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci;
ALTER TABLE mysql.audit_log_user
CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_as_ci;
ALTER TABLE mysql.audit_log_user
MODIFY COLUMN USER VARCHAR(32);
ALTER TABLE mysql.audit_log_user
ADD FOREIGN KEY (FILTERNAME) REFERENCES mysql.audit_log_filter(NAME);

Lets do the tail of the audit.log that hasn’t had anything written to it since we started the upgrade.

tail -100f /opt/mysql/audit/data/audit.log

Now to login as our audit_test_user user.

bin/mysql -uaudit_test_user -paudittest123 -S /opt/mysql/audit/mysql_audit.sock nexus

And run some I/U/D:

DELETE FROM replicant where `First name`='Rick';
UPDATE replicant set `Replicant` = 'No' where `First name` = 'Harrison';
INSERT INTO replicant (`First name`,`Last name`,`Replicant`) VALUES ('Rick','Ford','No');
UPDATE replicant set `Replicant` = 'Yes' where `First name` = 'Harrison';

exit, and watch it all being logged, except the SELECT’s.

A “status;” shows us:

--------------
bin/mysql Ver 8.0.12-commercial for linux-glibc2.12 on x86_64 (MySQL Enterprise Server - Commercial)

Connection id: 18
Current database: 
Current user: audit_test_user@localhost
SSL: Not in use
Current pager: stdout
Using outfile: ''
Using delimiter: ;
Server version: 8.0.12-commercial MySQL Enterprise Server - Commercial
...

 

Upgraded! With Auditing in place.

Comparing Data At-Rest Encryption Features for MariaDB, MySQL and Percona Server for MySQL

$
0
0
Encryption at rest MariaDB MySQL Percona Server

Encryption at rest MariaDB MySQL Percona ServerProtecting the data stored in your database may have been at the top of your priorities recently, especially with the changes that were introduced earlier this year with GDPR.

There are a number of ways to protect this data, which until not so long ago would have meant either using an encrypted filesystem (e.g. LUKS), or encrypting the data before it is stored in the database (e.g. AES_ENCRYPT or other abstraction within the application). A few years ago, the options started to change, as Alexander Rubin discussed in MySQL Data at Rest Encryption, and now MariaDB®, MySQL®, and Percona Server for MySQL all support encryption at-rest. However, the options that you have—and, indeed, the variable names—vary depending upon which database version you are using.

In this blog post we will take a look at what constitutes the maximum level of at-rest encryption that can be achieved with each of the latest major GA releases from each provider. To allow a fair comparison across the three, we will focus on the file-based key management; keyring_file plugin for MySQL and Percona Server for MySQL along with file_key_management plugin for MariaDB.

MariaDB 10.3

The MariaDB team take the credit for leading the way with at-rest encryption, as most of their features have been present since the 10.1 release (most notably the beta release of 10.1.3 in March 2015). Google donated the tablespace encryption, and eperi donated per-table encryption and key identifier support.

The current feature set for MariaDB 10.3 comprises of the following variables:

Maximising at-rest encryption with MariaDB 10.3

Using the following configuration would give you maximum at-rest encryption with MariaDB 10.3:

plugin_load_add = file_key_management
file_key_management_filename = /etc/mysql/keys.enc
file_key_management_filekey = FILE:/etc/mysql/.key
file_key_management_encryption_algorithm = aes_cbc
innodb_encrypt_log = ON
innodb_encrypt_tables = FORCE
Innodb_encrypt_threads = 4
encrypt_binlog = ON
encrypt_tmp_disk_tables = ON
encrypt_tmp_files = ON
aria_encrypt_tables = ON

This configuration would provide the following at-rest protection:

  • automatic and enforced InnoDB tablespace encryption
  • automatic encryption of existing tables that have not been marked with
    ENCRYPTED=NO
  • 4 parallel encryption threads
  • encryption of temporary files and tables
  • encryption of Aria tables
  • binary log encryption
  • an encrypted file that contains the main encryption key

You can read more about preparing the keys, as well as the other key management plugins in the Encryption Key Management docs.

There is an existing bug related to encrypt_tmp_files (MDEV-14884), which causes the use of

mysqld --help --verbose
 to fail, which if you are using the official MariaDB Docker container for 10.3 will cause you to be unable to keep mysqld up and running. Messages similar to these would be visible in the Docker logs:
ERROR: mysqld failed while attempting to check config
command was: "mysqld --verbose --help --log-bin-index=/tmp/tmp.HDiERM4SPx"
2018-08-15 13:38:15 0 [Note] Plugin 'FEEDBACK' is disabled.
2018-08-15 13:38:15 0 [ERROR] Failed to enable encryption of temporary files
2018-08-15 13:38:15 0 [ERROR] Aborting

N.B. you should be aware of the limitations for the implementation, most notably log tables and files are not encrypted and may contain data along with any query text.

One of the key features supported by MariaDB that is not yet supported by the other providers is the automatic, parallel encryption of tables that will occur when simply enabling

innodb_encrypt_tables
 . This avoids the need to mark the existing tables for encryption using
ENCRYPTED=YES
 , although at the same time it also does not automatically add the comment and so you would not see this information. Instead, to check for encrypted InnoDB tables in MariaDB you should check
information_schema.INNODB_TABLESPACES_ENCRYPTION
 , an example query being:
mysql> SELECT SUBSTRING_INDEX(name, '/', 1) AS db_name,
   ->   SUBSTRING_INDEX(name, '/', -1) AS db_table,
   ->   COALESCE(ENCRYPTION_SCHEME, 0) AS encrypted
   -> FROM information_schema.INNODB_SYS_TABLESPACES
   -> LEFT JOIN INNODB_TABLESPACES_ENCRYPTION USING(name);
+---------+----------------------+-----------+
| db_name | db_table             | encrypted |
+---------+----------------------+-----------+
| mysql   | innodb_table_stats   |      1    |
| mysql   | innodb_index_stats   |      0    |
| mysql   | transaction_registry |      0    |
| mysql   | gtid_slave_pos       |      0    |
+---------+----------------------+-----------+

As can be inferred from this query, the system tables in MariaDB 10.3 are still predominantly MyISAM and as such cannot be encrypted.

MySQL

Whilst the enterprise version of MySQL has support for a number of data at-rest encryption features as of 5.7, most of them are not available to the community edition. The latest major release of the community version sees the main feature set comprise of:

The enterprise edition adds the following extra support:

Maximising at-rest encryption with MySQL 8.0

Using the following configuration would give you maximum at-rest encryption with MySQL 8.0:

early-plugin-load=keyring_file.so
keyring_file_data=/var/lib/mysql/keyring
innodb_redo_log_encrypt=ON
innodb_undo_log_encrypt=ON

This configuration would provide the following at-rest protection:

  • optional InnoDB tablespace encryption
  • redo and undo log encryption

You would need to create new, or alter existing tables with the

ENCRYPTION=Y
 option, which would then be visible by examining
information_schema.INNODB_TABLESPACES
 , an example query being:
mysql> SELECT TABLE_SCHEMA AS db_name,
   ->    TABLE_NAME AS db_table,
   ->    CREATE_OPTIONS LIKE '%ENCRYPTION="Y"%' AS encrypted
   -> FROM information_schema.INNODB_TABLESPACES ts
   -> INNER JOIN information_schema.TABLES t ON t.TABLE_SCHEMA = SUBSTRING_INDEX(ts.name, '/', 1)
   ->                                        AND t.TABLE_NAME = SUBSTRING_INDEX(ts.name, '/', -1);
+---------+-----------------+-----------+
| db_name | db_table        | encrypted |
+---------+-----------------+-----------+
| sys     | sys_config      |     1     |
+---------+-----------------+-----------+

N.B. You are able to encrypt the tablespaces in 5.7, in which case you should use

information_schema.INNODB_SYS_TABLESPACES
 as the internal system views on the data dictionary were renamed (InnoDB Changes).

Unfortunately, whilst all of the tables in the mysql schema use the InnoDB engine (except for the log tables), you cannot encrypt them and instead get the following error:

ERROR 3183 (HY000): This tablespace can't be encrypted.

Interestingly, you are led to believe that you can indeed encrypt the

general_log
 and
slow_log
 tables, but this is in fact a bug (#91791).

Percona Server for MySQL

Last, but not least we have Percona Server for MySQL, which, whilst not completely matching MariaDB for features, is getting very close. As we shall see shortly, it does in fact have some interesting differences to both MySQL and MariaDB.

The current feature set for 5.7, which does indeed exceed the features provided by MySQL 5.7 and for the most part 8.0, is as follows:

Maximising at-rest encryption with Percona Server for MySQL 5.7

Using the following configuration would give you maximum at-rest encryption with Percona Server 5.7:

early-plugin-load=keyring_file.so
keyring_file_data=/var/lib/mysql-keyring/keyring
innodb_temp_tablespace_encrypt=ON
innodb_encrypt_online_alter_logs=ON
innodb_encrypt_tables=FORCE
encrypt_binlog=ON
encrypt_tmp_files=

This configuration would provide the following at-rest protection:

  • automatic and enforced InnoDB tablespace encryption
  • encryption of temporary files and tables
  • binary log encryption
  • encryption when performing online DDL

There are some additional features that are due for release in the near future:

  • Encryption of the doublewrite buffer
  • Automatic key rotation
  • Undo log and redo log encryption
  • InnoDB system tablespace encryption
  • InnoDB tablespace and redo log scrubbing
  • Amazon KMS keyring plugin

Just like MySQL, encryption of any existing tables needs to be specified via

ENCRYPTION=Y
 via an
ALTER
, however new tables are automatically encrypted. Another difference is that in order to check which tables are encrypted you can should the flag set against the tablespace in
information_schema.INNODB_SYS_TABLESPACES
, an example query being:
mysql> SELECT SUBSTRING_INDEX(name, '/', 1) AS db_name,
   ->    SUBSTRING_INDEX(name, '/', -1) AS db_table,
   ->    (flag & 8192) != 0 AS encrypted
   -> FROM information_schema.INNODB_SYS_TABLESPACES;
+---------+---------------------------+-----------+
| db_name | db_table                  | encrypted |
+---------+---------------------------+-----------+
| sys     | sys_config                |      1    |
| mysql   | engine_cost               |      1    |
| mysql   | help_category             |      1    |
| mysql   | help_keyword              |      1    |
| mysql   | help_relation             |      1    |
| mysql   | help_topic                |      1    |
| mysql   | innodb_index_stats        |      1    |
| mysql   | innodb_table_stats        |      1    |
| mysql   | plugin                    |      1    |
| mysql   | servers                   |      1    |
| mysql   | server_cost               |      1    |
| mysql   | slave_master_info         |      1    |
| mysql   | slave_relay_log_info      |      1    |
| mysql   | slave_worker_info         |      1    |
| mysql   | time_zone                 |      1    |
| mysql   | time_zone_leap_second     |      1    |
| mysql   | time_zone_name            |      1    |
| mysql   | time_zone_transition      |      1    |
| mysql   | time_zone_transition_type |      1    |
| mysql   | gtid_executed             |      0    |
+---------+---------------------------+-----------+

Here you will see something interesting! We are able to encrypt most of the system tables, including two that are of significance, as they can contain plain text passwords:

+---------+-------------------+-----------+
| db_name | db_table          | encrypted |
+---------+-------------------+-----------+
| mysql   | servers           |      1    |
| mysql   | slave_master_info |      1    |
+---------+-------------------+-----------+

In addition to the above, Percona Server for MySQL also supports using the opensource HashiCorp Vault to host the keyring decryption information using the keyring_vault plugin; utilizing this setup (provided Vault is not on the same device as your mysql service, and is configured correctly) gains you an additional layer of security.

You may also be interested in my earlier blog post on using Vault with MySQL, showing you how to store your credentials in a central location and use them to access your database, including the setup and configuration of Vault with Let’s Encrypt certificates.

Summary

There are significant differences both in terms of features and indeed variable names, but all of them are able to provide encryption of the InnoDB tablespaces that will be containing your persistent, sensitive data. The temporary tablespaces, InnoDB logs and temporary files contain transient data, so whilst they should ideally be encrypted, only a small section of data would exist in them for a finite amount of time which is less of a risk, albeit a risk nonetheless.

Here are the highlights:

MariaDB 10.3 MySQL 8.0 Percona Server 5.7
encrypted InnoDB data Y Y Y
encrypted non-InnoDB data Aria-only N N
encrypted InnoDB logs Y Y TBA
automatic encryption Y N Y
enforced encryption Y N Y
automatic key rotation Y N TBA
encrypted binary logs Y N Y
encrypted online DDL ? N Y
encrypted keyring Y Enterprise-only N
mysql.slave_master_info N N Y
mysql.servers N N Y
Hashicorp Vault N N Y
AWS KMS Y Enterprise-only TBA

 

Extra reading:

 

The post Comparing Data At-Rest Encryption Features for MariaDB, MySQL and Percona Server for MySQL appeared first on Percona Database Performance Blog.

Log Buffer #554: A Carnival of the Vanities for DBAs

$
0
0

This Log Buffer Edition covers Cloud, Oracle, and MySQL.

Cloud:

Google Cloud is announcing an easy way to back up and replay your streaming pipeline events directly from the Cloud Console via a new collection of simple import/export templates. If you are a developer interested in data stream processing, you’ll likely find this feature very handy.

Google network engineering uses a diverse set of vendor equipment to route user traffic from an internet service provider to one of our serving front ends inside a GCP data center.

Over the last two decades, the IT profession has developed new ways of working that are intended to deliver better business value more quickly and at lower risk.

Today’s digital transformations pose a number of challenges, or certainly major changes, for finance professionals in enterprises. Yet finance is a critical player in these transformations and in the “transformed” enterprise.

This post provides an overview of launching, setting up, and configuring a Hyper-V enabled host, launching a guest virtual machine (VM) within Hyper-V running on i3.metal.

This one provides an overview of moving a common blogging platform, WordPress, running on an on-premises virtualized Microsoft Hyper-V platform to AWS, including re-pointing the DNS records.

Oracle:

Oracle has just released the Oracle Autonomous Transaction Processing Cloud Service. This is the second service of the Oracle Autonomous Database Cloud Services, after Oracle Autonomous Data Warehouse Cloud Service which launched earlier this year.

First Steps with Prometheus and Grafana on Kubernetes on Windows.

If you want to batch-process over a number of objects in Exasol, scripts that work with the data dictionary might do the trick.

Spark is a very popular environment for processing data and doing machine learning in a distributed environment. When working in a development environment, you might work on a single node. This can be your local PC or laptop, as not everyone will have access to a multi-node distributed environment. But what if you could spin up some docker images there by creating additional nodes for you to test out the scalability of your Spark?

Oracle VBCS allows us to build multiple flows within the application. This is great – it helps to split application logic into different smaller modules, although VBCS doesn’t offer (in the current version) declarative support to build menu structure to navigate between the flows.

MySQL:

InnoDB Parallel Flushing was introduced with MySQL 5.7 (as a single-thread flushing was no longer feasible), and implemented as dedicated parallel threads (cleaners) which are involved in background once per second to do LRU-driven flushing first (in case there is no more or too low amount of free pages) and then REDO-driven flushing .

MySQL 8.0 introduced a new feature that allows you to persist configuration changes from inside MySQL. Previously, you could execute SET GLOBAL to change the configuration at runtime, but you needed to update your MySQL configuration file in order to persist the change.

Both MySQL and MariaDB publish a respectful list of customers who are using their database as their core data infrastructure.

Progress information is implemented through the Performance Schema using the stage events. In version 8.0.12 there are currently seven stages that can provide this information for ALTER TABLE statements on InnoDB tables.

When loading massive amounts of data into NDB when testing the new adaptive checkpoint speed I noted that checkpoints slowed down as the database size grew.


MySQL 5.7.* and mysqli

$
0
0

After installing MySQL 5.7.22 and PHP 7.1.17 on Fedora 27, you need to install the mysqli library. You need to verify if the mysqli library is installed. You can do that with the following mysqli_check.php program:


Check mysqli Install

You test preceding PHP program with the following URL in a browser:

http://localhost/mysqli_check.php

If the mysqli program isn’t installed, you can install it as follows by opening the yum interactive shell:

[root@localhost html]# yum shell
Last metadata expiration check: 1:26:46 ago on Wed 22 Aug 2018 08:05:50 PM MDT.
> remove php-mysql
No match for argument: php-mysql
Error: No packages marked for removal.
> install php-mysqlnd
> run
================================================================================================
 Package                 Arch               Version                   Repository           Size
================================================================================================
Installing:
 php-mysqlnd             x86_64             7.1.20-1.fc27             updates             246 k
Upgrading:
 php                     x86_64             7.1.20-1.fc27             updates             2.8 M
 php-cli                 x86_64             7.1.20-1.fc27             updates             4.2 M
 php-common              x86_64             7.1.20-1.fc27             updates             1.0 M
 php-fpm                 x86_64             7.1.20-1.fc27             updates             1.5 M
 php-json                x86_64             7.1.20-1.fc27             updates              73 k
 php-pdo                 x86_64             7.1.20-1.fc27             updates             138 k
 php-pgsql               x86_64             7.1.20-1.fc27             updates             135 k

Transaction Summary
================================================================================================
Install  1 Package
Upgrade  7 Packages

Total download size: 10 M
Is this ok [y/N]: y

After you type y and the return key, you should see a detailed log of the installation. Click the link below to see the yum installation log detail.

After you install the mysqli library, you exit the yum interactive shell with the quit command as shown:

> quit
Leaving Shell
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.

You can now retest by re-running the mysqli_check.php program with the following URL:

http://localhost/mysqli_check.php

Image processing is not generally installed by default. You should use the following yum command to install the PHP Image processing library:

yum install -y php-gd

Or, you can use dnf (Dandified yum), like:

dnf install -y php-gd

Click the link below to see the yum installation log detail.

If you encounter an error trying to render an image like this:

Call to undefined function imagecreatefromstring() in ...

The php-gd package is not enabled. You can verify the contents of the php-gd package with the following rpm command on Fedora or CentOS:

rpm -ql php-gd

On PHP 7.1, it should return:

/etc/php-zts.d/20-gd.ini
/etc/php.d/20-gd.ini
/usr/lib/.build-id
/usr/lib/.build-id/50
/usr/lib/.build-id/50/11f0ec947836c6b0d325084841c05255197131
/usr/lib/.build-id/b0/10bf6f48ca6c0710dcc5777c07059b2acece77
/usr/lib64/php-zts/modules/gd.so
/usr/lib64/php/modules/gd.so

Then, you might choose to follow some obsolete note from ten or more years ago to include gd.so in your /etc/php.ini file. That’s not necessary.

The most common reason for incurring this error is tied to migrating old PHP 5 code forward. Sometimes folks used logic like the following to print a Portable Network Graphics (png) file stored natively in a MySQL BLOB column:

  header('Content-Type: image/x-png');
  imagepng(imagecreatefromstring($image));

If it was stored as a Portable Network Graphics (png) file, all you needed was:

  header('Content-Type: image/x-png');
  print $image;

As always, I hope this helps those looking for a solution.

MySQL InnoDB Cluster 8.0.12 – avoid old reads on partitioned members

$
0
0

We received feedback about how a member should act when leaving the group. And the majority of users wanted that when a node drops out of the group, it should kill all connections and shutdown. I totally agree with that behavior and it’s now the default in MySQL 8.0.12.

This new feature is explained in WL#11568.

Before this change, the server goes into super read only mode when dropping out of the group and allows users connected to this server or new connections (if you don’t use the router) to read old data.

Let’s check this out in the following video:

So now in MySQL 8.0.12, there is a mnew option called group_replication_exit_state_action and be default, when a node is evicted of the group it should abort mysqld (shutdown MySQL). This terminate all user connections and avoid stale reads… Let’s have a look at this in action:

Hey !? What happened here ? We were still able to read old data if we perform a new connection directly to the server ! Of course this is not recommended. All existing connection have been killed, those connected via the router or directly, but we are still able to connect again directly to the server and read data. This is of course a good reason to always use the router (or another routing/proxy solution that monitors the cluster).

But can we also avoid this ?

Yes of course ! In fact this is not a MySQL issue per se, but it is systemd that restarts the mysqld process. This is the default behavior.

Let’s see how to change this and how it acts:

As you could notice, once we have configured systemd to not restart mysqld on failure, it works as expected.

Enjoy MySQL InnoDB Cluster and don’t forget to register to Oracle Open World if you want to learn more about MySQL 8.0 and InnoDB Cluster !

Vitess Almost Weekly Digest

$
0
0

This week, we continue the digest from the Slack discussions for Jul 25 2018 to Aug 2 2018 .

Update stream


Jian [Jul 25th at 1:27 PM]
hi there, I'm new to Vitess, now I'm following the user-guide from vitess.io to explore vitess, in update stream section, I notice they have change log, where could I see these change logs so I can have a better understanding of the update stream?

sougou 
That's the only documentation we have about the update stream, but we'll be fixing docs for all vitess very soon.

Jian 
sure sure, thank you very much!

vamsi 
@sougou even if documentation is not ready yet, is there some info you can provide to Jian about where he can see change logs?

sougou 
The end to end test can actually be handy. Let me get the link.

sougou 

sougou 

Jian [1 day ago]
:+1:

Fixing a failed MigrateServedTypes

Vidhi [2:10 AM]
Hi

for slave rollback, this will work? ./lvtctl.sh MigrateServedTypes -reverse test_keyspace/0  rdonly

sougou [6:36 AM]
yes. that should work

Vidhi [6:48 AM]
If in case some error came during master switch, as for rollback (no reads and writes are happening), if I update the old master end-point in zookeeper . Will it work?

sougou [6:50 AM]
i think you have to manually repair. can you show me where it failed?

Vidhi [6:50 AM]
It didnt failed yet. I havent done the switch. Just want to figure out rollback plan if something went wrong
Can you please elaborate on manually repair. How to do that?

sougou [6:51 AM]
let me look it up

for master switch, what vtctld does is the following:
set a shard control record to disable query service on source master, and issue a refresh which also sets the source master read-only (edited)
then waits for replication to catch up.
Once caught up, it sets the shard control record to enable query service on destination masters, and issue a refresh on destination masters that makes them read-write.
If there is a failure in the middle, you have to manually do or undo the setting of the tablet control
using SetShardTabletControl command
and then issue a RefreshStateByShard to the relevant tablets
i'm working on improving this part: https://github.com/vitessio/vitess/pull/4034
sougou
#4034 vreplication: change to use new _vt.vreplication
This change deprecates _vt.blp_checkpoint in favor  
of vreplication, which stands for Vitess Replication.  
The goal is to make vreplication a standalone sub-module  
of vitess that other services can use, including the  
resharding worflow.

The big change in the model is that vreplication is not owned by the resharding workflow. The workflow instead creates vreplication streams as needed, and controls them individually. The stream id for a replication is now generated by vreplication, which the resharding workflow stores and tracks.

This also means that a vreplication stream can be directly created and managed by anyone as needed. This allows for newer and more flexible workflows in the future.

Vidhi [7:00 AM]
Can you share the complete command to do these steps. I coulnt find it vitess docs

sougou [7:01 AM]
vtctl -h gives me this: SetShardTabletControl [--cells=c1,c2,...] [--blacklisted_tables=t1,t2,...] [--remove] [--disable_query_service]
to enable query service, you probably should use --remove
to disable --disable_query_service
I haven't used these myself. So, you should test them out yourself to make sure they work as intended.
You can try it out on the source master while it's serving queries to see if it stops serving
and re-enable it with --remove 

Vidhi [7:04 AM]
Sure, will try this setup on stage first.
Thank you very much for the help :)

Reading from replicas


skyler [Jul 31]
Does vtgate support rewrite rules similar to ProxySQL? We’re using ProxySQL to send queries to a replica if it’s not too laggy.

Does vtgate, or some other component of the stack, support something similar?


I haven’t found much in docs and via google, so I assume no, but I thought I’d ask anyway.


sougou
@skyler can you give an example?


skyler
The actual config is pretty lengthy, but what we’re doing is matching for the string `/*SLAVE OK*/` at the beginning of every query. If that string exists, then we route a query to a read replica if it’s replication lag is less than some threshold. If a replica’s replication lag is greater than the threshold, ProxySQL “shuns” it, which means that it removes the replica from the list of replicas that are available for querying.


sougou
this is supported differently by vitess


sougou
you can specify db name as `db@replica`


sougou
and the tolerances you mention can be specified to vttablet


skyler

Oh interesting, that’s very cool.


Reconstructing zk data


vamsi [Jul 31st]

Do people who use vitess with ZK generally backup ZK data regularly? If not, what would happen if ZK data is somehow corrupt or if ZK dies for some unexpected reason?

sougou

zk data can be reconstructed if needed.

it's mostly metadata about keyspaces and shards
but it's still a good idea to back it up

vamsi
any tools that can reconstruct it?

sougou
to manually reconstruct? they would be the vtctl commands like `CreateKeyspace` etc.
You could probably write a shell script to do this
Will be interesting if we could do a feature that generates this.

ameet
@vamsi we are using consul.  We backup the vitess metadata every 30 mins. It has saved us at least once where an operator deleted the metadata by mistake. Also, we manually backup before doing the cutover operation for a shard split

sougou
If you loose all data. I think these steps will also work:
1. Recreate all the cells
2. Restart all vttablets
3. Perform `TabletExternallyReparented` on all master tablets
Your system should be pretty much restored to the old state.


Are primary keys needed



faut [Aug 1st]

Is it imperative for tables to have a primary key in vitess?

derekperkins

it’s pretty much imperative in MySQL to have a PK, but I don’t think Vitess adds any more need for it. Are you wanting to run sharded or non-sharded?
faut
non-sharded. We have some tables that don’t have PKs, and vitess throws `cannot identify primary key of statement` on updates and inserts.

sougou
it will work if you change mysql to RBR


Can sequence tables be in a sharded keyspace




captaineyesight [Aug 1st]

Hi. I’m looking at sequences and I’m a little confused. Lets say I have a sharded cluster: foo 00-80 and foo 80-FF. In foo, I have a table named bar that has a lovely vschema that splits it between shards. Where does the bar_seq table go? 00-80 or 80-FF or should it be in a completely different place?

weitzman

The sequence table does not need to be in the same keyspace. The vitess examples tend to use a keyspace called “lookup” or something like that
The sequence table only has one row, so if you put it in the same keyspace it would end up in whatever tablet the primary key “0” maps to
If someone really didn’t want to go through the trouble of having multiple keyspaces there might be an argument to do that, but under normal circumstances you’d probably want the sequences in an unsharded keyspace

captaineyesight
thanks

sougou (update)
Submitted https://github.com/vitessio/vitess/pull/4134: vschema: allow pins in vschema. This allows you to pin a table to a specific shard by assigning a keyspace id to it.


Creating replicas for devs


faut [Aug 2nd] What are the suggestion for devs in minikube and simulating the effects of vitess (Assuming they will just run mysql with a DB named the same as the keyspace? So if they write toxic queries they know before they get to a staging environment etc. And is it possible to dump a keyspace(sharded/unsharded) so you can replicate that in a standalone mysql? ie: Is it possible to migrate out of vitess? (edited) sougou@faut I don't fully understand the question. Are you talking about migrating into vitess, or out? To migrate out, you can just start sending queries directly to the mysql instances and tear down the vitess components. You could also replicate the data out and failover. faut:+1: Makes sense. But we’d need to rebuild/revert all the sharding? sougouOr reimplement sharding at the app layer If mysql can handle, you can also merge back all the shards into one fautAnd do you have suggestions for how to ‘replicate’ the database for devs. Or what to do for a dev environment? running vitess locally seems overkill. sougouIf it's just to make the data available to devs, you can always setup a standalone replica from a vitess master. fautHow can I restore that standalone from the backups created from vitess backup? sougouyeah. you can restore from those backups and point the restored db to the master if you're lazy, you could make vitess do it for you bring up a replica vttablet. once it's brought up, kill just the vttablet (and delete its tablet record) fautis manually restoring the data just a case of copying the GCS bucket to datadir? sougoui believe so (don't know the mechanism for GCS) vitess copies the data files into the datastore as files so, if bucket==file, it should work the same way in reverse fautcool. Then theoretically I should be able to make a backup by just copying the files there. Then restoring from that on vttablet. sougoushould work

fautThank you, I’ve got a couple of ideas I will try.



Hackathon!


raj.veerappan [Aug 2nd]

Another question on https://vitess.io/overview/scaling-mysql/#migrating-production-data-to-vitess


In that approach, you'd enable MySQL replication from your source database to the Vitess master database.


In the replication approach, does "Vitess master database" mean use the VTGate as the replication slave? Or the VTTablet of the master or the mysql of the master? If it's mysql of the master, does that populate the schema properly in Vitess?
faut
hey raj, if you’re planning to do a production migration to vitess maybe we can chat. We’re also planning to move to vitess so we’re struggling through similar issues.

raj.veerappan
I'm just doing this for a hackathon to prove things out and see if it'll work for us

sougou
People have adopted more approaches than those mentioned in that write-up. We need to update it with the new strategies

sougou
Dual-writes seems to be a popular approach
In that particular descrption, I think it meant mysql->mysql

raj.veerappan
what happens to the schema in that case?
I guess I thought updates to the schema have to go through vtgate

sougou
not necessary
even after you're fully migrated to vitess, you can deploy schema changes directly to the mysqls
and people often do, using tools like gh-ost, etc
the `ApplySchema` is just a convenience

raj.veerappan
hmm, ok, I made that assumption because one approach I tried was to copy over the data files from my non-vitess mysql to the data directories of the vitess mysql instances. Then when I fired up vtgate and used the mysql command line client to inspect the db, I could see all the tables were there
but when I tried to select rows from a table, vtgate complained that it didn't recognize the table

sougou
ohh. you still need a `vschema`
something that describes how your shards are layed out
https://vitess.io/user-guide/vschema/

sougou [18 days ago]
if the target db is not sharded yet, the vschema is a simple json that lists the table names

raj.veerappan
nice! thank you, will try that now

sougou
examples/demo/schema/lookup/vschema.json
{
 "sharded": false,
 "tables": {
   "user_seq": {
     "type": "sequence"
   },
   "music_seq": {
     "type": "sequence"
   },
   "name_keyspace_idx": {}
 }
}
tables should have no types. the `sequence` tables are special case

raj.veerappan
right was gonna say, I didn't think I needed to create those until I sharded things

sougou
vitess will work without a vschema as long as there's only one keyspace, because it knows there's only one
as soon as you have more than one, it needs to know where to route the queries

raj.veerappan
when you say work without a vschema, will it function purely as a "connection pool" or will it still need to parse the queries and will only support the statements it supports?

sougou
it will still do some work, but most queries will just be passed through

raj.veerappan
one of the reasons I tried copying over the data files directly was that when I tried restoring from a mysqldump vtgate complained that it couldn't handle one of the insert statements to a many-to-many mapping table because it didn't understand the primary key

sougou
it's probably because the mysqls are setup as SBR
we recommend RBR now. Hopefully we can deprecate SBR support soon :slightly_smiling_face:

raj.veerappan
oh interesting, I didn't realize that would affect mysqldump

faut
would the vschema tables just be: `tables: { user: {} }`?

sougou
`"user":...` yeah

raj.veerappan
will retry importing using mysql dump after switching all the vitess instances to RBR, seems easier than creating that json

faut
raj, are you working in GCP or baremetal?

sougou
if it's a single keyspace, you shouldn't need that json (irrespective of how you do the import)

raj.veerappan
@sougou I think I may just be in a weird state right now because the mysql import failed halfway, will start over after wiping things out and see if I can just copy the data files over without doing anything with vschema

raj.veerappan
@faut I'm just doing baremetal for the hackathon, if we start using it in production it would be with k8s/AWS (edited)

fautI had the same problems when I mounted the datadir for a single database. It showed all the tables if i did `show tables` it showed everything. But any query would say. the table didnt exist. Even direct to mysql

sougou
it may be related to vttablet not having reloaded the schema
vttablet reloads the schema every X minutes

faut
I did a vschema reload. But the problem is with mysql. Because even when querying directly it would fail

sougou
this is vttablet seeing the table. vschema is for vtgate (edited)

raj.veerappan
vttablet reloads the schema every X minutes
is there a way to force this?

sougou
yeah. `vtctl ReloadSchema`

faut
raj, if you come right with the datapath mounting please let me know. I couldn’t get it to work

sougou
there is a way to make vttablet auto-detect by making it watch the replication stream. most people prefer not to use that feature
i think the flag is `-enable_replication_watcher` (not at my comp)

raj.veerappan
I wiped everything out and restarted and copied the data files over, when I login through vtgate I see the tables but in the UI for vitess the schema says empty and I'm not able to select from any of the tables in mysql client connected to vtgate
did `vtctl ReloadSchema` against my master vttablet but the schema did not populate in web UI
so will try using the json and enumerate the table names
actually, will switch all the vitess mysql instances to RBR and try loading from mysqldump first
nice, that seems to be the way to go, only problem now is that our mysqldump has tables with foreign key constraints on tables that are defined further down in the dump and vtgate doesn't like that, will need to edit the dump and reorder the create table statements

sougou
whatever works :slightly_smiling_face:

Raj.veerappan
problem is that it seems like vtgate does not support disabling foreign key checks for loading from dump

raj.veerappan
even trying to disable for session throws

```mysql> set foreign_key_checks=0;
ERROR 1105 (HY000): vtgate: http://localhost:15001/: unsupported construct: set foreign_key_checks=0```
(edited)

raj.veerappan
well, I found a janky workaround that makes this easy, create a schema only mysqldump, open up mysql cli onto vtgate, run `source ` repeatedly until the table count stabilizes. Then source your data only dump, super janky but it works for my hackathon :slightly_smiling_face:
I made it work the proper way, didn't realize I just needed to load the mysqldump directly against the vitess mysql master instance and reloadschema and everything would "just work"

sougou
yeah. that would be the best.

faut
The problem for me with the mysqldump is the downtime. Snapshotting a disk and using it as a mount is much quicker. I have got things to work with the mysqldump. Just trying to figure out the best way to migrate in production.








Configuring the app to use VTGate

Sean Gillespie [Aug 2nd]

Is there documentation on setting up an app to use vtgate?  I can’t find much beyond saying the apps can use it like MySQL
sougou
there's not much to it. just point the app at vtgate on the mysql port
https://vitess.slack.com/archives/C0PQY0PTK/p1527271545000268
Command to connect to vtgate: `mysql -h 127.0.0.1 -P 15306 -u mysql_user --password=mysql_password`
Posted in #vitessMay 25th

if you have many vtgates, you can put them behind an ELB

Sean Gillespie
Where do you set the user/pass?

sougou
in a credentials file like this https://github.com/vitessio/vitess/blob/master/examples/local/mysql_auth_server_static_creds.json
examples/local/mysql_auth_server_static_creds.json
```{
 "mysql_user": [
   {
     "MysqlNativePassword": "*9E128DA0C64A6FCCCDCFBDD0FC0A2C967C6DB36F",
     "Password": "mysql_password",```
...
and give that to vtgate (look at vtgaet-up,sh) in that same directory

Overriding the db name


raj.veerappan [Aug 2nd]

unfortunately looks like flyway relies on `information_schema` for a bunch of logic and that's not available through vtgate

sougou
if you connect to a specific shard, vtgate will pass it through
it should be an unsharded keyspace, or something like `ks:-80`

raj.veerappan
but then the db name will be `vt_db` instead of just `db`
I'll just disable flyway for now since migrations will probably need to be reworked if we use vitess

sougou
you have another option
you can override the dbname
vttablet command line `-init_db_name_override` (edited)
and name the db as `db` instead of `vt_db`

raj.veerappan [18 days ago]
lol, that might simplify things


Overriding the db name



raj.veerappan [Aug 2nd]

Seems like the `./lvtctl.sh CopySchemaShard test_keyspace/0 target/0` doesn't work if `test_keyspace` has tables with foreign keys in it

sougou
yeah. You can do a custom schema deploy in that case
it's only a convenience

raj.veerappan
is there a gist for that too :slightly_smiling_face:
I guess I only need to deploy the schema for the particular tables that I'm vertically sharding?
will just do a `show create table` on it on test_keyspace and just run directly using mysql on `target`

sougou
yup

raj.veerappan
if vtworker `cannot find MASTER tablet for destination shard for target/0` even though I did the `InitShardMaster` step, is there something else I need to do?
I see the `target` keyspace in the web ui with its shards and one tagged as master correctly

sougou
check the status page for vttablet `/debug/status` and the logs. Maybe it didn't initialize correctly

raj.veerappan
status is healthy

sougou
and it shows up as master in vtctld?

raj.veerappan
yes

sougou
the vtworker would have written a logfile
can you see if it has more info there?
can you also show me your vtworker command?

raj.veerappan
`./sharded-vtworker.sh VerticalSplitClone --tables my_table target/0`
will check the log file
the only error besides the `cannot find MASTER...` one is `proc.go:85] unexpected error on port 0: Get http://localhost:0/debug/pid: dial tcp [::1]:0: connect: can't assign requested address, trying to start anyway`

sougou
what is the full error? (that error can come from three different places)

raj.veerappan
ohh, just noticed that it was in a cell that doesn't match mine
ahh, I updated the cell name in the other scripts but not in `sharded-vtworker.sh`

sougou
that will do it :slightly_smiling_face:

raj.veerappan
that was it :slightly_smiling_face:
been at it all day, starting to miss things

sougou [18 days ago]
don't forget about `MigrateServedFrom` (not `MigrateServedTypes`)

Manual for benchmark toolset dbt2-0.37.50.15

$
0
0
Manual for dbt2-0.37.50.15

My career has been focused on two important aspects of DBMSs. The
first is the recovery algorithms to enable the DBMS to never be down.
The second is efficient execution of OLTP in the DBMS.

When I started a short career as a consultant in 2006 I noted that I had
to spend more and more time setting up and tearing down NDB Clusters
to perform benchmarks.

The particular benchmark I started developing was DBT2. I downloaded
the dbt2-0.37 version. I quickly noted that it was very hard to run a
benchmark in this version in an automated manner.

My long-term goal was to achieve a scenario where an entire benchmark
could be executed with only command. This goal took several years to
achieve and many versions of this new dbt2-0.37 tree.

Since I automated more and more, the scripts I developed are layered
such that I have base scripts that execute start and stop of individual
nodes. On top of this script I have another script that can start and
stop a cluster of nodes. In addition I have a set of scripts that execute
benchmarks. On top of all those scripts I have a script that executes
the entire thing.

The dbt2-0.37.50 tree also fixed one bug in the benchmark execution and
ensured that I could handle hundreds of thousands of transactions per second
and still handle the statistics in the benchmark.

Later I also added support for executing Sysbench and flexAsynch. Sysbench
support was added by forking sysbench-0.4.12 and fixing some scalability
issues in the benchmark and added support for continuous updates of
performance (defaults to one report per 3 seconds).

Today I view the dbt2-0.37.50 tree and sysbench-0.4.12 as my toolbox.
It would be quite time consuming to analyse any new features from a
performance point of view without this tool. This means that I think
it is worthwhile to continue developing this toolset for my own purposes.

About 10 years ago we decided to make this toolset publically available
at the MySQL website. One reason for this was to ensure that anyone
that wants to replicate the benchmarks that I report on my blog is able to
do this.

Currently my focus is on developing MySQL Cluster and thus the focus
on the development of this toolset is centered around these 3 benchmarks
with NDB. But the benchmark scripts still support running Sysbench and
DBT2 for InnoDB as well. I used to develop InnoDB performance
improvements as well for a few years (and MySQL server improvements)
and this toolset was equally important at that time.

When I wrote my book I decided to write a chapter to document this
toolset. This chapter of my book is really helpful to myself. It is easy
to forget some detail of how to use it.

Above is a link to the PDF file of this chapter if anyone wants to
try out and use these benchmark toolsets.

My last set of benchmark blogs on benchmarks of MySQL Cluster
in the Oracle Cloud used these benchmark scripts.

The toolset is not intended for production setups of NDB, but I am
sure it can be used for this with some adaption of the scripts.

For development setups of MySQL Cluster we are developing
MySQL Cluster Configurator (MCC) sometimes called
Auto Installer.

For production setups of NDB we are developing MySQL Cluster
Manager (MCM).

Happy benchmarking :)

MySQL Shell: Built-In Help

$
0
0

It can be hard to recall all the details of how a program and API work. The usual way to handle that is to look at the manual or a book. Another – and in my opinion – nice way is to have built-in help, so you can find the information without changing between the program and browser. This blog discuss how to obtain help when you use MySQL Shell.

MySQL Shell is a client that allows you to execute queries and manage MySQL through SQL commands and JavaScript and Python code. It is a second generation command-line client with additional WebOps support. If you have not installed MySQL Shell yet, then you can download it from MySQL’s community downloads, Patches & Updates in My Oracle Support (MOS) (for customers), or Oracle Software Delivery Cloud (for trial downloads). You can also install it through MySQL Installer for Microsoft Windows.
MySQL Shell: Get help for a table object
MySQL Shell: Get help for a table object

MySQL Shell has a very nice and comprehensive built-in help. There is of course the help output produced using the --help option if you invoke the shell from the command line:

PS: MySQL> mysqlsh --help
MySQL Shell 8.0.12

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Usage: mysqlsh [OPTIONS] [URI]
       mysqlsh [OPTIONS] [URI] -f <path> [script args...]
       mysqlsh [OPTIONS] [URI] --dba [command]
       mysqlsh [OPTIONS] [URI] --cluster

  -?, --help                    Display this help and exit.
  -e, --execute=<cmd>           Execute command and quit.
  -f, --file=file               Process file.
  --uri=value                   Connect to Uniform Resource Identifier. Format:
                                [user[:pass]@]host[:port][/db]
  -h, --host=name               Connect to host.
  -P, --port=#                  Port number to use for connection.
...

However, this help is not what makes MySQL Shell special. It is the help that you can see from within the shell when working in JavaScript or Python that is the worth some extra attention. There is both support for general help and obtaining help through objects.

General Help

The first layer of help is what is also known from the old mysql command-line client. A command existing of a backslash and a ?, h, or help (\?, \h or \help) will show information about the general usage of MySQL Shell:

mysql-py> \?
The Shell Help is organized in categories and topics. To get help for a
specific category or topic use: \? <pattern>

The <pattern> argument should be the name of a category or a topic.

The pattern is a filter to identify topics for which help is required, it can
use the following wildcards:

- ? matches any single charecter.
- * matches any character sequence.

The following are the main help categories:

 - AdminAPI       Introduces to the dba global object and the InnoDB cluster
                  administration API.
 - Shell Commands Provides details about the available built-in shell commands.
 - ShellAPI       Contains information about the shell and util global objects
                  as well as the mysql module that enables executing SQL on
                  MySQL Servers.
 - SQL Syntax     Entry point to retrieve syntax help on SQL statements.
 - X DevAPI       Details the mysqlx module as well as the capabilities of the
                  X DevAPI which enable working with MySQL as a Document Store

The available topics include:

- The dba global object and the classes available at the AdminAPI.
- The mysqlx module and the classes available at the X DevAPI.
- The mysql module and the global objects and classes available at the
  ShellAPI.
- The functions and properties of the classes exposed by the APIs.
- The available shell commands.
- Any word that is part of an SQL statement.

SHELL COMMANDS

The shell commands allow executing specific operations including updating the
shell configuration.

The following shell commands are available:

 - \                   Start multi-line input when in SQL mode.
 - \connect    (\c)    Connects the shell to a MySQL server and assigns the
                       global session.
 - \exit               Exits the MySQL Shell, same as \quit.
 - \help       (\?,\h) Prints help information about a specific topic.
 - \history            View and edit command line history.
 - \js                 Switches to JavaScript processing mode.
 - \nowarnings (\w)    Don't show warnings after every statement.
 - \option             Allows working with the available shell options.
 - \py                 Switches to Python processing mode.
 - \quit       (\q)    Exits the MySQL Shell.
 - \reconnect          Reconnects the global session.
 - \rehash             Refresh the autocompletion cache.
 - \source     (\.)    Loads and executes a script from a file.
 - \sql                Switches to SQL processing mode.
 - \status     (\s)    Print information about the current global session.
 - \use        (\u)    Sets the active schema.
 - \warnings   (\W)    Show warnings after every statement.

GLOBAL OBJEECTS

The following modules and objects are ready for use when the shell starts:

 - dba    Used for InnoDB cluster administration.
 - mysql  Support for connecting to MySQL servers using the classic MySQL
          protocol.
 - mysqlx Used to work with X Protocol sessions using the MySQL X DevAPI.
 - shell  Gives access to general purpose functions and properties.
 - util   Global object that groups miscellaneous tools like upgrade checker.

For additional information on these global objects use: <object>.help()

EXAMPLES
\? AdminAPI
      Displays information about the AdminAPI.

\? \connect
      Displays usage details for the \connect command.

\? check_instance_configuration
      Displays usage details for the dba.check_instance_configuration function.

\? sql syntax
      Displays the main SQL help categories.

This shows which commands and global objects are available. But there is more: you can also get help about the usage of MySQL Shell such as how to use the Admin API (for MySQL InnoDB Cluster), how to connect, or the SQL syntax. The search for relevant help topics are context sensitive, for example searching for the word select return different results depending on the mode and whether you are connected:

  • In Python or JavaScript mode without a connection, it is noted that information was found in the mysqlx.Table.select and mysqlx.TableSelect.select categories.
  • In Python or JavaScript mode with a connection, the SELECT SQL statement is included as a category.
  • In SQL mode the actual help text for the SELECT SQL statement is returned (requires a connection).

For example, to get help about the select method of a table object:

MySQL  Py > \? mysqlx.Table.select
NAME
      select - Creates a TableSelect object to retrieve rows from the table.

SYNTAX
      Table.select(...)
           [.where([expression])]
           [.group_by(...)[.having(condition)]]
           [.order_by(...)]
           [.limit(numberOfRows)[.offset(numberOfRows)]]
           [.lock_shared([lockContention])]
           [.lock_exclusive([lockContention])]
           [.bind(name, value)]
           [.execute()]

DESCRIPTION
      This function creates a TableSelect object which is a record selection
      handler.

      This handler will retrieve all the columns for each included record.

      The TableSelect class has several functions that allow specifying what
      records should be retrieved from the table, if a searchCondition was
      specified, it will be set on the handler.

      The selection will be returned when the execute function is called on the
      handler.
...

To get help for the SELECT SQL statement:

mysql-py> \? SQL Syntax/SELECT
Syntax:
SELECT
    [ALL | DISTINCT | DISTINCTROW ]
      [HIGH_PRIORITY]
      [STRAIGHT_JOIN]
      [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
      [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [FROM table_references
      [PARTITION partition_list]
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}
      [ASC | DESC], ... [WITH ROLLUP]]
    [HAVING where_condition]
    [WINDOW window_name AS (window_spec)
        [, window_name AS (window_spec)] ...]
    [ORDER BY {col_name | expr | position}
      [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [INTO OUTFILE 'file_name'
        [CHARACTER SET charset_name]
        export_options
      | INTO DUMPFILE 'file_name'
      | INTO var_name [, var_name]]
    [FOR UPDATE | LOCK IN SHARE MODE]]
    [FOR {UPDATE | SHARE} [OF tbl_name [, tbl_name] ...] [NOWAIT | SKIP LOCKED] 
      | LOCK IN SHARE MODE]]

SELECT is used to retrieve rows selected from one or more tables, and
can include UNION statements and subqueries. See [HELP UNION], and
http://dev.mysql.com/doc/refman/8.0/en/subqueries.html. A SELECT
statement can start with a WITH clause to define common table
expressions accessible within the SELECT. See
http://dev.mysql.com/doc/refman/8.0/en/with.html.

...

URL: http://dev.mysql.com/doc/refman/8.0/en/select.html


mysql-py> \sql
Switching to SQL mode... Commands end with ;

mysql-sql> \? select
Syntax:
SELECT
    [ALL | DISTINCT | DISTINCTROW ]
      [HIGH_PRIORITY]
      [STRAIGHT_JOIN]
      [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
      [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [FROM table_references
      [PARTITION partition_list]
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}
      [ASC | DESC], ... [WITH ROLLUP]]
    [HAVING where_condition]
    [WINDOW window_name AS (window_spec)
        [, window_name AS (window_spec)] ...]
    [ORDER BY {col_name | expr | position}
      [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [INTO OUTFILE 'file_name'
        [CHARACTER SET charset_name]
        export_options
      | INTO DUMPFILE 'file_name'
      | INTO var_name [, var_name]]
    [FOR UPDATE | LOCK IN SHARE MODE]]
    [FOR {UPDATE | SHARE} [OF tbl_name [, tbl_name] ...] [NOWAIT | SKIP LOCKED] 
      | LOCK IN SHARE MODE]]

SELECT is used to retrieve rows selected from one or more tables, and
can include UNION statements and subqueries. See [HELP UNION], and
http://dev.mysql.com/doc/refman/8.0/en/subqueries.html. A SELECT
statement can start with a WITH clause to define common table
expressions accessible within the SELECT. See
http://dev.mysql.com/doc/refman/8.0/en/with.html.
...

Note here how it is possible to get the help for the SELECT statement both from the Python (and JavaScript) as well as SQL modes, but the search term is different.

Tip: To get information about SQL statements, you must be connected to a MySQL instance.

When you use the JavaScript or Python modes there is another way to get  help based on your object. Let’s look at that.

Object Based Help

If you are coding in MySQL Shell using JavaScript or Python it may happen you need a hint how to use a given object, for example a table object. You can use the method described in the previous section to get help by searching for mysqlx.Table, however, you can also access the help directly from the object.

All of the X DevAPI objects in MySQL Shell has a help() method that you can invoke to have help returned for the object. For example, if you have an object named city for the city table in the world schema, then calling city.help() returns information about table object:

mysql-py> \use world
Default schema `world` accessible through db.

mysql-py> city = db.get_table('city')
mysql-py> city.help()
NAME
      Table - Represents a Table on an Schema, retrieved with a session created
              using mysqlx module.

DESCRIPTION
      Represents a Table on an Schema, retrieved with a session created using
      mysqlx module.

PROPERTIES
      name
            The name of this database object.

      schema
            The Schema object of this database object.

      session
            The Session object of this database object.

FUNCTIONS
      delete()
            Creates a record deletion handler.

      exists_in_database()
            Verifies if this object exists in the database.

      get_name()
            Returns the name of this database object.

      get_schema()
            Returns the Schema object of this database object.

      get_session()
            Returns the Session object of this database object.

      help([member])
            Provides help about this class and it's members

      insert(...)
            Creates TableInsert object to insert new records into the table.

      is_view()
            Indicates whether this Table object represents a View on the
            database.

      select(...)
            Creates a TableSelect object to retrieve rows from the table.

      update()
            Creates a record update handler.

As you can see, the built-in help in MySQL Shell is a powerful resource. Make sure you use it.

Another Post on the Percona Community Blog, Bug Activities, and Percona Live Europe

$
0
0
I published another article on the Percona Community Blog.  This time, it is about Semi-Synchronous Replication.  You can read the post here: Question about Semi-Synchronous Replication: the Answer with all the Details I previously wrote about my motivation to publish on the Percona Community Blog.  Things have not changed: I still believe it is a great community initiative that I want to

Configuring InnoDB Thread Concurrency for Performance

$
0
0

InnoDB depends on operating system threads to process the requests from user transactions, These transactions include requests to InnoDB before commit or rollback. The modern operating systems and servers with multi-core processors, where context switching is efficient, most workloads run well without any limit on the number of concurrent threads. InnoDB can efficiently control the number of concurrently executing operating system threads (and thus the number of requests that are processed at any one time) to minimize context switching between threads. if the number of threads concurrently executing is at a pre-defined limit, the new request sleeps for a short time before it tries again. The requests which cannot be rescheduled after the sleep is put in a first-in/first-out queue and eventually is processed. Threads waiting for locks are not counted in the number of concurrently executing threads.To limit the number of concurrent threads, You can configure MySQL system variable innodb_thread_concurrency (system variables are explained well at MySQL documentation here ). Once the number of executing threads reaches the limit, all those waiting threads sleeps for microseconds configured in innodb_thread_sleep_delay  before placed into the queue.

The default value of innodb_thread_concurrency is “0” , This mean, there is no limit on the number of concurrently executing threads. To reduce excessive context switching you can set innodb_thread_concurrency > “0”  . Now, How do you size your innodb_thread_concurrency ? There are several factors we consider before recommending the values, like your CPU, memory, Linux distribution / kernel, your application’s database architecture and other variables like innodb_adaptive hash_index (AHI) . So we recommend you to benchmark thoroughly before setting the values for innodb_thread_concurrency.

innodb_thread_sleep_delay is applicable only when innodb_thread_concurrency is greater than “0” , That’s why MySQL works faster in Servers with multi-core high performance CPUs compared to several Servers with moderately performing  CPUs.

When you have configured innodb_thread_concurrency > “0” , The SQLs that may comprise of multiple row operations are assigned with a specific number of tickets by InnoDB specified in the global system variable innodb_concurrency_tickets (default is 5000), which allows thread to be scheduled repeatedly with an minimal resource usage. If the tickets run out, the thread is evicted and innodb_thread_concurrency is observed again, Which will eventually move the thread back to FIFO (first-in-first-out) queue waiting threads.

The post Configuring InnoDB Thread Concurrency for Performance appeared first on MySQL Consulting, Support and Remote DBA Services.


Webinar Tuesday, 8/28: Forking or Branching – Lessons from the MySQL Community

$
0
0
forking or branching

forking or branchingPlease join Percona’s CEO, Peter Zaitsev as he presents Forking or Branching – Lessons from the MySQL Community on Tuesday, August 28th, 2018 at 7:00 AM PDT (UTC-7) / 10:00 AM EDT (UTC-4).

 

The MySQL Community offers a great example of various forks and branches, with MariaDB being the most well-known fork, and companies like Percona, Facebook and Alibaba maintaining their own branches.

In this presentation we will look at the history of MySQL, the causes of MySQL forking and branching, and discuss the benefits and drawbacks of both approaches, using specific examples from the MySQL ecosystem.

Register for the webinar.

Peter ZaitsevPeter Zaitsev, CEO and Co-Founder

Peter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 140 professionals in 30 plus countries, Peter’s venture now serves over 3000 customers – including the “who’s who” of internet giants, large enterprises and many exciting startups. Inc. 5000 named Percona to their list in 2013, 2014, 2015 and 2016. Peter was an early employee at MySQL AB, eventually leading the company’s High-Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. Peter is a co-author of High-Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Database Performance Blog. He has also been tapped as a contributor to Fortune and DZone, and his ebook Practical MySQL Performance Optimization is one of Percona’s most popular downloads.

 

The post Webinar Tuesday, 8/28: Forking or Branching – Lessons from the MySQL Community appeared first on Percona Database Performance Blog.

MySQL 8: Load Fine Tuning With Resource Groups

$
0
0

MySQL Resource Groups, introduced in MySQL 8, provide the ability to manipulate the assignment of running threads to specific resources, thereby allowing the DBA to manage application priorities. Essentially, you can assign a thread to a specific virtual CPU. In this post, I’m going to take a look at how these might work in practice.

Let us start with a disclaimer.

What I am going to discuss here is NOT common practice. This is advanced load optimization, and you should approach/implement it ONLY if you are 100% sure of what you are doing, and, more importantly, if you know what you are doing, and why you are doing it.

Overview

MySQL 8 introduced a feature that is explained only in a single documentation page. This feature can help a lot if used correctly, and hopefully they will not deprecate or remove it after five minutes. It is well hidden in the Optimization: Optimizing the MySQL Server chapter.

I am talking about resource groups. Resource groups permit assigning threads running within MySQL to particular groups so that threads execute according to the resources available to this group. Group attributes enable control over resources to enable or restrict resource consumption by threads in the group. DBAs can modify these attributes as appropriate for different workloads.

Currently, CPU affinity (i.e., assigning to a specific CPU) is a manageable resource, represented by the concept of “virtual CPU” as a term that includes CPU cores, hyperthreads, hardware threads, and so forth. MySQL determines, at startup, how many virtual CPUs are available. Database administrators with appropriate privileges can associate virtual CPUs with resource groups and assign threads to these groups.

In short, you can define that, this specific thread (ergo connection unless connection pooling OR ProxySQL with multiplexing), will use that usage CPU and will have the given priority.

Setting this by thread can be:

  1. Dangerous
  2. Not useful

Dangerous, because if you set this to a thread when using connection pooling OR ProxySQL and multiplexing, you may end up assigning a limitation to queries that instead, you wanted to  run efficiently.

Not useful because unless you spend the time looking at the processlist (full), and/or have a script running all the time that catches what you need, 99% of the time you will not be able to assign the group efficiently.

So? Another cool useless feature???

Nope…

Resource groups can be referenced inside a single statement, which means I can have only that query utilizing that resource group. Something like this will do the magic:

SELECT /*+ RESOURCE_GROUP(NAME OF THE RG) */ id, millid, date,active,kwatts_s FROM sbtest29 WHERE id=44

But if I run:

SELECT id, millid, date,active,kwatts_s FROM sbtest29 WHERE id=44

No resource group utilization even if I am using the same connection.

This is cool, isn’t it?

What is the possible usage?

In general, you can see this as a way to limit the negative impact of queries that you know will be problematic for others.

Good examples are:

  • ETL processes for data archiving, reporting, data consolidation and so on
  • Applications that are not business critical and can wait, while your revenue generator application cannot
  • GUI Client applications, used by some staff of your company, that mainly create problems for you while they claim they are working.

“Marco, that could make sense … but what should I do to have it working? Rewrite my whole application to add this feature?”

Good question! Thanks!

We can split the task of having a good Resource Group implementation into three steps:

  1. You must perform an analysis of what you want to control. You need to identify the source (like TCP/IP if it is fixed, username) and design which settings you want for your resource groups. Identify if you only want to reduce the CPU priority, or if you want to isolate the queries on a specific CPU, or a combination of the two.
  2. Implement the resource groups in MySQL.
  3. Implement a way to inject the string comment into the SQL.

About the last step, I will show you how to do this in a straightforward way with ProxySQL, but hey… this is really up to you. I will show you the easy way, but if you prefer a more difficult route, that’s good for me too.

The Setup

In my scenario, I have a very noisy secondary application written by a very, very bad developer that accesses my servers, mostly with read queries, and occasionally with write updates. Reads and writes are obsessive and create an impact on the MAIN application. My task is to limit the impact of this secondary application without having the main one affected.

To do that I will create two resource groups, one for WRITE and another for READ.

The first group, Write_app2, will have no cpu affiliation, but will have lowest (19) priority:

CREATE RESOURCE GROUP Write_app2 TYPE=USER THREAD_PRIORITY=19;

The second group, Select_app2, will have CPU affiliation AND lowest priority;

CREATE RESOURCE GROUP Select_app2 TYPE=USER VCPU=5 THREAD_PRIORITY=19;

Finally, I have identified that the application is connecting from several sources BUT it uses a common username APP2. Given that, I will use the user name to inject the instructions into the SQL using ProxySQL (I could have also used the IP, or the schema name, or destination port, or something in the submitted SQL. In short, any possible filter in the query rules).

To do that I will need four query rules:

insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(80,6033,'app1',80,1,3,'^SELECT.*FOR UPDATE',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(81,6033,'app1',81,1,3,'^SELECT.*',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(82,6033,'app2',80,1,3,'^SELECT.*FOR UPDATE',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(83,6033,'app2',81,1,3,'^SELECT.*',1,1);

To identify and redirect the query for R/W split.

INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (32,1,'app2',"(^SELECT)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Select_app2) */ \2 ",0,"Lower prio and CPU bound on Reader");
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (33,1,'app2',"^(INSERT|UPDATE|DELETE)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Write_app2) */ \2 ",0,"Lower prio on Writer");

And a user definition like:

insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app2','test',1,80,'mysql',1);
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app1','test',1,80,'mysql',1);

One important step you need to do ON ALL the servers you want to include in the Resource Group utilization, is to be sure you have

CAP_SYS_NICE
  capability set.

On Linux, resource group thread priorities are ignored unless the CAP_SYS_NICE capability is set. MySQL package installers for Linux systems should set this capability. For installation using a compressed tar file binary distribution or from source, the CAP_SYS_NICE capability can be set manually using the setcap command, specifying the pathname to the mysqld executable (this requires sudo access). You can check the capabilities using getcap. For example:

shell> sudo setcap cap_sys_nice+ep <Path to you mysqld executable>
shell> getcap ./bin/mysqld
./bin/mysqld = cap_sys_nice+ep

If manual setting of CAP_SYS_NICE is required, then you will need to do it every time you perform a new install.

As a reference here is a table about CPU priority:

Priority Range Windows Priority Level
-20 to -10 THREAD_PRIORITY_HIGHEST
-9 to -1 THREAD_PRIORITY_ABOVE_NORMAL
0 THREAD_PRIORITY_NORMAL
1 to 10 THREAD_PRIORITY_BELOW_NORMAL
11 to 19 THREAD_PRIORITY_LOWEST

 

Summarizing here the whole set of steps on my environment:

1) Check the CAP_SYS_NICE

getcap /opt/mysql_templates/mysql-8P/bin/mysqld
setcap cap_sys_nice+ep /opt/mysql_templates/mysql-8P/bin/mysqld

2) Create the user in MySQL and resource groups

create user app2@'%' identified by 'test';
GRANT ALL PRIVILEGES ON `windmills2`.* TO `app2`@`%`;
CREATE RESOURCE GROUP Select_app2 TYPE=USER VCPU=5 THREAD_PRIORITY=19;
CREATE RESOURCE GROUP Write_app2 TYPE=USER THREAD_PRIORITY=19;

To check :

SELECT * FROM INFORMATION_SCHEMA.RESOURCE_GROUPS;

3) Create ProxySQL user and rules

insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app2','test',1,80,'mysql',1);
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app1,'test',1,80,'mysql',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(83,6033,'app2',80,1,3,'^SELECT.*FOR UPDATE',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(84,6033,'app2',81,1,3,'^SELECT.*',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(85,6033,'app2',80,0,3,'.',1,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (32,0,'app2',"(^SELECT)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Select_app2) */ \2 ",0,"Lower prio and CPU bound on Reader");
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (33,0,'app2',"^(INSERT|UPDATE|DELETE)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Write_app2) */ \2 ",0,"Lower prio on Writer");
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

For several reasons I will add the resource groups query rules as INACTIVE for now.

Done…

Testing

Will this work?
We need to see the impact of the bad application on my production application.
Then we need to see IF implementing the tuning will work or not.

To do a basic check I run four tests:

  • test1 run both apps with read/write and rule disabled for RG
  • test2 run an application at a time without RG
  • test3 run only App2 with RG to see the cost on the execution
  • test4 run both to see what happen with RG

Test 1

Master

Test 1 master 1 current CPU core utilization

Slave

This test aims to have an idea, right away, of what happens when both applications are running, without limits.
As we can see during the test, all cores are utilized, some more consistently and some a bit less so, but nothing huge.
What is interesting is to see the effect on the response time and the number of events each application can execute:

The execution graph indicates a very high time in Insert, and Delete for App1, with the results showing very bad performance only nine inserts, 1333 deletes, and 165 selects.

But what is the application actually supposed to do? Test 2 will tell us, creating de facto our baseline.

Test 2

In this test I had run each application separately, so no interference.
Master App1

Test 2 master 1 current CPU core utilization

Master App2

Slave App1

Test 2 slave1 current CPU core utilization

Slave App2

Test 2 slave 2 current CPU core utilization

Nothing significantly different in the CPU utilization when App1 was running, while we can see a bit less utilization in the case of App2.

The impact on the performance is, however, more significant:

Execution time for insert, delete drops significantly for App1 and we can see that the application SHOULD be able to insert ~1320 events and perform a significantly higher number of operations. Same for App2, but here we care more about the OLTP than the ETL application.

So, what will happen IF we activate the Resource Group flags to the App2 (ETL) performance? Let’s see with test 3.

Test 3

Running only App2 with active resource groups
Master App2

Slave App2

On the master, what the RG settings will do is just reduce the priority, given that no other process is running and no other application is connected, the impact is not high.
On the other hand, on the slave we can clearly see that now App2 can only use core 5 as indicated in our configuration.
So far so good, what will be the performance loss? Let’s see:

Comparing the two tests 2 and 3, we can see that in applying the resource groups our ETL application has a minimal but existing impact. That is expected, desired and must be noted. The impact is not high in this test, but it can expand the running time in the real world.

It’s time to combine all and see what is going on.

Test 4

Run our OLTP application while the ETL is running under Resource Group.
Master

Slave

Looking at the CPU utilization these graphs are very similar to the ones in test1, but the result is totally different:

The execution time for App1 (OLTP) has dropped significantly while the performance has increased almost as if nothing else is running. At the same time App2 has lost performance, and this must be taken into account, but it will not stop/prevent the ETL process to run.


It is possible to do more tuning in the case that ETL is too compromised. Or maybe modify the Servers layout such as adding a Slave and dedicating it to ETL reads. The combinations and possibilities are many.

Conclusion

Just looking to the final graphs will help us to reach our conclusions:

Comparing the two tests 1 and 4 we can see how using the Resource Group will help us to correctly balance the workload and optimize the performance in the case of unavoidable contention between different applications.

At the same time, using Resource Group alone as a blanket setting is not optimal because it can fail its purpose. Instead of providing some improvement, it can unpredictably affect all the traffic. It is also not desirable to modify the code in order to implement it at the query level, given the possible impact of doing that in cost and time.
The introduction of ProxySQL with query rewrite, allows us to utilize the per query option, without the need for any code modification, and allow us to specify what we want, with very high level of granularity.

Once more do not do this by yourself unless you are more than competent and know 100% what you are doing. In any case, remember that an ETL process may take longer and that you need to plan your work/schedule accordingly.

Good MySQL everyone.

References

The post MySQL 8: Load Fine Tuning With Resource Groups appeared first on Percona Database Performance Blog.

MySQL 8: Load Fine Tuning With Resource Groups

$
0
0

High-CPU

MySQL Resource Groups, introduced in MySQL 8, provide the ability to manipulate the assignment of running threads to specific resources, thereby allowing the DBA to manage application priorities. Essentially, you can assign a thread to a specific virtual CPU. In this post, I'm going to take a look at how these might work in practice. Let us start with a disclaimer. What I am going to discuss here is NOT common practice. This is advanced load optimization, and you should approach/implement it ONLY if you are 100% sure of what you are doing, and, more importantly, if you know what you are doing, and why you are doing it.

Overview

MySQL 8 introduced a feature that is explained only in a single documentation page. This feature can help a lot if used correctly, and hopefully they will not deprecate or remove it after five minutes. It is well hidden in the Optimization: Optimizing the MySQL Server chapter. I am talking about resource groups. Resource groups permit assigning threads running within MySQL to particular groups so that threads execute according to the resources available to this group. Group attributes enable control over resources to enable or restrict resource consumption by threads in the group. DBAs can modify these attributes as appropriate for different workloads. Currently, CPU affinity (ie: assigning to a specific CPU) is a manageable resource, represented by the concept of “virtual CPU” as a term that includes CPU cores, hyperthreads, hardware threads, and so forth. MySQL determines, at startup, how many virtual CPUs are available. Database administrators with appropriate privileges can associate virtual CPUs with resource groups and assign threads to these groups. In short, you can define that, this specific thread (ergo connection unless connection pooling OR ProxySQL with multiplexing), will use that specific CPU and will have the given priority. Setting this by thread can be:

  1. Dangerous
  2. Not useful

Dangerous, because if you set this to a thread when using connection pooling OR ProxySQL and multiplexing, you may end up assigning a limitation to queries that instead you wanted to run efficiently. Not useful because unless you spend the time looking at the processlist (full), and/or have a script running all the time that catches what you need, 99% of the time you will not be able to assign the group efficiently. So? Another cool useless feature??? Nope… Resource groups can be referenced inside a single statement, which means I can have ONLY that query utilizing that resource group. Something like this will do the magic:

 

1
2
SELECT /*+ RESOURCE_GROUP(NAME OF THE RG) */ id, millid, date,active,kwatts_s FROM sbtest29 WHERE id=44

 

But if I run:

 

1
SELECT id, millid, date,active,kwatts_s FROM sbtest29 WHERE id=44

 

 

 

No resource group utilization even if I am using the same connection. This is cool, isn’t it?

What is the possible usage?

In general, you can see this as a way to limit the negative impact of queries that you know will be problematic for others. Good examples are:

  • ETL processes for data archiving, reporting, data consolidation and so on
  • Applications that are not business critical and can wait, while your revenue generator application cannot
  • GUI Client applications, used by some staff of your company, that mainly create problems for you while they claim they are working.

"Marco, that could make sense … but what should I do to have it working? Rewrite my whole application to add this feature?" Good question! Thanks! We can split the task of having a good Resource Group implementation into 3 steps:

  1. You must perform an analysis of what you want to control. You need to identify the source (like tcp/ip if it is fixed, username) and design which settings you want for your resource groups. Identify if you only want to reduce the CPU priority, or if you want to isolate the queries on a specific CPU, or a combination of the two.
  2. Implement the resource groups in MySQL.
  3. Implement a way to inject the string comment into the SQL.

About the last step, I will show you how to do this in a very simple way with ProxySQL, but hey … this is really up to you. I will show you the easy way but if you prefer a more difficult route, that's good for me too.

The Setup

In my scenario, I have a very noisy secondary application written by a very, very bad developer that accesses my servers, mostly with read queries, and occasionally with write updates. Reads and writes are obsessive and create an impact on the MAIN application. My task is to limit the impact of this secondary application without having the main one affected. To do that I will create two resource groups, one for WRITE and another for READ. The first group, Write_app2, will have no cpu affiliation, but will have lowest (19) priority:

 

1
CREATE RESOURCE GROUP Write_app2 TYPE=USER THREAD_PRIORITY=19;

 

The second group, Select_app2, will have CPU affiliation AND lowest priority;

 

1
CREATE RESOURCE GROUP Select_app2 TYPE=USER VCPU=5 THREAD_PRIORITY=19;

 

 

Finally, I have identified that the application is connecting from several sources BUT it uses a common username APP2. Given that, I will use the user name to inject the instructions into the SQL using ProxySQL (I could have also used the IP, or the schema name, or destination port, or something in the submitted SQL. In short, any possible filter in the query rules). To do that I will need four query rules:

 

1
2
3
4
5
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(80,6033,'app1',80,1,3,'^SELECT.*FOR UPDATE',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(81,6033,'app1',81,1,3,'^SELECT.*',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(82,6033,'app2',80,1,3,'^SELECT.*FOR UPDATE',1,1); 
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(83,6033,'app2',81,1,3,'^SELECT.*',1,1); 
 

 

To identify and redirect the query for R/W split.

1
2
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (32,1,'app2',"(^SELECT)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Select_app2) */ \2 ",0,"Lower prio and CPU bound on Reader");
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (33,1,'app2',"^(INSERT|UPDATE|DELETE)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Write_app2) */ \2 ",0,"Lower prio on Writer");

 

and a user definition like:
1
2
3
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app2','test',1,80,'mysql',1);
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('app1','test',1,80,'mysql',1);

 


One important step you need to do ON ALL the servers you want to include in the Resource Group utilization, is to be sure you have CAP_SYS_NICE capability set. On Linux, resource group thread priorities are ignored unless the CAP_SYS_NICE capability is set. MySQL package installers for Linux systems should set this capability. For installation using a compressed tar file binary distribution or from source, the CAP_SYS_NICE capability can be set manually using the setcap command, specifying the path name to the mysqld executable (this requires sudo access). You can check the capabilities using getcap. For example:

 

1
2
3
shell> sudo setcap cap_sys_nice+ep <Path to you mysqld executable>
shell> getcap ./bin/mysqld
./bin/mysqld = cap_sys_nice+ep

 

 

 

If manual setting of CAP_SYS_NICE is required, then you will need to do it every time you perform a new install. As reference here is a table about CPU priority:

Priority Range Windows Priority Level
-20 to -10 THREAD_PRIORITY_HIGHEST
-9 to -1 THREAD_PRIORITY_ABOVE_NORMAL
0 THREAD_PRIORITY_NORMAL
1 to 10 THREAD_PRIORITY_BELOW_NORMAL
11 to 19 THREAD_PRIORITY_LOWEST

  Summarizing here the whole set of steps on my environment: 1) Check the CAP_SYS_NICE

 

1
2
getcap /opt/mysql_templates/mysql-8P/bin/mysqld
setcap cap_sys_nice+ep /opt/mysql_templates/mysql-8P/bin/mysqld

 

2) Create the user in MySQL and resource groups

 

1
2
3
4
create user app2@'%' identified by 'test';
GRANT ALL PRIVILEGES ON `windmills2`.* TO `app2`@`%`;
CREATE RESOURCE GROUP Select_app2 TYPE=USER VCPU=5 THREAD_PRIORITY=19;
CREATE RESOURCE GROUP Write_app2 TYPE=USER THREAD_PRIORITY=19;

 

 

 

To check :

 

1
SELECT * FROM INFORMATION_SCHEMA.RESOURCE_GROUPS;

 

3) Create ProxySQL user and rules

1
2
3
4
5
6
7
8
9
10
11
INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('app2','test',1,80,'mysql',1);
INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('app1','test',1,80,'mysql',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;

insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(83,6033,'app2',80,1,3,'^SELECT.*FOR UPDATE',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(84,6033,'app2',81,1,3,'^SELECT.*',1,1);
insert into mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply,active) values(85,6033,'app2',80,0,3,'.',1,0);
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (32,0,'app2',"(^SELECT)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Select_app2) */ \2 ",0,"Lower prio and CPU bound on Reader");
INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply,comment) VALUES (33,0,'app2',"^(INSERT|UPDATE|DELETE)\s*(.*$)","\1 /*+ RESOURCE_GROUP(Write_app2) */ \2 ",0,"Lower prio on Writer");
 
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;


For several reasons I will add the resource groups query rules as INACTIVE for now. Done…

Testing

Will this work? We need to see the impact of the bad application on my production application. Then we need to see IF implementing the tuning will work or not. To do a basic check I run four tests:

  • test1 run both apps with read/write and rule disabled for RG
  • test2 run an application a time without RG
  • test3 run only App2 with RG to see the cost on the execution
  • test4 run both to see what happen with RG

Test 1

Master

master1_T1

Slave

master2_T1

The aim of this test is to have an idea, right away, of what happens when both applications are running, without limits. As we can see during the test all cores are utilized, some more consistently and some a bit less so, but nothing huge. What is interesting is to see the effect on the response time and the number of events each application is able to execute:

execution_time1

The execution graph indicates a very high time in Insert, and Delete for App1, with the results showing very bad performance only 9 inserts, 1333 deletes and 165 selects.

events_by_crud1

But what is the application actually supposed to do? Test 2 will tell us, creating de facto our baseline.

Test 2

In this test I had run each application separately, so no interference.

Master App1

master1_T2app1

Master App2

master1_T2app2

Slave App1

master2_T2app1

Slave App2

master2_T2app2

Nothing significantly different in the CPU utilization when App1 was running, while we can see a bit less utilization in the case of App2.

The impact on the performance is, however, more significant: 

execution_time2

events_by_crud2

Execution time for insert, delete drops significantly for App1 and we can see that the application SHOULD be able to insert ~1320 events and perform a significantly higher number of operations. Same for App2, but here we care more about the OLTP than the ETL application. So, what will happen IF we activate the Resource Group flags to the App2 (ETL) performance? Let's see with test 3.

Test 3

Running only App2 with active resource groups Master App2

master1_T3app2

Slave App2

master2_T3app2

On the master, what the RG settings will do is just reduce the priority, given that no other process is running and no other application is connected, the impact is not high. On the other hand, on the slave we can clearly see that now App2 can only use core 5 as indicated in our configuration. So far so good, what will be the performance loss? Let's see:

execution_time3

Comparing the two tests 2 and 3, we can see that in applying the resource groups our ETL application has a minimal but existing impact. That is expected, desired and must be noted. The impact is not high in this test, but it can expand the running time in real world.

events_by_crud3

It's time to combine all and see what is going on.

Test 4

Run our OLTP application while the ETL is running under Resource Group. Master

master1_T4

Slave

master2_T4

Looking at the CPU utilization these graphs are very similar to the ones in test1, but the result is totally different:

 

execution_time4

The execution time for App1 (OLTP) has dropped significantly while the performance has increased almost as if nothing else is running. At the same time App2 has lost performance, and this must be taken into account, but it will not stop/prevent the ETL process to run.

events_by_crud4

It is possible to do more tuning in the case that ETL is too compromised. Or maybe modify the Servers layout such as adding a Slave and dedicating it to ETL reads. The combinations and possibilities are many.

Conclusion

Just looking to the final graphs will help us to reach our conclusions: 

execution_time

events_by_crud

 

Comparing the two tests 1 and 4 we can see how using the Resource Group will help us to correctly balance the workload and optimize the performance in the case of unavoidable contention between different applications. At the same time, using Resource Group alone as a blanket setting is not optimal because it can fail its purpose. Instead of providing some improvement, it can unpredictably affect all the traffic. It is also not desirable to modify the code in order to implement it at query level, given the possible impact of doing that in cost and time. The introduction of ProxySQL with query rewrite, allows us to utilize the per query option, without the need for any code modification, and allow us to specify what we want, with very high level of granularity. Once more do not do this by yourself unless you are more than competent and know 100% what you are doing. In any case, remember that an ETL process may take longer and that you need to plan your work/schedule accordingly. Good MySQL everyone.

References

How To design a better Ansible Role for MySQL Environment ?

$
0
0

In our earlier stage of Ansible, we just wrote simple playbook and ad-hoc command with very long ansible hosts file. When we plan to use Ansible extensively in our daily production use case, we understand that simple playbooks don’t help to scale up to our expectation.

Even though we had options for separate variables, handlers and template files according to our requirements, this un-organized way didn’t help. It looked very messy and made me unhappy when I saw the code too.  That’s the place we decided to use Ansible Role.

My understanding of Ansible Roles?

The role is the primary mechanism for breaking a playbook into multiple files, we can simply refer to the Python Package. Roles help to group multiple tasks, Jinja2 template file, variable file and handlers into a clean directory structure. This will help us to reduce the syntax error while developing and also easily help to scale for future requirements.

Thumb rule for developing an Ansible role is, don’t develop a single role to do everything, it might break. Try to focus on a specific goal, for example installing MySQL another installing  App Server, etc.

How to create an Ansible Role?

ansible-galaxy is the command to manage Ansible role in the shared repo. This command has a lot of sub-commands,  but we are only going to use ansible-galaxy init.  ansible-galaxy init <role name>  command helps to create the skeleton framework of a role. By default, role creates under the current working directory.

[ec2-user@ip-172-31-28-102 ~]$ ansible-galaxy init mysql
- mysql was created successfully

 

Discussing The Ansible Role Directory Structure

Our MySQL roles directory consists of defaults, files, handlers, meta, tasks, templates, tests, and vars folders. We will discuss every individual directory characteristic little detail below.

[ec2-user@ip-172-31-28-102 ~]$ tree mysql
mysql
|-- defaults
| `-- main.yml
|-- files
|-- handlers
| `-- main.yml
|-- meta
| `-- main.yml
|-- README.md
|-- tasks
| `-- main.yml
|-- templates
|-- tests
| |-- inventory
| `-- test.yml
`-- vars
`-- main.yml

8 directories, 8 files

 

defaults/main.yml

Default folder name refers to the preexisting value of a user-configurable setting.

This directory contains default variable for the role. In our all role development, we have defined all mutable variable for the role here only because it has the lowest priority and it can be easily overridden through other variables from group _vars or hosts_vars or playbook vars.

Eg:

###########################################################

############Percona-utils Role Variable####################

###########################################################

###########################################################

# Percona Repo Variable #

###########################################################

percona_redhat_repo_url: "https://www.percona.com/redir/downloads/percona-release/redhat/percona-release-0.1-4.noarch.rpm"

percona_debian_repo_url: "https://repo.percona.com/apt/percona-release_0.1-4.{{ ansible_distribution_release }}_all.deb"

#########################################################################################################

# Percona installation state installed/latest, PMM Client version and PMM Client Re-Install "yes" or "".#

#########################################################################################################

common_percona_util_package_state: installed

percona_package_state: installed

pmm_client_version: "1.8.0"

pmm_client_reinstall: "no" #"yes" or "no"

 

files

Most of the time copy module uses this folder.

handlers/main.yml

This is the one place, we will write all our handlers that we are going to use in a role. In our task, we can just specify the name of the handler, it will be automatically called and executed at the end of the play.

basically, I don’t prefer to write handlers. Because everyone knows handler is similar to a task, but it only executes when the particular task changed the state of the machine. And it usually runs after all of the tasks are run at the end of the play.

In some situation following task failed, next time we re-run the play the handler calling task state will be ok. So that case handler fails to run.

I know using change_when we can fix the above issue, but I don’t like to complex my code. Using register, I will store the output and evaluate a certain condition. Based on the evaluation we will execute the task what handler will do. I feel it simple cool for me.

Eg:

- name: Check Percona repo is already configured

  stat: 
    path:"{{ red_percona_repofile_path }}"

  register: percona_repofile_status
- name: Installing PMM Client

  package:

  name: "{{ item }}"

  state: present

  with_items:

         "{{ red_pmm_client_packages }}"

  when: percona_repofile_status.stat.exists == True

 

meta/main.yml

Using this we can define meta information about the roles.  eg:  author, company,  description, license, and dependencies, etc.

Here dependencies are very important, we can’t ignore just like that and pass away. Why because using dependencies we can specify the list of the role that needs to run before the executing the rest of the role included in the playbook. So when playbook runs automatically all depend on role execute first and continue the other roles. it helps to avoid lot human error in the care other role dependencies.

tasks/main.yml

the task is the place we put all our play’s to install, configure, manage services, and etc..

Eg :

main.yml
#######################################################

# Percona Repo for Redhat and Debian #

#######################################################

- import_tasks: percona-repo-RedHat.yml

when: ansible_os_family == "RedHat"

static: no

- import_tasks: percona-repo-Debian.yml

when: ansible_os_family == "Debian"

static: no

#####################################################

# Install Common Utils Packages #

#####################################################

- import_tasks: utils-setup.yml

static: no

######################################################

# PMM Client Installation for RedHat and Debian #

######################################################

- import_tasks: pmm-client-setup-RedHat.yml

when: ansible_os_family == "RedHat"

static: no

- import_tasks: pmm-client-setup-Debian.yml

when: ansible_os_family == "Debian"

static: no

 

percona-repo-Redhat.yml
---

#############################################################################################

# Installing Percona Repo For RedHat #

#############################################################################################

- name: Check Percona repo is already configured.

stat: path="{{ red_percona_repofile_path }}"

register: percona_repofile_status

- name: Enable RedHat Optional repo.

command: yum-config-manager --enable rhui-REGION-rhel-server-optional

when: ansible_distribution == "RedHat"

- name: Install Percona repo.

yum:

name: "{{ percona_redhat_repo_url }}"

state: present

register: percona_install_result

when: percona_repofile_status.stat.exists == False

- name: Amazon Linux changing releaserver to 7 default.

command: sed -i 's/$releasever/7/g' "{{ red_percona_repofile_path }}"

when: ansible_distribution == "Amazon"

templates

It’s just text file that has special syntax for specifying variables that should be replaced by values. Ansible uses the jinja2 templating engine to implement templates.

In our case, we use the template for building configuration file dynamically.

Eg:

- name: Copy my.cnf global MySQL configuration.

template:

src: mysql_conf.j2

dest: "{{ mysql_config_file }}"

owner: root

group: root

force: "{{ overwrite_global_mycnf }}"

mode: 0644
# {{ ansible_managed }}

[client]

port = {{ mysql_port }}

socket = {{ mysql_socket }}

[mysqld]

port = {{ mysql_port }}

bind-address = {{ mysql_bind_address }}

datadir = {{ mysql_data_dir }}

socket = {{ mysql_socket }}

pid-file = {{ mysql_pid_file }}

{% if mysql_skip_name_resolve %}

skip-host-cache

skip-name-resolve

{% endif %}

{% if mysql_sql_mode %}

sql_mode = {{ mysql_sql_mode }}

{% endif %}

# Logging configuration.

{% if mysql_log_error == 'syslog' or mysql_log == 'syslog' %}

syslog

syslog-tag = {{ mysql_syslog_tag }}

{% else %}

{% if mysql_log %}

log = {{ mysql_log }}

{% endif %}

log-error = {{ mysql_log_error }}

{% endif %}

# Slow query log configuration.

{% if mysql_slow_query_log_enabled %}

slow_query_log = 1

slow_query_log_file = {{ mysql_slow_query_log_file }}

long_query_time = {{ mysql_slow_query_time }}

{% endif %}

# Disabling symbolic-links is recommended to prevent assorted security risks

symbolic-links = 0

# User is ignored when systemd is used (fedora >= 15).

user = mysql

# http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html

#performance_schema

{% if mysql_version|string == "5.7" %}

performance_schema

{% endif %}

# Memory settings.

key_buffer_size = {{ mysql_key_buffer_size }}

max_allowed_packet = {{ mysql_max_allowed_packet }}

table_open_cache = {{ mysql_table_open_cache }}

sort_buffer_size = {{ mysql_sort_buffer_size }}

read_buffer_size = {{ mysql_read_buffer_size }}

read_rnd_buffer_size = {{ mysql_read_rnd_buffer_size }}

myisam_sort_buffer_size = {{ mysql_myisam_sort_buffer_size }}

query_cache_type = {{ mysql_query_cache_type }}

query_cache_size = {{ mysql_query_cache_size }}

query_cache_limit = {{ mysql_query_cache_limit }}

{% if mysql_max_connections | int > 3000 %}

max_connections = 3000

thread_cache_size = {{ (3000 * 0.15) | int }}

{% elif mysql_max_connections | int < 150 %}

max_connections = 150

thread_cache_size = {{ (150 * 0.15) | int }}

{% else %}

max_connections = {{ mysql_max_connections }}

thread_cache_size = {{ (mysql_max_connections | int * 0.15) | int }}

{% endif %}

max_connect_errors = {{ mysql_max_connect_errors }}

tmp_table_size = {{ mysql_tmp_table_size }}

max_heap_table_size = {{ mysql_max_heap_table_size }}

group_concat_max_len = {{ mysql_group_concat_max_len }}

join_buffer_size = {{ mysql_join_buffer_size }}

vars/main.yml

vars also hold variable for our roles same as defaults. The variables which reside under vars are more difficult to overwrite due to its high priority. so if we need to make the variable immutable, we can declare under vars.

# vars file for common

red_percona_utils_packages:

  - innotop

  - percona-toolkit

  - perl-Sys-Statistics-Linux

  - nagios-plugins-perl

red_pmm_client_packages:

  - pmm-client-{{ pmm_client_version }}

  - percona-nagios-plugins.noarch

  - perl-DBI.x86_64

  - perl-Nagios-Plugin.noarch

red_percona_repofile_path: "/etc/yum.repos.d/percona-release.repo"
I think I covered all related topics to roles. but there are a lot of things to discuss and share.

Finally, the Playbook Order of Execution

  • Any pre_tasks defined in the play.
  • Any handlers triggered so far will be run.
  • Each role listed in roles will execute in turn. Any role dependencies defined in the roles meta/main.yml will be run first, subject to tag filtering and conditionals.
  • Any tasks defined in the play.
  • Any handlers triggered so far will be run.
  • Any post_tasks defined in the play.
  • Any handlers triggered so far will be run.

 

This Week in Data with Colin Charles 49: MongoDB Conference Opportunities and Serverless Aurora MySQL

$
0
0
Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

Beyond the MongoDB content that will be at Percona Live Europe 2018, there is also a bit of an agenda for MongoDB Europe 2018, happening on November 8 in London—a day after Percona Live in Frankfurt. I expect you’ll see a diverse set of MongoDB content at Percona Live.

The Percona Live Europe Call for Papers closes TODAY! (Friday August 17, 2018)

From Amazon, there have been some good MySQL changes. You now have access to time delayed replication as a strategy for your High Availability and disaster recovery. This works with versions 5.7.22, 5.6.40 and later. It is worth noting that this isn’t documented as working for MariaDB (yet?). It arrived in MariaDB Server in 10.2.3.

Another MySQL change from Amazon? Aurora Serverless MySQL is now generally available. You can build and run applications without thinking about instances: previously, the database function was not all that focused on serverless. This on-demand auto-scaling serverless Aurora should be fun to use. Only Aurora MySQL 5.6 is supported at the moment and also, be aware that this is not available in all regions yet (e.g. Singapore).

Releases

  • pgmetrics is described as an open-source, zero-dependency, single-binary tool that can collect a lot of information and statistics from a running PostgreSQL server and display it in easy-to-read text format or export it as JSON for scripting.
  • PostgreSQL 10.5, 9.6.10, 9.5.14, 9.4.19, 9.3.24, And 11 Beta 3 has two fixed security vulnerabilities may inspire an upgrade.

Link List

Industry Updates

  • Martin Arrieta (LinkedIn) is now a Site Reliability Engineer at Fastly. Formerly of Pythian and Percona.
  • Ivan Zoratti (LinkedIn) is now Director of Product Management at Neo4j. He was previously on founding teams, was the CTO of MariaDB Corporation (then SkySQL), and is a long time MySQL veteran.

Upcoming Appearances

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

 

The post This Week in Data with Colin Charles 49: MongoDB Conference Opportunities and Serverless Aurora MySQL appeared first on Percona Database Performance Blog.

Viewing all 18800 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>