Percona XtraDB Cluster 5.7.23-31.31 Is Now Available

September 26, 2018, 8:40 am

≪ Previous: Netflix’s Production Technology = Voltron

Percona is glad to announce the release of Percona XtraDB Cluster 5.7.23-31.31 on September 26, 2018. Binaries are available from the downloads section or from our software repositories.

Percona XtraDB Cluster 5.7.23-31.31 is now the current release, based on the following:

Percona Server for MySQL 5.7.23
Galera Replication library 3.24
Galera/Codership WSREP API Release 5.7.23

Deprecated

The following variables are deprecated starting from this release:

wsrep_convert_lock_to_trx

This variable, which defines whether locking sessions should be converted to transactions, is deprecated in Percona XtraDB Cluster 5.7.23-31.31 because it is rarely used in practice.

Fixed Bugs

PXC-1017: Memcached access to InnoDB was not replicated by Galera.
PXC-2164: The SST script prevented SELinux from being enabled.
PXC-2155: wsrep_sst_xtrabackup-v2 did not delete all folders on cleanup.
PXC-2160: In some cases, the MySQL version was not detected correctly with the Xtrabackup-v2 method of SST (State Snapshot Transfer).
PXC-2199: When the DROP TRIGGER IF EXISTS statement was run for a not existing trigger, the node GTID was incremented instead of the cluster GTID.
PXC-2209: The compression dictionary was not replicated in PXC.
PXC-2202: In some cases, a disconnected cluster node was not shut down.
PXC-2165: SST could fail if either wsrep_node_address or wsrep_sst_receive_address were not specified.
PXC-2213: NULL/VOID DDL transactions could commit in a wrong order.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post Percona XtraDB Cluster 5.7.23-31.31 Is Now Available appeared first on Percona Database Performance Blog.

↧

Debug a slow Wordpress site

September 27, 2018, 8:35 am

≫ Next: MySQL Track at Southern California Linux Expo 2019 CFP Open

≪ Previous: Percona XtraDB Cluster 5.7.23-31.31 Is Now Available

A few days ago my colleague asked me to help him figure out why his Wordpress website was running so slow. Everything seemed fine, the CPU was not busy, the memory was not used much. It was not the network issue because its ping time was very fast. The MySQL database was not slow either. So it was hard for me to debug.

The site returned 504 error initially, but it sometimes could load the page (after a long wait). I checked the Nginx error log and found that the upstream connection was not responded. So it must be something wrong with the PHP code.

I tried many ways which I found on the internet but no success. Most of them were about increasing the timeout or the execution time. By doing that, the page could load but it was still super slow (it loaded in > 1min).

When I sat quietly for a while to think, I realized that plugins could be the issue. There could not be any reason a Wordpress could be that slow unless there were some new extensions (or plugins) working abnormally. But how could I find out which plugin is working well and which one is not? I could in theory turn them off one by one but it would be very time consuming.

Luckily there is a config in PHP-FPM to output slow execution script to a log file. By default, this feature was disabled so nothing was logged. I checked the PHP-FPM config file and changed two configs

slowlog = /var/log/php-fpm/www-slow.log
request_slowlog_timeout = 10s

After restarting php-fpm service and reloading the page, I could clearly see which script was causing the issue. Here was the log

[0x00007f1516a17770] curl_exec() /var/www/html/wp-includes/Requests/Transport/cURL.php:162
[0x00007f1516a176a0] request() /var/www/html/wp-includes/class-requests.php:379
[0x00007f1516a17590] request() /var/www/html/wp-includes/class-http.php:370
[0x00007f1516a173f0] request() /var/www/html/wp-includes/class-http.php:589
[0x00007f1516a17330] post() /var/www/html/wp-includes/http.php:187
[0x00007f1516a17280] wp_remote_post() /var/www/html/wp-content/plugins/akismet/class.akismet.php:1107
[0x00007f1516a170e0] http_post() /var/www/html/wp-content/plugins/akismet/class.akismet.php:64
[0x00007f1516a17040] check_key_status() /var/www/html/wp-content/plugins/akismet/class.akismet.php:68
[0x00007f1516a16f90] verify_key() /var/www/html/wp-content/plugins/jetpack/class.jetpack.php:7082
[0x00007f1516a16ef0] is_akismet_active() /var/www/html/wp-content/plugins/jetpack/modules/sharedaddy/sharing-service.php:72
[0x00007f1516a16d40] get_all_services() /var/www/html/wp-content/plugins/jetpack/modules/sharedaddy/sharing-service.php:188
[0x00007f1516a16c00] get_blog_services() /var/www/html/wp-content/plugins/jetpack/_inc/lib/class.core-rest-api-endpoints.php:2516
[0x00007f1516a16ab0] prepare_options_for_response() /var/www/html/wp-content/plugins/jetpack/_inc/lib/core-api/class.jetpack-core-api-module-endpoints.php:404
[0x00007f1516a16950] get_all_options() /var/www/html/wp-content/plugins/jetpack/_inc/lib/admin-pages/class.jetpack-react-page.php:337
[0x00007f1516a168a0] get_flattened_settings() /var/www/html/wp-content/plugins/jetpack/_inc/lib/admin-pages/class.jetpack-react-page.php:287
[0x00007f1516a16680] page_admin_scripts() /var/www/html/wp-content/plugins/jetpack/_inc/lib/admin-pages/class.jetpack-admin-page.php:117
[0x00007f1516a165f0] admin_scripts() /var/www/html/wp-includes/class-wp-hook.php:286
[0x00007f1516a16500] apply_filters() /var/www/html/wp-includes/class-wp-hook.php:310
[0x00007f1516a16480] do_action() /var/www/html/wp-includes/plugin.php:453
[0x00007f1516a16360] do_action() /var/www/html/wp-admin/admin-header.php:118

Aha, so the curl_exec was the issue. It seems like the curl request was not succesful and the slowness was because the script had to wait for the response from that curl request. Going a bit further in the stacktrace, we can see that Akismet and Jetpack were two plugins using that curl_exec(). I went to Wordpress admin (which was also very slow) and disabled these two plugins, the site become fast again.

It is important to know that this might be a piece of cake for an experienced PHP developer, but I am not that guy. I am a Rails guy who can play around with the server, that's why I am happy to resolve partly the issue. I did not know why curl request failed to load the resource, perhaps something was wrong with the network. So next time, if you see some issue with the PHP site, remember to turn on the slowlog config so that you know what is going on.

↧

MySQL Track at Southern California Linux Expo 2019 CFP Open

September 27, 2018, 8:49 am

≫ Next: Running MySQL Shell 8.0 with Docker

≪ Previous: Debug a slow Wordpress site

The Call For Papers for the next Southern California Linux Expo aka SCaLE is open and I need your help with the MySQL Track. We have had a MySQL track for the past few years and this year I have gotten permission from the organizers of SCaLE to get a group of MySQL community members to review the talks for this track. This year was pretty good but to make 2019 even better we need more submissions from more people AND THIS MEANS YOU!!

The link above has the details on how to register to submit a talk submission and the process is fairly simple. But if you would like help with your submission, want to 'rubber duck' ideas, or want a quick review before you submission please contact me (@stoker, david.stokes @ Oracle.com) or find me at a show.

So what type of talks do we need? We need to cover material for newbies, intermediate, and the experienced. There are lots of developers, Devops folks, hobbyists, and students just waiting for you to share your knowledge with them. For a large show, we have a fairly intimate talk space which is friendly to first time presenters.

Some ideas:

1. So how do you do backups and restores? What sort of things get deleted and what steps do you take to keep your data safe? Policies? Procedures? How does a developer who fudged a row/table get their data back in your organization?

2. How do you reduce SQL injects, the N+1 problem and 0(N) searches in your code?

3. How is your Kubernetes/Docker/Ansible environment for your developers managed?

4. Five things about being an 'accidental' MySQL DBA I wish I knew when I took over the databases

5. MySQL Replication best practices.

6. Data normalization how-to

7. Query optimization for novices. Query optimization for the advanced SQL coder.

And anything else you think that at least one other person would like to hear about (and there are more - believe me).

So please make SCaLE 17x even better! You have until the end of October to submit.

↧

Running MySQL Shell 8.0 with Docker

September 27, 2018, 10:35 am

≫ Next: Announcement: Alpha Build of Percona Server 8.0

≪ Previous: MySQL Track at Southern California Linux Expo 2019 CFP Open

In a previous blog, I discussed how to pull, install and run MySQL 8.0 with Docker. I showed how to connect to the Docker daemon with MySQL.

Now I will show you how to connect to the same Docker instance using the MySQL Shell which is a tool to use Document Store and to create InnoDB Clusters.

Installing Docker, Starting MySQL, and Connecting using MySQL Shell

First, you grab Docker: https://docs.docker.com/install .

Then, you pull and run MySQL 8.0 (Linux) by running the following. Note that I’m not using a password which is just for testing a fleeting MySQL Docker container:

$ docker run –name mysql8 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -d mysql/mysql-server

Unable to find image ‘mysql/mysql-server:latest’ locally

…

Pull complete

Run the following command in bold to get the status of your MySQL container. Look for the word ‘healthy’ to know that it is running:

$ docker ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES8534edee97f mysql/mysql-server “/entrypoint.sh mysq…” 11 minutes ago Up 11 minutes (healthy) 3306/

Now you login to your containerized server, connecting with MySQL Shell:

$ docker exec -it mysql8 mysqlsh -uroot –mysql -S/var/lib/mysql/mysql.sock

Creating a Classic session to ‘root@/var%2Flib%2Fmysql%2Fmysql.sock’

Please provide the password for ‘root@/var%2Flib%2Fmysql%2Fmysql.sock’:

Fetching schema names for autocompletion… Press ^C to stop.

Your MySQL connection id is 13

Server version: 8.0.12 MySQL Community Server – GPL

No default schema selected; type \use <schema> to set one.

MySQL localhost JS >

That’s it! You are now ready to use the MySQL 8.0 Shell to try Document Store and set up your InnoDB Cluster. For more information on MySQL 8.0 Shell, please check here: https://dev.mysql.com/doc/mysql-shell/8.0/en/ .

“The statements and opinions expressed here are my own and do not necessarily represent those of the Oracle Corporation.”

-Kathy Forte, Oracle MySQL Solutions Architect

↧

Announcement: Alpha Build of Percona Server 8.0

September 27, 2018, 11:42 am

≫ Next: Scaling Percona Monitoring and Management (PMM)

≪ Previous: Running MySQL Shell 8.0 with Docker

Alpha Build of Percona Server 8.0 released

An alpha version of Percona Server 8.0 is now available in the Percona experimental software repositories. This is a 64-bit release only.

You may experiment with this alpha release by running it in a Docker container:

$ docker run -d -e MYSQL_ROOT_PASSWORD=password -p 3306:3306 perconalab/percona-server:8.0.12.alpha

When the container starts, connect to it as follows:

$ docker exec -ti $(docker ps | grep -F percona-server:8.0.12.alpha | awk '{print $1}') mysql -uroot -ppassword

Note that this release is not ready for use in any production environment.

Percona Server 8.0 alpha is available for the following platforms:

RHEL/Centos 6.x
RHEL/Centos 7.x
Ubuntu 14.04 Trusty
Ubuntu 16.04 Xenial
Ubuntu 18.04 Bionic
Debian 8 Jessie
Debian 9 Stretch

Note: The list of supported platforms may be different in the GA release.

Fixed Bugs:

PS-4788: Setting log_slow_verbosity and enabling the slow_query_log could lead to a server crash
PS-4814: TokuDB ‘fast’ replace into is incompatible with 8.0 row replication
PS-4834: The encrypted system tablespace has empty uuid

Other fixed bugs: PS-4788, PS-4631, PS-4736, PS-4818, PS-4755

Unfinished Features

The following features are work in progress and are not yet in a working state:

Column compression with Data Dictionaries
Native Partitioning for TokuDB and for MyRocks
Encryption
- Key Rotation
- Scrubbing

Known Issues

PS-4803: ALTER TABLE … ADD INDEX … LOCK crash | handle_fatal_signal (sig=11) in dd_table_has_instant_cols
PS-4896: handle_fatal_signal (sig=11) in THD::thread_id likely due to enabling innodb_print_lock_wait_timeout_info
PS-4820: PS crashes with keyring_vault encryption
PS-4796: 8.0 DD and atomic DDL breaks DROP DATABASE for engines that store files in database directory
PS-4898: Crash during PAM authentication plugin installation.
PS-1782: Optimizer chooses wrong plan when joining 2 tables
PS-4850: Toku hot backup plugin dumps tons of info to stdout with no way to disable it
PS-4797: rpl.rpl_master_errors failing, likely due to binlog encryption
PS-4800: Recovery of prepared XA transactions seems broken in 8.0
PS-4853: Installing audit_log plugin causes server to crash
PS-4855: Replace http with https in http://bugs.percona.com in server crash messages
PS-4857: Improve error message handling for compressed columns
PS-4895: Improve error message when encrypted system tablespace was started without keyring plugin
PS-3944: Single variable to control logging in QRT
PS-4705: crash on snapshot size check in RocksDB
PS-4885: Using ALTER … ROW_FORMAT=TOKUDB_QUICKLZ leads to InnoDB: Assertion failure: ha_innodb.cc:12198:m_form->s->row_type == m_create_info->row_type

The post Announcement: Alpha Build of Percona Server 8.0 appeared first on Percona Database Performance Blog.

↧

Scaling Percona Monitoring and Management (PMM)

September 28, 2018, 5:29 am

≫ Next: This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

≪ Previous: Announcement: Alpha Build of Percona Server 8.0

Starting with PMM 1.13, PMM uses Prometheus 2 for metrics storage, which tends to be heaviest resource consumer of CPU and RAM. With Prometheus 2 Performance Improvements, PMM can scale to more than 1000 monitored nodes per instance in default configuration. In this blog post we will look into PMM scaling and capacity planning—how to estimate the resources required, and what drives resource consumption.

PMM tested with 1000 nodes

We have now tested PMM with up to 1000 nodes, using a virtualized system with 128GB of memory, 24 virtual cores, and SSD storage. We found PMM scales pretty linearly with the available memory and CPU cores, and we believe that a higher number of nodes could be supported with more powerful hardware.

What drives resource usage in PMM ?

Depending on your system configuration and workload, a single node can generate very different loads on the PMM server. The main factors that impact the performance of PMM are:

Number of samples (data points) injected into PMM per second
Number of distinct time series they belong to (cardinality)
Number of distinct query patterns your application uses
Number of queries you have on PMM, through the user interface on the API, and their complexity

These specifically can be impacted by:

Software version – modern database software versions expose more metrics)
Software configuration – some metrics are only exposed in certain configuration
Workload – a large number of database objects and high concurrency will increase both the number of samples ingested and their cardinality.
Exporter configuration – disabling collectors can reduce amount of data collectors
Scrape frequency – controlled by METRICS_RESOLUTION

All these factors together may impact resource requirements by a factor of ten or more, so do your own testing to be sure. However, the numbers in this article should serve as good general guidance as a start point for your research.

On the system supporting 1000 instances we observed the following performance:

Performance PMM 1000 nodes load

As you can see, we have more than 2.000 scrapes/sec performed, providing almost two million samples/sec, and more than eight million active time series. These are the main numbers that define the load placed on Prometheus.

Capacity planning to scale PMM

Both CPU and memory are very important resources for PMM capacity planning. Memory is the more important as Prometheus 2 does not have good options for limiting memory consumption. If you do not have enough memory to handle your workload, then it will run out of memory and crash.

We recommend at least 2GB of memory for a production PMM Installation. A test installation with 1GB of memory is possible. However, it may not be able to monitor more than one or two nodes without running out of memory. With 2GB of memory you should be able to monitor at least five nodes without problem.

With powerful systems (8GB of more) you can have approximately eight systems per 1GB of memory, or about 15,000 samples ingested/sec per 1GB of memory.

To calculate the CPU usage resources required, allow for about 50 monitored systems per core (or 100K metrics/sec per CPU core).

One problem you’re likely to encounter if you’re running PMM with 100+ instances is the “Home Dashboard”. This becomes way too heavy with such a large number of servers. We plan to fix this issue in future releases of PMM, but for now you can work around it in two simple ways:

You can select the host, for example “pmm-server” in your home dashboard and save it, before adding a large amount of hosts to the system.

set home dashboard for PMM

Or you can make some other dashboard of your choice and set it as the home dashboard.

Summary

More than 1,000 monitored systems is possible per single PMM server
Your specific workload and configuration may significantly change the resources required
If deploying with 8GB or more, plan 50 systems per core, and eight systems per 1GB of RAM

The post Scaling Percona Monitoring and Management (PMM) appeared first on Percona Database Performance Blog.

↧

This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

September 28, 2018, 8:22 am

≫ Next: How To Fix MySQL Replication After an Incompatible DDL Command

≪ Previous: Scaling Percona Monitoring and Management (PMM)

Join Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

I consider this to be the biggest news for the week: Alpha Build of Percona Server for MySQL 8.0. Experiment with it in a Docker container. It is missing column compression with dictionary support, native partitioning for TokuDB and MyRocks (excited to see that this is coming!), and encryption key rotation and scrubbing. All in, this should be a fun release to try, test, and also to file bugs for!

Database paradigms are changing, and it is interesting to see Cloudflare introducing Workers KV a key-value store, that is eventually consistent and highly distributed (at their global network of 152+ data centers). You can have up to 1 billion keys per namespace, keys up to 2kB in size, values up to 64kB, and eventual global consistency within 10 seconds. Read more about the cost and other technicals too.

For some quick glossing, from a MySQL Federal Account Manager, comes Why MySQL is Harder to Sell Than Oracle (from someone who has done both). Valid concerns, and always interesting to hear the barriers MySQL faces even after 23 years in existence! For analytics, maybe this is where the likes of MariaDB ColumnStore or ClickHouse might come into play.

Lastly, for all of you asking me about when Percona Live Europe Frankfurt 2018 speaker acceptances and agendas are to be released, I am told by a good source that it will be announced early next week. So register already!

Releases

Percona XtraDB Cluster 5.7.23-31.31 – based on Percona Server for MySQL 5.7.23, Galera replication library 3.24, wsrep API 5.7.23.
MariaDB 10.2.18 – mariabackup supports DDL commands during backups.
MariaDB Connector/Node.js 2.0.0 – still in alpha, but that version number has been bumped up.

Link List

Amazon Aurora with PostgreSQL Compatibility Supports PostgreSQL 10 – comes with all the patches for 10.1/10.2/10.3
5th Annual RocksDB meetup at Facebook HQ – the meetup is filling fast, can’t believe it is happening for the fifth year! Congratulations, and now there is definitely even more interest as RocksDB becomes “end-user usable” via Percona Server for MySQL 5.7 & MariaDB Server 10.3!

Upcoming Appearances

Open Source Summit Europe 2018 – 22–24 October 2018

Feedback

I look forward to feedback/tips via Twitter @bytebot.

The post This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha appeared first on Percona Database Performance Blog.

↧

How To Fix MySQL Replication After an Incompatible DDL Command

October 1, 2018, 4:40 am

≫ Next: MySQL Explain Example – Explaining MySQL EXPLAIN using StackOverflow data

≪ Previous: This Week in Data with Colin Charles #54: Percona Server for MySQL is Alpha

fix MySQL replication after incompatible DDL MySQL supports replicating to a slave that is one release higher. This allows us to easily upgrade our MySQL setup to a new version, by promoting the slave and pointing the application to it. However, though unsupported, there are times when the MySQL version of slave deployed is one release lower. In this scenario, if your application has been performing much better on an older version of MySQL, you would like to have a convenient option to downgrade. You can simply promote the slave to get the old performance back.

The MySQL manual says that ROW based replication can be used to replicate to a lower version, provided that no DDLs replicated are incompatible with the slave. One such incompatible command is ALTER USER which is a new feature in MySQL 5.7 and not available on 5.6. :

ALTER USER 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Executing that command would break replication. Here is an example of a broken slave in non-GTID replication:

*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 36915649
               Relay_Log_File: mysql_sandbox5641-relay-bin.000006
                Relay_Log_Pos: 36174552
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 36174373
              Relay_Log_Space: 36916179
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
*** redacted ***
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:03:40
*** redacted ***
                Auto_Position: 0
1 row in set (0.00 sec)

Skipping the statement does not resume replication:

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.02 sec)
mysql> SET GLOBAL sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)
mysql> SHOW SLAVE STATUS\G

Fixing non-GTID replication

When you check slave status, replication still isn’t fixed. To fix it, you must manually skip to the next binary log position. The current binary log (Relay_Master_Log_File) and position (Exec_Master_Log_Pos) executed are mysql-bin.000002 and 36174373 respectively. We can use mysqlbinlog on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=36174373 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000002 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 36174373
#180918 22:03:40 server id 1  end_log_pos 36174438 CRC32 0xc7e1e553 	Anonymous_GTID	last_committed=19273	sequence_number=19277	rbr_only=no
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174438
#180918 22:03:40 server id 1  end_log_pos 36174621 CRC32 0x2e5bb235 	Query	thread_id=563	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
SET @@session.pseudo_thread_id=563/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 36174621
#180918 22:03:40 server id 1  end_log_pos 36174686 CRC32 0x86756b3f 	Anonymous_GTID	last_committed=19275	sequence_number=19278	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 36174686
#180918 22:03:40 server id 1  end_log_pos 36174760 CRC32 0x30e663f9 	Query	thread_id=529	exec_time=0	error_code=0
SET TIMESTAMP=1537279420/*!*/;
BEGIN
/*!*/;
# at 36174760
#180918 22:03:40 server id 1  end_log_pos 36174819 CRC32 0x48054daf 	Table_map: `sbtest`.`sbtest1` mapped to number 226

Based on the output above, the next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=36174621;
START SLAVE;

Verify if the slave threads are now running by executing SHOW SLAVE STATUS\G

Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 306841423
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 190785290
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 226959625
              Relay_Log_Space: 270667273
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 383
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Opening tables
           Master_Retry_Count: 86400
*** redacted ***
                Auto_Position: 0

To make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Done.

GTID replication

For GTID replication, in addition to injecting an empty transaction for the offending statement, you’ll need skip it by using the non-GTID solution provided above. Once running, flip it back to GTID.

Here’s an example of a broken GTID slave:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 14364967
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 8630318
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
*** redacted ***
                   Last_Errno: 1064
                   Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 12468343
              Relay_Log_Space: 10527158
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1064
               Last_SQL_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A784' at line 1' on query. Default database: ''. Query: 'ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8''
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 180918 22:32:28
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:2280-8530
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-7403
                Auto_Position: 1
1 row in set (0.00 sec)
mysql> SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+---------------+---------------------------------------------+
| Variable_name | Value                                       |
+---------------+---------------------------------------------+
| gtid_executed | 00005723-0000-0000-0000-000000005723:1-7403 |
+---------------+---------------------------------------------+
1 row in set (0.00 sec)

Since the last position executed is 7403, so you’ll need to create an empty transaction for the offending sequence 7404.

STOP SLAVE;
SET GTID_NEXT='00005723-0000-0000-0000-000000005723:7404';
BEGIN;
COMMIT;
SET GTID_NEXT=AUTOMATIC;
START SLAVE;

Note: If you have MTS enabled, you can also get the offending GTID coordinates from Last_SQL_Error of SHOW SLAVE STATUS\G

The next step is to find the next binary log position. The current binary log(Relay_Master_Log_File) and position(Exec_Master_Log_Pos) executed are mysql-bin.000003 and 12468343 respectively. We can again use

mysqlbinlog

on the master to determine the next position:

mysqlbinlog -v --base64-output=DECODE-ROWS --start-position=12468343 /ssd/sandboxes/msb_5_7_23/data/mysql-bin.000003 | head -n 30
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 12468343
#180918 22:32:19 server id 1  end_log_pos 12468408 CRC32 0x259ee085 	GTID	last_committed=7400	sequence_number=7404	rbr_only=no
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7404'/*!*/;
# at 12468408
#180918 22:32:19 server id 1  end_log_pos 12468591 CRC32 0xb349ad80 	Query	thread_id=142	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
SET @@session.pseudo_thread_id=142/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C latin1 *//*!*/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=8/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
ALTER USER 'testuser'@'localhost' IDENTIFIED WITH 'mysql_native_password' AS '*3A2EB9C80F7239A4DE3933AE266DB76A7846BCB8'
/*!*/;
# at 12468591
#180918 22:32:19 server id 1  end_log_pos 12468656 CRC32 0xb2019f3f 	GTID	last_committed=7400	sequence_number=7405	rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= '00005723-0000-0000-0000-000000005723:7405'/*!*/;
# at 12468656
#180918 22:32:19 server id 1  end_log_pos 12468730 CRC32 0x76b5ea6c 	Query	thread_id=97	exec_time=0	error_code=0
SET TIMESTAMP=1537281139/*!*/;
BEGIN
/*!*/;
# at 12468730
#180918 22:32:19 server id 1  end_log_pos 12468789 CRC32 0x48f0ba6d 	Table_map: `sbtest`.`sbtest8` mapped to number 115

The next binary log position is 36174621. To fix the slave, run:

STOP SLAVE;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=12468591, MASTER_AUTO_POSITION=0;
START SLAVE;

Notice that I added MASTER_AUTO_POSITION=0 above to disable GTID replication for now. You can run SHOW SLAVE STATUS\G to determine that MySQL is running fine:

mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 5723
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 446194575
               Relay_Log_File: mysql_sandbox5641-relay-bin.000002
                Relay_Log_Pos: 12704248
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
*** redacted ***
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 25172522
              Relay_Log_Space: 433726939
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
*** redacted ***
        Seconds_Behind_Master: 2018
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
                  Master_UUID: 00005723-0000-0000-0000-000000005723
             Master_Info_File: /ssd/sandboxes/msb_5_6_41/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
*** redacted ***
           Retrieved_Gtid_Set: 00005723-0000-0000-0000-000000005723:7405-264930
            Executed_Gtid_Set: 00005723-0000-0000-0000-000000005723:1-14947
                Auto_Position: 0

Since it’s running fine you can now revert back to GTID replication:

STOP SLAVE;
CHANGE MASTER TO MASTER_AUTO_POSITION=1;
START SLAVE;

Finally, to make the slave consistent with the master, execute the compatible query on the slave.

SET SESSION sql_log_bin = 0;
GRANT USAGE ON *.* TO 'testuser'@'localhost' IDENTIFIED BY 'testuser';

Summary

In this article, I’ve shared how to fix replication when it breaks due to an incompatible command being replicated to the slave. In fact, I’ve only identified ALTER USER as an incompatible command for 5.6. If there are other incompatible commands, please share them in the comment section. Thanks in advance.

The post How To Fix MySQL Replication After an Incompatible DDL Command appeared first on Percona Database Performance Blog.

↧

MySQL Explain Example – Explaining MySQL EXPLAIN using StackOverflow data

October 1, 2018, 6:55 am

≫ Next: ClickHouse: Two Years!

≪ Previous: How To Fix MySQL Replication After an Incompatible DDL Command

I personally believe that the best way to deliver a complicated message to an audience, is by using a simple example. So in this post, I chose to demonstrate how to obtain insights from MySQL’s EXPLAIN output, by using a simple SQL query which fetches data from StackOverflow’s publicly available dataset.

The EXPLAIN command provides information about how MySQL executes queries. EXPLAIN can work with SELECT, DELETE, INSERT, REPLACE, and UPDATE statements.

We’ll first analyze the original query, then attempt to optimize the query and look into the optimized query’s execution plan to see what changed and why.

This is the first article in a series of posts. Each post will walk you through a more advanced SQL query than the previous post, while demonstrating more insights which can be obtained from MySQL’s execution plans.

The query and database structure

The following is the structure of the two tables used by this example query (posts and votes):

CREATE TABLE `posts` (
   `Id` int(11) NOT NULL,
   `AcceptedAnswerId` int(11) DEFAULT NULL,
   `AnswerCount` int(11) DEFAULT NULL,
   `Body` longtext CHARACTER SET utf8 NOT NULL,
   ...
   `OwnerUserId` int(11) DEFAULT NULL,
   ...
   `Title` varchar(250) CHARACTER SET utf8 DEFAULT NULL,
   `ViewCount` int(11) NOT NULL
   PRIMARY KEY (`Id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `votes` (
   `Id` int(11) NOT NULL,
   `PostId` int(11) NOT NULL,
   `UserId` int(11) DEFAULT NULL,
   `BountyAmount` int(11) DEFAULT NULL,
   `VoteTypeId` int(11) NOT NULL,
   `CreationDate` datetime NOT NULL,
   PRIMARY KEY (`Id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1

The following SQL query will find the details of users who added my StackOverflow questions to their favorites list. For the sake of this example, my user id is 12345678.

SELECT
  v.UserId,
  COUNT(*) AS FavoriteCount
FROM
  Votes v
  JOIN Posts p ON p.id = v.PostId
WHERE
  p.OwnerUserId = 12345678
  AND v.VoteTypeId = 5  -- (Favorites vote)
GROUP BY
  v.UserId
ORDER BY
  FavoriteCount DESC
LIMIT
  100;

The original query’s execution duration is very long. I stopped waiting and cancelled the execution after more than a minute has passed.

Explaining the original EXPLAIN

This is the original EXPLAIN plan for this query:

Before rushing to optimize the query, let’s take a closer look at the output of the EXPLAIN command, to make sure we fully understand all aspects of it. The first thing we notice, is that it can include more than one row. The query we’re analyzing involves two tables in the process, which are joined using an inner join. Each of these tables gets represented in a different row in the execution plan above. As an analogy to the coding world, you can look at the concept of an inner join as very similar to a nested loop. MySQL chooses the table it thinks will be best to start the journey with (the outer “loop”) and then touches the next table using the values from the outer “loop”.

Each of the rows in the EXPLAIN contains the following fields:

id – In most cases, the id field will present a sequential number of the SELECT query this row belongs to. The query above contains no subqueries nor unions, so therefore the id for both rows is 1, as there is actually only 1 query.
select_type – The type of SELECT query. In our case, it’s a simple query as it contains no subqueries or unions. In more complex cases, it will contain other types such as SUBQUERY (for subqueries), UNION (second or later statements in a union), DERIVED (a derived table) and others. More information about access_types can be found in MySQL’s docs.
table – the table name, or alias, this row refers to. In the screenshot above, you can see ‘v’ and ‘p’ mentioned, as those are the aliases defined for the tables votes and posts.
type – defines how the tables are accessed / joined. The most popular access types you’ll generally see are the following, sorted from the worst to the best: ALL, index, range, ref, eq_ref, const, system. As you can see in the EXPLAIN, the table votes is the first table accessed, using the ALL access_type, which means MySQL will scan the entire table, using no indexes, so it will go through over 14 million records. The posts table is then accessed using the eq_ref access type. Other than the system and const types, eq_ref is the best possible join type. The database will access one row from this table for each combination of rows from the previous tables.
possible_keys – The optional indexes MySQL can choose from, to look up for rows in the table. Some of the indexes in this list can be actually irrelevant, as a result of the execution order MySQL chose. In general, MySQL can use indexes to join tables. Said that, it won’t use an index on the first table’s join column, as it will go through all of its rows anyway (except rows filtered by the WHERE clause).
key – This column indicates the actual index MySQL decided to use. It doesn’t necessarily mean it will use the entire index, as it can choose to use only part of the index, from the left-most side of it.
key_len – This is one of the important columns in the explain output. It indicates the length of the key that MySQL decided to use, in bytes. In the EXPLAIN output above, MySQL uses the entire PRIMARY index (4 bytes). We know that because the only column in the PRIMARY index is Id, which is defined as an INT => 4 bytes. Unfortunately, there is no easier way to figure out which part of the index is used by MySQL, other than aggregating the length of all columns in the index and comparing that to the key_len value.
rows – Indicates the amount of number of rows MySQL believes it must examine from this table, to execute the query. This is only an estimation. Usually, high row counts mean there is room for query optimization.
filtered – The amount of rows unfiltered by the conditions in the WHERE clause. These rows will be joined to the table in the next row of the EXPLAIN plan. As mentioned previously, this is a guesstimate as well, so MySQL can be wrong with this estimation.
extra – Contains more information about the query processing. Let’s look into the extras for our query:
- using where – The WHERE clause is used to restrict which rows are fetched from the current table (votes) and matched with the next table (posts).
- using temporary – As part of the query processing, MySQL has to create a temporary table, which in many cases can result in a performance penalty. In most cases, it will indicate that one of the ORDER BY or GROUP BY clauses is executed without using an index. It can also happen if the GROUP BY and ORDER BY clauses include different columns (or in different order).
- using filesort – MySQL is forced to perform another pass on the results of the query to sort them. In many cases, this can also result in a performance penalty.

Optimizing a slow query using MySQL’s EXPLAIN

What can we learn from this specific EXPLAIN plan?

MySQL chooses to start with the votes table.The EXPLAIN output shows that it has to go through 145,045,878 rows, which is about all the rows in the table. That’s nasty.
The Extra column indicates that MySQL tries to reduce the amount of rows it inspects using the WHERE clause, but it estimates to reduce those to 10%, which is still bad, as 10% = 14 million rows. A possible conclusion from this information is that the condition v.VoteTypeId = 5 isn’t selective enough, so therefore millions of rows will be joined to the next table.
Looking at the WHERE clause, we can see there is another condition, p.OwnerUserId = 12345678, which looks very selective and should drastically reduce the amount of rows to inspect. The posts table contains ~40 million records, while only 57 rows are returned when applying the condition p.OwnerUserId = 12345678, which means that it’s very selective. In this case, it would be best if MySQL would have started the execution using the posts table, to take advantage of this selective condition. Later in this post, we’ll see which change will get MySQL to choose that order.
Looking at the possible_keys values for both tables, we can see that MySQL doesn’t have a lot of optional indexes to choose from. More precisely, there is no index in the possible_keys that contains the columns mentioned in the WHERE clause.

Therefore, we’ll add the following two indexes. Each index starts with the column mentioned in the WHERE clause. The index for the votes table also includes the joined column.

ALTER TABLE `Posts` ADD INDEX `posts_idx_owneruserid` (`OwnerUserId`);
ALTER TABLE `Votes` ADD INDEX `votes_idx_votetypeid_postid` (`VoteTypeId`,`PostId`);

The optimized query’s EXPLAIN output

This is the new EXPLAIN output after adding the indexes:

What changed?

The first change we see is that MySQL chose to start with the posts table this time (hurray!). It uses the new index to filter out the rows and estimates to filter all but 57 records, which are then joined to the second table, votes.
The second change we see, by looking at the key column, is that indexes are used for lookups and filtering in both tables.
By looking at the key_len column, we can see that the composite index for the votes table is used in full – 8 bytes, which covers both the VoteTypeId and PostId columns.
The last important change we see is the amount of rows MySQL estimates it needs to inspect in order to run evaluate the query. It estimates it needs to inspect 57 * 2 = 114 rows, which is great, comparing to the millions of records in the original execution path.

So looking at the execution duration now, we can see a drastic improvement, from a very slow query which never actually returned, to only 0.063 seconds:

Conclusions

Summarizing the most important aspects of reading an EXPLAIN and optimizing this simple query:

Run the EXPLAIN command to inspect the planned execution path for your SQL query.
Look at the tables order MySQL chose for the execution. Does it makes sense? If not, ask yourself why did MySQL get it wrong, what’s missing?
Find the conditions in the WHERE clause which are the most selective ones and make sure you create the optimal indexes to include them. You can read more information on how to create the optimal indexes here.
Look for the places MySQL doesn’t use an index for look-ups and filtering, as those may be the weak spots.
Look for the rows where MySQL shows a very high estimation of rows it needs to inspect to evaluate the query.

In our next post for this series, we’ll analyze a more complex query, looking into more insights the EXPLAIN can provide for other query structures such as sub-queries.

↧

ClickHouse: Two Years!

October 1, 2018, 7:47 am

≫ Next: Archive MySQL Data In Chunks Using Stored Procedure

≪ Previous: MySQL Explain Example – Explaining MySQL EXPLAIN using StackOverflow data

historical trend of ClickHouse popularity

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is it?” to “We are using or considering ClickHouse for our analytics needs”.

2. ClickHouse changed their versioning schema. Unfortunately it changed from the unconventional …; 1.1.54390; 1.1.54394 naming structure to the still unconventional 18.6.0; 18.10.3; 18.12.13 naming structure, where “18.” is a year of the release.

Now to the more interesting technical improvements.

3. Support of the more traditional JOIN syntax. Now if you join two tables you can use SELECT ... FROM tab1 ANY LEFT JOIN tab2 ON tab1_col=tab2_col .

So now, if we take a query from the workload described in https://www.percona.com/blog/2017/06/22/clickhouse-general-analytical-workload-based-star-schema-benchmark/

We can write this:

SELECT     C_REGION,     sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder ANY INNER JOIN customer
ON LO_CUSTKEY=C_CUSTKEY
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

instead of the monstrous:

SELECT
    C_REGION,
    sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder
ANY INNER JOIN
(
    SELECT
        C_REGION,
        C_CUSTKEY AS LO_CUSTKEY
    FROM customer
) USING (LO_CUSTKEY)
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

4. Support for DELETE and UPDATE operations. This has probably been the most requested feature since the first ClickHouse release.
ClickHouse uses an LSM-tree like structure—MergeTree—and it is not friendly to single row operations. To highlight this specific limitation, ClickHouse uses ALTER TABLE UPDATE / ALTER TABLE DELETE syntax to highlight this will be executed as a bulk operation, so please consider it as such. Updating or deleting rows in ClickHouse should be an exceptional operation, rather than a part of your day-to-day workload.

We can update a column like this: ALTER TABLE lineorder UPDATE LO_DISCOUNT = 5 WHERE LO_CUSTKEY = 199568

5. ClickHouse added a feature which I call Dictionary Compression, but ClickHouse uses the name “LowCardinality”. It is still experimental, but I hope soon it will be production ready. Basically it allows internally to replace long strings with a short list of enumerated values.

For example, consider the table from our example lineorder which contains 600037902 rows, but has only five different values for the column LO_ORDERPRIORITY:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
┌─LO_ORDERPRIORITY─┐
│ 1-URGENT         │
│ 5-LOW            │
│ 4-NOT SPECIFIED  │
│ 2-HIGH           │
│ 3-MEDIUM         │
└──────────────────┘

So we can define our table as:

CREATE TABLE lineorder_dict (
        LO_ORDERKEY             UInt32,
        LO_LINENUMBER           UInt8,
        LO_CUSTKEY              UInt32,
        LO_PARTKEY              UInt32,
        LO_SUPPKEY              UInt32,
        LO_ORDERDATE            Date,
        LO_ORDERPRIORITY        LowCardinality(String),
        LO_SHIPPRIORITY         UInt8,
        LO_QUANTITY             UInt8,
        LO_EXTENDEDPRICE        UInt32,
        LO_ORDTOTALPRICE        UInt32,
        LO_DISCOUNT             UInt8,
        LO_REVENUE              UInt32,
        LO_SUPPLYCOST           UInt32,
        LO_TAX                  UInt8,
        LO_COMMITDATE           Date,
        LO_SHIPMODE             LowCardinality(String)
)Engine=MergeTree(LO_ORDERDATE,(LO_ORDERKEY,LO_LINENUMBER),8192);

How does this help? Firstly, it offers space savings. The table will take less space in storage, as it will use integer values instead of strings. And secondly, performance. The filtering operation will be executed faster.

For example: here’s a query against the table with LO_ORDERPRIORITY stored as String:

SELECT count(*)
FROM lineorder
WHERE LO_ORDERPRIORITY = '2-HIGH'
┌───count()─┐
│ 119995822 │
└───────────┘
1 rows in set. Elapsed: 0.859 sec. Processed 600.04 million rows, 10.44 GB (698.62 million rows/s., 12.16 GB/s.)

And now the same query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT count(*)
FROM lineorder_dict
WHERE LO_ORDERPRIORITY = '2-HIGH'
┌───count()─┐
│ 119995822 │
└───────────┘
1 rows in set. Elapsed: 0.350 sec. Processed 600.04 million rows, 600.95 MB (1.71 billion rows/s., 1.72 GB/s.)

This is 0.859 sec vs 0.350 sec (for the LowCardinality case).

Unfortunately this feature is not optimized for all use cases, and actually in aggregation it performs slower.

An aggregation query against table with LO_ORDERPRIORITY as String:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
┌─LO_ORDERPRIORITY─┐
│ 4-NOT SPECIFIED  │
│ 1-URGENT         │
│ 2-HIGH           │
│ 3-MEDIUM         │
│ 5-LOW            │
└──────────────────┘
5 rows in set. Elapsed: 1.200 sec. Processed 600.04 million rows, 10.44 GB (500.22 million rows/s., 8.70 GB/s.)

Versus an aggregation query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder_dict
┌─LO_ORDERPRIORITY─┐
│ 4-NOT SPECIFIED  │
│ 1-URGENT         │
│ 2-HIGH           │
│ 3-MEDIUM         │
│ 5-LOW            │
└──────────────────┘
5 rows in set. Elapsed: 2.334 sec. Processed 600.04 million rows, 600.95 MB (257.05 million rows/s., 257.45 MB/s.)

This is 1.200 sec vs 2.334 sec (for the LowCardinality case)

6. And the last feature I want to mention is the better support of Tableau Software: this required ODBC drivers. It may not seem significant, but Tableau is the number one software for data analysts, and by supporting this, ClickHouse will reach a much wider audience.

Summing up: ClickHouse definitely became much more user friendly since a year ago!

The post ClickHouse: Two Years! appeared first on Percona Database Performance Blog.

↧

Archive MySQL Data In Chunks Using Stored Procedure

September 26, 2018, 6:12 am

≫ Next: Prepared Statements for MySQL: PDO, MySQLi, and X DevAPI

≪ Previous: ClickHouse: Two Years!

In a DBA’s day to day activities, we are doing Archive operation on our transnational database servers to improve your queries and control the Disk space. The archive is a most expensive operation since its involved a huge number of Read and Write will be performed. So its mandatory to run the archive queries in …

The post Archive MySQL Data In Chunks Using Stored Procedure appeared first on SQLgossip.

↧

Prepared Statements for MySQL: PDO, MySQLi, and X DevAPI

October 1, 2018, 9:16 am

≫ Next: Transaction Processing in NewSQL

≪ Previous: Archive MySQL Data In Chunks Using Stored Procedure

Recently I ran across a prominent PHP Developer who incorrectly claimed that only PDO allows binding values to variables for prepared statements. A lot of developer use prepared statements to reduce the potential of SQL Injection and it is a good first step. But there are some features that you do no kno

What is a Prepared Statement?

The MySQL Manual states The MySQL database supports prepared statements. A prepared statement or a parameterized statement is used to execute the same statement repeatedly with high efficiency.

So far, so good. Well there is also a performance issue to consider too. From the same source The prepared statement execution consists of two stages: prepare and execute. At the prepare stage a statement template is sent to the database server. The server performs a syntax check and initializes server internal resources for later use.

So it is a two step process. Set up the query as a template and then plug in the value. If you need to reuse the query, just plug in a new value into the template.

So lets look at how it is done.

PDO

On PHP.NET, there are a lot of really great examples.

/* Prepared statement, stage 1: prepare */if (!($stmt = $mysqli->prepare("INSERT INTO test(id) VALUES (?)"))) { echo "Prepare failed: (" . $mysqli->errno . ") " . $mysqli->error;
}

So this is our template with a 'place holder', designated as a question mark (?).
And then it is executed.

$id = 1;if (!$stmt->bind_param("i", $id)) { echo "Binding parameters failed: (" . $stmt->errno . ") " . $stmt->error;} if (!$stmt->execute()) { echo "Execute failed: (" . $stmt->errno . ") " . $stmt->error;}?>

So if we wanted to insert an $id with a value of 2, we would just assign that value ($id = 2)
and rerun the $stmt->bind_param/$stmt_execute duo again.

So that is the basics. But what do they look like with the other two extensions?

MySQLi

So what does the MySQLi version look like? Once again question marks are used as placeholders.

$stmt = $mysqli->prepare("INSERT INTO CountryLanguage VALUES (?, ?, ?, ?)");

$stmt->bind_param('sssd', $code, $language, $official, $percent);
$code = 'DEU';$language = 'Bavarian';$official = "F";$percent = 11.2;
/* execute prepared statement */$stmt->execute();
printf("%d Row inserted.\n", $stmt->affected_rows);

But what is that sssd stuff? That is where you declare the type of variable you are want to use. Use 's' for string, 'i' for integer, 'd' for double, and 'b' for a blob (binary large object). So you get the advantage of type checking.

X DevAPI

The much newer X DevAPI is for the new X Protocol and the MySQL Document Store. Unlike the other two examples it is not Structured Query Language (SQL) based.

$res = $coll->modify('name like :name')->arrayInsert('job[0]', 'Calciatore')->bind(['name' => 'ENTITY'])->execute();

$res = $table->delete()->orderby('age desc')->where('age < 20 and age > 12 and name != :name')->bind(['name' => 'Tierney'])->limit(2)->execute();

Note that this is not an object relational mapper as it is the protocol itself and not something mapping the object to the SQL.

Wrap Up

So now you know how to use prepared statements with all three PHP MySQL Extensions.

↧

Transaction Processing in NewSQL

October 1, 2018, 10:44 am

≫ Next: Replicating data from MySQL to Oracle

≪ Previous: Prepared Statements for MySQL: PDO, MySQLi, and X DevAPI

This is a list of references for transaction processing in NewSQL systems. The work is exciting. I don't have much to add and wrote this to avoid losing interesting links. My focus is on OLTP, but some of these systems support more than that.

By NewSQL I mean the following. I am not trying to define "NewSQL" for the world:

Support for multiple nodes because the storage/compute on one node isn't sufficient.
Support for SQL with ACID transactions. If there are shards then cross-shard operations can be consistent and isolated.
Replication does not prevent properties listed above when you are wiling to pay the price in commit overhead. Alas synchronous geo-replication is slow and too-slow commit is another form of downtime. I hope NewSQL systems make this less of a problem (async geo-replication for some or all commits, commutative operations). Contention and conflict are common in OLTP and it is important to understand the minimal time between commits to a single row or the max number of commits/second to a single row.

NewSQL Systems

MySQL Cluster - this was NewSQL before NewSQL was a thing. There is a nice book that explains the internals. There is a company that uses it to make HDFS better. Cluster seems to be more popular for uses other than web-scale workloads.
VoltDB - another early NewSQL system that is still getting better. It was after MySQL Cluster but years before Spanner and came out of the H-Store research effort.
Spanner - XA across-shards, Paxos across replicas, special hardware to reduce clock drift between nodes. Sounds amazing, but this is Google so it just works. See the papers that explain the system and support for SQL. This got the NewSQL movement going.
CockroachDB - the answer to implementing Spanner without GPS and atomic clocks. From that URL they explain it as "while Spanner always waits after writes, CockroachDB sometimes waits before reads". It uses RocksDB and they help make it better.
FaunaDB - FaunaDB is inspired by Calvin and Daniel Abadi explains the difference between it and Spanner -- here and here. Abadi is great at explaining distributed systems, see his work on PACELC (and the pdf). A key part of Calvin is that "Calvin uses preprocessing to order transactions. All transactions are inserted into a distributed, replicated log before being processed." This approach might limit the peak TPS on a large cluster, but I assume that doesn't matter for a large fraction of the market.
YugaByte - another user of RocksDB. There is much discussion about it in the recent Abadi post. Their docs are amazing -- slides, transaction IO path, single-shard write IO path, distributed ACID and single-row ACID.
TiDB - I don't know much about it but they are growing fast and are part of the MySQL community.

Other relevant systems

FoundationDB - I am curious where this goes given the competition explained above.
Aurora - not NewSQL yet because this doesn't scale across nodes. It does support large nodes and that might be sufficient for a large part of the market. But Amazon moves fast (see the new parallel query feature) so I wouldn't be surprised if this became NewSQL one day. I appreciate that they have begun to explain the internals -- here and here.
MongoDB - not SQL, but starting to get interesting with the new features for read and write concerns. There is also new support for causal consistency and retryable writes.
Clustrix - a NewSQL system that is now part of MariaDB. Maybe this becomes open source.
Kudu - awesome paper, interesting research on HybridTime, useful docs on the internals.

↧

Replicating data from MySQL to Oracle

October 1, 2018, 11:51 am

≫ Next: Shutdown and Restart Statements

≪ Previous: Transaction Processing in NewSQL

In our work, We used to get a lot of requirements for replicating data from one data source to another.Previously I wrote replication from MySQL to Red-shift.

In this blog I am going to explain about replicating the data from MySQL to Oracle using Tungsten replicator.

1.0. Tungsten Replicator :

It is an open source replication engine supports data extract from MySQL, MySQL Variants such as RDS, Percona Server, MariaDB and Oracle and allows the data extracted to be applied on other data sources such as Vertica, Cassandra, Redshift etc.

Tungsten Replicator includes support for parallel replication, and advanced topologies such as fan-in and multi-master, and can be used efficiently in cross-site deployments.

1.1.0. Architecture :

There are three major components in tungsten replicator.

1. Extractor / Master Service
2. Transaction History Log (THL)
3. Applier / Slave Service

1.1.1. Extractor / Master Service :

The extractor component reads data from MySQL’s binary log and writes that information into the Transaction History Log (THL).

1.1.2. Transaction History Log (THL) :

The Transaction History Log (THL) acts as a translator between two different data sources. It stores transactional data from different data servers in a universal format using the replicator service acting as a master, That could then be processed Applier / Slave service.

1.1.3. Applier / Slave Service :

All the raw row-data recorded on the THL logs is re-assembled or constructed into another format such as JSON or BSON, or external CSV formats that enable the data to be loaded in bulk batches into a variety of different targets.

Therefore Statement information is not supported for heterogeneous deployments. So It’s mandatory that Binary log format on MySQL is ROW ( with Full Image ).

1.2.0. Schema Creation :

This heterogeneous replication does not replicated SQL statements, including DDL statements that would normally define and generate the table structures, a different method must be used.

Tungsten Replicator includes a tool called ddlscan which can read the schema definition from MySQL or Oracle and translate that into the schema definition required on the target database.

1.3.0. Pre Requisites:

1.3.1. Server Packages:

JDK 7 or higher
Ant 1.8 or higher
Ruby 2.4
Net-SSH
Net-SCP

1.3.2. MySQL:

All the tables to be replicated must have a primary key.

Following MySQL configuration should be enabled on MySQL

binlog-format = row
binlog-row-image = full
character-set-server=utf8
collation-server=utf8_general_ci
default-time-zone='+00:00'

2.0. Requirements :

User creation in Oracle :

CREATE USER accounts_places IDENTIFIED BY accounts_1 DEFAULT TABLESPACE ACCOUNTS_PUB QUOTA UNLIMITED ON ACCOUNTS_PUB;

GRANT CONNECT TO accounts_places;

GRANT ALL PRIVILEGES TO accounts_places;

User creation in MySQL :

root@localhost:(none)> create user 'tungsten'@'%' identified by 'secret';
Query OK, 0 rows affected (0.01 sec)

root@localhost:(none)> GRANT ALL PRIVILEGES ON *.* TO 'tungsten'@'%';
Query OK, 0 rows affected (0.00 sec)

We would need to replicate the NOTES_TESTING table from accounts schema on MySQL to Oracle.Structures of the table are given below.

CREATE TABLE NOTES_TESTING (
ID INT(11) NOT NULL AUTO_INCREMENT,
NOTE TEXT,
CREATED_AT DATETIME DEFAULT NULL,
UPDATED_AT DATETIME DEFAULT NULL,
PERSON_ID INT(11) DEFAULT NULL,
ADDED_BY INT(11) DEFAULT NULL,
PRIMARY KEY (ID));

Note :

The above table was created in MySQL .Moving forward the MySQL to Oracle replication a few datatypes are not supported in Oracle.Link

3.0. Implementation:

The implementation consists of following steps.

Installation / Building tungsten from source
Preparing equivalent schema for Redshift
Configuring Master service
Configuring Slave service
Generating worker tables (temp tables used by tungsten) for replication to be created on redshift
Start the replication

3.1. Installation / Building From Source:

Download the source package from the GIT.

# git clone https://github.com/continuent/tungsten-replicator.git

Compile this package it will generate the tungsten-replicator.tar file.

# sh tungsten-replicator/builder/build.sh
# mkdir -p tungsten

Once the tar file is generated extract the file under the folder tungsten

# tar --strip-components 1 -zxvf tungsten-replicator/builder/build/tungsten-replicator-5.2.1.tar.gz -C tungsten/

3.2. Preparing equivalent table for Oracle :

In tungsten replicator the ddl extractor which will read table definitions from MySQL and create appropriate Oracle table definitions to use during replication.

./bin/ddlscan -user root -url 'jdbc:mysql:thin://mysql-stg:3306/accounts' -pass root2345 -template ddl-mysql-oracle.vm -db ACCOUNTS > access_log.ddl

This ddlscan will extract the mysql table definitions and stored in access_log.ddl file.

The table structure will be like this.

DROP TABLE ACCOUNTS.notes_testing;
CREATE TABLE ACCOUNTS.notes_testing
(
ID NUMBER(10, 0) NOT NULL,
NOTE CLOB /* TEXT */,
CREATED_AT DATE,
UPDATED_AT DATE,
PERSON_ID NUMBER(38,0),
ADDED_BY NUMBER(38,0),
PRIMARY KEY (ID)
);

Then we need to restore this ddl to Oracle like this .

# cat access_log.ddl | sqlplus sys/oracle as sysdba

To check tables are created inside the correct accounts schema.

SQL> desc notes_testing;
 Name               Null?    Type
 --------       --------   -----------------
 ID             NOT NULL    NUMBER(10, 0)
 NOTE                       CLOB
 CREATED_AT                 DATE
 UPDATED_AT                 DATE
 PERSON_ID                  NUMBER(38,0)
 ADDED_BY                   NUMBER(38,0)

3.3. Configuring Master Service:

Switch to tungsten directory and Reset the defaults configuration file.

#cd ~/tungsten
#./tools/tpm configure defaults --reset

Configure the Master service on the directory of your choice, We have used /opt/master
Following commands will prepare the configuration file for Master service.

./tools/tpm configure master \
--install-directory=/opt/master \
--enable-heterogeneous-service=true \
--enable-heterogeneous-master=true \
--members=mysql-stg \
--master=mysql-stg

./tools/tpm configure master --hosts=mysql-stg \
--replication-user=tungsten \
--replication-password=password \
--skip-validation-check=MySQLUnsupportedDataTypesCheck

Once the configuration is prepared, Then we can install it using tpm.

./tools/tpm install
Configuration is now complete. 
For further information, please consult
Tungsten documentation, which is available at docs.continuent.com.

NOTE  >> Command successfully completed

Now Master service will be configured under /opt/master/
Start the tungsten Master service.

[root@mysql-stg tungsten]# /opt/master/tungsten/cluster-home/bin/startall
Starting Tungsten Replicator Service...
Waiting for Tungsten Replicator Service.
running: PID:15141

Verify it’s working by checking the master status.

[root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: 0
appliedLatency  : 1.412
role            : master
serviceName     : master
serviceType     : local
started         : true
state           : ONLINE
Finished services command...

[root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : mysql-bin.000134:0000000000000652;-1
appliedLastSeqno       : 0
appliedLatency         : 1.412
autoRecoveryEnabled    : false
autoRecoveryTotal      : 0
channels               : 1
clusterName            : master
currentEventId         : mysql-bin.000134:0000000000000652
currentTimeMillis      : 1536839268029
dataServerHost         : mysql-stg
extensions             : 
host                   : mysql-stg
latestEpochNumber      : 0
masterConnectUri       : thl://localhost:/
masterListenUri        : thl://mysql-stg:2112/
maximumStoredSeqNo     : 0
minimumStoredSeqNo     : 0
offlineRequests        : NONE
pendingError           : NONE
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: NONE
pipelineSource         : jdbc:mysql:thin://mysql-stg:3306/tungsten_master?noPrepStmtCache=true
relativeLatency        : 13.029
resourcePrecedence     : 99
rmiPort                : 10000
role                   : master
seqnoType              : java.lang.Long
serviceName            : master
serviceType            : local
simpleServiceName      : master
siteName               : default
sourceId               : mysql-stg
state                  : ONLINE
timeInStateSeconds     : 12.854
timezone               : GMT
transitioningTo        : 
uptimeSeconds          : 13.816
useSSLConnection       : false
version                : Tungsten Replicator 5.2.1
Finished status command...

If the master did not start properly refer to this (/opt/master/service_logs/trepsvc.log) error log.

3.4. Configuring Slave service:

Switch to tungsten directory and Reset the defaults configuration file.

#cd ~/tungsten
#./tools/tpm configure defaults --reset

Configure the Slave service on the directory of your choice, We have used /opt/slave
Following commands will prepare the configuration file for Slave service.

./tools/tpm configure slave \
--install-directory=/opt/slave \
--enable-heterogeneous-service=true \
--members=mysql-stg

./tools/tpm configure slave --hosts=mysql-stg \
--datasource-type=oracle \
--datasource-host=172.17.4.106 \
--datasource-port=1526 \
--datasource-oracle-sid=PLACES \
--datasource-user=accounts_places \
--datasource-password=accounts_1 \
--svc-applier-filters=dropstatementdata,replicate \
--property=replicator.filter.replicate.do=accounts --property=replicator.applier.dbms.getColumnMetadataFromDB=false \
--skip-validation-check=InstallerMasterSlaveCheck \
--rmi-port=10002 \
--thl-port=2113 \
--master-thl-port=2112 \
--master-thl-host=mysql-stg

Once the configuration is prepared, Then we can install it using tpm.

./tools/tpm install
Configuration is now complete.  
For further information, please consult
Tungsten documentation, which is available at docs.continuent.com.
NOTE  >> Command successfully completed

3.5. Starting Replication:

Once the slave is configured then start the slave

[root@mysql-stg tungsten]# /opt/slave/tungsten/cluster-home/bin/startall
Starting Tungsten Replicator Service...
Waiting for Tungsten Replicator Service.
running: PID:17039

Verify it’s working by checking the slave status.

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl services
Processing services command...
NAME              VALUE
----              -----
appliedLastSeqno: -1
appliedLatency  : -1.0
role            : slave
serviceName     : slave
serviceType     : unknown
started         : true
state           : OFFLINE:ERROR
Finished services command...

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status
Processing status command...
NAME                     VALUE
----                     -----
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
autoRecoveryEnabled    : false
autoRecoveryTotal      : 0
channels               : 1
clusterName            : slave
currentEventId         : NONE
currentTimeMillis      : 1536839732221
dataServerHost         : 172.17.4.106
extensions             : 
host                   : 172.17.4.106
latestEpochNumber      : -1
masterConnectUri       : thl://mysql-stg:2112/
masterListenUri        : null
maximumStoredSeqNo     : -1
minimumStoredSeqNo     : -1
offlineRequests        : NONE
pendingError           : NONE
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: NONE
pipelineSource         : thl://mysql-stg:2112/
relativeLatency        : -1.0
resourcePrecedence     : 99
rmiPort                : 10002
role                   : slave
seqnoType              : java.lang.Long
serviceName            : slave
serviceType            : local
simpleServiceName      : slave
siteName               : default
sourceId               : 172.17.4.106
state                  : ONLINE
timeInStateSeconds     : 4.476
timezone               : GMT
transitioningTo        : 
uptimeSeconds          : 77.996
useSSLConnection       : false
version                : Tungsten Replicator 5.2.1
Finished status command...

4.0. Testing:

Now both master & slave are in sync. Now I am going to insert a few record on MySQL server in notes_testing table.

insert into notes_testing values(1,'Mydbops ',NULL,NULL,13,45);

insert into notes_testing values(2,'MySQL DBA',NULL,NULL,1,2);

Above these records are inserted in the master server. At the same I have checked Oracle these records are replicated or not.

SQL> select * from notes_testing;

ID NOTE CREATED_AT UPDATED_AT PERSON_ID ADDED_BY

— —- ———- ——— ——– ———

1 Mydbops 13 45

2 MySQL DBA 1 2

SQL>

5.0. Troubleshooting:

While reconfiguring the tungsten replicatior I am getting below error replication is not synced.Here I have mentioned the sample error. Sample Error :

NAME                     VALUE
----                     -----
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
autoRecoveryEnabled    : false
autoRecoveryTotal      : 0
channels               : -1
clusterName            : slave
currentEventId         : NONE
currentTimeMillis      : 1536839670076
dataServerHost         : 172.17.4.106
extensions             : 
host                   : 172.17.4.106
latestEpochNumber      : -1
masterConnectUri       : thl://mysql-stg:2112/
masterListenUri        : null
maximumStoredSeqNo     : -1
minimumStoredSeqNo     : -1
offlineRequests        : NONE
pendingError           : Event extraction failed
pendingErrorCode       : NONE
pendingErrorEventId    : NONE
pendingErrorSeqno      : -1
pendingExceptionMessage: Client handshake failure: Client response validation failed: Master log does not contain requested transaction: master source ID=mysql-stg client source ID=172.17.4.106 requested seqno=3 client epoch number=0 master min seqno=0 master max seqno=0
pipelineSource         : UNKNOWN
relativeLatency        : -1.0
resourcePrecedence     : 99
rmiPort                : 10002
role                   : slave

The reason is some transactions on the slave from a previous installation was not clear properly.

solution :

To clear old transactions from slave.

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl -service slave reset -all

Do you really want to delete replication service slave completely? [yes/NO] yes

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl online

[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status
Processing status command...
NAME VALUE
---- -----
appliedLastEventId : NONE
appliedLastSeqno : -1
appliedLatency : -1.0
autoRecoveryEnabled: false
autoRecoveryTotal: 0
channels : 1
clusterName: slave
currentEventId : NONE
currentTimeMillis: 1536839732221
dataServerHost : 172.17.4.106
extensions :
host : 172.17.4.106
latestEpochNumber: -1
masterConnectUri : thl://mysql-stg:2112/
masterListenUri: null
maximumStoredSeqNo : -1
minimumStoredSeqNo : -1
offlineRequests: NONE
pendingError : NONE
pendingErrorCode : NONE
pendingErrorEventId: NONE
pendingErrorSeqno: -1
pendingExceptionMessage: NONE
pipelineSource : thl://mysql-stg:2112/
relativeLatency: -1.0
resourcePrecedence : 99
rmiPort: 10002
role : slave
seqnoType: java.lang.Long
serviceName: slave
serviceType: local
simpleServiceName: slave
siteName : default
sourceId : 172.17.4.106
state: ONLINE
timeInStateSeconds : 4.476
timezone : GMT
transitioningTo:
uptimeSeconds: 77.996
useSSLConnection : false
version: Tungsten Replicator 5.2.1
Finished status command...

6.0. Conclusion:

Tungsten replicator is a great tool when it comes to replication of data with heterogeneous data sources. If we understand it’s working, It’s easy to configure and perform efficiently.

Image Courtesy : Photo by Karl JK Hedin on Unsplash

↧

Shutdown and Restart Statements

October 2, 2018, 1:53 am

≫ Next: MySQL and Memory: a love story (part 2)

≪ Previous: Replicating data from MySQL to Oracle

There are various ways to shutdown MySQL. The traditional cross platform method is to use the shutdown command in the mysqladmin client. One drawback is that it requires shell access; another is that it cannot start MySQL again automatically. There are platform specific options that can perform a restart such as using systemctl on Linux or install MySQL as a service on Microsoft Windows. What I will look at here though is the built in support for stopping and restarting MySQL using SQL statements.

Stop Sign — Photo by Michael Mroczek on Unsplash

MySQL 5.7 added the SHUTDOWN statement which allows you to shut down MySQL using the MySQL command-line client or MySQL Shell. The command is straight forward to use:

The SHUTDOWN command available in MySQL 5.7 and later.

You will need the SHUTDOWN privilege to use the statement – this is the same as it required to use mysqladmin to shutdown MySQL. There is one gotcha to be aware of with the SHUTDOWN statement: it only works with the old (traditional) MySQL protocol. If you attempt to use it when connected to MySQL using the new X Protocol, you get the error: ERROR: 3130: Command not supported by pluggable protocols as shown in the next example:

Executing SHUTDOWN when connected through the X Protocol causes error 3130.

The RESTART statement, on the other hand, works through both protocols and also requires the SHUTDOWN privilege:

The RESTART command available in MySQL 8.0.

For the restart to work, it is necessary that MySQL has been started in presence of a “monitoring service”. This is the default on Microsoft Windows (to disable the monitoring service start MySQL with --no-monitor). On Linux the monitoring service can for example be systemd or mysqld_safe.

As an example of where the RESTART statement comes in handy is for MySQL Shell’s AdminAPI for administrating a MySQL InnoDB Cluster cluster. MySQL Shell can when connected to MySQL Server 8.0 use the new SET PERSIST syntax to make the required configuration changes and then use the RESTART statement to restart the instance to make non-dynamic configuration changes take effect.

The SHUTDOWN and RESTART statements may not be the most important changes in MySQL 5.7 and 8.0, but they can be handy to know of in some cases.

↧

MySQL and Memory: a love story (part 2)

October 2, 2018, 3:11 am

≫ Next: CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available

≪ Previous: Shutdown and Restart Statements

We saw in the previous post that MySQL likes memory. We also saw how to perform operating system checks and some configuration changes for Swap and NUMA.

Today, we will check what MySQL server can tell us about its memory usage.

Introduced in MySQL 5.7 and enabled by default in MySQL 8.0, the Performance_Schema‘s Memory instrumentation allows us to have a better overview of what MySQL is allocating and why.

Let’s check on our MySQL server using SYS:

Pay attention that there is a bug related to how InnoDB Buffer Pool statistics are accounted in Performance_Schema. This is fixed in 8.0.13.

SYS schema provides 5 tables to get memory allocation information:

+-----------------------------------+
| Tables_in_sys (memory%)           |
+-----------------------------------+
| memory_by_host_by_current_bytes   |
| memory_by_thread_by_current_bytes |
| memory_by_user_by_current_bytes   |
| memory_global_by_current_bytes    |
| memory_global_total               |
+-----------------------------------+

It’s possible to get an overview by “code area“:

SELECT SUBSTRING_INDEX(event_name,'/',2) AS code_area, 
       sys.format_bytes(SUM(current_alloc)) AS current_alloc 
FROM sys.x$memory_global_by_current_bytes 
GROUP BY SUBSTRING_INDEX(event_name,'/',2) 
ORDER BY SUM(current_alloc) DESC;
+---------------------------+---------------+
| code_area                 | current_alloc |
+---------------------------+---------------+
| memory/innodb             | 333.47 MiB    |
| memory/performance_schema | 276.40 MiB    |
| memory/sql                | 28.54 MiB     |
| memory/mysys              | 8.96 MiB      |
| memory/temptable          | 7.00 MiB      |
| memory/mysqld_openssl     | 208.16 KiB    |
| memory/mysqlx             | 31.35 KiB     |
| memory/myisam             | 696 bytes     |
| memory/vio                | 624 bytes     |
| memory/csv                | 88 bytes      |
| memory/blackhole          | 88 bytes      |
+---------------------------+---------------+

Buffer Pool

When using InnoDB, one of the most important component is the InnoDB Buffer Pool. Every time an operation happen to a table (read or write), the page where the records (and indexes) are located is loaded into the Buffer Pool.

This means that if the data you read and write the most has its pages in the Buffer Pool, the performance will be better than if you have to read pages form disk. Also don’t forget that when there is no more free pages in it, older pages must be evicted and if they were modified, synchronized back to disk (checkpointing). So when all the data you need is present in the Buffer Pool, we say that the working-set fits in memory. You can have a data-set of 3TB (with a log of historical data that is never queried) but a working set of several GB or even less.

Since MySQL 8.0, if you have a dedicated server for MySQL, you can let MySQL configure the size of the Buffer Pool for you by setting innodb_dedicated_server to ON.

It’s possible to verify how much the InnoDB Buffer Pool is filled with data using the Performance_Schema:

mysql> SELECT CONCAT(FORMAT(A.num * 100.0 / B.num,2),"%") BufferPoolFullPct FROM
	(SELECT variable_value num FROM performance_schema.global_status
	WHERE variable_name = 'Innodb_buffer_pool_pages_data') A,
	(SELECT variable_value num FROM performance_schema.global_status
	WHERE variable_name = 'Innodb_buffer_pool_pages_total') B;
+-------------------+
| BufferPoolFullPct |
+-------------------+
| 27.73%            |
+-------------------+

Has you can see for the moment on this server the working set fits in memory as there are still plenty of free pages in the InnoDB Buffer Pool.

Of course it’s also possible to see the memory allocation for the Buffer Pool using the following query:

mysql> select * from memory_global_by_current_bytes 
where event_name like 'memory/innodb_buf_buf_pool';
*************************** 1. row ***************************
       event_name: memory/innodb/buf_buf_pool
    current_count: 2
    current_alloc: 262.12 MiB
current_avg_alloc: 131.06 MiB
       high_count: 2
       high_alloc: 262.12 MiB
   high_avg_alloc: 131.06 MiB

If you want to know what schemas or tables are present into the Buffer Pool, please query on of these SYS schema tables:

innodb_buffer_stats_by_schema
innodb_buffer_stats_by_table

You can also verify the ratio of pages requested by InnoDB and those read from the Buffer Pool to know if your working-set fits in memory or not. In this example I will check it the request for 1 minute on my database server:

show global status like 'innodb_buffer_pool_read%s';select sleep(60); show global status like 'innodb_buffer_pool_read%s';
+----------------------------------+---------+
| Variable_name                    | Value   |
+----------------------------------+---------+
| Innodb_buffer_pool_read_requests | 2459014 |
| Innodb_buffer_pool_reads         | 3550    |
+----------------------------------+---------+
2 rows in set (0.0026 sec)
+-----------+
| sleep(60) |
+-----------+
|         0 |
+-----------+
1 row in set (1 min 0.0006 sec)
+----------------------------------+---------+
| Variable_name                    | Value   |
+----------------------------------+---------+
| Innodb_buffer_pool_read_requests | 2465390 |
| Innodb_buffer_pool_reads         | 3550    |
+----------------------------------+---------+
2 rows in set (0.5880 sec)

We can see that for 60 seconds, there were 5175 page read and all were served by the Buffer Pool which is great !

You can find similar information from the output of SHOW ENGINE INNODB STATUS on the Buffer pool hit rate line. 1000 / 1000 is the number you want to see there. If you see permanently a lower number, then you should consider reducing your working-set or increasing your Buffer Pool.

Better indexes, no duplicate ones can help reducing the working-set. In sys schema, you can find 2 tables that can help you targeting the tables with non optimal indexes (schema_redundant_indexes & schema_unused_indexes). Better queries can also help reducing the working-set. You can fetch candidates in these two other tables: schema_tables_with_full_table_scans & statements_with_full_table_scans

Temporary Tables

Of course temporary tables also use the memory.

You can track their creation in the global status:

select * from performance_schema.global_status 
where variable_name like '%tmp%tables';
+-------------------------+----------------+
| VARIABLE_NAME           | VARIABLE_VALUE |
+-------------------------+----------------+
| Created_tmp_disk_tables | 0              |
| Created_tmp_tables      | 4903           |
+-------------------------+----------------+

You can see that some Temp Tables were created in memory. You can find the statements creating them querying the table statements_with_temp_tables.

We can monitor the amount of allocated RAM for temporary tables as it is reported in event memory/temptable/physical_ram:

select * from memory_global_by_current_bytes 
where event_name like '%temp%'\G
*************************** 1. row ***************************
       event_name: memory/temptable/physical_ram
    current_count: 7
    current_alloc: 7.00 MiB
current_avg_alloc: 1.00 MiB
       high_count: 10
       high_alloc: 10.00 MiB
   high_avg_alloc: 1.00 MiB

To limit the size of RAM used for one temporary table (the amount of temp tables is not limited), there are some configuration variables that can be used. Of course it depends of the Temporary Tables engine used. Before MySQL 8.0 only the MEMORY engine was available, in 8.0 the new TempTable engine is available and used by default.

With MEMORY engine the max size of a table in memory is limited by the lowest value of these two variables: max_heap_table_size and tmp_table_size.

Once the size exceeded (or incompatible types of fields), the temporary table goes to disk.

For TempTable engine, that allows VARCHAR columns, VARBINARY columns, or other binary large object type columns (supported as of MySQL 8.0.13), the variable temptable_max_ram limits the size of the table.

The Engine used for internal temporary tables is defined in these variables:

select * from performance_schema.global_variables 
where variable_name like 'internal_tmp%';
+----------------------------------+----------------+
| VARIABLE_NAME                    | VARIABLE_VALUE |
+----------------------------------+----------------+
| internal_tmp_disk_storage_engine | InnoDB         |
| internal_tmp_mem_storage_engine  | TempTable      |
+----------------------------------+----------------+

Small tip, it’s possible to see temporary tables that were just deleted on disk from the OS using lsof:

# lsof -p $(pidof mysqld) | grep -i del
mysqld  17275 mysql    5u   REG                8,1      4183   131170 /tmp/ibQdvz8e (deleted)
mysqld  17275 mysql    6u   REG                8,1         0   131180 /tmp/ibO15nJp (deleted)
mysqld  17275 mysql    7u   REG                8,1         0   131181 /tmp/ibqDRckA (deleted)
mysqld  17275 mysql    8u   REG                8,1         0   131182 /tmp/ibcb3V3Z (deleted)
mysqld  17275 mysql   12u   REG                8,1         0   131183 /tmp/ibzlww6g (deleted)

We can see that only one temp table has some records as it was a bit larger than 4k.

Buffers

And finally, there are Caches and Buffers that use some amount of memory. Some are globals and easy to identify them like binlog_cache_size, what is important to know is what a user can use as memory and see if the max_user_connections should be reduced.

For example every session will have their own read_buffer_size, read_rnd_buffer_size, sort_buffer_size, join_buffer_size, thread_stack, max_allowed_packet, net_buffer_length, ...

The following statement is also very interesting to have an idea of the allocated memory per user. That’s also why I encourage to have a different user per application.

select * from memory_by_user_by_current_bytes where user in('lefred','root')\G
*************************** 1. row ***************************
              user: lefred
current_count_used: 19402
 current_allocated: 69.05 MiB
 current_avg_alloc: 3.64 KiB
 current_max_alloc: 42.06 MiB
   total_allocated: 3.78 GiB
*************************** 2. row ***************************
              user: root
current_count_used: 649
 current_allocated: 2.78 MiB
 current_avg_alloc: 4.38 KiB
 current_max_alloc: 1.10 MiB
   total_allocated: 244.42 MiB

I hope you now understand a bit more how MySQL handles memory, why it’s not recommended to oversize session buffers, and that a new Engine is available for internal temporary tables (TempTables).

Don’t hesitate to share your tips too

↧

CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available

October 2, 2018, 10:15 am

≫ Next: Shinguz: MariaDB/MySQL Environment MyEnv 2.0.1 has been released

≪ Previous: MySQL and Memory: a love story (part 2)

High Availability To resolve a critical regression, Percona announces the release of Percona XtraDB Cluster 5.7.23-31.31.2 on October 2, 2018 Binaries are available from the downloads section or from our software repositories.

This release resolves a critical regression in the upstream wsrep library and supersedes 5.7.23-31.31

Percona XtraDB Cluster 5.7.23-31.31.2 is now the current release, based on the following:

Percona Server 5.7.23-23
Galera Replication library 3.24
Galera/Codership WSREP API Release 5.7.23

All Percona software is open-source and free.

Fixed Bugs

#2254: A cluster conflict could cause a crash in Percona XtraDB Cluster 5.7.23 if autocommit=off.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

The post CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available appeared first on Percona Database Performance Blog.

↧

Shinguz: MariaDB/MySQL Environment MyEnv 2.0.1 has been released

October 3, 2018, 12:06 am

≫ Next: MySQL User Camp, Bangalore

≪ Previous: CRITICAL UPDATE for Percona XtraDB Cluster users: 5.7.23-31.31.2 Is Now Available

FromDual has the pleasure to announce the release of the new version 2.0.1 of its popular MariaDB, Galera Cluster and MySQL multi-instance environment MyEnv.

The new MyEnv can be downloaded here. How to install MyEnv is described in the MyEnv Installation Guide.

In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.

Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

Upgrade from 1.1.x to 2.0

Please look at the MyEnv 2.0.0 Release Notes.

Upgrade from 2.0.0 to 2.0.1


shell> cd ${HOME}/product
shell> tar xf /download/myenv-2.0.1.tar.gz
shell> rm -f myenv
shell> ln -s myenv-2.0.1 myenv

Plug-ins

If you are using plug-ins for showMyEnvStatus create all the links in the new directory structure:

shell> cd ${HOME}/product/myenv
shell> ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/

Upgrade of the instance directory structure

From MyEnv 1.0 to 2.0 the directory structure of instances has fundamentally changed. Nevertheless MyEnv 2.0 works fine with MyEnv 1.0 directory structures.

Changes in MyEnv 2.0.1

MyEnv

CloudLinux was added as supported distribution.
Introduced different brackets for up () and down [] MariaDB/MySQL Instances in up output.
Script setMyEnv.sh added to set environment for instance (e.g. via ssh).
MyEnv should not complain any more when default my.cnf with include/includedir directives is used.
Missing instancedir configuration variable in myenv.conf is complaining now. This could be a left over from 1.x to 2.y migration.
OpenSuSE Leap 42.3 support added.

MyEnv Installer

Instance name with dot '.' is not allowed any more.
basedir without bin/mysqld is stripped out from installer overview.

MyEnv Utilities

Utilities cluster_conflict.php, cluster_conflict.sh, galera_monitor.sh, haproxy_maint.sh, group_replication_monitor.sh for Galera and Group Replication Cluster added.

For subscriptions of commercial use of MyEnv please get in contact with us.

Taxonomy upgrade extras:

↧

MySQL User Camp, Bangalore

October 3, 2018, 2:41 am

≫ Next: Introducing Agent-Based Database Monitoring with ClusterControl 1.7

≪ Previous: Shinguz: MariaDB/MySQL Environment MyEnv 2.0.1 has been released

We are happy to announce that another "MySQL User Camp" is going to be hold in Bangalore, India. Please find details below:

Date: Thursday, October 11, 2018
3:00-5:30pm
Venue: OC001, Block1, B wing, Kalyani Magnum Infotech Park, J.P. Nagar, 7th Phase Bangalore – 76
Agenda:
- Listen to Tomas Ulin (VP Engineering, MySQL) talking about how MySQL 8.0 combines best of both worlds: SQL and NoSQL.
- An engaging session by MySQL developer demonstrating “InnoDB Cluster in action”.
- A presentation by MySQL Developers on MySQL Document Store.
Please send an email to tinku.ajit@oracle.com for registration. Registration is free (FCFS)

Explore the opportunity to get to know more about MySQL by networking with other MySQLers!!!

Follow us on:
- Facebook Group : MySQL User Camp
- Google Group : bangalore-mysql-user-camp
- Linkedin Group : MySQL India

↧

Introducing Agent-Based Database Monitoring with ClusterControl 1.7

October 2, 2018, 7:41 am

≫ Next: System Performance Theory

≪ Previous: MySQL User Camp, Bangalore

We are excited to announce the 1.7 release of ClusterControl - the only management system you’ll ever need to take control of your open source database infrastructure!

ClusterControl 1.7 introduces new exciting agent-based monitoring features for MySQL, Galera Cluster, PostgreSQL & ProxySQL, security and cloud scaling features ... and more!

Release Highlights

Monitoring & Alerting

Agent-based monitoring with Prometheus
New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

Security & Compliance

Enable/disable Audit Logging on your MariaDB databases
Enable policy-based monitoring and logging of connection and query activity

Deployment & Scaling

Automatically launch cloud instances and add nodes to your cloud deployments

Additional Highlights

Support for MariaDB v10.3

View the ClusterControl ChangeLog for all the details!

Single Console for Your Entire Database Infrastructure

Find out what else is new in ClusterControl

Install ClusterControl for FREE

View Release Details and Resources

Release Details

Monitoring & Alerting

Agent-based monitoring with Prometheus

ClusterControl was originally designed to address modern, highly distributed database setups based on replication or clustering. It provides a systems view of all the components of a distributed cluster, including load balancers, and maintains a logical topology view of the cluster.

So far we’d gone the agentless monitoring route with ClusterControl, and although we love the simplicity of not having to install or manage agents on the monitored database hosts, an agent-based approach can provide higher resolution of monitoring data and has certain advantages in terms of security.

With that in mind, we’re happy to introduce agent-based monitoring as a new feature added in ClusterControl 1.7!

It makes use of Prometheus, a full monitoring and trending system that includes built-in and active scraping and storing of metrics based on time series data. One Prometheus server can be used to monitor multiple clusters. ClusterControl takes care of installing and maintaining Prometheus as well as exporters on the monitored hosts.

Users can now enable their database clusters to use Prometheus exporters to collect metrics on their nodes and hosts, thus avoiding excessive SSH activity for monitoring and metrics collections and use SSH connectivity only for management operations.

Monitoring & Alerting

New performance dashboards for MySQL, Galera Cluster, PostgreSQL & ProxySQL

ClusterControl users now have access to a set of new dashboards that have Prometheus as the data source with its flexible query language and multi-dimensional data model, where time series data is identified by metric name and key/value pairs. This allows for greater accuracy and customization options while monitoring your database clusters.

The new dashboards include:

Cross Server Graphs
System Overview
MySQL Overview, Replication, Performance Schema & InnoDB Metrics
Galera Cluster Overview & Graphs
PostgreSQL Overview
ProxySQL Overview

Security & Compliance

Audit Log for MariaDB

Continuous auditing is an imperative task for monitoring your database environment. By auditing your database, you can achieve accountability for actions taken or content accessed. Moreover, the audit may include some critical system components, such as the ones associated with financial data to support a precise set of regulations like SOX, or the EU GDPR regulation. Usually, it is achieved by logging information about DB operations on the database to an external log file.

With ClusterControl 1.7 users can now enable a plugin that will log all of their MariaDB database connections or queries to a file for further review; it also introduces support for version 10.3 of MariaDB.