This presentation was made at LSPE event in Bangalore (India) held at Walmart labs on 10-03-2018. This presentation focuses how we have harnessed the power of Ansible at Mydbops.
This presentation was made at LSPE event in Bangalore (India) held at Walmart labs on 10-03-2018. This presentation focuses how we have harnessed the power of Ansible at Mydbops.
With MySQL 8.0, the error logging subsystem has been redesigned to use the new component architecture.
Thanks to this new redesign, now the log events can be filtered, the output can be sent to multiple destinations (different formats like JSON). All that is controlled by system variables.
This work gives the possibility for a log event to become the raw material for log processing by more modern and automated systems like filebeat for beats, kibana, etc…
Let’s check the default configuration:
mysql> select * from global_variables where VARIABLE_NAME like 'log_error_%'; +---------------------+----------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +---------------------+----------------------------------------+ | log_error_services | log_filter_internal; log_sink_internal | | log_error_verbosity | 2 | +---------------------+----------------------------------------+
This means that log events will follow the following flow: first pass through log_filter_internal
(built-in filter component), then through log_sink_internal
(buit-in log writer component).
To enable a log component you need to use INSTALL COMPONENT
command and set the log_error_services
global variable as wished. To disable it use UNINSTALL COMPONENT
.
Currently the available log components are in lib/plugins:
To specify a new output format, you need to use a log writer component (sink). So let’s try to use one.
To load a component , you need its URN. This is ‘file://
‘ + the filename without the .so
extension. Example, to load the writer to json component, you enable it like this:
mysql> INSTALL COMPONENT 'file://component_log_sink_json'; Query OK, 0 rows affected (0.14 sec) mysql> SET GLOBAL log_error_services = 'log_filter_internal; log_sink_internal; log_sink_json'; Query OK, 0 rows affected (0.01 sec) mysql> select * from global_variables where VARIABLE_NAME like 'log_error_%'; +---------------------+-------------------------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +---------------------+-------------------------------------------------------+ | log_error_services | log_filter_internal; log_sink_internal; log_sink_json | | log_error_verbosity | 2 | +---------------------+-------------------------------------------------------+ 2 rows in set (0.00 sec)
Now if I generate an entry, I will have the error in the standard error log file and also in a new json file (having the name of the error log, specified in log_error
variable) with a number and the .json
extension. More info here in the manual.
Let’s have a look at the new entry generated by loading the group_replication plugin in this sandbox instance.
in the traditional error log:
2018-03-13T09:13:45.846708Z 24 [ERROR] [MY-011596] [Repl] Plugin group_replication reported: 'binlog_checksum should be NONE for Group Replication' 2018-03-13T09:13:45.853491Z 24 [ERROR] [MY-011660] [Repl] Plugin group_replication reported: 'Unable to start Group Replication on boot'
and in the json error log:
{ "prio": 1, "err_code": 11596, "subsystem": "Repl", "component": "plugin:group_replication", "SQL_state": "HY000", "source_file": "plugin.cc", "function": "check_if_server_properly_configured", "msg": "Plugin group_replication reported: 'binlog_checksum should be NONE for Group Replication'", "time": "2018-03-13T09:13:45.846708Z", "thread": 24, "err_symbol": "ER_GRP_RPL_BINLOG_CHECKSUM_SET", "label": "Error" } { "prio": 1, "err_code": 11660, "subsystem": "Repl", "component": "plugin:group_replication", "SQL_state": "HY000", "source_file": "plugin.cc", "function": "plugin_group_replication_init", "msg": "Plugin group_replication reported: 'Unable to start Group Replication on boot'", "time": "2018-03-13T09:13:45.853491Z", "thread": 24, "err_symbol": "ER_GRP_RPL_FAILED_TO_START_ON_BOOT", "label": "Error" }
The new error log service gives you the possibility to use components to filter the events.
The default built-in, log_filter_internal
, filters the events only based on their priority, you can specify it using the global variable log_error_verbosity
(default is 2).
But there is another component available that allows you to filter on rules that you define: log_filter_dragnet
Let’s try to setup this last one:
mysql> INSTALL COMPONENT 'file://component_log_filter_dragnet' mysql> SET GLOBAL log_error_services = 'log_filter_dragnet; log_sink_internal; log_sink_json'; mysql> SELECT * from global_variables where VARIABLE_NAME like 'log_error_ser%'; +--------------------+------------------------------------------------------+ | VARIABLE_NAME | VARIABLE_VALUE | +--------------------+------------------------------------------------------+ | log_error_services | log_filter_dragnet; log_sink_internal; log_sink_json | +--------------------+------------------------------------------------------+
and we can already check the available dragnet rule:
mysql> select * from global_variables where VARIABLE_NAME like 'dragnet%'\G *************************** 1. row *************************** VARIABLE_NAME: dragnet.log_error_filter_rules VARIABLE_VALUE: IF prio>=INFORMATION THEN drop. IF EXISTS source_line THEN unset source_line. 1 row in set (0.30 sec)
You can find much more information about the dragnet rule language in the manual.
As you can see the new Error Logging Service is much more powerful than prior of MySQL 8.0. It’s again another example of a new feature in 8.0.
And for those that want even more or something very specific, don’t forget that now you have also the possibility to modify directly the error logging as you have the possibility to create your own components.
Please join Percona’s Principal Support Engineer, Sveta Smirnova, as she presents Basic External MySQL Troubleshooting Tools on March 15, 2018 at 10:00 am PDT (UTC-7) / 1:00 pm EDT (UTC-4).
In my troubleshooting webinar series, I normally like to discuss built-in instruments available via the SQL interface. While they are effective and help to understand what is going on, external tools are also designed to make life of a database administrator easier.
In this webinar, I will discuss the external tools, toolkits and graphical instruments most valued by Support teams and customers. I will show the main advantages of these tools, and provide examples on how to effectively use them.
I will cover Percona Toolkit, MySQL Utilities, MySQL Sandbox, Percona Monitoring and Management (PMM) and a few other instruments.
Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can quickly solve typical issues, and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle. She is the author of the book “MySQL Troubleshooting” and JSON UDF functions for MySQL.
In this blog post, we’ll look at how to navigate some of the complexities of multi-source GTID replication.
GTID replication is often a real challenge for DBAs, especially if this has to do with multi-source GTID replication. A while back, I came across a really interesting customer environment with shards where multi-master, multi-source, multi-threaded MySQL 5.6 MIXED replication was active. This is a highly complex environment that has both pros and cons, introducing risks as a trade-off for specific customer requirements.
This is the set up of part of this environment:
I started looking into this setup when a statement broke replication between db1
and db10
. Replication broke due to a statement executed on a schema that was not present on db10.
This also resulted in changes originating from db1
to not being pushed down to db100
as db10
, as we stopped the replication thread (for db1 channel).
On the other hand, replication was not stopped on db2
because the schema in question was present on db2
. Replication between db2
and db20
was broken as well because the schema was not present in db20
.
In order to fix db1
->db10
replication, four GTID sets were injected in db10
.
Here are some interesting blog posts regarding how to handle/fix GTID replication issues:
After injecting the GTID sets, we started replication again and everything ran fine.
After that, we had to check the db2
->db20
replication, which, as I’ve already said, was broken as well. In this case, injecting only the first GTID trx into db20
instead of all of those causing issues on db10
was enough!
You may wonder how this is possible. Right? The answer is that the rest of them were replicated from db10
to db20
, although the channel was not the same.
Another strange thing is the fact that although the replication thread for the db2
->db20
channel was stopped (broken), checking the slave status on db20
showed that Executed_Gtid_Set
was moving for all channels even though Retrieved_Gtid_Set
for the broken one was stopped! So what was happening there?
This raised my curiosity, so I decided to do some further investigation and created scenarios regarding other strange things that could happen. An interesting one was about the replication filters. In our case, I thought “What would happen in the following scenario … ?”
Let’s say we write a row from db1
to db123.table789
. This row is replicated to db10
(let’s say using channel 1) and to db2
(let’s say using channel2). On channel 1, we filter out the db123.%
tables, on channel2 we don’t. db1
writes the row and the entry to the binary log. db2
writes the row after reading the entry from the binary log and subsequently writes the entry to its own binary log and replicates this change to db20
. This change is also replicated to db10
. So now, on db10
(depending on which channel finds the GTID first) it either gets filtered on channel1 and written to its own bin log at just start
…commit
with any actual DDL/DML removed, or if it is read first on channel2 (db1
->db2
and then db20
->db10
) then it is NOT filtered out and executed instead. Is this correct? It definitely ISN’T!
You can find answers to the above questions in the points of interest listed below. Although it’s not really clear through the official documentation, this is what happens with GTID replication and multi-source GTID replication:
Executed_Gtid_Set
is common for all channels. This means that regardless the originating channel, when a GTID transaction is executed it is recorded in all channels’ Executed_Gtid_Set
. Although it’s logical (each database is unique, so if a trx is going to affect a database it shouldn’t be tightened to a single channel regardless of the channel it uses), the documentation doesn’t provide much info around this.Today, yet another blog post about improvements in MySQL 8.0 related to
Performance
_Schema. Before MySQL 8.0 it was not always easy to get an example of the queries you could find in Performance_Schema
when looking for statements summaries. You had to link several tables (even from sys
) to achieve this goal as I explained it in this post.
Now in MySQL 8.0, we have changed the table events_statements_summary_by_digest
. This table now contains 6 extra columns:
QUANTILE_95
: stores the 95th percentile of the statement latency, in picoseconds.QUANTILE_99
: stores the 99th percentile of the statement latency, in picoseconds.QUANTILE_999
: stores the 99.9th percentile of the statement latency, in picoseconds.QUERY_SAMPLE_TEXT
: captures a query sample that can be used with EXPLAIN to get a query plan.QUERY_SAMPLE_SEEN
: stores the timestamp of the query.QUERY_SAMPLE_TIMER_WAIT
: stores the query sample execution time.FIRST_SEEN
and LAST_SEEN
have also been modified to use fractional seconds. The previous definition was:
Field: LAST_SEEN Type: timestamp Null: NO Key: Default: 0000-00-00 00:00:00 Extra:
Now it’s
Field: LAST_SEEN Type: timestamp(6) Null: NO Key: Default: 0000-00-00 00:00:00.000000 Extra:
The main goal is to capture a full example query like it was made in production with some key information about this query example and to make it easily accessible.
Proudly announcing the release of the latest stable release of ProxySQL 1.4.7 as of the 14th March 2018.
ProxySQL is a high performance, high availability, protocol aware proxy for MySQL. It can be downloaded here, and freely usable and accessible according to GPL license.
ProxySQL 1.4.7 includes a number of important improvements and bug fixes including:
MyHGM_myconnpoll_destroy
MyHGM_myconnpoll_get
MyHGM_myconnpoll_get_ok
MyHGM_myconnpoll_push
MyHGM_myconnpoll_reset
Queries_frontends_bytes_recv
Queries_frontends_bytes_sent
stats_mysql_global
status variables using SELECT @@
syntax #1375monitor_read_only_max_timeout_count
variable to allow multiple attempts on read only check #1206STMT_PREPARE_RESPONSE
MyComQueryCmd
not initialized could cause crash #1370SET NAMES ... COLLATE
#1357main-bundle.min.css
in web UI #1354SET
commands #1373CHANGE USER
#1393utf8_unicode_ci
in MariaDB Client Connector C #1396RO=1
becomes RO=0
(similar to #1039)A special thanks to all the people that reports bugs: this makes each version of ProxySQL better than the previous one.
Please report any bugs or feature requests on github issue tracker
The MariaDB project is pleased to announce the immediate availability of MariaDB Connector/J 2.2.3 and MariaDB Connector/J 1.7.3. See the release notes and changelogs for details and visit mariadb.com/downloads/connector to download.
Download MariaDB Connector/J 2.2.3
Release Notes Changelog About MariaDB Connector/J
Download MariaDB Connector/J 1.7.3
Release Notes Changelog About MariaDB Connector/J
The MariaDB project is pleased to announce the immediate availability of MariaDB Connector/J 2.2.3 and MariaDB Connector/J 1.7.3. See the release notes and changelogs for details.
If you’ve been delaying nominating someone or something for a MySQL Community Award, now is the time to submit it. You can submit it quickly via a google form, or go through the trouble of tweeting or emailing it.
After March 15, the Committee will being voting.
As a reminder: there are categories for Community Contributor (a person), Application, and Corporate Contributor. This is a fantastic way to honor new and old community members for the work they do – often in their spare time, and to give thanks to the corporations that help the community.
The original post with more details is here: http://mysqlawards.org/mysql-community-awards-2018/
In this blog, I will provide answers to the Q & A for the Basic Internal Troubleshooting Tools for MySQL Server webinar.
First, I want to thank everybody for attending my February 15, 2018, webinar on troubleshooting tools for MySQL. The recording and slides for the webinar are available here. Below is the list of your questions that I was unable to answer fully during the webinar.
Q: How do we prevent the schema prefix from appearing in the show create view. This is causing issue with restore on another server with a different DB. See the issue here and reproducible test case: https://gist.github.com/anonymous/30cb138b598fec46be762789397796b6
A: I shortened the example in order to fit it in this blog:
mysql> create table t1(f1 int); Query OK, 0 rows affected (3.47 sec) mysql> create view v1 as select * from t1; Query OK, 0 rows affected (0.21 sec) mysql> show create view v1G *************************** 1. row *************************** View: v1 Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `v1` AS select `t1`.`f1` AS `f1` from `t1` character_set_client: utf8 collation_connection: utf8_general_ci 1 row in set (0.00 sec) mysql> select * from information_schema.views where table_schema='test'G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: v1 VIEW_DEFINITION: select `test`.`t1`.`f1` AS `f1` from `test`.`t1` CHECK_OPTION: NONE IS_UPDATABLE: YES DEFINER: root@localhost SECURITY_TYPE: DEFINER CHARACTER_SET_CLIENT: utf8 COLLATION_CONNECTION: utf8_general_ci 1 row in set (0.00 sec)
The issue you experienced happened because even if you created a view as
SELECT foo FROM table1;, it is stored as
SELECT foo FROM your_schema.table1;. You can see it if you query the
*.frmfile for the view:
sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat var/mysqld.1/data/test/v1.frm TYPE=VIEW query=select `test`.`t1`.`f1` AS `f1` from `test`.`t1` md5=5840f59d1287385629fcb9b948e53d96 updatable=1 algorithm=0 definer_user=root definer_host=localhost suid=2 with_check_option=0 timestamp=2018-02-24 10:27:45 create-version=1 source=select * from t1 client_cs_name=utf8 connection_cl_name=utf8_general_ci view_body_utf8=select `test`.`t1`.`f1` AS `f1` from `test`.`t1`
You cannot prevent the schema prefix from being stored. If you restore the view on a different server with a different database name, you should edit the view definition manually. If you already restored the view that points to a non-existent schema, just recreate it.
VIEWis metadata only and does not hold any data, so this operation is non-blocking and will run momentarily.
Q: What is thread/sql/compress_gtid_table in performance_schema.threads?
A:
thread/sql/compress_gtid_tableis name of the instrument. You can read this and other instruments as below:
thread/is a group of instruments. In this case, it is the instruments that are visible in the
THREADStable.
thread/sql/is the group of instruments that are part of the server kernel code. If you are not familiar with MySQL source tree, download the source code tarball and check its content. The main components are:
sql– server kernel
storage– where storage engines code located (
storage/innobaseis InnoDB code,
storage/myisamis MyISAM code and so on)
vio– input-output functions
mysys– code, shared between all parts of the server
client– client library and utilities
strings– functions to work with strings
This is not full list. For more information consult MySQL Internals Manual
thread/sql/compress_gtid_tableis the name of the particular instrument.
Unfortunately, there is no link to source code for instrumented threads in the table
THREADS, but we can easily find them in the
sqldirectory. The function
compress_gtid_tableis defined in
sql/rpl_gtid_persist.ccand we can check comments and find what it is doing:
/** The main function of the compression thread. - compress the gtid_executed table when get a compression signal. @param p_thd Thread requesting to compress the table @return @retval 0 OK. always, the compression thread will swallow any error for going to wait for next compression signal until it is terminated. */ extern "C" { static void *compress_gtid_table(void *p_thd) {
You can also find the description of mysql.gtid_executed compression in the User Reference Manual.
You can follow the same actions to find out what other MySQL threads are doing.
Q: How does a novice on MySQL learn the core basics about MySQL. The documentation can be very vast which surpasses my understanding right now. Are there any good intro books you can recommend for a System Admin?
A: I learned MySQL a long time ago, and a book that I can recommend written for version 5.0. This is “MySQL 5.0 Certification Study Guide” by Paul DuBois, Stefan Hinz and Carsten Pedersen. The book is in two parts: one is devoted to SQL developers and explains how to run and tune queries. The second part is for DBAs and describes how to tune MySQL server. I asked my colleagues to suggest more modern books for you, and this one is still on the list for many. This is in all cases an awesome book for beginners, just note that MySQL has changed a lot since 5.0 and you need to deepen your knowledge after you finish reading this book.
Another book that was recommended is “MySQL” by Paul DuBois. It is written for beginners and has plenty of content. Paul DuBois has been working on (and continues to work on) the official MySQL documentation for many years, and knows MySQL in great detail.
Another book is “Murach’s MySQL” by Joel Murach, which is used as a course book in many colleges for “Introduction into Databases” type classes.
For System Administrators, you can read “Systems Performance: Enterprise and the Cloud” by Brendan Gregg. This book talks about how to tune operating systems for performance. This is one of the consistent tasks we have to do when administering MySQL. I also recommend that you study Brendan Gregg’s website, which is a great source of information for everyone who is interested in operating system performance tuning.
After you finish the books for novices, you can check out “High Performance MySQL, 3rd Edition” by Peter Zaitsev, Vadim Tkachenko, Baron Schwartz and “MySQL Troubleshooting” by Sveta Smirnova (yours truly =) ). These two books require at least basic MySQL knowledge, however.
Q: Does the database migration goes on same way? Do these tools work for migration as well?
A: The tools I discussed in this webinar are available for any version of MySQL/Percona/MariaDB server. You may use them for migration. For example, it is always useful to compare configuration (
SHOW GLOBAL VARIABLES) on both “old” and “new” servers. It helps if you observe performance drops on the “new” server. Or you can check table definitions before and after migration. There are many more uses for these tools during the migration process.
Q: How can we take backup of a single schema from a MySQL AWS instance without affecting the performance of applications. An AWS RDS instance to be more clear. mysqldump we cannot use in RDS instance in the current scenario.
A: You can connect to your RDS instance with mysqldump from your local machine, exactly like your MySQL clients connect to it. Then you can collect a dump of a single database, table or even specify the option –where to limit the resulting set to only a portion of the table. Note, by default
mysqldumpis blocking, but if you backup solely transactional tables (InnoDB, TokuDB, MyRocks) you can run
mysqldumpwith the option
--single-transaction, which starts the transaction at the beginning of the backup job.
Alternatively, you can use AWS Database Migration Service, which allows you to replicate your databases. Then you can take a backup of a single schema using whatever method you like.
Q: Why do some sites suggest to turn off information and performance schema? Is it important to keep it on or turn it off?
A: You cannot turn off Information Schema. It is always available.
Performance Schema in earlier versions (before 5.6.14) was resource-consuming, even if it was idle while enabled. These limitations were fixed a long time ago, and you don’t need to keep it off. At least unless you hit some new bug.
Q: How do we handle storage level threshold if a data file size grows and reaches max threshold when unnoticed? Can you please help on this question?
A: Do you mean what will happen if the data file grows until filesystem has no space? In this case, clients receive the error
"OS error code 28: No space left on device"until space is freed and mysqld can start functioning normally again. If it can write into error log file (for example, if it is located on different disk), you will see messages about error 28 in the error log file too.
Q: What are the performance bottlenecks when enabling performance_schema. Is there any benchmark we can have?
A: Just enabling Performance Schema in version 5.6 and up does not cause any performance issue. With version 5.7, it can also start with almost zero allocated memory, so it won’t affect your other buffers. The Performance Schema causes impact when you enable particular instruments. Most of them are instruments that start with the name
events_waits_*. I performed benchmarks on effects of particular Performance Schema instruments and published them in this post.
Q: Suggest us some tips about creating a real-time dashboards for the same as we have some replication environment? it would be great if you can help us here for building business level dashboards
A: This is topic for yet another webinar or, better still, a tutorial. For starters, I recommend you to check out the “MySQL Replication” dashboard in PMM and extend it using the metrics that you need.
Thanks for attending the webinar on internal troubleshooting tools for MySQL.
In this blog post, we’ll look at how to create PMM custom graphs and dashboards to track what you need to see in your database.
Percona Monitoring and Management (PMM)‘s default set of graphs is pretty complete: it covers most of the stuff a DBA requires to fully visualize database servers. However, sometimes custom information is needed in graphical form. Otherwise, you just feel your PMM deployment is a missing a graph.
Recently, a customer request came in asking for a better understanding of a specific metric: table growth, or more specifically the daily table growth (in bytes) for the last 30 days.
The graph we came up with looks like this:
. . .which graphs the information that comes from this query:
increase(mysql_info_schema_table_size{instance="$host",component="data_length"}[1d])
But what does that query mean, and how do I create one myself? I’m glad you asked! Let’s go deep into the technical details!
Before creating any graph, we must ensure that we have the data that will represent graphically. So, the first step is to ensure data collection.
This data is already collected by the Percona mysqld_exporter, as defined in the “Collector Flags” table from the GitHub repo: https://github.com/percona/mysqld_exporter/#collector-flags
Cool! Now we need a Prometheus query in order to get the relevant data. Luckily, the Prometheus documentation is very helpful and we came up with a query in no time.
What do we need for the query? In this case, it is a metric, a label and a time range. Every PMM deployment has access to the Prometheus console by adding “/prometheus” to the URL. The console is incredibly helpful when playing with queries. The console looks like this:
The time series values collected by the exporter are stored in the metrics inside of Prometheus. For our case, the metric name is called mysql_info_schema_table_size, which I figured out by using the Prometheus console “Expression” text input and its autocomplete feature. This shows you the options available as you’re writing. All the metrics collected by mysqld_export start with “mysql”.
Labels are different per metric, but they are intuitively named. We need the instance and component labels. Instance is the hostname and component is equivalent to the column name of a MySQL table. The component we need is “data_length”.
This is easy: since is a daily value, the time frame is 1d.
The time frame is not mandatory, but it is a parameter asked for by the function we’re going to use to calculate the increase, which is called increase().
That’s how we ended up with the query that feeds the metrics, which end up in here:
You will notice it’s using a variable: $host. We define that variable in the dashboard creation, explained below.
PMM best practice is to take a copy of the existing dashboard using Setting > Save as…, since edits to Percona-provided dashboards are not preserved during upgrades. In this example, we will start with an empty dashboard.
Adding a new dashboard is as easy as clicking the “New” button from the Grafana dropdown menu:
After that, you choose the type of element that you want on a new row, which is a Graph in this case:
We like to use variables for our graphs – changing which server we analyze, for example. To add variables to the dashboard, we need to head up to the Templating option and add the variables:
Make sure you put a meaningful name for your dashboard, and you’re all set! A good practice will be to export the JSON definition of your dashboard as a backup for future recovery, or to just share it with others.
The final dashboard is called “MySQL Table Size” and holds another graph showing the table size during the timeframe for the top ten biggest tables. It looks like this:
The top right of the screen has some drop down links, the ones that look like this:
You can add links on the “Link” tab of the dashboard settings:
In case you are wondering, the query for the “Table size” graph is:
topk(10,sort_desc(sum(mysql_info_schema_table_size{instance="$host",component=~".*length"}) by (schema, table)))
So next time you want to enhance PMM and you know that there is data already inside Prometheus, but PMM lacks the visualization you want, just add it! Create a new graph and put it to your own custom dashboard!
This is our second blog in the ProxySQL Series ( Blog I MySQL Replication Read-write Split up ). Will cover how to integrate ProxySQL with MHA to handle failover of Database servers.
We already have Master – Slave replication setup behind ProxySQL from previous blog [ProxySQL On MySQL Replication]
For this setup we have added one more node for MHA Manager , Which will keep eye on Master and Slave status.
ProxySQL can be greatly configured with MHA for Highly available setup with zero downtime.
MHA role in failover :
MHA tool is used for failover.During failover, MHA promotes most updated slave (slave with most recent transactions) as new master and apply CHANGE MASTER
command on new slave and change read_only
flag on new master and slave.
ProxySQL role in failover :
When failover happened (due to crash or manual for any maintenance activity) ProxySQL will detect the change (checking read_only flag) and promotes new master server’s IP into writers hostgroup and start sending traffic on new master.
Each row in mysql_replication_hostgroups table in proxysql represent a pair of writer_hostgroup and reader_hostgroup .
ProxySQL will monitor the value of read_only
from mysql_server_read_only_log
for all the servers.
If read_only=1 the host is copied/moved to the reader_hostgroup, while if read_only=0 the host is copied/moved to the writer_hostgroup .
Installing MHA
If replication is classic binlog/pos format based then install MHA node on all hosts involving (manager,master,slaves), for GTID based replication it has to be installed only on the manager node.
Install MHA node :
Install MHA on all DB nodes and MHA manager server. More information
apt-get -y install libdbd-mysql-perl
dpkg -i mha4mysql-node_0.56-0_all.deb
Install MHA manager :
Only install on MHA manager server.
#dependencies
apt-get install -y libdbi-perl libdbd-mysql-perl libconfig-tiny-perl liblog-dispatch-perl libparallel-forkmanager-perl libnet-amazon-ec2-perl
dpkg -i mha4mysql-manager_0.56-0_all.deb
Configuration changes :
Changes only on node5 (172.17.0.5) , MHA Manager
:
Create directories :
mkdir -p /etc/mha/ /var/log/mha/
Config file :
cat /etc/mha/cluster1.conf
[server default]
# mysql user and password
user=root
password=xxx
# replication user password
repl_user=repl
repl_password=xxx
remote_workdir=/var/tmp
# working directory on the manager
manager_workdir=/var/log/mha/
# manager log file
manager_log=/var/log/mha/mha.log
ping_interval=15
*/As we don't have to deal with VIP's here, disable master_ip_failover_script */
#master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
master_binlog_dir=/data/log/
secondary_check_script=/etc/mha/mha_prod/failover_triggered.sh
report_script=/etc/mha/mha_prod/failover_report.sh
master_pid_file=/var/run/mysqld/mysqld.pid
ssh_user=root
log_level=debug
#set this to 0 if YOU ARE SURE THIS CAN"T BREAK YOUR REPLICATION
check_repl_filter=1
[server1]
hostname=172.17.0.1
port=3306
[server2]
hostname=172.17.0.2
port=3306
[server3]
hostname=172.17.0.3
port=3306
no_master=1
master_ip_failover : Script used to switch virtual IP address.
master_ip_online_change : Script used in switchover when master is online or dead.
NOTE: Don’t forget to comment out the “FIX ME” lines in the above scripts.
Custom scripts : Below scripts are optional
secondary_check_script
It is always good to double check the availability of master. More info
report_script
: With this script we can configure alerts or email when failover completes. In Detail
Now run test against the cluster using below two scripts :
– masterha_check_repl
– masterha_check_ssh
Please note If this check fails then MHA will refuse to run any kind of failover.
root@MHA-Node# /etc/mha # masterha_check_ssh --conf=/etc/mha/cluster1.cnf
-- truncated long output
[info] All SSH connection tests passed successfully.
root@MHA-Node# masterha_check_repl --conf=/etc/mha/cluster1.cnf
172.17.0.1(172.17.0.1:3306) (current master)
+--172.17.0.2(172.17.0.2:3306)
+--172.17.0.3(172.17.0.3:3306)
-- truncated long output
MySQL Replication Health is OK.
To run an Manual failover :
new_master_host
– Its optional parameter if you want to select new master. If we don’s specify any value then most updated slave considered as new Master.masterha_master_switch --master_state=alive --conf=/etc/mha/cluster1.conf
--orig_master_is_new_slave [--new_master_host=]
Automatic failover :
We need to run masterha_manager
is background to monitor cluster status :
nohup masterha_manager --conf=/etc/mha/cluster1.cnf < /dev/null > /var/log/mha/mha.log 2>&1 &
When Auto failover happen , In case of Master Crash , Logs look like
tail -f /var/log/mha/mha.log
----- Failover Report -----
Master 172.17.0.1(172.17.0.1:3306) is down!
Started automated(non-interactive) failover.
172.17.0.1(172.17.0.1:3306)
Selected 172.17.0.2(172.17.0.2:3306) as a new master.
172.17.0.2(172.17.0.2:3306): OK: Applying all logs succeeded.
172.17.0.2(172.17.0.2:3306): OK: Activated master IP address.
172.17.0.3(172.17.0.3:3306): OK: Slave started, replicating from 172.17.0.2(172.17.0.2:3306)
172.17.0.2(172.17.0.2:3306): Resetting slave info succeeded.
Master failover to 172.17.0.2(172.17.0.2:3306) completed successfully.
we can also check the status of masterha_manager
:
RUNNING :
root@mysql-monitoring /etc/mha # masterha_check_status --conf=/etc/mha/cluster1.cnf
cluster1 (pid:15810) is running(0:PING_OK), master:172.17.0.1
NOT RUNNING :
root@mysql-monitoring /etc/mha # masterha_check_status --conf=/etc/mha/cluster1.cnf
cluster1 is stopped(2:NOT_RUNNING).
Remember masterha_manager
script stops working in two situation :
Check backend status at ProxySQL :
Below table of ProxySQL shows , what is the current master and its slaves after failover with their ONLINE status.
Admin > select hostgroup,srv_host,status,Queries,Bytes_data_sent,Latency_us from stats_mysql_connection_pool where hostgroup in (0,1);
+-----------+------------+----------+---------+-----------------+------------+
| hostgroup | srv_host | status | Queries | Bytes_data_sent | Latency_us |
+-----------+------------+----------+---------+-----------------+------------+
| 0 | 172.17.0.1 | ONLINE | 12349 | 76543232 | 144 |
| 1 | 172.17.0.2 | ONLINE | 22135 | 87654356 | 190 |
| 1 | 172.17.0.3 | ONLINE | 22969 | 85344235 | 110 |
| 1 | 172.17.0.1 | ONLINE | 1672 | 4534332 | 144 |
+-----------+------------+----------+---------+-----------------+------------+
Preserve relay logs and purge regularly :
As we have two slaves in this setup , MHA is keeping relay logs for recovering other slaves. [To ensure that it disables relay_log_purge
]
We need to periodically purges old relay logs like binary logs. MHA Node has a command line tool purge_relay_logs
to do that
purge_relay_logs removes relay logs without blocking SQL threads. Relay logs need to be purged regularly (i.e. once per day, once per 6 hours, etc), so purge_relay_logs should be regularly invoked on each slave server at different time. It can be scheduled as a cron too.
[root@mysql-slave1]$
cat /etc/cron.d/purge_relay_logs
#purge relay logs after every 5 hours
0 */5 * * * /usr/bin/purge_relay_logs --user=mha --password=PASSWORD --workdir=/data/archive_relay --disable_relay_log_purge >> /var/log/mha/purge_relay_logs.log 2>&1
Above script will purge relay logs and set relay_log_purge = 0 [OFF] to avoid automatic relay purge .
More Details : https://github.com/yoshinorim/mha4mysql-manager/wiki/Requirements
We can also have MySQL Utilities to perform the failover in ProxySQL too.The main advantage of using MHA-ProxySQL integration is, it avoids need for VIP or re-defining DNS after MHA failover , they are taken care by ProxySQL.
Business has continuously desired to derive insights from information to make reliable, smarter, real-time, fact-based decisions. As firms rely more on data and databases, information and data processing is the core of many business operations and business decisions. The faith in the database is total. None of the day-to-day company services can run without the underlying database platforms. As a consequence, the necessity on scalability and performance of database system software is more critical than ever. The principal benefits of the clustered database system are scalability and high availability. In this blog, we will try to compare Oracle RAC and Galera Cluster in the light of these two aspects. Real Application Clusters (RAC) is Oracle’s premium solution to clustering Oracle databases and provides High Availability and Scalability. Galera Cluster is the most popular clustering technology for MySQL and MariaDB.
Oracle RAC uses Oracle Clusterware software to bind multiple servers. Oracle Clusterware is a cluster management solution that is integrated with Oracle Database, but it can also be used with other services, not only the database. The Oracle Clusterware is an additional software installed on servers running the same operating system, which lets the servers to be chained together to operate as if they were one server.
Oracle Clusterware watches the instance and automatically restarts it if a crash occurs. If your application is well designed, you may not experience any service interruption. Only a group of sessions (those connected to the failed instance) is affected by the failure. The blackout can be efficiently masked to the end user using advanced RAC features like Fast Application Notification and the Oracle client’s Fast Connection Failover. Oracle Clusterware controls node membership and prevents split brain symptoms in which two or more instances attempt to control the instance.
Galera Cluster is a synchronous active-active database clustering technology for MySQL and MariaDB. Galera Cluster differs from what is known as Oracle’s MySQL Cluster - NDB. MariaDB cluster is based on the multi-master replication plugin provided by Codership. Since version 5.5, the Galera plugin (wsrep API) is an integral part of MariaDB. Percona XtraDB Cluster (PXC) is also based on the Galera plugin. The Galera plugin architecture stands on three core layers: certification, replication, and group communication framework. Certification layer prepares the write-sets and does the certification checks on them, guaranteeing that they can be applied. Replication layer manages the replication protocol and provides the total ordering capability. Group Communication Framework implements a plugin architecture which allows other systems to connect via gcomm back-end schema.
To keep the state identical across the cluster, the wsrep API uses a Global Transaction ID. GTID unique identifier is created and associated with each transaction committed on the database node. In Oracle RAC, various database instances share access to resources such as data blocks in the buffer cache to enqueue data blocks. Access to the shared resources between RAC instances needs to be coordinated to avoid conflict. To organize shared access to these resources, the distributed cache maintains information such as data block ID, which RAC instance holds the current version of this data block, and the lock mode in which each instance contains the data block.
Oracle RAC relies on a distributed disk architecture. The database files, control files and online redo logs for the database need be accessible to each node in the cluster. There is a variation of ways to configure shared storage including directly attached disks, Storage Area Networks (SAN), and Network Attached Storage (NAS) and Oracle ASM. Two most popular are OCFS and ASM. Oracle Cluster File System (OCFS) is a shared file system designed specifically for Oracle RAC. OCFS eliminates the requirement that Oracle database files be connected to logical drives and enables all nodes to share a single Oracle Home ASM, RAW Device. Oracle ASM is Oracle's advised storage management solution that provides an alternative to conventional volume managers, file systems, and raw devices. The Oracle ASM provides a virtualization layer between the database and storage. It treats multiple disks as a single disk group and lets you dynamically add or remove drives while maintaining databases online.
There is no need to build sophisticated shared disk storage for Galera, as each node has its full copy of data. However it is a good practice to make the storage reliable with battery-backed write caches.
Oracle Real Application Clusters has a shared cache architecture, it utilizes Oracle Grid Infrastructure to enable the sharing of server and storage resources. Communication between nodes is the critical aspect of cluster integrity. Each node must have at least two network adapters or network interface cards: one for the public network interface, and one for the interconnect. Each cluster node is connected to all other nodes via a private high-speed network, also recognized as the cluster interconnect.
The private network is typically formed with Gigabit Ethernet, but for high-volume environments, many vendors offer low-latency, high-bandwidth solutions designed for Oracle RAC. Linux also extends a means of bonding multiple physical NICs into a single virtual NIC to provide increased bandwidth and availability.
While the default approach to connecting Galera nodes is to use a single NIC per host, you can have more than one card. ClusterControl can assist you with such setup. The main difference is the bandwidth requirement on the interconnect. Oracle RAC ships blocks of data between instances, so it places a heavier load on the interconnect as compared to Galera write-sets (which consist of a list of operations).
With Redundant Interconnect Usage in RAC, you can identify multiple interfaces to use for the private cluster network, without the need of using bonding or other technologies. This functionality is available starting with Oracle Database 11gR2. If you use the Oracle Clusterware excessive interconnect feature, then you must use IPv4 addresses for the interfaces (UDP is a default).
To manage high availability, each cluster node is assigned a virtual IP address (VIP). In the event of node failure, the failed node's IP address can be reassigned to a surviving node to allow applications continue to reach the database through the same IP address.
Sophisticated network setup is necessary to Oracle's Cache Fusion technology to couple the physical memory in each host into a single cache. Oracle Cache Fusion provides data stored in the cache of one Oracle instance to be accessed by any other instance by transporting it across the private network. It also protects data integrity and cache coherency by transmitting locking and supplementary synchronization information beyond cluster nodes.
On top of the described network setup, you can set a single database address for your application - Single Client Access Name (SCAN). The primary purpose of SCAN is to provide ease of connection management. For instance, you can add new nodes to the cluster without changing your client connection string. This functionality is because Oracle will automatically distribute requests accordingly based on the SCAN IPs which point to the underlying VIPs. Scan listeners do the bridge between clients and the underlying local listeners which are VIP-dependent.
For Galera Cluster, the equivalent of SCAN would be adding a database proxy in front of the Galera nodes. The proxy would be a single point of contact for applications, it can blacklist failed nodes and route queries to healthy nodes. The proxy itself can be made redundant with Keepalived and Virtual IP.
The main difference between Oracle RAC and MySQL Galera Cluster is that Galera is shared nothing architecture. Instead of shared disks, Galera uses certification based replication with group communication and transaction ordering to achieve synchronous replication. A database cluster should be able to survive a loss of a node, although it's achieved in different ways. In case of Galera, the critical aspect is the number of nodes, Galera requires a quorum to stay operational. A three node cluster can survive the crash of one node. With more nodes in your cluster, your availability will grow. Oracle RAC doesn't require a quorum to stay operational after a node crash. It is because of the access to distributed storage that keeps consistent information about cluster state. However, your data storage could be a potential point of failure in your high availability plan. While it's reasonably straightforward task to have Galera cluster nodes spread across geolocation data centers, it wouldn't be that easy with RAC. Oracle RAC requires additional high-end disk mirroring however, basic RAID like redundancy can be achieved inside an ASM diskgroup.
Disk Group Type | Supported Mirroring Levels | Default Mirroring Level |
---|---|---|
External redundancy | Unprotected (none) | Unprotected |
Normal redundancy | Two-way, three-way, unprotected (none) | Two-way |
High redundancy | Three-way | Three-way |
Flex redundancy | Two-way, three-way, unprotected (none) | Two-way (newly-created) |
Extended redundancy | Two-way, three-way, unprotected (none) | Two-way |
In a single-user database, a user can alter data without concern for other sessions modifying the same data at the same time. However, in a multi-user database multi-node environment, this become more tricky. A multi-user database must provide the following:
Cluster instances require three main types of concurrency locking:
Oracle lets you choose the policy for locking, either pessimistic or optimistic, depending on your requirements. To obtain concurrency locking, RAC has two additional buffers. They are Global Cache Service (GCS) and Global Enqueue Service (GES). These two services cover the Cache Fusion process, resource transfers, and resource escalations among the instances. GES include cache locks, dictionary locks, transaction locks and table locks. GCS maintains the block modes and block transfers between the instances.
In Galera cluster, each node has its storage and buffers. When a transaction is started, database resources local to that node are involved. At commit, the operations that are part of that transaction are broadcasted as part of a write-set, to the rest of the group. Since all nodes have the same state, the write-set will either be successful on all nodes or it will fail on all nodes.
Galera Cluster uses at the cluster-level optimistic concurrency control, which can appear in transactions that result in a COMMIT aborting. The first commit wins. When aborts occur at the cluster level, Galera Cluster gives a deadlock error. This may or may not impact your application architecture. High number of rows to replicate in a single transaction would impact node responses, although there are techniques to avoid such behavior.
Configuring both clusters hardware doesn’t require potent resources. Minimal Oracle RAC cluster configuration would be satisfied by two servers with two CPUs, physical memory at least 1.5 GB of RAM, an amount of swap space equal to the amount of RAM and two Gigabit Ethernet NICs. Galera’s minimum node configuration is three nodes (one of nodes can be an arbitrator, gardb), each with 1GHz single-core CPU 512MB RAM, 100 Mbps network card. While these are the minimal, we can safely say that in both cases you would probably like to have more resources for your production system.
Each node stores software so you would need to prepare several gigabytes of your storage. Oracle and Galera both have the ability to individually patch the nodes by taking them down one at a time. This rolling patch avoids a complete application outage as there are always database nodes available to handle traffic.
What is important to mention is that a production Galera cluster can easily run on VM’s or basic bare metal, while RAC would need investment in sophisticated shared storage and fiber communication.
Oracle Enterprise Manager is the favored approach for monitoring Oracle RAC and Oracle Clusterware. Oracle Enterprise Manager is an Oracle Web-based unified management system for monitoring and administering your database environment. It’s part of Oracle Enterprise License and should be installed on separate server. Cluster control monitoring and management is done via combination on crsctl and srvctl commands which are part of cluster binaries. Below you can find a couple of example commands.
Clusterware Resource Status Check:
crsctl status resource -t (or shorter: crsctl stat res -t)
Example:
$ crsctl stat res ora.test1.vip
NAME=ora.test1.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on test1
Check the status of the Oracle Clusterware stack:
crsctl check cluster
Example:
$ crsctl check cluster -all
*****************************************************************
node1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
*****************************************************************
node2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
Check the status of Oracle High Availability Services and the Oracle Clusterware stack on the local server:
crsctl check crs
Example:
$ crsctl check crs
CRS-4638: Oracle High Availablity Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
Stop Oracle High Availability Services on the local server.
crsctl stop has
Stop Oracle High Availability Services on the local server.
crsctl start has
Displays the status of node applications:
srvctl status nodeapps
Displays the configuration information for all SCAN VIPs
srvctl config scan
Example:
srvctl config scan -scannumber 1
SCAN name: testscan, Network: 1
Subnet IPv4: 192.51.100.1/203.0.113.46/eth0, static
Subnet IPv6:
SCAN 1 IPv4 VIP: 192.51.100.195
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
The Cluster Verification Utility (CVU) performs system checks in preparation for installation, patch updates, or other system changes:
cluvfy comp ocr
Example:
Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configurationl...
All nodes free of non-clustered, local-only configurations
ASM Running check passed. ASM is running on all specified nodes
Checking OCR config file “/etc/oracle/ocr.loc"...
OCR config file “/etc/oracle/ocr.loc" check successful
Disk group for ocr location “+DATA" available on all the nodes
NOTE:
This check does not verify the integrity of the OCR contents. Execute ‘ocrcheck' as a privileged user to verify the contents of OCR.
OCR integrity check passed
Verification of OCR integrity was successful.
Galera nodes and the cluster requires the wsrep API to report several statuses, which is exposed. There are currently 34 dedicated status variables can be viewed with SHOW STATUS statement.
mysql> SHOW STATUS LIKE 'wsrep_%';
wsrep_apply_oooe wsrep_apply_oool wsrep_cert_deps_distance wsrep_cluster_conf_id wsrep_cluster_size wsrep_cluster_state_uuid wsrep_cluster_status wsrep_connected wsrep_flow_control_paused wsrep_flow_control_paused_ns wsrep_flow_control_recv |
wsrep_local_send_queue_avg wsrep_local_state_uuid wsrep_protocol_version wsrep_provider_name wsrep_provider_vendor wsrep_provider_version wsrep_flow_control_sent wsrep_gcomm_uuid wsrep_last_committed wsrep_local_bf_aborts wsrep_local_cert_failures |
wsrep_local_commits wsrep_local_index wsrep_local_recv_queue wsrep_local_recv_queue_avg wsrep_local_replays wsrep_local_send_queue wsrep_ready wsrep_received wsrep_received_bytes wsrep_replicated wsrep_replicated_bytes wsrep_thread_count |
The administration of MySQL Galera Cluster in many aspects is very similar. There are just few exceptions like bootstrapping the cluster from initial node or recovering nodes via SST or IST operations.
Bootstrapping cluster:
$ service mysql bootstrap # sysvinit
$ service mysql start --wsrep-new-cluster # sysvinit
$ galera_new_cluster # systemd
$ mysqld_safe --wsrep-new-cluster # command line
The equivalent Web-based, out of the box solution to manage and monitor Galera Cluster is ClusterControl. It provides a web-based interface to deploy clusters, monitors key metrics, provides database advisors, and take care of management tasks like backup and restore, automatic patching, traffic encryption and availability management.
Oracle provides SCAN technology which we found missing in Galera Cluster. The benefit of SCAN is that the client’s connection information does not need to change if you add or remove nodes or databases in the cluster. When using SCAN, the Oracle database randomly connects to one of the available SCAN listeners (typically three) in a round robin fashion and balances the connections between them. Two kinds load balancing can be configured: client side, connect time load balancing and on the server side, run time load balancing. Although there is nothing similar within Galera cluster itself, the same functionality can be addressed with additional software like ProxySQL, HAProxy, Maxscale combined with Keepalived.
When it comes to application workload design for Galera Cluster, you should avoid conflicting updates on the same row, as it leads to deadlocks across the cluster. Avoid bulk inserts or updates, as these might be larger than the maximum allowed writeset. That might also cause cluster stalls.
Designing Oracle HA with RAC you need to keep in mind that RAC only protects against server failure, and you need to mirror the storage and have network redundancy. Modern web applications require access to location-independent data services, and because of RAC’s storage architecture limitations, it can be tricky to achieve. You also need to spend a notable amount of time to gain relevant knowledge to manage the environment; it is a long process. On the application workload side, there are some drawbacks. Distributing separated read or write operations on the same dataset is not optimal because latency is added by supplementary internode data exchange. Things like partitioning, sequence cache, and sorting operations should be reviewed before migrating to RAC.
According to the Oracle documentation, the maximum distance between two boxes connected in a point-to-point fashion and running synchronously can be only 10 km. Using specialized devices, this distance can be increased to 100 km.
Galera Cluster is well known for its multi-datacenter replication capabilities. It has rich support for Wider Area Networks network settings. It can be configured for high network latency by taking Round-Trip Time (RTT) measurements between cluster nodes and adjusting necessary parameters. The wsrep_provider_options parameters allow you to configure settings like suspect_timeout, interactive_timeout, join_retrans_timouts and many more.
Per Oracle note www.oracle.com/technetwork/database/options/.../rac-cloud-support-2843861.pdf no third-party cloud currently meets Oracle’s requirements regarding natively provided shared storage. “Native” in this context means that the cloud provider must support shared storage as part of their infrastructure as per Oracle’s support policy.
Thanks to its shared nothing architecture, which is not tied to a sophisticated storage solution, Galera cluster can be easily deployed in a cloud environment. Things like:
makes cloud migration process more reliable.
Oracle licensing is a complex topic and would require a separate blog article. The cluster factor makes it even more difficult. The cost goes up as we have to add some options to license a complete RAC solution. Here we just want to highlight what to expect and where to find more information.
RAC is a feature of Oracle Enterprise Edition license. Oracle Enterprise license is split into two types, per named user and per processor. If you consider Enterprise Edition with per core license, then the single core cost is RAC 23,000 USD + Oracle DB EE 47,500 USD, and you still need to add a ~ 22% support fee. We would like to refer to a great blog on pricing found on https://flashdba.com/2013/09/18/the-real-cost-of-oracle-rac/.
Flashdba calculated the price of a four node Oracle RAC. The total amount was 902,400 USD plus additional 595,584 USD for three years DB maintenance, and that does not include features like partitioning or in-memory database, all that with 60% Oracle discount.
Galera Cluster is an open source solution that anyone can run for free. Subscriptions are available for production implementations that require vendor support. A good TCO calculation can be found at https://severalnines.com/blog/database-tco-calculating-total-cost-ownership-mysql-management.
While there are significant differences in architecture, both clusters share the main principles and can achieve similar goals. Oracle enterprise product comes with everything out of the box (and it's price). With a cost in the range of >1M USD as seen above, it is a high-end solution that many enterprises would not be able to afford. Galera Cluster can be described as a decent high availability solution for the masses. In certain cases, Galera may well be a very good alternative to Oracle RAC. One drawback is that you have to build your own stack, although that can be completely automated with ClusterControl. We’d love to hear your thoughts on this.
Thank you to everyone who joined us at our second annual MariaDB user conference, M|18, in New York City on February 26 and 27. DBAs, open source enthusiasts, engineers, executives and more from all over the world came together to explore and learn.
Couldn’t make the event or want to relive your favorite session?
Watch 40+ M|18 session recordings on demand.
Learning opportunities abounded at M|18, all while having fun. At the opening-night party and closing reception, attendees enjoyed food, drink and conversation – plus a little good-natured competition.
Thanks to the attendees and speakers, M|18 was trending on Twitter. Here are a few of our favorite conference tweets.
Had great time at conference. Helped to reinforce architectural changes needed in our company and how mariadb suite helps. #MARIADBM18
— Adam A. Lang (@AdamALang)
Have had a blast at #MARIADBM18 - lots learnt and even more to think about. Thanks #Mariadb
— Tudor Davies (@DerBroader71)
Thank you MariaDB for all of your hard work creating a great M18 conference! #MARIADBM18 pic.twitter.com/FqYxU8kNZO
— Lou Zircher (@ndp3188)
… to our sponsors for their generous support.
In the next few weeks, we’ll release the dates for our next MariaDB user conference, M|19. Be on the lookout for the announcement!
MariaDB's second annual MariaDB user conference, M|18, took place in New York City on February 26 and 27. DBAs, open source enthusiasts, engineers, executives and more from all over the world came together to explore and learn. Couldn’t make the event or want to relive your favorite session? Watch 45+ M|18 session recordings on demand.
In this blog post, we’ll look at how you can verify query performance using ProxySQL.
In the previous blog post, I showed you how many information can you get from the “stats.stats_mysql_query_digest” table in ProxySQL. I also mentioned you could even collect and graph these metrics. I will show you this is not just theory, it is possible.
These graphs could be very useful to understand the impact of the changes what you made on the query count or execution time.
I used our all-time favorite benchmark tool called Sysbench. I was running the following query:
UPDATE sbtest1 SET c=? WHERE k=?
There was no index on “k” when I started the test. During the test, I added an index. We expect to see some changes in the graphs.
I selected the “stats.stats_mysql_query_digest” into a file in every second, then I used Percona Monitoring and Management (PMM) to create graphs from the metrics. (I am going write another blog post on how can you use PMM to create graphs from any kind of metrics.)
Without the index, the update was running only 2-3 times per second. By adding the index, it went up to 400-500 hundred. We can see the results immediately on the graph.
Let’s see the average execution time:
Without the index, it took 600000-700000 microseconds, which is around 0.7s. By adding an index, it dropped to 0.01s. This is a big win, but most importantly we can see the effects on the query response time and query count if we are making some changes to the schema, query or configuration as well.
If you already have a ProxySQL server collecting and graphing these metrics, they could be quite useful when you are optimizing your queries. They can help make sure you are moving in the right direction with your tunings/modifications.
FromDual has the pleasure to announce the release of the new version 2.0.0 of its popular MySQL, Galera Cluster and MariaDB multi-instance environment MyEnv.
The new MyEnv can be downloaded here.
In the inconceivable case that you find a bug in the MyEnv please report it to the FromDual bug tracker.
Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.
# cd ${HOME}/product # tar xf /download/myenv-2.0.0.tar.gz # rm -f myenv # ln -s myenv-2.0.0 myenv
If you are using plug-ins for showMyEnvStatus
create all the links in the new directory structure:
cd ${HOME}/product/myenv ln -s ../../utl/oem_agent.php plg/showMyEnvStatus/
From MyEnv v1 to v2 the directory structure of instances has fundamentally changed. Nevertheless MyEnv v2 works fine with MyEnv v1 directory structures.
Old structure
~/data/instance1/ibdata1 ~/data/instance1/ib_logfile? ~/data/instance1/my.cnf ~/data/instance1/error.log ~/data/instance1/mysql ~/data/instance1/test~/data/mypprod/ ~/data/instance1/general.log ~/data/instance1/slow.log ~/data/instance1/binlog.0000?? ~/data/instance2/...
New structure
~/database/instance1/binlog/binlog.0000?? ~/database/instance1/data/ibdata1 ~/database/instance1/data/ib_logfile? ~/database/instance1/data/mysql ~/database/instance1/data/test ~/database/instance1/etc/my.cnf ~/database/instance1/log/error.log ~/database/instance1/log/general.log ~/database/instance1/log/slow.log ~/database/instance1/tmp/ ~/database/instance2/...
But over time you possibly want to migrate the old structure to the new one. The following steps describe how you upgrade MyEnv instance structure v1 to v2:
mysql@chef:~ [mysql-57, 3320]> mypprod mysql@chef:~ [mypprod, 3309]> stop .. SUCCESS! mysql@chef:~ [mypprod, 3309]> mkdir ~/database/mypprod mysql@chef:~ [mypprod, 3309]> mkdir ~/database/mypprod/binlog ~/database/mypprod/data ~/database/mypprod/etc ~/database/mypprod/log ~/database/mypprod/tmp mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/binary-log.* ~/database/mypprod/binlog/ mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/my.cnf ~/database/mypprod/etc/ mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/error.log ~/database/mypprod/log/ mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/slow.log ~/database/mypprod/log/ mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/general.log ~/database/mypprod/log/ mysql@chef:~ [mypprod, 3309]> mv ~/data/mypprod/* ~/database/mypprod/data/ mysql@chef:~ [mypprod, 3309]> rmdir ~/data/mypprod mysql@chef:~ [mypprod, 3309]> vi /etc/myenv/myenv.conf - datadir = /home/mysql/data/mypprod + datadir = /home/mysql/database/mypprod/data - my.cnf = /home/mysql/data/mypprod/my.cnf + my.cnf = /home/mysql/database/mypprod/etc/my.cnf + instancedir = /home/mysql/database/mypprod mysql@chef:~ [mypprod, 3309]> source ~/.bash_profile mysql@chef:~ [mypprod, 3309]> cde mysql@chef:~/database/mypprod/etc [mypprod, 3309]> vi my.cnf - log_bin = binary-log + log_bin = /home/mysql/database/mypprod/binlog/binary-log - datadir = /home/mysql/data/mypprod + datadir = /home/mysql/database/mypprod/data - tmpdir = /tmp + tmpdir = /home/mysql/database/mypprod/tmp - log_error = error.log + log_error = /home/mysql/database/mypprod/log/error.log - slow_query_log_file = slow.log + slow_query_log_file = /home/mysql/database/mypprod/log/slow.log - general_log_file = general.log + general_log_file = /home/mysql/database/mypprod/log/general.log mysql@chef:~/database/mypprod/etc [mypprod, 3309]> cdb mysql@chef:~/database/mypprod/binlog [mypprod, 3309]> vi binary-log.index - ./binary-log.000001 + /home/mysql/database/mypprod/binlog/binary-log.000001 - ./binary-log.000001 + /home/mysql/database/mypprod/binlog/binary-log.000001 mysql@chef:~/database/mypprod/binlog [mypprod, 3309]> start mysql@chef:~/database/mypprod/binlog [mypprod, 3309]> exit
instancedir
variable introduced, aliases adapted accordingly.aliases.conf
and variables.conf
made more user friendly.mysqladmin
replace by UNIX socket probing.up
.my.cnf
template (super_read_only, innodb_tmpdir, innodb_flush_log_at_trx_commit
, MySQL Group Replication, crash-safe Replication, GTID, MySQL 8.0)mysqld_safe
) and cgroups added.mysqlstat.php
added.keepalived
added.mysql-create-instance.sh
and mysql-remove-instance.sh
removed.insert_test.sh
, insert_test.php
and test
table improved.For subscriptions of commercial use of MyEnv please get in contact with us.
In the past five posts of the blog series, we covered deployment of clustering/replication (MySQL / Galera, MySQL Replication, MongoDB & PostgreSQL), management & monitoring of your existing databases and clusters, performance monitoring and health, how to make your setup highly available through HAProxy and MaxScale and in the last post, how to prepare yourself for disasters by scheduling backups.
Since ClusterControl 1.2.11, we made major enhancements to the database configuration manager. The new version allows changing of parameters on multiple database hosts at the same time and, if possible, changing their values at runtime.
We featured the new MySQL Configuration Management in a Tips & Tricks blog post, but this blog post will go more in depth and cover Configuration Management within ClusterControl for MySQL, PostgreSQL and MongoDB.
The configuration management interface can be found under Manage > Configurations. From here, you can view or change the configurations of your database nodes and other tools that ClusterControl manages. ClusterControl will import the latest configuration from all nodes and overwrite previous copies made. Currently there is no historical data kept.
If you’d rather like to manually edit the config files directly on the nodes, you can re-import the altered configuration by pressing the Import button.
And last but not least: you can create or edit configuration templates. These templates are used whenever you deploy new nodes in your cluster. Of course any changes made to the templates will not retroactively applied to the already deployed nodes that were created using these templates.
As previously mentioned, the MySQL configuration management got a complete overhaul in ClusterControl 1.2.11. The interface is now more intuitive. When changing the parameters ClusterControl checks whether the parameter actually exists. This ensures your configuration will not deny startup of MySQL due to parameters that don’t exist.
From Manage -> Configurations, you will find an overview of all config files used within the selected cluster, including load balancer nodes.
We use a tree structure to easily view hosts and their respective configuration files. At the bottom of the tree, you will find the configuration templates available for this cluster.
Suppose we need to change a simple parameter like the maximum number of allowed connections (max_connections), we can simply change this parameter at runtime.
First select the hosts to apply this change to.
Then select the section you want to change. In most cases, you will want to change the MYSQLD section. If you would like to change the default character set for MySQL, you will have to change that in both MYSQLD and client sections.
If necessary you can also create a new section by simply typing the new section name. This will create a new section in the my.cnf.
Once we change a parameter and set its new value by pressing “Proceed”, ClusterControl will check if the parameter exists for this version of MySQL. This is to prevent any non-existent parameters to block the initialization of MySQL on the next restart.
When we press “proceed” for the max_connections change, we will receive a confirmation that it has been applied to the configuration and set at runtime using SET GLOBAL. A restart is not required as max_connections is a parameter we can change at runtime.
Now suppose we want to change the bufferpool size, this would require a restart of MySQL before it takes effect:
And as expected the value has been changed in the configuration file, but a restart is required. You can do this by logging into the host manually and restarting the MySQL process. Another way to do this from ClusterControl is by using the Nodes dashboard.
You can perform a restart per node by selecting “Restart Node” and pressing the “Proceed” button.
When you select “Initial Start” on a Galera node, ClusterControl will empty the MySQL data directory and force a full copy this way. This is, obviously, unnecessary for a configuration change. Make sure you leave the “initial” checkbox unchecked in the confirmation dialog. This will stop and start MySQL on the host but depending on your workload and bufferpool size this could take a while as MySQL will start flushing the dirty pages from the InnoDB bufferpool to disk. These are the pages that have been modified in memory but not on disk.
For MySQL master-slave topologies you can’t just restart node by node. Unless downtime of the master is acceptable, you will have to apply the configuration changes to the slaves first and then promote a slave to become the new master.
You can go through the slaves one by one and execute a “Restart Node” on them.
After applying the changes to all slaves, promote a slave to become the new master:
After the slave has become the new master, you can shutdown and restart the old master node to apply the change.
Now that we have applied the change directly on the database, as well as the configuration file, it will take until the next configuration import to see the change reflected in the configuration stored in ClusterControl. If you are less patient, you can schedule an immediate configuration import by pressing the “Import” button.
For PostgreSQL, the Configuration Management works a bit different from the MySQL Configuration Management. In general, you have the same functionality here: change the configuration, import configurations for all nodes and define/alter templates.
The difference here is that you can immediately change the whole configuration file and write this configuration back to the database node.
If the changes made requires a restart, a “Restart” button will appear that allows you to restart the node to apply the changes.
The MongoDB Configuration Management works similar to the MySQL Configuration Management: you can change the configuration, import configurations for all nodes, change parameters and alter templates.
Changing the configuration is pretty straightforward, by using Change Parameter dialog (as described in the "Changing Parameters" section::
Once changed, you can see the post-modification action proposed by ClusterControl in the "Config Change Log" dialog:
You can then proceed to restart the respective MongoDB nodes, one node at a time, to load the changes.
In this blog post we learned about how to manage, alter and template your configurations in ClusterControl. Changing the templates can save you a lot of time when you have deployed only one node in your topology. As the template will be used for new nodes, this will save you from altering all configurations afterwards. However for MySQL and MongoDB based nodes, changing the configuration on all nodes has become trivial due to the new Configuration Management interface.
As a reminder, we recently covered in the same series deployment of clustering/replication (MySQL / Galera, MySQL Replication, MongoDB & PostgreSQL), management & monitoring of your existing databases and clusters, performance monitoring and health, how to make your setup highly available through HAProxy and MaxScale and in the last post, how to prepare yourself for disasters by scheduling backups.
The survey results from StackOverflow’s developers survey are already here, and we can now declare the most popular databases for 2018.
Without further ado, let’s look into the results:
So what can we learn from these results?
To summarize, RDBMS and MySQL are still very popular among tech companies and programmers. NoSQL databases are probably not here to replace those needs, but to be used to solve different requirements.
Disclaimer: charts and data copyrights are of StackOverflow. EverSQL is in no way associated with StackOverflow.