Quantcast
Channel: Planet MySQL
Viewing all 18842 articles
Browse latest View live

Press Release: Severalnines boosts US healthcare provider’s IT operations

$
0
0

Accountable Health Inc uses ClusterControl to outcompete larger rivals

Stockholm, Sweden and anywhere else in the world - 20/07/2016 - Severalnines, the provider of database infrastructure management software, today announced its latest customer, Accountable Health INC (AHI). This move comes at a time when technology is disrupting healthcare globally with the introduction of self-service medicine kiosks and virtual medical consultations.

AHI is a US-based company which helps pharmaceutical and healthcare firms to enhance their business and technical performance. Subsidiaries of AHI, such as Connect Health Solutions and Accountable Health Solutions, help employers build health and wellness programmes to facilitate a stronger return on investment in employees. Severalnines’ ClusterControl enables AHI to remedy database issues affecting business performance.

This is the second time the IT team at AHI has chosen Severalnines’ ClusterControl over rivals such as Oracle, Microsoft and Rackspace to provide database infrastructure. With the acquisition of Connect Health Solutions, AHI learnt the existing database infrastructure was inadequate. The pressure on the database caused severe data overloads and failed to handle the massive queries the business required, meaning client portals crashed regularly and employees were waiting for hours for the server to upload claims documents. AHI estimated the previous IT set-up was losing the business thousands of dollars each day in productivity loss.

To compete in a highly competitive US healthcare market, AHI needed to achieve high database uptime, availability and reliability for all businesses across its portfolio. Having successfully deployed ClusterControl in the past, AHI deployed the database management platform again to improve technology performance and customer satisfaction. Other solutions were seen as unattainable due to technical complexity and prohibitive costs.

It took 10 days to fully deploy ClusterControl and migrate to a clustered database setup. Severalnines assisted AHI with the migration. AHI can now access Severalnines’ database experts with one phone call which is different to the tiered support systems offered by the large software vendors.

ClusterControl is now the database management platform for all wholly-owned subsidiaries of AHI, who themselves currently deploy clusters on commodity off-the-shelf hardware. The ease of deployment and management along with competitive pricing meant AHI could be agile in its growth strategy and compete with US healthcare rivals such as Trizetto, Optum and Cognizant Healthcare Consulting.

Greg Sarrica, Director of IT development at AHI, said: “Using ClusterControl was an absolute no-brainer for me. AHI looked for an alternative to Oracle and IBM, which could match our demands and with our budget. We wanted to give our clients frictionless access to their healthcare information without portals crashing and potentially losing their personal data. Now we have a solution that allows us to be agile when competing in the fast-moving US healthcare market.”

Vinay Joosery, Severalnines CEO said: “The security and availability of data is indicative of the performance of healthcare providers. The US healthcare industry is so compact that smaller businesses need to punch harder than their bigger rivals to gain market share. We are happy to be working with AHI and help the team there deliver fast and accurate customer service.”

About Severalnines

Severalnines provides automation and management software for database clusters. We help companies deploy their databases in any environment, and manage all operational aspects to achieve high-scale availability.

Severalnines' products are used by developers and administrators of all skills levels to provide the full 'deploy, manage, monitor, scale' database cycle, thus freeing them from the complexity and learning curves that are typically associated with highly available database clusters. The company has enabled over 8,000 deployments to date via its popular online database configurator. Currently counting BT, Orange, Cisco, CNRS, Technicolour, AVG, Ping Identity and Paytrail as customers. Severalnines is a private company headquartered in Stockholm, Sweden with offices in Singapore and Tokyo, Japan. To see who is using Severalnines today visit, http://www.severalnines.com/company.

About Accountable Health Solutions

Accountable Health Solutions was founded in 1992 as Molloy Wellness Company and acquired by the Principal Financial Group in 2004, with ownership transferred to Accountable Health, Inc., and whose name changed to Accountable Health Solutions in 2013.

Accountable Health Solutions offers comprehensive health and wellness programs to employers and health plan clients. Accountable Health combines smart technology, healthcare and behavior change expertise to deliver solutions that improve health, increase efficiencies and reduce costs in the delivery of healthcare. The company's product suite ranges from traditional wellness products to health improvement programs. Accountable Health Solutions is an industry leader with more than 20 years in the health and wellness industry and a 97% client retention rate. More at accountablehealthsolutions.com.

Press contact:

Positive Marketing
Steven de Waal/Camilla Nilsson
severalnines@positivemarketing.com
0203 637 0647/0643


PlanetMySQL Voting: Vote UP / Vote DOWN

The Value of Database Support

$
0
0
database support

database supportIn this post, I’ll discuss how database support is good for your enterprise.

Years ago when I worked for the MySQL Support organization at the original MySQL AB, we spoke about MySQL Support as insurance and focused on a value proposition similar to that of car insurance. You must purchase car insurance before the incident happens, or insurance won’t cover the damage. In fact, most places around the world require automobile insurance. Similarly, many organizations that leverage production-use technology have their own “insurance” by means of 24/7 support.

In my opinion, this is a very one-sided view that does not capture the full value (and ROI) that a database support contract with Percona provides. With a Percona support contract, you are assured that your database environment (virtual, physical, or in the cloud) is fully covered – whether it’s one server or many.

Increasingly – especially with the introduction of cloud-based database environments – database servers are being spun up and torn down on a day-to-day basis. However briefly these databases exist, they need support. One of the challenges modern businesses face is providing support for a changing database infrastructure, while still maintaining a viable cost structure.

Let’s look at the different dimensions of value offered by Percona Support based on the different support cases we have received throughout the years.

Reduce and Prevent Downtime

If your database goes down, the time to recover will be significantly shorter with a support agreement than without it. The cost of downtime varies widely between organizations. A recent IBM-sponsored Ponemon Institute research study found that the average data security breach could cost upwards of $7.01M (in the United States).

With our clients, we’ve found preventing even one significant downtime event a year justifies support costs. Even when the client’s in-house team is very experienced, our help is often invaluable as we are exposed to a great variety of incidents from hundreds of companies. It is much more likely we have encountered the same incident before and have a solution ready. Helping to recover from downtime quickly is a reactive part of support – you can realize even more value by proactively working with support to get advice on your HA options as well as ensure that you’re following the best database backup and security practices.

Better Security

Having a database support contract by itself is not enough to prevent all security incidents. Databases are only one of the component attack vectors, and it takes a lot of everyday work to stay secure. There is nothing that can guarantee complete security. Database support, however, can be a priceless resource for your security team. It can apply security and compliance practices to your database environment and demonstrate how to avoid typical mistakes.

The cost of data breaches can be phenomenal, as well as impact business reputations much more than downtime or performance issues. Depending on the company size and market, costs vary. Different studies estimate costs ranging in average from $1.6M to 7.01M in direct costs. Everyone agrees leaving rising security risks and costs unchecked is a recipe for disaster.

Fix Database Software Bugs

While you might have great DBAs on your team who are comfortable with best practices and downtime recovery, most likely you do not have a development team comfortable with fixing bugs in the database kernel or supporting tools. Getting up-to-date software fixes reduces downtime. It also limits other issues as well, such as ensuring efficient development and operation teams, avoiding using complex workarounds, etc.

Reduce Resources

We deal with a large number of performance-related questions. When we address such problems, we provide a better user experience, save costs, and minimize environmental impact by reducing resource use.

Savings vary depending on your application scale and how optimized the environment is already. In the best cases, our support team helped customers make applications more than 10x more efficient. In most cases, we can help make things at least 30% more efficient. If you’re spending $100K or more on your database environment, this benefit alone makes a support agreement well worth it.

Efficient Developers

You cannot minimize the importance of development efficiency. Too often customers don’t give their developers support access, even though they critically help realize application’s full value. Developers make database decisions about schema design all the time. These include query writing, stored procedures, triggers, sharding, document storage, or foreign keys. Without a database support contract, developers often have resort to “Google University” to find an answer – and often end up with inapplicable, outdated or simply wrong information. Combined with this, they often apply or resort to time-consuming trial and error.

With the help of a Percona Support team, developers can learn proven practices that apply to their specific situation. This saves a lot of time and gets better applications to market faster. Even with a single US-based developer intensively working within the database environment, a support agreement might justify the cost based on increased developer efficiency alone. Larger development teams simply cannot afford to not have support.

Efficient Operations

Your operations staff (DBAs, DevOps, Sysadmins) are in the same boat – if your database environment is significant, chances are you are always looking for ways to save time, make operations more efficient and reduce mistakes. Our support team can provide you with specific actionable advice for your challenges.

Chances are we have seen environments similar to yours and know which software, approaches and practices work well (and which do not). This knowledge helps prevent and reduce downtime. It also helps with team efficiency. Percona Support’s help allows you to handle operations with a smaller team, or address issues with a less experienced staff.

Better Applications

Percona Support access helps developers not only be more productive, but results in better application quality because application database interface design, schema, queries, etc. best practices are followed. The Percona team supports many applications, for many years. We often  think about problems before you might think about them, such as:

  • “How will this design play with replication or sharding?”
  • “Will it scale with large amounts of users or data?”
  • “How flexible is such a design when the  application will inevitably be evolving over years?”

While a better application is hard to quantify, it really is quite important.

Faster Time to Market

Yet another benefit that comes from developers having access to a database support team is faster time-to-market. For many agile applications, being able to launch new features faster is even more important than cost savings – this is how businesses succeed against the competition. At Percona, we love helping businesses succeed.

Conclusion

As you see, there are a lot of ways Percona Support can contribute to the success of your business. Support is much more than “insurance” that you should consider purchasing for compliance reasons. Percona Support provides a great return on investment. It allows you to minimize risks and costs while delivering the highest quality applications or services. Our flexible plans can cover your database environment, even if it is an ever-changing one, while still allowing you to plan your operations costs.


PlanetMySQL Voting: Vote UP / Vote DOWN

Planets9s - Watch our webinar replays for the MySQL, MongoDB and PostgreSQL DBA

$
0
0

Welcome to this week’s Planets9s, covering all the latest resources and technologies we create around automation and management of open source database infrastructures.

Watch our webinar replays for the MySQL, MongoDB and PostgreSQL DBA

Whether you’re interested in open source datastores such as MySQL, MariaDB, Percona, MongoDB or MySQL; load balancers such as HAProxy, MaxScale or ProxySQL; whether you’re in DB Ops or DevOps; looking to automate and manage your databases… Chances are that we have a relevant webinar replay for you. And we have just introduced a new search feature for our webinar replays, which makes it easier and quicker to find the webinar replay you’re looking for.

Search for a webinar replay

Severalnines boosts US health care provider’s IT operations

This week we were delighted to announce that US health care provider Accountable Health Inc. uses our flagship product ClusterControl to outcompete its larger rivals. To quote Greg Sarrica, Director of IT development at AHI: “Using ClusterControl was an absolute no-brainer for me. AHI looked for an alternative to Oracle and IBM, which could match our demands and with our budget. We wanted to give our clients frictionless access to their healthcare information without portals crashing and potentially losing their personal data. Now we have a solution that allows us to be agile when competing in the fast-moving US healthcare market.”

Read the press release

ClusterControl Tips & Tricks: Best practices for database backups

Backups - one of the most important things to take care of while managing databases. It is said there are two types of people - those who backup their data and those who will backup their data. In this new blog post in the Tips & Tricks series, we discuss good practices around backups and show you how you can build a reliable backup system using ClusterControl.

Read the blog

That’s it for this week! Feel free to share these resources with your colleagues and follow us in our social media channels.

Have a good end of the week,

Jean-Jérôme Schmidt
Planets9s Editor
Severalnines AB


PlanetMySQL Voting: Vote UP / Vote DOWN

FieldTop - find columns that might overflow soon

$
0
0

Intro

In the case of a database that's in use for a long time and contains user-generated data, some columns might outgrow the data types that were used to store the data.

This blog post will describe a tool called fieldtop that checks for overflows and underflows in all tables and databases stored in a MySQL server.

Use-case

Gnoosic.com has been running for over 10 years. A user first tells Gnoosic artists they like. Then the user is presented with other artists he might like and he can pick which ones he likes and which ones he doesn't. As time goes by, Gnoosic learns more about what users like and stores those preferences in a database.

Among other information, Gnoosic also stores the popularity of different bands. This information is stored as an integer data type. So, the popularity field for the entry about Pink Floyd reached 1.3 billion. The maximum allowed for the INT data type is 2.1 billion (more precisely, 2,147,483,647) Because of this, a collaboration was made to develop a tool that would check for situations like these so they can be avoided.

Running the tool uncovered a few other columns in different tables of the Gnoosic database that required attention, and since this tool was built to support decisions about the schema of MySQL database, it was opensourced under MIT license.

In addition to the main use-case, there is a dual use-case. If the tool shows that the maximum values stored in a column are a long way from reaching the maximum, and if the data type allows (for text data types), the length of these columns can be fine-tuned for more efficient use of disk space.

What data types are applicable

While MySQL supports many data types. The tool is applicable to:

This tool is not applicable to:

  • BIT fields
  • columns storing UUID values
  • columns storing IP addresses
  • columns storing hash values (MD5, SHA1 or other ones)

The inner-workings of fieldtop

The tool gets the data types of each column from information_schema, then it computes how close those values are to the maximum allowed values for their respective data types.

The following query is used to fetch information from information_schema:

SELECT
    b.COLUMN_NAME,
    b.COLUMN_TYPE,
    b.DATA_TYPE,
    b.signed,
    a.TABLE_NAME,
    a.TABLE_SCHEMA
FROM (
    -- get all tables
    SELECT
    TABLE_NAME, TABLE_SCHEMA
    FROM information_schema.tables
    WHERE 
    TABLE_TYPE IN ('BASE TABLE', 'VIEW') AND
    TABLE_SCHEMA NOT IN ('mysql', 'performance_schema')
) a
JOIN (
    -- get information about columns types
    SELECT
    TABLE_NAME,
    COLUMN_NAME,
    COLUMN_TYPE,
    TABLE_SCHEMA,
    DATA_TYPE,
    (!(LOWER(COLUMN_TYPE) REGEXP '.*unsigned.*')) AS signed
    FROM information_schema.columns
) b ON a.TABLE_NAME = b.TABLE_NAME AND a.TABLE_SCHEMA = b.TABLE_SCHEMA
ORDER BY a.TABLE_SCHEMA DESC;

The results of this query are fetched inside a PHP program 1 that analyzes them and computes the percentage of the maximum value for the data type that was found in that each column.

Demo

Below is a short demo that showcases the tool described in the post:

Conclusion

We've covered the usage of a database tool that can be used to prevent overflowing columns in a proactive manner.

Footnotes:

1

It's possible to have all the logic inside of a stored procedure, in a similar way to this SO thread, but this tool covers more situations and a flexible language (PHP) was required to describe and handle all those situations


PlanetMySQL Voting: Vote UP / Vote DOWN

Creating Migrations with Liquibase

$
0
0

Liquibase

Liquibase is a versioning tool for databases. Currently is in the 3.5 version and it is installed as a JAR. It has been in the market since 2006, and recently completed its 10th anniversary. In it's feature list we have:

  • Code branching and merging
  • Multiple database types
  • Supports XML, YAML, JSON and SQL formats
  • Supports context-dependent logic
  • Generate Database change documentation
  • Generate Database "diffs"
  • Run through your build process, embedded in your application or on demand
  • Automatically generate SQL scripts for DBA code review
  • Does not require a live database connection

Why you need it?

Some frameworks comes with built-in solutions out of the box like Eloquent and Doctrine. There is nothing wrong in using that when you have only DB per project, but when you have multiple systems, it starts to get complicated.

Since Liquibase works as a versioning tool, you can branch like a normal code on github and merge as needed. You have contexts, which means changes can be applied to specific environments only and tagging capabilities to be able to rollback.

Rollback is a tricky thing, you can either do an automatically rollback or define a script, useful when deal with MySQL for instance where DDL changes are NOT transactional.

Guidelines for changelogs and migrations

  • MUST be written using the JSON format. Exceptions are changes/legacy/base.xml and changes/legacy/base_procedures_triggers.sql.
  • MUST NOT be edited. If a new column is to be added, a new migration file must be created and the file MUST be added AFTER the las run transaction.

Branching

There could be 3 main branches:

  • production (master)
  • staging
  • testing

Steps:

  1. Create your changelog branch;
  2. Merge into testing;
  3. When the feature ready to staging, merge into staging;
  4. When the feature is ready, merge into production.

Example:

Liquibase

Rules:

  • testing, staging and production DO NOT merge amongst themselves in any capacity;
  • DO NOT rebase the main branches;
  • Custom branch MUST be deleted after merged into production.

The downside of this approach is the diverging state between the branches. Current process is to from time to time compare the branches and manually check the diffs for unplanned discrepancies.

Procedures for converting a legacy database to Liquibase migrations

Some projects are complete monoliths. More than one application connects to it, and this is not a good practice, if you do that I recommend you treating the database sourcing as in its own repository and not together with your application.

Writing migrations

This is a way I found for keeping the structure in reasonable sense. Suggestions are welcome.

1. Create the property file

Should be in the root of the project and be named liquibase.properties:

driver: com.mysql.jdbc.Driver
classpath: /usr/share/java/mysql-connector-java.jar:/usr/share/java/snakeyaml.jar
url: jdbc:mysql://localhost:3306/mydb
username: root
password: 123

The JAR files in the classpath can be manually downloaded or installed though the server package manager.

Create the Migration file

You can choos e between different formats, I chose to use JSON, in this instance I will be running this SQL:

Which will translate to this:

It is verbose? Yes, completely, but then you have a tool to show you what the SQL will look like and be able to manage the rollbacks.

Save the file as:

.
  /changes
    - changelog.json
    - create_mydb_users.json

Where changelog.json looks like this:

For each new change you add it to the end of the databaseChangeLog array.

Run it

To run, just simply do:

$ liquibase --changeLogFile=changes/changelog.json migrate

Don't worry if you run it twice, the change only happens once.

Next post is how to add a legacy DB into Liquibase.


PlanetMySQL Voting: Vote UP / Vote DOWN

Comparison of database encryption methods (for data at rest)

$
0
0



I recently came across a project where we had to evaluate different techniques suited for encryption of PII data at rest. Database is MySQL community 5.6, Red Hat enterprise OS. We had to encrypt (mask) PII information of customers. As of now data is hosted in local cloud. But we may have future plans to move to a third party cloud like Amazon.

We are talking about two threats, internal and external. Internal - we have support team accessing the database the data for fixes and reporting (slave) Also DBA or Linux root user who have special privileges. So PII needs to be masked from them. External - Mainly hackers, Amazon cloud admins if we move to their cloud environment. Finally we decided to have application layer to do the encryption/decryption. Here are the major factors that lead to the decision




Encryption Type



#

File system Encryption
Database level (TDE)
Application level
Column level privilege(with views)
1
Who is responsible
OS
MySQL EE
Application
DBA
2
who can access data
MySQL user(s)
MySQL users
application
Application, root, DBA
3
protects data from
stolen disk, hackers
file system hackers
everything
non Admin MySQL users
4
does not protect from
DBA, OPS
DBA, root user, OPS

DBA, root, access during changes
5
what can be encrypted
all required file systems
database file system
required fields
required fields
6
performance penalty
high
low
very low
nothing
7
protection strength
weak
strong
very strong
medium
8
application change required
No
No
Yes
No
9
Is backup encrypted
depends on the method (e.g. sqldump is not)
depends on the method
yes
No
10
protects from internal threat
no
no
yes
yes
11
protects from external threat
yes
yes
yes
depends
12
duration to encrypt existing data
long time
long time
depends which all fields
no time






OPS : support + dev team having mysql connectivity to the database



column level privilege - create views excluding PII data for support folks, this can be a different schema as well with only views present in there





This may not be very explanatory so let me know if you have any questions, I'll try my best to answer them..

Praji

PlanetMySQL Voting: Vote UP / Vote DOWN

Develop By Example – Document Store: working with collections using Node.js

$
0
0

In the previous blog post we explained how to create schemas and collections. In this one we are going to explain how to work with collections: adding, updating and deleting documents.

The following code demonstrates how to add a single document to an existing collection:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var newDoc = { name: 'Test Name', description: 'Test Description' };

  coll.add(newDoc).execute().then(function (added) {
    console.log('Document(s) added: '
                + added.getAffectedItemsCount());
    session.close();
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

In the previous code, first we get the objects that represent the schema (schema) and the collection (coll) that we want to work with. An object using JSON (newDoc) is created and is passed as a parameter to the coll object’s add method. Calling the execute method the document is added to the collection. Once the execute method has finished, we receive an object (added) that contains information about the document added. To verify if the document was added, we can call the added object’s getAffectedItemsCount method, which will return how many documents were added.

But, what if you want to add multiple documents?

You can do it with almost no changes in your code. The following code adds two documents at the same time:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var newDoc = { name: 'Test Name', description: 'Test Description' };
  var newDoc2 = { name: 'Test Name 2', description: 'Test Description 2' };

  coll.add(newDoc , newDoc2).execute().then(function (added) {
    console.log('Document(s) added: '
                + added.getAffectedItemsCount());
    session.close();
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

As you can see, the previous code is almost identical to the first example. We just add an extra line to declare the new document (newDoc2), and we add the new document as a parameter in the coll object’s add method. At the end we call added object’s getAffectedItemsCount method to verify we added the two documents.

Now we know how to add multiples documents to a collection using multiple variables and passing them as parameters, but we can also do the same using an Array object. In the following code example we create an array object to use it to add new documents to a collection.

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var newDocs = [{ name: 'Test Name', description: 'Test Description' }, 
                 { name: 'Test Name 2', description: 'Test Description 2' }];

  coll.add(newDocs).execute().then(function (added) {
    console.log('Document(s) added: '
                + added.getAffectedItemsCount());
    session.close();
  })
  .catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
  });
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

The previous code is almost identical to the first example; the difference is that we pass an array object as a parameter instead of a JSON object. The rest of the code is the same. This could be useful if you receive an array of objects from the client or if you load the data from a JSON file, you just pass the whole array to upload it to the collection.

Updating a field in a document is also very easy to do. The following code is an example of how to do it:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var query = "$._id == 'f0a743ac-a052-d615-1f8c-ef65ebc4'";

  coll.modify(query)
  .set('$.name', 'New Name')
  .set('$.description', 'New Description')
  .execute()
  .then(function (updated) {
    console.log('Document(s) updated: '
                + updated.getAffectedItemsCount());
    session.close();
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

In the previous code, first we get the objects that represent the schema (schema) and the collection (coll) we want to work. Then we declare the query variable which contains the where clause for our update. Next, we call the coll object’s modify method that receives the query variable as a parameter. Chained to the modify method is the set method, which receives a pair of objects; the first one is the field to update and the second one the new value to be set. As we did in our previous examples we call the execute method to perform the action requested. When the execute method finishes we receive an object (updated) with information about the update. To know how many documents were updated we call the updated object’s getAffectedItemsCount method.

Now that we know how to add and update documents in a collection, we are going to explain how to remove them. The following code demonstrates it.

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var query = "$._id == 'f0a743ac-a052-d615-1f8c-ef65ebc4'";

  coll.remove(query).execute().then(function (deleted) {
    console.log('Document(s) deleted: ' 
                + deleted.getAffectedItemsCount());
    session.close();
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

The previous code defines the objects that represent the schema (schema) and the collection (coll) where we want to work. Then, we define the query variable again to contain the where clause for our operation, the remove in this case. To remove a document we call the coll object’s remove method followed by the execute method. Once the execute method is completed, we receive an object (deleted) with the information about the operation that has finished. By calling the deleted object’s getAffectedItemsCount method, we know how many documents were removed from the collection.

Now we are going to see how to get documents from a collection. In the following example, we are retrieving the document that match the _id that we want:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var query = "$._id == 'f0a743ac-a052-d615-1f8c-ef65ebc4'";

  coll.find(query).execute(function (doc) {
    console.log(doc);
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
  session.close();
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

The previous code defines the objects that represent the schema (schema) and the collection (coll) we want to work with. Then the query variable is defined and the where clause is set to it. Then we call the coll object’s execute method to perform the query. When the execute method completes, we receive the document that match our search criteria and is send to the console to view it.

But, what if we want all the records from a collection? Well that is simple; we just need to remove the search criteria from the find method. The updated code would look like the following:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');

  coll.find().execute(function (doc) {
    console.log(doc);
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
  session.close();
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

Now we know how to search a specific document and get all the documents from a collection. What if we want to get just some number of documents that match query criteria? The next example shows the code to do it:

var mysqlx = require('mysqlx');
mysqlx.getSession({
  host: 'host',
  port: '33060',
  dbUser: 'root',
  dbPassword: 'my pass'
}).then(function (session) {
  var schema = session.getSchema('mySchema');
  var coll = schema.getCollection('myColl');
  var query = "$.name like '%Test%'";

  coll.find(query).limit(3).execute(function (doc) {
    console.log(doc);
  })
  .catch(function (err) {
    console.log(err.message);
    console.log(err.stack);
  });
  session.close();
}).catch(function (err) {
  console.log(err.message);
  console.log(err.stack);
});

The previous code looks very similar to the example that returns one document with a specific _id, the difference here is that our query is performing a like and we are adding a call to the coll object’s limit method. Note that the query statement is case sensitive; this means that if we have documents that have  ‘test’ in the field ‘name’, those documents will not be returned because we are searching for ‘Test’ names.

See you in the next blog post.


PlanetMySQL Voting: Vote UP / Vote DOWN

Avoiding MySQL ERROR 1784 when replicating from 5.7 to 5.6

$
0
0

Recently I upgraded some MySQL databases from 5.6 to 5.7, but -- for boring reasons unique to my environment -- I had to leave one replica on version 5.6. I knew there was a chance that the 5.7 -> 5.6 replication wouldn't work, but I decided to try it out to see if (and why) it would fail. Once I upgraded the master, replication failed, so I checked the error log on the replica and found these messages:

[ERROR] Slave I/O: Found a Gtid_log_event or Previous_gtids_log_event when @@GLOBAL.GTID_MODE = OFF. Error_code: 1784 [ERROR] Slave I/O: Relay log write failure: could not queue event from master, Error_code: 1595

The error surprised me a little bit since I'm not using GTIDs in that replication topology. I asked around a bit, and Kenny Gryp hypothesized that I might be experiencing MySQL bug #74683, which was fixed in version 5.6.23. Since my replica was on 5.6.22, I decided to do an incremental 5.6 upgrade to see if that resolved the issue. I upgraded to 5.6.31 and replication started working.

YMMV and there are certainly bound to be 5.7 -> 5.6.31 replication scenarios that don't work, but this was a simple fix for me. In hindsight it makes sense that replicating from a new major version to an older major version is more likely to work with a more recent minor version, so if I have to do this in the future I'll make sure the replica is running the latest minor version before upgrading the master.

P.S. Thanks Kenny!


PlanetMySQL Voting: Vote UP / Vote DOWN

How I became a Data Engineer

$
0
0

Pardon me if my memory fails a bit, I am trying to recount the events as they happened, but my memory is not my strongest suit.

I think I had a bit of different path than most PHP Software Developers. My first job was at a company that had their own CRM and they wanted a web version of some parts of the system. At that time, I had only “experience” with ASP and Microsoft Access (I know, tough). They wanted it to be with PHP, the difference I think came when they said they wanted the integration to run directly into the database. The web app would write directly into the DB. The CRM system was written using Delphi and Firebird. So I learned the PHP and my first database wasn’t MySQL (I don’t count MS Access as a DB). After that I got a job which MySQL was used, it was a bit weird at the time learning that MyISAM (I was really fresh on MySQL, and I didn’t know about the engines and such) didn’t had foreign keys for instance.

After that I got a job in a huge multinational where they had this project of migrating every Excel spreadsheet to a PHP program, VBA was heavily used there and they had entire programs running into that, what they didn't tell us was that was cheaper for them having a whole team of PHP developers doing an internal system then have the features built into their ERP. But for “security” reasons, no business logic could be inside the PHP code, so I had to do tons of Stored Procedures. They also had integrations with MS SQL Server, the workflow system used it together with a PHP tool called Scriptcase.

Another job I had was with another multinational where I had to do a program to read from various sources and store in a DB for later reports and all, that would be a “scratch” of an ETL (Extract, Transform, Load) but at the time I wasn’t really well versed and data warehouse techniques. For that job I used PostgreSQL. In the same company we later on did Magento white label stores for our clients (other big companies) and it had to have integration to our ERP, differently from my first job, the integration was through a Service Bus written in Java and the ERP had Oracle as DB.

I could go on and on to every job I had, so when I said my path was a bit different, I say that because most of my jobs were at companies considered corporate ones, and that put me into contact with many flavours of Relational Databases and NoSQL too (MongoDB, Dynamo and Cassandra being the main ones).

In the end, that kind of exposure made me be the “DB” person among my fellow engineers, and everything considered more than a CRUD to it would fall into my lap. And I would be happy with it, I noticed that other developers didn't like much doing database tasks. One of my employers noticed my interest on it and how quickly I became familiarised with the schema and our data architecture. Our main relational database had millions and millions of records, terabytes and they created a Data Team to make it more efficient. We would create workers to sync data into Elasticsearch for instance. I was still a PHP Developer officially, but mainly curating the DB and doing workers with NodeJS (we needed the async and the indexing time was crucial for business).

That’s when I discovered that I wanted to work with data, I didn’t have any idea of what kind of title that would be, I knew I didn’t want to be a DBA, too much infrastructure involved on it and it didn’t sound fun for me. Once the opportunity to work officially as a Data Engineer appeared, I grabbed it with my both hands and I knew it was right.


PlanetMySQL Voting: Vote UP / Vote DOWN

Working with CloudFlare DNS in python

$
0
0

Last week I wrote about aDNS discovery feature in Etcd. As a step in the whole process we need to create DNS records in twindb.com zone. CloudFlare provides rich API to work with it. We wrapped it into a Python module twindb_cloudflare and opensourced it:

In the post I will show how to use twindb_cloudflare module.

CloudFlare API credentials

First of all you need to get credentials to work with CloudFlare. Visit https://www.cloudflare.com/a/account/my-account and get “API Key”.

CLOUDFLARE_EMAIL = "aleks@twindb.com"
CLOUDFLARE_AUTH_KEY = "dbb4a7ae063347a306e9ad8c5bda58a7a3cfa"

Installing twindb_cloudflare

The module is available in PyPi. You can install it with pip:

$ pip install twindb_cloudflare
Collecting twindb_cloudflare
  Using cached twindb_cloudflare-0.1.1-py2.py3-none-any.whl
Installing collected packages: twindb-cloudflare
Successfully installed twindb-cloudflare-0.1.1

Creating A record

To create an A record

import socket
import time
from twindb_cloudflare.twindb_cloudflare import CloudFlare, CloudFlareException

CLOUDFLARE_EMAIL = "aleks@twindb.com"
CLOUDFLARE_AUTH_KEY = "dbb4a7ae063347a306e9ad8c5bda58a7a3cfa"

cf = CloudFlare(CLOUDFLARE_EMAIL, CLOUDFLARE_AUTH_KEY)

try:
    cf.create_dns_record('blogtest.twindb.com', 'twindb.com', '10.10.10.10')
    # The new record isn't available right away
    wait_until = time.time() + 600
    while time.time() < wait_until:
        try:
            ip = socket.gethostbyname('blogtest.twindb.com')
            print(ip)
            exit(0)
        except socket.gaierror:
            time.sleep(1)
    print("New record isn't available after 600 seconds")
    exit(-1)
except CloudFlareException as err:
    print(err)
    exit(-1)

The script runs a minute or two:

[16:41:51 aleks@Aleksandrs-MacBook-Pro mydns]$ python create.py
10.10.10.10
[16:43:59 aleks@Aleksandrs-MacBook-Pro mydns]$

Updating A record

Updating a record is pretty straightforward either. In real applications you have to take into account that a DNS response may be cached and CloudFlare changes aren’t available momentarily.

import socket
import time
from twindb_cloudflare.twindb_cloudflare import CloudFlare, CloudFlareException

CLOUDFLARE_EMAIL = "aleks@twindb.com"
CLOUDFLARE_AUTH_KEY = "dbb4a7ae063347a306e9ad8c5bda58a7a3cfa"

cf = CloudFlare(CLOUDFLARE_EMAIL, CLOUDFLARE_AUTH_KEY)

try:
    cf.update_dns_record('blogtest.twindb.com', 'twindb.com', '10.20.20.20')
    # The new record isn't available right away
    wait_until = time.time() + 600
    while time.time() < wait_until:
        try:
            ip = socket.gethostbyname('blogtest.twindb.com')
            if ip == '10.20.20.20':
                print(ip)
                exit(0)
            else:
                time.sleep(1)
        except socket.gaierror:
            time.sleep(1)
    print("New record isn't updated after 600 seconds")
    exit(-1)
except CloudFlareException as err:
    print(err)
    exit(-1)

The change is visible after a minute

[16:54:02 aleks@Aleksandrs-MacBook-Pro mydns]$ python myupdate.py
10.20.20.20
[16:55:03 aleks@Aleksandrs-MacBook-Pro mydns]$

Deleting A record

And finally let’s delete the record we’ve created. Here I won’t wait until the change is propagated and will assume that if there was no exception then we are good.

from twindb_cloudflare.twindb_cloudflare import CloudFlare, CloudFlareException

CLOUDFLARE_EMAIL = "aleks@twindb.com"
CLOUDFLARE_AUTH_KEY = "dbb4a7ae063347a306e9ad8c5bda58a7a3cfa"

cf = CloudFlare(CLOUDFLARE_EMAIL, CLOUDFLARE_AUTH_KEY)

try:
    cf.delete_dns_record('blogtest.twindb.com', 'twindb.com')
except CloudFlareException as err:
    print(err)
    exit(-1)

Final notes

ClaudFlare API provides many more actions than the module implements. However I wanted to start with something small and incrementally implementing more feature as they are needed.

I encourage you to file bugs and feature requests on https://github.com/twindb/twindb_cloudflare/issues.
Also, pull requests are welcomed ?

The post Working with CloudFlare DNS in python appeared first on Backup and Data Recovery for MySQL.


PlanetMySQL Voting: Vote UP / Vote DOWN

Speeding up protocol decoders in python

$
0
0

Decoding binary protocols in python

Decoding binary protocols like the MySQL Client/Server Protocol or MySQL's new X Protocol involves taking a sequence of bytes and turning them into integers.

In python the usually workhorse for this task is struct.unpack()

It takes a sequence of bytes and a format-string and returns a tuple of decoded values.

In the case of the MySQL Client/Server protocol the integers are (mostly) little-endian, unsigned and we can use:

format description
<B integer, little endian, unsigned, 1 bytes
<H integer, little endian, unsigned, 2 bytes
<L integer, little endian, unsigned, 4 bytes
<Q integer, little endian, unsigned, 8 bytes
# unpack_int_le_1

import struct


def unpack_int_le_struct(payload, l):
    if l == 0:
        return 0
    elif l == 1:
        return struct.unpack_from("<B", payload)[0]
    elif l == 2:
        return struct.unpack_from("<H", payload)[0]
    elif l == 4:
        return struct.unpack_from("<L", payload)[0]
    elif l == 8:
        return struct.unpack_from("<Q", payload)[0]
    else:
        # no native mapping for that byte-length,
        # fallback to shift+or
        v = 0
        sh = 0

        for d in struct.unpack_from("%dB" % l, payload):
            v |= d << sh
            sh += 8

        return v

The code gets benchmarked with 'timeit' which runs the setup once and the test-function 10000 times:

# benchmark.py

import timeit
import tabulate

from collections import OrderedDict

setup = r"""
from unpack_int_le_1 import unpack_int_le

test_str = b'\x00\x00\x00\x00\x00\x00\x00\x00'
"""


timings = OrderedDict()

for n in (1, 2, 3, 4, 8):
    key = str(n)

    timings[key] = timeit.timeit(
        r"unpack_int_le(test_str, %d)" % (n, ),
        setup=setup)


print(tabulate.tabulate(
    timings.items(),
    tablefmt="rst",
    headers=("byte length",
             "(s)",
             )))

shows us:

byte length time (s)
1 0.466556
2 0.422064
3 1.17804
4 0.439113
8 0.448069

As the MySQL Client/Server protocol also has a 3-byte unsigned integer which doesn't have mapping struct's format strings, the fallback is taken which is quite slow.

Optimizing 3 byte integers

There is a nice trick to speed it up:

uint3_le = struct.unpack("<L", payload[:3] + b"\x00")[0]
# unpack_int_le_2.py

import struct


def unpack_int_le(payload, l):
    if l == 0:
        return 0
    elif l == 1:
        return struct.unpack_from("<B", payload)[0]
    elif l == 2:
        return struct.unpack_from("<H", payload)[0]
    elif l == 3 and hasattr(payload, "__add__"):
        return struct.unpack_from("<L", payload[:l] + b"\x00")[0]
    elif l == 4:
        return struct.unpack_from("<L", payload)[0]
    elif l == 8:
        return struct.unpack_from("<Q", payload)[0]
    else:
        # no native mapping for that byte-length, fallback to
        # shift+or
        v = 0
        sh = 0

        for d in struct.unpack_from("%dB" % l, payload):
            v |= d << sh
            sh += 8

        return v

Note

the check for __add__ handles that case of memoryviews which can't be extended like that.

byte length time (s)
1 0.479504
2 0.430233
3 0.650036
4 0.456068
8 0.481057

The tiny bit of extra work to slice and extend the string, wins over doing all the work in pure python.

Naive cython

cython is a python to C converter which (with the help of a C compiler) builds native python modules.

You can install it via pip:

$ pip install cython

and generate a native python module with:

$ cythonize ./unpack_int_le_2.py && \
  gcc -shared -pthread -fPIC -fwrapv -O3 -Wall \
    -fno-strict-aliasing -I/usr/include/python2.7 \
    -o unpack_int_le_3.so unpack_int_le_3.c

Without any changes to the python code we gain between 20-40%:

byte length time (s)
1 0.36877
2 0.314847
3 0.427502
4 0.322987
8 0.334282

Static typed python with annotations

cython can do better though with static typing.

Looking at the generated unpack_int_le_3.c one can see that most of the time is spent in the abstractions (object, buffer-interface, ...) for each byte extracted.

cython can remove all that if we give it some hints:

# unpack_int_le_4.py

import cython


@cython.locals(payload=cython.p_uchar,
               l=cython.ulong,
               v=cython.ulong,
               sh=cython.ulong,
               n=cython.ulong)
def _unpack_int_le(payload, l):
    v = sh = 0

    for n in range(l):
        v |= payload[n] << sh
        sh += 8

    return v


@cython.locals(l=cython.ulong,
               charp=cython.p_uchar)
def unpack_int_le(payload, l):
    if l > len(payload):
        raise IndexError()

    if isinstance(payload, (bytes, bytearray)):
        charp = payload

        return _unpack_int_le(charp, l)
    else:
        raise TypeError(type(payload))

See also: http://docs.cython.org/en/latest/src/tutorial/pure.html

With these hints given the _unpack_int_le() from above gets translated to:

__pyx_v_v = 0;
__pyx_v_sh = 0;

__pyx_t_1 = __pyx_v_l;
for (__pyx_t_2 = 0; __pyx_t_2 < __pyx_t_1; __pyx_t_2+=1) {
  __pyx_v_n = __pyx_t_2;

  __pyx_v_v = (__pyx_v_v | ((__pyx_v_payload[__pyx_v_n]) << __pyx_v_sh));
  __pyx_v_sh = (__pyx_v_sh + 8);
}

return PyInt_from_unsigned_long(__pyx_v_v)

This gains another 50%:

byte length time (s)
1 0.224607
2 0.157248
3 0.158877
4 0.163004
8 0.165769

While the code is pure python with some annotations it sadly comes with some overhead in the generated C-code.

Static typed cython

cython has two more ways of specifying the static types:

  • move the static types into a .pxd file
  • write a .pyx file directly

In our case the result would be the same which is why I go with the first approach:

# unpack_int_le_5.pxd

import cython


@cython.locals(v=cython.ulong,
               sh=cython.ulong,
               n=cython.ulong)
cdef unsigned long _unpack_int_le(unsigned char *payload, unsigned long l)

@cython.locals(l=cython.ulong,
               charp=cython.p_uchar)
cpdef unpack_int_le(payload, unsigned long l)
# unpack_int_le_5.py

def _unpack_int_le(payload, l):
    v = sh = n = 0

    for n in range(l):
        v |= payload[n] << sh
        sh += 8

    return v


def unpack_int_le(payload, l):
    if l > len(payload):
        raise IndexError()

    if isinstance(payload, (bytes, bytearray)):
        charp = payload

        return _unpack_int_le(charp, l)
    else:
        raise TypeError(type(payload))
$ cythonize ./unpack_int_le_5.py && \
  gcc -shared -pthread -fPIC -fwrapv \
    -O3 -Wall -fno-strict-aliasing \
    -I/usr/include/python2.7 \
    -o unpack_int_le_5.so unpack_int_le_5.c
byte length time (s)
1 0.084414
2 0.0818369
3 0.083369
4 0.084229
8 0.0980709

One can see that the longer byte-length leads to slightly higher runtime.

Just cast it

One last step and we are at the final form of the optimizations which mirror what one would have written if one would have written the code in C directly.

If we are running on a little endian platform, we can save shift-or-loop and just cast the byte-sequence into the right integer directly.

Note

to show how the code looks in a pyx file, the previous py and pxd files have been merged into a pyx.

# unpack_int_le_6.pyx

import sys


cdef unsigned int is_le
is_le = sys.byteorder == "little"


cdef unsigned long _unpack_int_le(const unsigned char *payload, unsigned long l):
  cdef unsigned long *u32p
  cdef unsigned short *u16p
  cdef unsigned long long *u64p
  cdef unsigned long v = 0
  cdef unsigned long sh = 0
  cdef unsigned long n = 0

  if is_le:
      if l == 1:
          return payload[0]
      elif l == 2:
          u16p = <unsigned short *>payload
          return u16p[0]
      elif l == 4:
          u32p = <unsigned long *>payload
          return u32p[0]
      elif l == 8:
          u64p = <unsigned long long *>payload
          return u64p[0]

  v = sh = n = 0

  for n in range(l):
      v |= payload[n] << sh
      sh += 8

  return v


cpdef unsigned long unpack_int_le(payload, unsigned long l):
    cdef unsigned char *charp
    if l > len(payload):
        raise IndexError()

    if isinstance(payload, (bytes, bytearray)):
        charp = payload

        return _unpack_int_le(charp, l)
    else:
        raise TypeError(type(payload))
$ cythonize ./unpack_int_le_6.pyx && \
  gcc -shared -pthread -fPIC -fwrapv \
    -O3 -Wall -fno-strict-aliasing \
    -I/usr/include/python2.7 \
    -o unpack_int_le_6.so unpack_int_le_6.c
byte length time (s)
1 0.0812111
2 0.0792191
3 0.082288
4 0.079355
8 0.0791218

Note

the 3-byte case hits the loop again and is slightly slower.

Conclusion

Between the first attempt with struct.unpack() and our last with cython we have quite a speed difference:

byte length time (s) time (s) speedup
1 0.466556 0.0812111 5.75x
2 0.422064 0.0792191 5.34x
3 1.17804 0.082288 14.36x
4 0.439113 0.079355 5.56x
8 0.448069 0.0791218 5.67x

We got a 5x speed up and didn't had to write a single line of python C-API code.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL on Docker: Single Host Networking for MySQL Containers

$
0
0

Networking is critical in MySQL, it is a fundamental resource to manage access to the server from client applications and other replication peers. The behaviour of a containerized MySQL service is determined by how the MySQL image is spawned with “docker run” command. With Docker single-host networking, a MySQL container can be run in an isolated environment (only reachable by containers in the same network), or an open environment (where the MySQL service is totally exposed to the outside world) or the instance simply runs with no network at all.

In the previous two blog posts, we covered the basics of running MySQL in a container and how to build a custom MySQL image. In today’s post, we are going to cover the basics of how Docker handles single-host networking and how MySQL containers can leverage that.

3 Types of Networks

By default, Docker creates 3 networks on the machine host upon installation:

$ docker network ls
NETWORK ID          NAME                DRIVER
1a54de857c50        host                host
1421a175401a        bridge              bridge
62bf0f8a1267        none                null

Each network driver has its own characteristic, explained in the next sections.

Host Network

The host network adds a container on the machine host’s network stack. You may imagine containers running in this network are connecting to the same network interface as the machine host. It has the following characteristics:

  • Container’s network interfaces will be identical with the machine host.
  • Only one host network per machine host. You can’t create more.
  • You have to explicitly specify “--net=host” in the “docker run” command line to assign a container to this network.
  • Container linking, “--link mysql-container:mysql” is not supported.
  • Port mapping, “-p 3307:3306” is not supported.

Let’s create a container on the host network with “--net=host”:

$ docker run \
--name=mysql-host \
--net=host \
-e MYSQL_ROOT_PASSWORD=mypassword \
-v /storage/mysql-host/datadir:/var/lib/mysql \
-d mysql

When we look into the container’s network interface, the network configuration inside the container is identical to the machine host:

[machine-host]$ docker exec -it mysql-host /bin/bash
[container-host]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:fa:f6:30 brd ff:ff:ff:ff:ff:ff
    inet 192.168.55.166/24 brd 192.168.55.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fefa:f630/64 scope link
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:93:50:ee:c8 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:93ff:fe50:eec8/64 scope link

In this setup, the container does not need any forwarding rules in iptables since it’s already attached to the same network as the host. Hence, port mapping using option “-p” is not supported and Docker will not manage the firewall rules of containers that run in this type of network.

If you look at the listening ports on the host machine, port 3306 is listening as it should:

[machine-host]$ netstat -tulpn | grep 3306
tcp6       0      0 :::3306                 :::*                    LISTEN      25336/mysqld

Having a MySQL container running on the Docker host network is similar to having a standard MySQL server installed on the host machine. This is only helpful if you want to dedicate the host machine as a MySQL server, however managed by Docker instead.

Now, our container architecture can be illustrated like this:

Containers created on host network are reachable by containers created inside the default docker0 and user-defined bridge.

Bridge network

Bridging allows multiple networks to communicate independently while keep separated on the same physical host. You may imagine this is similar to another internal network inside the host machine. Only containers in the same network can reach each other including the host machine. If the host machine can reach the outside world, so can the containers.

There are two types of bridge networks:

  1. Default bridge (docker0)
  2. User-defined bridge

Default bridge (docker0)

The default bridge network, docker0 will be automatically created by Docker upon installation. You can verify this by using the “ifconfig” or “ip a” command. The default IP range is 172.17.0.1/16 and you can change this inside /etc/default/docker (Debian) or /etc/sysconfig/docker (RedHat). Refer to Docker documentation if you would like to change this.

Let’s jump into an example. Basically, if you don’t explicitly specify “--net” parameter in the “docker run” command, Docker will create the container under the default docker0 network:

$ docker run \
--name=mysql-bridge \
-p 3307:3306 \
-e MYSQL_ROOT_PASSWORD=mypassword \
-v /storage/mysql-bridge/datadir:/var/lib/mysql \
-d mysql

And when we look at the container’s network interface, Docker creates one network interface, eth0 (excluding localhost):

[machine-host]$ docker exec -it mysql-container-bridge /bin/bash
[container-host]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:2/64 scope link
       valid_lft forever preferred_lft forever

By default, Docker utilises iptables to manage packet forwarding to the bridge network. Each outgoing connection will appear to originate from one of the host machines’s own IP addresses. The following is the machine’s NAT chains after the above container was started:

[machine-host]$ iptables -L -n -t nat
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0
MASQUERADE  tcp  --  172.17.0.2           172.17.0.2           tcp dpt:3306

Chain DOCKER (2 references)
target     prot opt source               destination
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:3307 to:172.17.0.2:3306

The above rules allows port 3307 to be exposed on the machine host based on the port mapping option “-p 3307:3306” in the “docker run” command line. If we look at the netstat output on the host, we can see MySQL is listening on port 3307, owned by docker-proxy process:

[machine-host]$ netstat -tulpn | grep 3307
tcp6       0      0 :::3307                 :::*                    LISTEN      4150/docker-proxy

At this point, our container setup can be illustrated below:

The default bridge network supports the use of port mapping and container linking to allow communication between containers in the docker0 network. If you would like to link another container, you can use the “--link” option in the “docker run” command line. Docker documentation provides extensive details on how the container linking works by exposing environment variables and auto-configured host mapping through /etc/hosts file.

User-defined bridge

Docker allows us to create custom bridge network, a.k.a user-defined bridge network (you can also create user-defined overlay network, but we are going to cover that in the next blog post). It behaves exactly like the docker0 network, where each container in the network can immediately communicate with other containers in the network. Though, the network itself isolates the containers from external networks.

The big advantage of having this network is that all containers have the ability to resolve the container’s name. Consider the following network:

[machine-host]$ docker network create mysql-network

Then, create 5 mysql containers under the user-defined network:

[machine-host]$ for i in {1..5}; do docker run --name=mysql$i --net=mysql-network -e MYSQL_ROOT_PASSWORD=mypassword -d mysql; done

Now, login into one of the containers (mysql3):

[machine-host]$ docker exec -it mysql3 /bin/bash

We can then ping all containers in the network without ever linking them:

[mysql3-container]$ for i in {1..5}; do ping -c 1 mysql$i ; done
PING mysql1 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: icmp_seq=0 ttl=64 time=0.151 ms
--- mysql1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.151/0.151/0.151/0.000 ms
PING mysql2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: icmp_seq=0 ttl=64 time=0.138 ms
--- mysql2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.138/0.138/0.138/0.000 ms
PING mysql3 (172.18.0.4): 56 data bytes
64 bytes from 172.18.0.4: icmp_seq=0 ttl=64 time=0.087 ms
--- mysql3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.087/0.087/0.087/0.000 ms
PING mysql4 (172.18.0.5): 56 data bytes
64 bytes from 172.18.0.5: icmp_seq=0 ttl=64 time=0.353 ms
--- mysql4 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.353/0.353/0.353/0.000 ms
PING mysql5 (172.18.0.6): 56 data bytes
64 bytes from 172.18.0.6: icmp_seq=0 ttl=64 time=0.135 ms
--- mysql5 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.135/0.135/0.135/0.000 ms

If we look into the resolver setting, we can see Docker configures an embedded DNS server:

[mysql3-container]$ cat /etc/resolv.conf
search localdomain
nameserver 127.0.0.11
options ndots:0

The embedded DNS server maintains the mapping between the container name and its IP address, on the network the container is connected to, as in this case it is mysql-network. This feature facilitates node discovery in the network and is extremely useful in building a cluster of MySQL containers using MySQL clustering technology like MySQL replication, Galera Cluster or MySQL Cluster.

At this point, our container setup can be illustrated as the following:

Default vs User-defined Bridge

The following table simplifies the major differences between these two networks:

Area Default bridge (docker0) User-defined bridge
Network deployment Docker creates upon installation Created by user
Container deployment Default to this network Explicitly specify “--net=[network-name]” in the “docker run” command
Container linking Allows you to link multiple containers together and send connection information from one to another by using “--link [container-name]:[service-name]”. When containers are linked, information about a source container can be sent to a recipient container. Not supported
Port mapping Supported e.g, by using “-p 3307:3306” Supported e.g, by using “-p 3307:3306”
Name resolver Not supported (unless you link them) All containers in this network are able to resolve each other’s container name to IP address. Version <1.10 use /etc/hosts, >=1.10 use embedded DNS server.
Packet forwarding Yes, via iptables Yes, via iptables
Example usage for MySQL MySQL standalone MySQL replication, Galera Cluster, MySQL Cluster (involving more than one MySQL container setup)

No network

We can also create a container without any network attached to it by specifying “--net=none” in the “docker run” command. The container is only accessible through interactive shell. No additional network interface will be configured on the node.

Consider the following:

[machine-host]$ docker run --name=mysql0 --net=none -e MYSQL_ROOT_PASSWORD=mypassword -d mysql

By looking at the container’s network interface, only localhost interface is available:

[machine-host]$ docker exec -it mysql0 /bin/bash
[mysql0-container]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

Container in network none indicates it can’t join any network. Nevertheless, the MySQL container is still running and you can access it directly from the shell using mysql client command line through localhost or socket:

[mysql0-container]$ mysql -uroot -pmypassword -h127.0.0.1 -P3306
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.13 MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Example use cases to run MySQL container in this network are MySQL backup verification by testing the restoration process, preparing the backup created using, e.g., Percona Xtrabackup or testing queries on different version of MySQL servers.

At this point, our containers setup can be illustrated as the following:

This concludes today’s blog. In the next blog post, we are going to look into multiple host networking (using overlay networks) together with Docker Swarm, an orchestration tool to manage containers on multiple machine hosts.


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona XtraBackup 2.4.4 is now available

$
0
0
Percona XtraBackup 2.4.4

Percona XtraBackup 2.4.4Percona announces the GA release of Percona XtraBackup 2.4.4 on July 25th, 2016. You can download it from our download site and from apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup has been rebased on MySQL 5.7.13.

Bugs Fixed:

  • Percona XtraBackup reported the difference in the actual size of the system tablespace and the size which was stored in the tablespace header. This check is now skipped for tablespaces with autoextend support. Bug fixed #1550322.
  • Because Percona Server 5.5 and MySQL 5.6 store the LSN offset for large log files at different places inside the redo log header, Percona XtraBackup was trying to guess which offset is better to use by trying to read from each one and compare the log block numbers and assert lsn_chosen == 1 when both LSNs looked correct, but they were different. Fixed by improving the server detection. Bug fixed #1568009.
  • Percona XtraBackup didn’t correctly detect when tables were both compressed and encrypted. Bug fixed #1582130.
  • Percona XtraBackup would crash if the keyring file was empty. Bug fixed #1590351.
  • Backup couldn’t be prepared when the size in cache didn’t match the physical size. Bug fixed #1604299.
  • Free Software Foundation address in copyright notices was outdated. Bug fixed #1222777.
  • Backup process would fail if the datadir specified on the command-line was not the same as one that is reported by the server. Percona XtraBackup now allows the datadir from my.cnf override the one from SHOW VARIABLES. xtrabackup prints a warning that they don’t match, but continues. Bug fixed #1526467.
  • With upstream change of maximum page size from 16K to 64K, the size of incremental buffer became 1G. Which increased the requirement to 1G of RAM in order to prepare the backup. While in fact there is no need to allocate such a large buffer for smaller pages. Bug fixed #1582456.
  • Backup process would fail on MariaDB Galera cluster operating in GTID mode if binary logs were in non-standard directory. Bug fixed #1517629.

Other bugs fixed: #1583717, #1583954, and #1599397.

Release notes with all the bugfixes for Percona XtraBackup 2.4.4 are available in our online documentation. Please report any bugs to the launchpad bug tracker.


PlanetMySQL Voting: Vote UP / Vote DOWN

SQL injection in the MySQL server! (of the proxy kind)

$
0
0
[this is a repost of my http://shardquery.com blog post, because it did not syndicate to planet.mysql.com]

As work on WarpSQL (Shard-Query 3) progresses, it has outgrown MySQL proxy.  MySQL proxy is a very useful tool, but it requires LUA scripting, and it is an external daemon that needs to be maintained.  The MySQL proxy module for Shard-Query works well, but to make WarpSQL into a real distributed transaction coordinator, moving the proxy logic inside of the server makes more sense.

The main benefit of MySQL proxy is that it allows a script to "inject" queries between the client and server, intercepting the results and possibly sending back new results to the client.  I would like similar functionality, but inside of the server.

For example, I would like to implement new SHOW commands, and these commands do not need to be implemented as actual MySQL SHOW commands under the covers.

For example, for this blog post I made a new example command called "SHOW PASSWORD"

Example "injection" which adds SHOW PASSWORD functionality to the server

mysql> select user();
+----------------+
| user()         |
+----------------+
| root@localhost |
+----------------+
1 row in set (0.00 sec)

-- THIS COMMAND DOES NOT EXIST
mysql> show password;
+-------------------------------------------+
| password_hash                             |
+-------------------------------------------+
| *00A51F3F48415C7D4E8908980D443C29C69B60C9 |
+-------------------------------------------+
1 row in set (0.00 sec)


Important - This isn't a MySQL proxy plugin.  There is C++ code in the SERVER to answer that query, but it isn't the normal SHOW command code.  This "plugin" (I put it in quotes because my plan is for a pluggable interface but it isn't added to the server yet) doesn't access the mysql.user table using normal internal access methods. It runs actual SQL inside of the server, on the same THD as the client connection, in the same transaction as the client connection, to get the answer!

Problem #1 - Running SQL in the server

The MySQL C client API doesn't have any methods for connecting to the server from inside of the server, except to connect to the normally available socket interfaces, authenticate, and then issue queries like a normal client.  While it is perfectly possible to connect to the server as a client in this manner, it is sub-optimal for a number of reasons.  First, it requires a second connection to the server, second, it requires that you authenticate again (which requires you have the user's password), and lastly, any work done in the second connection is not party to transactional changes in the first, and vice-versa.

The problem is communication between the client and server, which uses a mechanism called VIO.  There was work done a long time ago for external stored procedures, which never made it into the main server that would have alleviated this problem by implementing a in-server VIO layer, and making the parser re-entrant.  That work was done on MySQL 5.1 though.

It is possible to run queries without using VIO though.  You simply can't get results back, except to know if the query succeeded or not.  This means it is perfectly acceptable for any command that doesn't need a resultset, basically anything other than SELECT.  There is a loophole however, in that any changes made to the THD stay made to that THD.  Thus, if the SQL executed sets any user variables, then those variables are of course visible after query execution.

Solution  - encapsulate arbitrary SQL resultsets through a user variable

Since user variables are visible after query execution, the goal is to get the complete results of a query into a user variable, so that the resultset can be accessed from the server.  To accomplish this, first a method to get the results into the variable must be established, and then some data format for communication that is amenable to that method has to be decided upon so that the resultset can be accessed conveniently..

With a little elbow grease MySQL can convert any SELECT statement into CSV resultset.  To do so, the following are used:


  1. SELECT ... INTO @user_variable

  2. A subquery in the FROM clause (for the original query)

  3. CONCAT, REPLACE, IFNULL, GROUP_CONCAT (to encode the resultset data)

Here is the SQL that the SHOW PASSWORD command uses to get the correct password:

select authentication_string as pw,
       user 
  from mysql.user 
 where concat(user,'@',host) = USER() 
    or user = USER() 
LIMIT 1

Here is the "injected" SQL that the database generates to encapsulate the SQL resultset as CSV:

select 
  group_concat( 
    concat('"',
           IFNULL(REPLACE(REPLACE(`pw`,'"','\\"'),"\n","\\n"),"\N"),
           '"|"',
           IFNULL(REPLACE(REPLACE(`user`,'"','\\"'),"\n","\\n"),"\N"),
           '"'
    ) 
  separator "\n"
  ) 
from 
  ( select authentication_string as pw,
           user 
      from mysql.user 
      where concat(user,'@',host) = USER() 
        OR user = USER() 
    LIMIT 1
  ) the_query 
into @sql_resultset ;
Query OK, 1 row affected (0.00 sec)

Here is the actual encapsulated resultset.  If there were more than one row, they would be newline separated.

mysql> select @sql_resultset;
+----------------+
| @sql_resultset |
+----------------+
| ""|"root"      |
+----------------+
1 row in set (0.00 sec)

Injecting SQL in the server

With the ability to encapsulate resultsets into CSV in user variables, it is possible to create a cursor over the resultset data and access it in the server.  The MySQL 5.7 pre-parse rewrite plugins, however,  still run inside the parser.  The THD is not "clean" with respect to being able to run a second query.  The parser is not re-entrant.  Because I desire to run (perhaps many) queries between the time a user enters a query and the server actually answers the query (perhaps with a different query than the user entered!) the MySQL 5.7 pre-parse rewrite plugin infrastructure doesn't work for me.

I modified the server, instead, so that there is a hook in do_command() for query injections.  I called it conveniently query_injection_point() and the goal is to make it a new plugin type, but I haven't written that code yet.  Here is the current signature for query_injection_point():


bool query_injection_point(
  THD* thd, COM_DATA *com_data, enum enum_server_command command,
  COM_DATA* new_com_data, enum enum_server_command* new_command );

It has essentially the same signature as dispatch_command(), but it provides the ability to replace the command, or keep it as is.  It returns true when the command has been replaced.

Because it is not yet pluggable, here is the code that I placed in the injection point:


/* TODO: make this pluggable */
bool query_injection_point(THD* thd, COM_DATA *com_data, enum enum_server_command command,
 COM_DATA* new_com_data, enum enum_server_command* new_command)
{
 /* example rewrite rule for SHOW PASSWORD*/
 if(command != COM_QUERY)
 { return false; }
 
 /* convert query to upper case */
 std::locale loc;
 std::string old_query(com_data->com_query.query,com_data->com_query.length);
 for(unsigned int i=0;i<com_data->com_query.length;++i) {
   old_query[i] = std::toupper(old_query[i], loc);
 } 
   
 if(old_query == "SHOW PASSWORD")
 {
   std::string new_query;
   SQLClient conn(thd);
   SQLCursor* stmt;
   SQLRow* row;

   if(conn.query("pw,user",
    "select authentication_string as pw,user from mysql.user " \
    "where concat(user,'@',host) = USER() or user = USER() LIMIT 1", &stmt))
   {
     if(stmt != NULL)
     {
       if((row = stmt->next()))
       {
          new_query = "SELECT '" + row->at(0) + "' as password_hash";
       }
       } else
       {
         return false;
       }
     } else {
       return false;
     }

     /* replace the command sent to the server */
     if(new_query != "")
     {
       Protocol_classic *protocol= thd->get_protocol_classic();
       protocol->create_command(
         new_com_data, COM_QUERY, 
         (uchar *) strdup(new_query.c_str()), 
         new_query.length()
       );
       *new_command = COM_QUERY;
     } else {
       if(stmt) delete stmt;
       return false;
     }
     if(stmt) delete stmt;
     return true;
   }
 }

 /* don't replace command */
 return false;
}

SQLClient

You will notice that the code access the mysql.user table using SQL, using the SQLClient, SQLCursor, and SQLRow objects.  These are the objects that wrap around encapsulating the SQL into a CSV resultset, and actually accessing the result set.  The interface is very simple, as you can see from the example.  You create a SQLClient for a THD (one that is NOT running a query already!) and then you simply run queries and access the results.

The SQLClient uses a stored procedure to methodically encapsulate the SQL into CSV and then provides objects to access and iterate over the data that is buffered in the user variable.  Because MySQL 5.7 comes with the sys schema, I placed the stored procedure into it, as there is no other available default database that allows the creation of stored procedures.  I called it sys.sql_client().

Because the resultset is stored as text data, the SQLRow object returns all column values as std::string.

What's next?

I need to add a proper plugin type for "SQL injection plugins".  Then I need to work on a plugin for parallel queries.  Most of the work for that is already done, actually, at least to get it into an alpha quality state.  There is still quite a bit of work to be done though.

You can find the code in the internal_client branch of my fork of MySQL 5.7:

http://github.com/greenlion/warpsql-server


PlanetMySQL Voting: Vote UP / Vote DOWN

Netflix Billing Migration to AWS - Part II

$
0
0

This is a continuation in the series on Netflix Billing migration to the Cloud. An overview of the migration project was published earlier here. This post details the technical journey for the Billing applications and datastores as they were moved from the Data Center to AWS Cloud.

As you might have read in earlier Netflix Cloud Migration blogs, all of Netflix streaming infrastructure is now completely run in the Cloud. At the rate Netflix was growing, especially with the imminent Netflix Everywhere launch, we knew we had to move Billing to the Cloud sooner than later else our existing legacy systems would not be able to  scale.

There was no doubt that it would be a monumental task of moving highly sensitive applications and critical databases without disrupting the business, while at the same time continuing to build the new business functionality and features.

A few key responsibilities and challenges for Billing:

  • The Billing team is responsible for the financially critical data in the company. The data we generate on a daily basis for subscription charges, gift cards, credits, chargebacks, etc. is rolled up to finance and is reported into the Netflix accounting. We have stringent SLAs on our daily processing to ensure that the revenue gets booked correctly for each day. We cannot tolerate delays in processing pipelines.
  • Billing has zero tolerance for data loss.
  • For most parts, the existing data was structured with a relational model and necessitates use of transactions to ensure an all-or-nothing behavior. In other words we need to be ACID for some operations. But we also had use-cases where we needed to be highly available across regions with minimal replication latencies.
  • Billing integrates with the DVD business of the company, which has a different architecture than the Streaming component, adding to the integration complexity.
  • The Billing team also provides data to support Netflix Customer Service agents to answer any member billing issues or questions. This necessitates providing Customer Support with a comprehensive view of the data.

The way the Billing systems were, when we started this project, is shown below.
Canvas 1.png
  • 2 Oracle databases in the Data Center - One storing the customer subscription information and other storing the invoice/payment data.
  • Multiple REST-based applications - Serving calls from the www.netflix.com and Customer support applications. These were essentially doing the CRUD operations
  • 3 Batch applications -
  • Subscription Renewal - A daily job that looks through the customer base to determine the customers to be billed that day and the amount to be billed by looking at their subscription plans, discounts, etc.
  • Order & Payment Processor - A series of  batch jobs that create an invoice to charge the customer to be renewed and process the invoice through various stages of the invoice lifecycle.
  • Revenue Reporting - A daily job that looks through billing data and generates reports for the Netflix Finance team to consume.
  • One Billing Proxy application (in the Cloud) - used to route calls from rest of Netflix applications in the Cloud to the Data Center.
  • Weblogic queues with legacy formats being used for communications between processes.

The goal was to move all of this to the Cloud and not have any billing applications or databases in the Data Center. All this without disrupting the business operations. We had a long way to go!
The Plan

We came up with a 3-step plan to do it:
  • Act I - Launch new countries directly in the Cloud on the billing side while syncing the data back to the Data Center for legacy batch applications to continue to work.
  • Act II - Model the user-facing data, which could live with eventual consistency and does not need to be ACID, to persist to Cassandra (Cassandra gave us the ability to perform writes in one region and make it available in the other regions with very low latency. It also gives us high-availability across regions).
  • Act III - Finally move the SQL databases to the Cloud.
In each step and for each country migration, learn from it, iterate and improve on it to make it better.
Act I – Redirect new countries to the Cloud and sync data to the Data Center
Netflix was going to launch in 6 new countries soon. We decided to take it as a challenge to launch these countries partly in the Cloud on the billing side. What that meant was the user-facing data and applications would be in the Cloud, but we would still need to sync data back to the Data Center so some of our batch applications which would continue to run in the Data Center for the time-being, could work without disruption. The customer for these new countries data would be served out of the Cloud while the batch processing would still run out of the Data Center. That was the first step.
We ported all the APIs from the 2 user-facing applications to a Cloud based application that we wrote using Spring Boot and Spring Integration. With Spring Boot, we were able to quickly jump-start building a new application, as it provided the infrastructure and plumbing we needed to stand it up out of the box and let us focus on the business logic. With Spring Integration we were able to write once and reuse a lot of the workflow style code. Also with headers and header-based routing support that it provided, we were able to implement a pub-sub model within the application to put a message in a channel and have all consumers consume it with independent tuning for each consumer. We were now able to handle the API calls for members in the 6 new countries in any AWS region with the data stored in Cassandra. This enabled Billing to be up for these countries even if an entire AWS region went down – the first time we were able to see the power of being on the Cloud!

We deployed our application on EC2 instances in AWS in multiple regions. We added a redirection layer in our existing Cloud proxy application to switch billing calls for users in the new countries to go to the new billing APIs in the Cloud and billing calls for the users in the existing countries to continue to go to the old billing APIs in the Data Center. We opened direct connectivity from one of the AWS regions to the existing Oracle databases in the Data Center and wrote an application to sync the data from Cassandra via SQS in the 3 regions back to this region. We used SQS queues and Dead Letter Queues (DLQs) to move the data between regions and process failures.
New country launches usually mean a bump in member base. We knew we had to move our Subscription Renewal application from the Data Center to the Cloud so that we don’t put the load on the Data Center one. So for these 6 new countries in the Cloud, we wrote a crawler that went through all the customers in Cassandra daily and came up with the members who were to be charged that day. This all row iterator approach would work for now for these countries, but we knew it wouldn’t hold ground when we migrated the other countries and especially the US data (which had majority of our members  at that time) to the Cloud. But we went ahead with it for now to test the waters. This would be the only batch application that we would run from the Cloud in this stage.
We had chosen Cassandra as our data store to be able to write from any region and due to the fast replication of the writes it provides across regions. We defined a data model where we used the customerId as the key for the row and created a set of composite Cassandra columns to enable the relational aspect of the data. The picture below depicts the relationship between these entities and how we represented them in a single column family in Cassandra. Designing them to be a part of a single column family helped us achieve transactional support for these related entities.


We designed our application logic such that we read once at the beginning of any operation, updated objects in memory and persisted it to a single column family at the end of the operation. Reading from Cassandra or writing to it in the middle of the operation was deemed an anti-pattern.  We wrote our own custom ORM using Astyanax (a Netflix grown and open-sourced Cassandra client) to be able to read/write the domain objects from/to Cassandra.

We launched in the new countries in the Cloud with this approach and after a couple of initial minor issues and bug fixes, we stabilized on it. So far so good!
The Billing system architecture at the end of Act I was as shown below:
Canvas 2.png
Act II – Move all applications and migrate existing countries to the cloud
With Act I done successfully, we started focusing on moving the rest of the apps to the Cloud without moving the databases. Most of the business logic resides in the batch applications, which had matured over years and that meant digging into the code for every condition and spending time to rewrite it. We could not simply forklift these to the Cloud as is. We used this opportunity to remove dead code where we could, break out functional parts into their own smaller applications and restructure existing code to scale. These legacy applications were coded to read from config files on disk on startup and use other static resources like reading messages from Weblogic queues -  all anti-patterns in the Cloud due to the ephemeral nature of the instances. So we had to re-implement those modules to make the applications Cloud-ready. We had to change some APIs to follow an async pattern to allow moving the messages through the queues to the region where we had now opened a secure connection to the Data Center.
The Cloud Database Engineering (CDE) team setup a multi node Cassandra cluster for our data needs. We knew that the all row Cassandra iterator Renewal solution that we had implemented for renewing customers from earlier 6 countries would not scale once we moved the entire Netflix member billing data to Cassandra. So we designed a system to use Aegisthus to pull the data from Cassandra SSTables and convert it to JSON formatted rows that were staged out to S3 buckets. We then wrote Pig scripts to run mapreduce on the massive dataset everyday to fetch customer list to renew and charge for that day. We also wrote Sqoop jobs to pull data from Cassandra and Oracle and write to Hive in a queryable format which enabled us to join these two datasets in Hive for faster troubleshooting.

To enable DVD servers to talk to us in the Cloud, we setup load balancer endpoints (with SSL client certification) for DVD to route calls to us through the Cloud proxy, which for now would pipe the call back to the Data Center, until we migrated US. Once US data migration was done, we would sever the Cloud to Data Center communication link.

To validate this huge data migration, we wrote a comparator tool to compare and validate the data that was migrated to the Cloud, with the existing data in the Data Center. We ran the comparator in an iterative format, where we were able to identify any bugs in the migration, fix them, clear out the data and re-run. As the runs became clearer and devoid of issues, it increased our confidence in the data migration. We were excited to start with the migration of the countries. We chose a country with a small Netflix member base as the first country and migrated it to the Cloud with the following steps:

  • Disable the non-GET apis for the country under migration. (This would not impact members, but delay any updates to subscriptions in billing)
  • Use Sqoop jobs to get the data from Oracle to S3 and Hive.
  • Transform it to the Cassandra format using Pig.
  • Insert the records for all members for that country into Cassandra.
  • Enable the non-GET apis to now serve data from the Cloud for the country that was migrated.

After validating that everything looked good, we moved to the next country. We then ramped up to migrate set of similar countries together. The last country that we migrated was US, as it held most of our member base and also had the DVD subscriptions. With that, all of the customer-facing data for Netflix members was now being served through the Cloud. This was a big milestone for us!
After Act II, we were looking like this:
Canvas 3.png
Act III  – Good bye Data Center!
Now the only (and most important) thing remaining in the Data Center was the Oracle database. The dataset that remained in Oracle was highly relational and we did not feel it to be a good idea to model it to a NoSQL-esque paradigm. It was not possible to structure this data as a single column family as we had done with the customer-facing subscription data. So we evaluated Oracle and Aurora RDS as possible options. Licensing costs for Oracle as a Cloud database and Aurora still being in Beta didn’t help make the case for either of them.

While the Billing team was busy in the first two acts, our Cloud Database Engineering team was working on creating the infrastructure to migrate billing data to MySQL instances on EC2. By the time we started Act III, the database infrastructure pieces were ready, thanks to their help. We had to convert our batch application code base to be MySQL-compliant since some of the applications used plain jdbc without any ORM. We also got rid of a lot of the legacy pl-sql code and rewrote that logic in the application, stripping off dead code when possible.
Our database architecture now consists of a MySQL master database deployed on EC2 instances in one of the AWS regions. We have a Disaster Recovery DB that gets replicated from the master and will be promoted to master if the master goes down. And we have slaves in the other AWS regions for read only access to applications.
Our Billing Systems, now completely in the Cloud, look like this:
Canvas 4.png
Needless to say, we learned a lot from this huge project. We wrote a few tools along the way to help us debug/troubleshoot and improve developer productivity. We got rid of old and dead code, cleaned up some of the functionality and improved it wherever possible. We received support from many other engineering teams within Netflix. We had engineers from the Cloud Database Engineering, Subscriber and Account engineering, Payments engineering, Messaging engineering worked with us on this initiative for anywhere between 2 weeks to a couple of months. The great thing about the Netflix culture is that everyone has one goal in mind – to deliver a great experience for our members all over the world. If that means helping Billing solution move to the Cloud, then everyone is ready to do that irrespective of team boundaries!
The road ahead …
With Billing in the Cloud, Netflix streaming infrastructure now completely runs in the Cloud. We  can scale any Netflix service on demand, do predictive scaling based on usage patterns, do single-click deployments using Spinnaker and have consistent deployment architectures between various Netflix applications. Billing infrastructure can now make use of all the Netflix platform libraries and frameworks for monitoring and tooling support in the Cloud. Today we support billing for over 81 million Netflix members in 190+ countries. We generate and churn through terabytes of data everyday to accomplish  billing events. Our road ahead includes rearchitecting membership workflows for a global scale and business challenges. As part of our new architecture, we would be redefining our services to scale natively in the Cloud.  With the global launch, we have an opportunity to learn and redefine Billing and Payment methods in newer markets and integrate with many global  partners and local payment processors in the regions. We are looking forward to architect more functionality and scale out further.

If you like to design and implement large-scale distributed systems for critical data and build automation/tooling for testing it, we have a couple of positions open and would love to talk to you! Check out the positions here :


PlanetMySQL Voting: Vote UP / Vote DOWN

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

Why Uber Engineering Switched from Postgres to MySQL

Testing Samsung storage in tpcc-mysql benchmark of Percona Server

$
0
0
tpcc-mysql benchmark

This blog post will detail the results of Samsung storage in

tpcc-mysql
 benchmark using Percona Server.

I had an opportunity to test different Samsung storage devices under tpcc-mysql benchmark powered by Percona Server 5.7. You can find a summary with details here https://github.com/Percona-Lab-results/201607-tpcc-samsung-storage/blob/master/summary-tpcc-samsung.md

I have in my possession:

  • Samsung 850 Pro, 2TB: This is a SATA device and is positioned as consumer-oriented, something that you would use in a high-end user desktop. As of this post, I estimate the price of this device as around $430/TB.
  • Samsung SM8631.92TB: this device is also a SATA, and positioned for a server usage. The current price is about $600/TB. 
  • Samsung PM1725, 800GB: This is an NVMe device, in a 2.5″ form factor, but it requires a connection to a PCIe slot, which I had to allocate in my server. The device is high-end, oriented for server-side and demanding workloads. The current price is about $1300/TB.

I am going to use 1000 warehouses in the 

tpcc-mysql
 benchmarks, which corresponds roughly to a data size of 100GB.

This benchmark varies the

innodb_buffer_pool_size
 from 5GB to 115GB. With 5GB buffer pool size only a very small portion of data fits into memory, which results in intensive foreground IO reads and intensive background IO writes. With 115GB almost all data fits into memory, which results in very small (or almost zero) IO reads and moderate background IO writes.

All buffer pool sizes in the middle of the interval correspond to resulting IO reads and writes. For example, we can see the read to write ratio on the chart below (received for the PM1725 device) with different buffer pool sizes:

tpcc-mysql benchmarks

We can see that for the 5GB buffer pool size we have 56000 read IOPs operations and 32000 write IOPs. For 115GB, the reads are minimal at about 300 IOPS and the background writes are at the 20000 IOPs level. Reads gradually decline with the increasing buffer pool size.

The charts are generated with the Percona Monitoring and Management tools.

Results

Let’s review the results. The first chart shows measurements taken every one second, allowing us to see the trends and stalls.

tpcc-mysql benchmarks

If we take averages, the results are:

tpcc-mysql benchmarks

In table form (the results are in new order transactions per minute (NOTPM)):

bp, GB pm1725 sam850 sam863 pm1725 / sam863 pm1725 / sam850
5 42427.57 1931.54 14709.69 2.88 21.97
15 78991.67 2750.85 31655.18 2.50 28.72
25 108077.56 5156.72 56777.82 1.90 20.96
35 122582.17 8986.15 93828.48 1.31 13.64
45 127828.82 12136.51 123979.99 1.03 10.53
55 130724.59 19547.81 127971.30 1.02 6.69
65 131901.38 27653.94 131020.07 1.01 4.77
75 133184.70 38210.94 131410.40 1.01 3.49
85 133058.50 39669.90 131657.16 1.01 3.35
95 133553.49 39519.18 132882.29 1.01 3.38
105 134021.26 39631.03 132126.29 1.01 3.38
115 134037.09 39469.34 132683.55 1.01 3.40

Conclusion

The Samsung 850 obviously can’t keep with the more advanced SM863 and PM1725. The PM1725 shows a greater benefit with smaller buffer pool sizes. In cases using large amounts of memory, there is practically no difference with SM863. The reason is that with big buffer pool sizes, MySQL does not push IO subsystem much to use all the PM1725 performance.

For the reference, my.cnf file is

[mysqld]
datadir=/var/lib/mysql
socket=/tmp/mysql.sock
ssl=0
symbolic-links=0
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
# general
thread_cache_size=2000
table_open_cache = 200000
table_open_cache_instances=64
back_log=1500
query_cache_type=0
max_connections=4000
# files
innodb_file_per_table
innodb_log_file_size=15G
innodb_log_files_in_group=2
innodb_open_files=4000
innodb_io_capacity=10000
loose-innodb_io_capacity_max=12000
innodb_lru_scan_depth=1024
innodb_page_cleaners=32
# buffers
innodb_buffer_pool_size= 200G
innodb_buffer_pool_instances=8
innodb_log_buffer_size=64M
# tune
innodb_doublewrite= 1
innodb_support_xa=0
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit= 1
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
join_buffer_size=32K
sort_buffer_size=32K
innodb_use_native_aio=0
innodb_stats_persistent = 1
# perf special
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 8
innodb_purge_threads=4
innodb_adaptive_hash_index=0
innodb_change_buffering=none
loose-innodb-log_checksum-algorithm=crc32
loose-innodb-checksum-algorithm=strict_crc32
loose-innodb_sched_priority_cleaner=39
loose-metadata_locks_hash_instances=256


PlanetMySQL Voting: Vote UP / Vote DOWN

New Webinar Trilogy: The MySQL Query Tuning Deep-Dive

$
0
0

Following our popular webinar on MySQL database performance tuning, we’re excited to introduce a new webinar trilogy dedicated to MySQL query tuning.

This is an in-depth look into the ins and outs of optimising MySQL queries conducted by Krzysztof Książek, Senior Support Engineer at Severalnines.

When done right, tuning MySQL queries and indexes can significantly increase the performance of your application as well as decrease response times. This is why we’ll be covering this complex topic over the course of three webinars of 60 minutes each.

Dates

Part 1: Query tuning process and tools

Tuesday, August 30th
Register

Part 2: Indexing and EXPLAIN - deep dive

Tuesday, September 27th
Register

Part 3: Working with the optimizer and SQL tuning

Tuesday, October 25th
Register

Agenda

Part 1: Query tuning process and tools

  • Query tuning process
    • Build
    • Collect
    • Analyze
    • Tune
    • Test
  • Tools
    • tcpdump
    • pt-query-digest

Part 2: Indexing and EXPLAIN - deep dive

  • How B-Tree indexes are built?
  • Indexes - MyISAM vs. InnoDB
  • Different index types
    • B-Tree
    • Fulltext
    • Hash
  • Indexing gotchas
  • EXPLAIN walkthrough - query execution plan

Part 3: Working with optimizer and SQL tuning

  • Optimizer
    • How execution plans are calculated
    • InnoDB statistics
  • Hinting the optimizer
    • Index hints
    • JOIN order modifications
    • Tweakable optimizations
  • Optimizing SQL

Speaker

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience in managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. He’s the main author of the Severalnines blog and webinar series: Become a MySQL DBA.


PlanetMySQL Voting: Vote UP / Vote DOWN

Monitoring MongoDB with Nagios

$
0
0
Monitoring MongoDB with Nagios

Monitoring MongoDB with NagiosIn this blog, we’ll discuss monitoring MongoDB with Nagios.

There is a significant amount of talk around graphing MongoDB metrics using things like Prometheus, Data Dog, New Relic, and Ops Manager from MongoDB Inc. However, I haven’t noticed a lot of talk around “What MongoDB alerts should I be setting up?”

While building out Percona’s remote DBA service for MongoDB, I looked at Prometheus’s AlertManager. After reviewing it, I’m not sure it’s quite ready to be used exclusively. We needed to decide quickly if there are better Nagios checks on the market, or did I need to write my own?

In the end, we settled on a hybrid approach. There are some good frameworks, but we need to create or tweak some of the things needed for an “SEV 1-” or “SEV 2-” type issue (which are most important to me). One of the most common problems for operations, Ops, DevOps, DBA teams and most engineering is alert spam. As such I wanted to be very careful to only alert on the things pointing to immediate dangers or current outages. As a result, we have now added

pmp-check-mongo.py
 to the GitHub for Percona Monitoring Plugins. Since we use Grafana and Prometheus for metrics and graphing, there are no accompanying Catci information templates. In the future, we’ll need to decide how this will change PMP overtime. In the meantime, we wanted to make the tool available now and worry about some of the issues later on.

As part of this push, I want to give you some real world examples of how you might use this tool. There are many options available to you, and Nagios is still a bit green in regards to making those options as user-friendly as our tools are.

Usage: pmp-check-mongo.py [options]
Options:
  -h, --help                         show this help message and exit
  -H HOST, --host=HOST               The hostname you want to connect to
  -P PORT, --port=PORT               The port mongodb is running on
  -u USER, --user=USER               The username you want to login as
  -p PASSWD, --password=PASSWD       The password you want to use for that user
  -W WARNING, --warning=WARNING      The warning threshold you want to set
  -C CRITICAL, --critical=CRITICAL   The critical threshold you want to set
  -A ACTION, --action=ACTION         The action you want to take. Valid choices are
                                     (check_connections, check_election, check_lock_pct,
                                     check_repl_lag, check_flushing, check_total_indexes,
                                     check_balance, check_queues, check_cannary_test,
                                     check_have_primary, check_oplog, check_index_ratio,
                                     check_connect) Default: check_connect
  -s SSL, --ssl=SSL                  Connect using SSL
  -r REPLICASET, --replicaset=REPLICASET    Connect to replicaset
  -c COLLECTION, --collection=COLLECTION    Specify the collection in check_cannary_test
  -d DATABASE, --database=DATABASE          Specify the database in check_cannary_test
  -q QUERY, --query=QUERY                   Specify the query, only used in check_cannary_test
  --statusfile=STATUS_FILENAME      File to current store state data in for delta checks
  --backup-statusfile=STATUS_FILENAME_BACKUP    File to previous store state data in for delta checks
  --max-stale=MAX_STALE             Age of status file to make new checks (seconds)

There seems to be a huge amount going on here, but let’s break it down into a few categories:

  • Connection options
  • Actions
  • Action options
  • Status options

Hopefully, this takes some of the scariness out of the script above.

Connection options
  • Host / Port Number
    • Pretty simple, this is just the host you want to connect to and what TCP port it is listening on.
  • Username and Password
    • Like with Host/Port, this is some of your normal and typical Mongo connection field options. If you do not set both the username and password, the system will assume auth was disabled.
  • SSL
    • This is mostly around the old SSL support in Mongo clients (which was a boolean). This tool needs updating to support the more modern SSL connection options. Use this as a “deprecated” feature that might not work on newer versions.
  • ReplicaSet
    • Very particular option that is only used for a few checks and verifies that the connection uses a replicaset connection. Using this option lets the tool automatically find a primary node for you, and is helpful to some checks specifically around replication and high availability (HA):
      • check_election
      • check_repl_lag
      • check_cannary_test
      • chech_have_primary
      • check_oplog
Actions and what they mean
  • check_connections
    • This parameter refers to memory usage, but beyond that you need to know if your typical connections suddenly double. This indicates something unexpected happened in the application or database and caused everything to reconnect. It often takes up to 10 minutes for those old connections to go away.
  • check_election
    • This uses the status file options we will cover in a minute, but it checks to see if the primary from the last check differs from the current found primary. If so, it alerts. This check should only have a threshold of one before it alarms (as an alert means an HA event occurred).
  • check_lock_pct
    • MMAP only, this engine has a write lock on the whole collection/database depending on the version. This is a crucial metric to determine if MMAP writes are blocking reads, meaning you need to scale the DB layer in some way.
  • check_repl_lag
    • Checks the replication stream to understand how lagged a given node is the primary. To accomplish this, it uses a fake record in the test DB to cause a write. Without this, a read-only system would look lagged artificially as no new oplog entries get created.
  • check_flushing
    • A common issue with MongoDB is very long flush times, causing a system halt. This is a caused by your disk subsystem not keeping up, and then the DB having to wait on flushing to make sure writes get correctly journaled.
  • check_total_indexes
    • The more indexes you have, the more the planner has to work to determine which index is a good fit. This increases the risk that the recovery of a failure will take a long time. This is due to the way a restore builds indexes and how MongoDB can only make one index at a time.
  • check_balance
    • While MongoDB should keep things in balance across a cluster, many things can happen: jumbo chunks, a disabled balancer being, constantly attempting to move the same chunk but failing, and even adding/removing sharding. This alert is for these cases, as an imbalance means some records might get served faster than others. It is purely based on the chunk count that the MongoDB balancer is also based on, which is not necessarily the same as disk usage.
  • check_queues
    • No matter what engine you have selected, a backlog of sustained reads or writes indicates your DB layer is unable to keep up with demand. It is important in these cases to send an alert if the rate is maintained. You might notice this is also in our Prometheus exporter for graphics as both trending and alerting are necessary to watch in a MongoDB system.
  • check_cannary_test
    • This is a typical query for the database and then used to set critical/warning levels based on the latency of the returned query. While not as accurate as full synthetic transactions, queries through the application are good to measure response time expectations and SLAs.
  • check_have_primary
    • If we had an HA event but failed to get back up quickly, it’s important to know if a new primary is causing writes to error on the system. This check simply determines if the replica set has a primary, which means it can handle reads and writes.
  • check_oplog
    • This check is all about how much oplog history you have. This is much like measuring how much history you have in MySQL blogs. The reason this is important is when recovering from a backup and performing a point in time recovery, you can use the current oplog if the oldest timestamp in the oplog is newer than the backup timestamp. As a result, this is normal three times the backup interval you use to guarantee that you have plenty of time to find the newest recovery and then do the recovery.
  • check_index_ratio
    • This is an older metric that modern MongoDB versions will not find useful, but in the past, it was a good way to understand the percentage of queries not handled by an index.
  • check_connect
    • A very basic check to ensure it can connect (and optionally login) to MongoDB and verify the server is working.
Status File options

These options rarely need to be changed but are present in case you want to store the status on an SHM mount point to avoid actual disk writes.

  • statusfile
    • This is where a copy of the current rs.status, serverStatus and other command data is stored
  • backup-statusfile
    • Like status_file, but status_file is moved here when a new check is done. These two objects can then be compared to find the delta between two checkpoints.
  • max-stale
    • This is the amount of age for which an old file is still valid. Deltas older then this aren’t allowed and exist to protect the system from will assumption when a statusfile is hours or days old.

If you have any questions on how to use these parameters, feel free to let us know. In the code, there is also a defaults dictionary for most of these options so that in many cases setting warning and critical level are not needed.


PlanetMySQL Voting: Vote UP / Vote DOWN
Viewing all 18842 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>