A Case for Write Optimizations in MySQL

As a storage engine developer, I am excited for MySQL 5.6. Looking at http://dev.mysql.com/tech-resources/articles/whats-new-in-mysql-5.6.html, there has been plenty of work done to improve the performance of reads in MySQL for all storage engines (provided they take advantage of the new APIs).

What would be great to add is API improvements to increase the performance of writes, and more specifically, updates. For many applications that perform updates, such as applications that do click counting or impression counting, there are significant opportunities for improving write performance.

Take the following example of click counting (or impression counting). You have a website and want to save the number of times links on your website have been clicked. Your table may look something like:

create table num_clicks( link_id int, num_clicks int);

To update the number of clicks, you do something like:

insert into num_clicks (LINK_ID, 1) on duplicate key update set num_clicks=num_clicks+1;

With MySQL as it currently works, this is slower than it needs to be, as I explained here. At a high level, the reason is that MySQL forces the storage engine to check in the table if a value exists for LINK_ID. If a row is returned, MySQL performs the increment away from the storage engine, and passes a new row to the storage engine for an update. The check incurs a disk seek, which is very costly in terms of latency. Disks can do only hundreds of seeks per second. Furthermore, NoSQL solutions based on B-trees are similarly limited and can’t be significantly accelerated because updates incur disk I/O.

However, with some changes to MySQL, a storage engine can take advantage of this knowledge to improve its algorithms. All that’s needed is for the storage engine to know that the user wants to perform an insert or to perform this particular update, as opposed to getting individual handler calls of write_row, index_read, and update_row (which is the current design). Hence, what’s needed is a way for the storage engine layer to be able to apply updates on its own.

This change can help all storage engines. Although I am not an expert in MySQL Cluster, I imagine reducing these individual handler calls also helps MySQL Cluster avoid network hops to retrieve information. For in-memory databases, performance may increase due to reducing the number of calls made by the handler. InnoDB can potentially use its insertion buffer to store the “insert … on duplicate key update” operation, thereby giving the operation the same boost insertions into secondary keys get. For TokuDB, we estimate that these types of updates aided by this additional information could run much faster. In future posts, I will expand on how we think TokuDB can do this.

PlanetMySQL Voting: Vote UP / Vote DOWN

A Case for Write Optimizations in MySQL

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112