Quantcast
Channel: Planet MySQL
Viewing all articles
Browse latest Browse all 18776

Find and remove duplicate indexes

$
0
0

Having duplicate keys in our schemas can hurt the performance of our database:

    • They make the optimizer phase slower because MySQL needs to examine more query plans.
    • The storage engine needs to maintain, calculate and update more index statistics
    • DML and even read queries can be slower because MySQL needs update fetch more data to Buffer Pool for the same load
    • Our data needs more disk space so our backups will be bigger and slower
  • In this post I’m going to explain the different types of duplicate indexes and how to find and remove them.


    Duplicate keys on the same column

    This is the easiest one. You can create multiple indexes on the same column and MySQL won’t complain. Let’s see this example:

    mysql> alter table t add index(name);
    mysql> alter table t add index(name);
    mysql> alter table t add index(name);

    mysql> show create table t\G
    [...]
    KEY `name` (`name`),
    KEY `name_2` (`name`),
    KEY `name_3` (`name`)
    [...]

    MySQL detects that the ‘name’ index already exists so it creates the new one appending a number at the end of the name. This type of indexes are easy to find out and to avoid them. How? Just specify the index name and MySQL will avoid to create duplicate indexes:

    mysql> alter table t add index key_for_first_name(name);
    Query OK, 0 rows affected (0.01 sec)
    mysql> alter table t add index key_for_first_name(name);
    ERROR 1061 (42000): Duplicate key name 'key_for_first_name'

    Using custom names with indexes is a good practice because they can avoid duplicates and help you to identify them with more meaningful names.


    Redundant keys on composite indexes

    Let’s start with an example:

    mysql> show create table t;
    [...]
    KEY `key_name` (`name`),
    KEY `key_name_2` (`name`,`age`)

    The redundant index is on the ‘name‘ column. To take benefit from a composite index MySQL doesn’t need to use all the columns of that index, the leftmost prefix is enough. For example an index on columns (A,B,C) can be used to satisfy the combinations (A), (A,B), (A,B,C) but not (B) or (B,C).

    Therefore in the previous example the index ‘key_name‘ is redundant and ‘key_name_2‘ is enough in most of the cases. Sometimes the redundant keys make sense, for example if full index is a lot longer when both long and short index might be good for the query execution.


    Redundant suffixes on clustered index

    InnoDB uses a clustered index and that means that the secondary keys contains the primary key column. Let’s see the following example:

    mysql> show create table t;
    [...]
    PRIMARY KEY (`i`),
    KEY `key_name` (`name`,`i`)

    In this example the index ‘key_name‘ includes the primary key so that last column ‘i‘ is usually not necessary on the secondary key because it’s redundant. I said “usually” because there are some cases where the redundant key can be useful.

    SELECT * FROM t WHERE name='kahxailo' AND i > 100000;

    With a index on (name) the execution plan is Using intersect(name,PRIMARY); Using where and with an index on (name,id) the execution plan changes to Using where. There is also a big difference in the number of Handler_read_next requests, so less data needs to be read.

    It’s also worth to mention that some people tends to create a UNIQUE(i) key on the primary key because they think even if you defined something as PRIMARY KEY would also want to define it as UNIQUE to ensure there is no duplicated. That’s not necessary because the PRIMARY KEY is indeed UNIQUE.


    How can I find all those keys?

    There is a tool in Percona Toolkit that can help you to find all those keys in your schema, its name is pt-duplicate-key-checker and it can find the three different types of keys explained before. Lets try it:

    mysql> show create table t;
    [...]
    PRIMARY KEY (`i`),
    KEY `name` (`name`,`i`),
    KEY `name_2` (`name`),
    KEY `name_3` (`name`),
    KEY `name_4` (`name`,`age`),
    KEY `age` (`age`,`name`,`i`)

    I run the tool on the ‘test’ database:

    root@debian:~# pt-duplicate-key-checker --database=test
    # ########################################################################
    # test.t
    # ########################################################################

    # name_2 is a left-prefix of name_4
    [...]
    # To remove this duplicate index, execute:
    ALTER TABLE `test`.`t` DROP INDEX `name_2`;

    # name_3 is a left-prefix of name_4
    [...]
    # To remove this duplicate index, execute:
    ALTER TABLE `test`.`t` DROP INDEX `name_3`;

    # Key name ends with a prefix of the clustered index
    [...]
    # To shorten this duplicate clustered index, execute:
    ALTER TABLE `test`.`t` DROP INDEX `name`, ADD INDEX `name` (`name`);

    # Key age ends with a prefix of the clustered index
    [...]
    # To shorten this duplicate clustered index, execute:
    ALTER TABLE `test`.`t` DROP INDEX `age`, ADD INDEX `age` (`age`,`name`);

    The tool gives us the reason why the keys are duplicated and provides us the SQL command to solve the problem.


    Conclusion

    Indexes are good for our queries but too many indexes or duplicate or redundant indexes can hurt the performance. Checking the schemas periodically to catch this duplicates can help us to maintain a good overall performance. Before removing indexes on production test it on a testing environment and check the performance of your queries with tools like pt-upgrade. As we saw in previous sections sometimes redundant keys can help us to improve the execution time and decrease the size of the data read.


    PlanetMySQL Voting: Vote UP / Vote DOWN

    Viewing all articles
    Browse latest Browse all 18776

    Trending Articles



    <script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>