A few days ago one of our intern @mydbops reached me with a SQL query. The query scans only a row according to the execution plan. But query does not seems optimally performing.
Below is the SQL query and its explain plan. ( MySQL 5.7 )
select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;
Execution plan and table structure
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: user_roles partitions: NULL type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 1 filtered: 10.00 Extra: Using where 1 row in set, 1 warning (0.00 sec) show create table user_roles\G *************************** 1. row *************************** Table: user_roles Create Table: CREATE TABLE `user_roles` ( `role_id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Role ID', `role` varchar(255) DEFAULT NULL COMMENT 'Role', `username` varchar(255) DEFAULT NULL COMMENT 'Username', PRIMARY KEY (`role_id`) ) ENGINE=InnoDB;
The execution plan estimates the query scans one row ( rows : 1).
The
MySQL Ref Manualrows
column indicates the number of rows MySQL believes it must examine to execute the query.
Let us review the slow query log file too
# Query_time: 0.261564 Lock_time: 0.000181 Rows_sent: 1 Rows_examined: 486271 Rows_affected: 0 # Bytes_sent: 174 SET timestamp=1584903045; select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1;
The slow query logs shows a high number in estimated rows Rows_examined. The estimated rows in slow log equals the counts of records in tables, So the query is actually doing a FULL TABLE SCAN (FTS). But according to the optimiser it choses the primary key over the ORDER BY clause as an optimal index.
Let us validate the query cost via format=JSON
EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "65861.40" }, "ordering_operation": { "using_filesort": false, "table": { "table_name": "user_roles", "access_type": "index", "key": "PRIMARY", "used_key_parts": [ "role_id" ], "key_length": "4", "rows_examined_per_scan": 1, "rows_produced_per_join": 32434, "filtered": "10.00", "cost_info": { "read_cost": "59374.56", "eval_cost": "6486.84", "prefix_cost": "65861.40", "data_read_per_join": "16M" }, "used_columns": [ "role_id", "role", "username" ], "attached_condition": "(`user_roles`.`username` = '9977223389')" } } } } 1 row in set, 1 warning (0.00 sec)
Though the access type is Index the query cost ( 65861) is huge for examining a single row based on Primary KEY. The execution plan is misleading in this case.
A better index for performance can be a composite index on (username, id ) and can be a covering index over column role too.
create index idx_combo on user_roles(username,role_id,role); Query OK, 0 rows affected (2.80 sec) Records: 0 Duplicates: 0 Warnings: 0 explain select username, role from user_roles where username= '9977223389' ORDER BY role_id DESC LIMIT 1; +----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+ | 1 | SIMPLE | user_roles | NULL | ref | idx_combo | idx_combo | 258 | const | 1 | 100.00 | Using where; Using index | +----+-------------+------------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+ 1 row in set, 1 warning (0.00 sec) ## Format = JSON EXPLAIN: { "query_block": { "select_id": 1, "cost_info": { "query_cost": "1.20" },
The query performs more optimally and there is a huge reduction in cost. The optimiser can be misleading with “Order By” over primary keys. Now our interns understands how the query can be optimised and dig in further. Hope this will be fixed in upcoming releases.
More good reads.