In the recent MariaDB DBA advanced training class the question came up if MariaDB can make use of an index when searching for NULL
values... And to be honest I was not sure any more. So instead of reading boring documentation I did some little tests:
Search for NULL
First I started with a little test data set. Some of you might already know it:
CREATE TABLE null_test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data VARCHAR(32) DEFAULT NULL , ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP() ); INSERT INTO null_test VALUES (NULL, 'Some data to show if null works', NULL); INSERT INTO null_test SELECT NULL, 'Some data to show if null works', NULL FROM null_test; ... up to 1 Mio rows
Then I modified the data according to my needs to see if the MariaDB Optimizer can make use of the index:
-- Set 0.1% of the rows to NULL UPDATE null_test SET data = NULL WHERE ID % 1000 = 0; ALTER TABLE null_test ADD INDEX (data); ANALYZE TABLE null_test;
and finally I run the test (MariaDB 10.3.11):
EXPLAIN EXTENDED SELECT * FROM null_test WHERE data IS NULL; +------+-------------+-----------+------+---------------+------+---------+-------+------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-----------+------+---------------+------+---------+-------+------+----------+-----------------------+ | 1 | SIMPLE | null_test | ref | data | data | 35 | const | 1047 | 100.00 | Using index condition | +------+-------------+-----------+------+---------------+------+---------+-------+------+----------+-----------------------+
We can clearly see that the MariaDB Optimizer considers and uses the index and its estimation of about 1047 rows is quite appropriate.
Unfortunately the optimizer chooses the completely wrong strategy (3 times slower) for the opposite query:
EXPLAIN EXTENDED SELECT * FROM null_test WHERE data = 'Some data to show if null works'; +------+-------------+-----------+------+---------------+------+---------+-------+--------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+-----------+------+---------------+------+---------+-------+--------+----------+-----------------------+ | 1 | SIMPLE | null_test | ref | data | data | 35 | const | 522351 | 100.00 | Using index condition | +------+-------------+-----------+------+---------------+------+---------+-------+--------+----------+-----------------------+
Search for NOT NULL
Now let us try to test the opposite problem:
CREATE TABLE anti_null_test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data VARCHAR(32) DEFAULT NULL , ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP() ); INSERT INTO anti_null_test VALUES (NULL, 'Some data to show if null works', NULL); INSERT INTO anti_null_test SELECT NULL, 'Some data to show if null works', NULL FROM anti_null_test; ... up to 1 Mio rows
Then I modified the data as well but this time in the opposite direction:
-- Set 99.9% of the rows to NULL UPDATE anti_null_test SET data = NULL WHERE ID % 1000 != 0; ALTER TABLE anti_null_test ADD INDEX (data); ANALYZE TABLE anti_null_test;
and then we have to test again the query:
EXPLAIN EXTENDED SELECT * FROM anti_null_test WHERE data IS NOT NULL; +------+-------------+----------------+-------+---------------+------+---------+------+------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+----------------+-------+---------------+------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | anti_null_test | range | data | data | 35 | NULL | 1047 | 100.00 | Using index condition | +------+-------------+----------------+-------+---------------+------+---------+------+------+----------+-----------------------+
Also in this case the MariaDB Optimizer considers and uses the index and produces a quite fast Query Execution Plan.
Also in this case the optimizer behaves wrong for the opposite query:
EXPLAIN EXTENDED SELECT * FROM anti_null_test WHERE data IS NULL; +------+-------------+----------------+------+---------------+------+---------+-------+--------+----------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +------+-------------+----------------+------+---------------+------+---------+-------+--------+----------+-----------------------+ | 1 | SIMPLE | anti_null_test | ref | data | data | 35 | const | 523506 | 100.00 | Using index condition | +------+-------------+----------------+------+---------------+------+---------+-------+--------+----------+-----------------------+