In my post yesterday, I shared a little known trick for sorting NULLs last when using ORDER BY ASC.
To summarize briefly, NULLs are treated as less than 0 when used in ORDER BY, However, sometimes you do not want that behavior, and you need the NULLs listed last, even though you want your numbers in ascending order.
So a query like the following returns the NULLs first (expected behavior):
SELECT * FROM t1 ORDER BY col1 ASC, col2 ASC; +--------+------+ | col1 | col2 | +--------+------+ | apple | NULL | | apple | 5 | | apple | 10 | | banana | NULL | | banana | 5 | | banana | 10 | +--------+------+
The trick I mentioned in my post is to rewrite the query like:
SELECT * FROM t1 ORDER BY col1 ASC, -col2 DESC;
The difference is that we added a minus sign (-) in front of the column we want sorted and we change ASC to DESC.
Now this query returns what we’re looking for:
+--------+------+ | col1 | col2 | +--------+------+ | apple | 5 | | apple | 10 | | apple | NULL | | banana | 5 | | banana | 10 | | banana | NULL | +--------+------+
I could not really find this behavior documented at the time, and thus did some more digging to find out if this is intended behavior and if it should continue to work in the future (i.e., can we rely on it).
The answer is yes to both, it is intended, and it will continue to work.
Now, why does this work this way?
- It is known that sorting in ascending (ASC) order NULLs are listed first, and if descending (DESC) they are listed last.
- It is known that minus sign (-) preceding the column followed by DESC is the same as ASC (essentially the inverse). This is because if a > b, then -a
- And the last bit of the puzzle is that NULL == -NULL.
So since -NULL == NULL, and we are now using DESC, the NULLs will be last. And then the remainder of the INT values will be in ASC order, since we effectively converted them to negative values and the changed order to DESC, effectively putting those INTs in ASC order.
Hope this helps clarify for anyone out there interested.