There are several ways to use less SSD for an OLTP workload: choose a database engine has less space amplification, store less data, move the cold data elsewhere. The first approach is a plan while the others are goals. A plan is something you can implement. A goal requires a plan to get done.
This matters when you want to decrease the cost of storage for a database workload but not everyone needs to do that. The first approach assumes your DBMS supports an LSM with leveled compaction and compression (MariaDB and Percona Server include MyRocks, ScyllaDB and Cassandra are also options). The second approach, store less data, assumes you can get everyone to agree to remove data and that is a hard conversation.
The third approach, move the cold data elsewhere, is a wonderful goal. I wonder how often that goal is achieved. To implement this you must find data that won't (well, almost never) be read or written again and then move it to less expensive storage. I assume this has been more successful when implemented in storage than in a DBMS. The obvious example is an OS page cache but there are also tiered storage servers. An LSM already classifies data that is hot vs cold for writes, data closer to the L0 was written more recently, but that might not imply anything about read likelihood.
I have read wonderful research papers that show how a DBMS might do this but I am still wary. Assume that data can be described by read and write likelihood attributes -- {read N, read forever} X {write N, write forever} then the goal is to find data that is read N, write N for which N has been reached. You frequently won't know the value of N and monitoring will be needed to identify data that is rarely read or written, along with more CPU, RAM and IO to perform that monitoring. This is easier to do when the granularity of hot vs cold is per table but that is rare in my experience. I assume the common case is a table with a mix of hot and cold data.
Don't forget that it is a lousy experience when cold data becomes hot again.
This post was inspired by something I read on a wonderful blog -- Blocks and Files (no snark, blog is great). My summary of the post is that SSD endurance isn't a problem, just evict cold data to cold storage. Just remember that is a goal not a plan.
This matters when you want to decrease the cost of storage for a database workload but not everyone needs to do that. The first approach assumes your DBMS supports an LSM with leveled compaction and compression (MariaDB and Percona Server include MyRocks, ScyllaDB and Cassandra are also options). The second approach, store less data, assumes you can get everyone to agree to remove data and that is a hard conversation.
The third approach, move the cold data elsewhere, is a wonderful goal. I wonder how often that goal is achieved. To implement this you must find data that won't (well, almost never) be read or written again and then move it to less expensive storage. I assume this has been more successful when implemented in storage than in a DBMS. The obvious example is an OS page cache but there are also tiered storage servers. An LSM already classifies data that is hot vs cold for writes, data closer to the L0 was written more recently, but that might not imply anything about read likelihood.
I have read wonderful research papers that show how a DBMS might do this but I am still wary. Assume that data can be described by read and write likelihood attributes -- {read N, read forever} X {write N, write forever} then the goal is to find data that is read N, write N for which N has been reached. You frequently won't know the value of N and monitoring will be needed to identify data that is rarely read or written, along with more CPU, RAM and IO to perform that monitoring. This is easier to do when the granularity of hot vs cold is per table but that is rare in my experience. I assume the common case is a table with a mix of hot and cold data.
Don't forget that it is a lousy experience when cold data becomes hot again.
This post was inspired by something I read on a wonderful blog -- Blocks and Files (no snark, blog is great). My summary of the post is that SSD endurance isn't a problem, just evict cold data to cold storage. Just remember that is a goal not a plan.