Quantcast
Channel: Planet MySQL
Viewing all articles
Browse latest Browse all 18822

Doing It Wrong with MySQL / in the cloud

$
0
0

This is a rant. It’s a strong one mainly because of what I’ve been dealing with over the past few days. But it’s still just a rant, so I’m going to exaggerate 😛. I’ve been with my employer for several years, and a lot of the stuff I’m writing about isn’t really new. I think most of it has been like this for years. I didn’t really pay attention to this stuff in the past because I didn’t have to. I didn’t want to either1, but now our team is handling ops and I have to care 😐.

Doing it wrong with MySQL

We use async, bidirectional replication. It’s a nightmare.

Using async replication means you don’t care about replicas, basically by definition. If the master isn’t waiting for replicas, then they can fall arbitrarily behind, crash, fail, whatever and the master will happily keep going. You can use hacks like delaying binlog commits on the master to improve parallelization on the replicas to keep them from falling behind, but that’s just like pretending to care about replicas. I don’t really understand how you could use async replication if you actually care about your replicas. I’ve already spent many hours trying to figure out why replica were falling behind and all of that work could’ve been avoided if we stopped using async replication.

Using async replication with bidirectional replication and a proxy means you don’t care about consistency. Suppose you have a proxy sending data to A, and B is replicating from A. Let’s say B is behind by a few minutes. Then suppose A fails and the proxy starts sending connections to B. At that moment, B is simultaneously catching up with “old” transactions from A and accepting new writes. You could have scenarios where current writes are overwritten by older ones. Good luck with consistency.

Doing it wrong in the cloud

This tweet basically sums it up.

In 2010, I started a web hosting company. After a year or so I started hosting VMs. I used to move customers’ VMs around for various reasons. I didn’t have block storage (which would’ve been so nice) so this required a bunch of copying. Copying GBs of VMs around is not fun. It’s a slow and painful process, and it’s even more painful when you consider my customers couldn’t use their VMs until it was done. VM disks were stored locally on each server as LVM logical volumes. I got really good at LVM. After I shut down my hosting business a couple of years ago, I was so glad that I didn’t have to deal with that sys admin stuff anymore. I can code! I can design and build systems! I can automate! I can use all these fancy services on AWS!

But here I am, over 6 years later, spending days on manual rsync, copying TBs of data between instances. 😡

I think if you’re using LVM and rsync in the cloud in 2017, you’re doing it wrong. I don’t care what you’re actually doing, because I’m almost positive there are better alternatives.


Anyway. At least I get the chance to Do It Right when I work on personal projects. It’s just really frustrating to Do It Wrong the next morning, especially because that work actually matters.

Footnotes:

  1. I went from thinking about consensus and distributed systems to async MySQL replication. You wouldn’t want to either.

Viewing all articles
Browse latest Browse all 18822

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>