refering Jeremy Cole's post on swapstorming under NUMA hardware, I'll note something potentially new.
While I've seen this "brick wall swapstorming" a few times before and since the post, I just saw some new OS installs not do this by default, and using the numactl to change the defaults is actually harmful to system interactivity.
In the brick-wall cases, two NUMA zones of ~30G each, plus a mysqld (or memcached) running with 45G of ram, would equal 30G in memory, and 15G in swap. Ugly.
In this case, I'm getting a little bit in swap, but a relatively even note dist.
Here's a box with no numactl tuning:
similar hardware, same OS/kernel running under numactl --interleave=all:
... just a touch in swap on the first guy. Though I'm going to wait a few days to declare victory or defeat, since I did see the first guy dump nearly a whole gig of swap once, but wasn't able to confirm if the swapped memory was mysql yet.
The side note here is that my numactl-modified node is exhibiting some extreme latency on interactivity. Appears to be related to anything that needs to fork having a half-second delay. MySQL seems to be running fine though.
I haven't investigated at all as to how numa distribution has changed in recent kernels (though I know it's been steadily improving over the years). Unfortunately every other box I've used which *has* the problem, runs on a redhat/centos5 kernel. Which is ancient to an extreme.
In this case it's debian squeeze with its default 2.6.32 kernel. Anyone try a recent ubuntu or redhat6 yet and see if the NUMA/swap issues are better on there?
PlanetMySQL Voting: Vote UP / Vote DOWN
While I've seen this "brick wall swapstorming" a few times before and since the post, I just saw some new OS installs not do this by default, and using the numactl to change the defaults is actually harmful to system interactivity.
In the brick-wall cases, two NUMA zones of ~30G each, plus a mysqld (or memcached) running with 45G of ram, would equal 30G in memory, and 15G in swap. Ugly.
In this case, I'm getting a little bit in swap, but a relatively even note dist.
Here's a box with no numactl tuning:
N0 : 7068733 ( 26.97 GB) N1 : 7120258 ( 27.16 GB) active : 13355529 ( 50.95 GB) anon : 14187441 ( 54.12 GB) dirty : 14185099 ( 54.11 GB) mapmax : 265 ( 0.00 GB) mapped : 1580 ( 0.01 GB) swapcache : 2350 ( 0.01 GB)
similar hardware, same OS/kernel running under numactl --interleave=all:
N0 : 6778742 ( 25.86 GB) N1 : 6313382 ( 24.08 GB) active : 12395957 ( 47.29 GB) anon : 13090566 ( 49.94 GB) dirty : 13090566 ( 49.94 GB) mapmax : 255 ( 0.00 GB) mapped : 1588 ( 0.01 GB)
... just a touch in swap on the first guy. Though I'm going to wait a few days to declare victory or defeat, since I did see the first guy dump nearly a whole gig of swap once, but wasn't able to confirm if the swapped memory was mysql yet.
The side note here is that my numactl-modified node is exhibiting some extreme latency on interactivity. Appears to be related to anything that needs to fork having a half-second delay. MySQL seems to be running fine though.
I haven't investigated at all as to how numa distribution has changed in recent kernels (though I know it's been steadily improving over the years). Unfortunately every other box I've used which *has* the problem, runs on a redhat/centos5 kernel. Which is ancient to an extreme.
In this case it's debian squeeze with its default 2.6.32 kernel. Anyone try a recent ubuntu or redhat6 yet and see if the NUMA/swap issues are better on there?
PlanetMySQL Voting: Vote UP / Vote DOWN