Internal tests from a leading industry vendor have shown that fixes applied to servers running Linux or Windows Server aren’t as detrimental as initially thought, with many use cases seeing no impact at all.
The Meltdown and Spectre vulnerabilities, first documented in January, seemed like a nightmare for virtualized systems, but that is overblown. There are a lot of qualifiers, starting with what you are doing and what generation processor you are using.
The tests were done on servers running Xeons of the Haswell-EP (released in 2014), Broadwell-EP (released in 2016), and Skylake-EP (released in 2017). Haswell and Broadwell were the same microarchitecture, with minor tweaks. The big change there was Broadwell was a die shrink. Skylake, though, was a whole new architecture, and as it turns out, that made the difference.
Meltdown and Spectre most negatively impact virtual environments with lots of transitions, apps spending a lot of time in privilege mode, apps with a high number of system calls and interrupts, or a larger number of user/kernel privilege changes.
And as it turns out, benchmarks show that with the patches applied to mitigate all of the vulnerabilities, general compute, integer, floating point, Linpack, streaming, server side apps, and energy efficiency suffer negligible to no impact. That’s because they are not making a lot of kernel calls and run in the user space for the majority of the time.
Also, apps that use DPDK, a network packet optimizer, saw no impact. So while DPDK is a high I/O library, it’s moving packets that bypass the kernel because they go through the network card, not the CPU. The patches had no impact.
“The perception was anything with a lot of I/O activity would get impacted real bad,” said a source who asked to not be identified. “But not really. DPDK is a very I/O-intensive workload moving packets, but you bypass the kernel. The whole reason DPDK has high performance is it minimizes time spent in the kernel.”
In web serving runtimes, there was about a 10 percent impact. That’s because requests for a web page go to the kernel and that kernel transition is adding overhead.
Where the patches do affect performance
It’s storage-related activity where performance gets creamed, but only partially and only in certain circumstances. First, Skylake is unaffected thanks to how the microarchitecture was rearchitected.
The Haswell and Broadwell servers, though, took a 30 percent performance hit, and that was only if they used a patch called IBRS, a patch provided by Intel that provoked Linux major domo Linus Torvalds to go on one of his legendary blue streaks. If they used Retopoline, developed by Google, the performance impact was 1 to 2 percent.
Google is the firm that found Meltdown and Spectre and had several months to work on its patches, as opposed to the rushed IBRS. Much of the Linux world has rallied behind Retpoline, even though that means needing a new kernel and perhaps having to recompile apps.
There was another mitigating factor. Skylake saw a 10 percent drop in performance on FIO storage benchmarks when using 64kb blocks, while Broadwell and Haswell fell 20 percent — but everyone took it on the chin badly when using 4kb blocks. Skylake took a 32 percent performance hit, while Broadwell and Haswell plunged 60 percent.
The reason is pretty clear. Reading in a 1GB file in 4kb blocks means many more blocks to be read and moved about than 64kb blocks. Each block means an interrupt to protect the system, thus slowing things down.
Of course, not every bit of data can be read in 64kb or larger blocks. Database transactions might require smaller blocks, which will impact performance.
Also, it should be noted that this benchmark was conducted in an artificial scenario not very likely in a production scenario. The vendor ran the CPU at 100 percent utilization for the entire test, not something that will happen in the real world very often.
Another revelation that’s just downright funny is that reading data from traditional spinning hard drives has far less impact than a SATA SSD, and SATA in turn has less impact than a NVMe drive. The reason is that spinning drives are slower, so there is just less data passing through the CPU than on a SSD, and NVMe is designed to be much more parallel in data transfer than SATA. So, a faster SSD suffers more. But that is more than offset by the fact that NVMe drives are ridiculously fast, much faster than SATA SSDs.
What Linux and Microsoft users are doing
At this point, the Linux industry seems to have settled on Retpoline, and all of the major Linux distributions have updated to support it. Microsoft hasn’t indicated support one way or another. Given it comes from arch-nemesis Google, it would kill Microsoft to have to support it, but stranger things have happened.
The industry vendor said patches at this point are solid and there is not likely to be any major new optimizations. Intel plans to release new products this year that will address Meltdown and Spectre in silicon, but it has not provided any further details than that.