As details of the Meltdown and Spectre vulnerabilities1 have become clearer a number of statements have been published by the multiple vendors affected; Canonical has issued advisories and updates on fixes and mitigations, the latest of which includes a first round of Spectre mitigations. However, most of these statements focus on the mechanics of applying fixes and corresponding damage control, and not on explaining what the problems are, how the mitigations work, and how they may affect you.
Because the vulnerabilities and their fixes are CPU-dependent and involve a major performance-security tradeoff, understanding their general model is important to every system administrator and developer. In the spirit of Ubuntu, which is known for providing easily accessible technology — we are Linux for Human Beings, after all — this post attempts to provide an accessible description of the impact of these vulnerabilities and their mitigations.
The essence of both vulnerabilities is that a program running on a computer can read memory it is not supposed to. This is not an arbitrary code execution issue, but rather that the CPU can be tricked by a malicious program to expose memory that it wouldn’t otherwise have access to. Second, due to the way the CPU is being tricked, the exposed memory can only be retrieved relatively slowly2. In summary:
Although an attacker can’t directly use Meltdown or Spectre to read disks or change anything on a system — for instance, to modify a password or write a file — the problem is that passwords, keys and other secrets are usually stored unencrypted in memory. Once stolen, these can then be used to escalate privileges, which can then be used to modify the system or obtain sensitive information.
Proof of concept code exploiting these vulnerabilities has been published that demonstrates:
Other exploits that follow the patterns above almost certainly will emerge over time.
These issues affect practically every computer in server and end-user contexts, but due to the nature of the possible attack vectors, different use cases are exposed to different degrees. We have therefore provided the following risk grading:
In order to understand how mitigations are being implemented, it’s important to grasp what the underlying issues are, which in turn requires an understanding of operating system and processor concepts.
There are two good “kindergarten-class” analogies that may serve as useful introductions, which we have nicknamed the Helpful Grandson Analogy and the Book Voyeur Analogy. A quick read through William Cohen’s excellent post on Red Hat’s developer blog will also provide basic knowledge on CPU pipelines and cache behavior, and an optional hour-long read of Dan Luu’s branch prediction treatise will complete the necessary background for those unfamiliar with that aspect of processor design.
And that’s where our explanation starts. Summarizing the root cause:
Meltdown and Spectre are caused by slightly different processor design choices, and expose system memory in different ways:
Neither Meltdown nor Spectre can be directly addressed by CPU manufacturers in existing hardware: they are a consequence of fundamental hardware design which cannot be modified in field. Mitigations involve a combination of software and CPU microcode updates to hypervisor and guests, which we will discuss next.
Mitigations are specific to each attack, and have different performance implications. Existing mitigations are summarized in the following table:
It’s worth calling out that some of the mitigations for Spectre are not yet fully mature, which implies multiple iterations will be implemented and rolled out before the situation is fully addressed.
In order to benefit from the mitigations being provided, the following must hold true:
Ubuntu provides security updates free of charge to all Ubuntu users. Updates will automatically install all necessary code and — where available — CPU microcode. Ubuntu will also, by default, enable all protections that are stable and safely implemented.
However, not all vulnerabilities identified have protections available covering their full extent, as outlined in the Current Status section of our Knowledge Base entry on the vulnerabilities. To further complicate matters, the protections that are available have significant performance impact for a number of workloads. The next section will discuss this in more detail.
Performance impact from the mitigations has been a top concern, and at the highest level, the answer is that performance regressions are entirely workload-dependent; the slowdowns we have directly observed vary from 0 to 50%. To put the complexity of producing useful performance impact data in perspective, let’s review the different ways in which mitigations can apply to a test scenario:
The possible permutations make communicating through benchmarks incredibly difficult, and recognizing that, we will focus on practical advice first. The information in this section assumes all protections published by Ubuntu for Spectre and Meltdown have been enabled.
For Ubuntu Desktop users, including users running official flavors derived from Ubuntu, here is a summary of the impact of mitigations being applied:
For server workloads, impact is described in more detail in the following table.
Where necessary, offsetting performance impact will involve a combination of scaling out workloads, increasing compute power (by choosing a larger cloud instance type, for instance) or selectively disabling mitigations in contexts where the tradeoffs justify it. We are maintaining a Mitigation Controls page which describes the relevant knobs available on Ubuntu.
Canonical are in the process of finalizing a set of performance runs across private and cloud environments and applications. We aim to have our performance findings, based on our internal experience applying mitigations published by February 12th.
For one preliminary datapoint, on our pre-Skylake build farm, the currently published Meltdown and Spectre mitigations for Ubuntu (including pre-release Intel microcode) show us that:
In the meantime, we’re providing a Published Application Performance summary page which collects per-application performance descriptions published by third parties. That page will be updated as new information gets published and can assist administrators in evaluating trade-offs in risk and performance to determine their own mitigation plans.
We appreciate this is frustrating to many administrators aiming to frame decisions with hard data, and we ourselves have been faced with the same issue internally. But the immature nature of the mitigations, made worse by an evolving understanding of which mitigation strategies should be active in what contexts — kernel/userspace, hypervisor/guest, and specific code paths, makes the topic anything but straightforward.
We have built this document with the intention of putting forward a practical framework to support decision making in this unusual situation. The evolving nature of the industry’s collective understanding of the vulnerabilities has lead to an excess of public information, in part incomplete and in part contradictory, and we have selected here a set of links that are coherent, well-written and expand detail on what we have presented above:
We will issue updates to this post and additional information as the situation evolves. We encourage Ubuntu users who seek more information to contact an Ubuntu Advantage support representative for an in-depth discussion relative to your use cases.
Ubuntu offers all the training, software infrastructure, tools, services and support you need for your public and private clouds.
2020, the year that we expect 5G networks to begin turning on is not that far away with some already starting early so it is inevitable that the IoT will adapt to the new features that it brings. Our increasingly connected world, sometimes…
Last Friday (11 May 2018) we learned that a snap was mining cryptocurrency in the background while the application was running. The practical implication of that is the overuse of local resources on a user’s system, well beyond what a…
For some time, we’ve wanted a mechanism to alert snap publishers to security updates which affect their snaps. All the pieces have come together and we are now sending alerts via email. Stated more precisely, publishers who use ‘stage-…