Microarchitectural Security of AWS Firecracker VMM: Analysis of Firecracker's Containment Systems

cover
13 Jun 2024

Authors:

(1) Zane Weissman, Worcester Polytechnic Institute Worcester, MA, USA {[email protected]};

(2) Thomas Eisenbarth, University of Lübeck Lübeck, S-H, Germany {[email protected]};

(3) Thore Tiemann, University of Lübeck Lübeck, S-H, Germany {[email protected]};

(4) Berk Sunar, Worcester Polytechnic Institute Worcester, MA, USA {[email protected]}.

4. ANALYSIS OF FIRECRACKER’S CONTAINMENT SYSTEMS

Figure 2 shows the containment offered by Firecracker, as presented by AWS. In this section, we analyze each depicted component and their defenses against and vulnerabilities to microarchitectural attacks

Figure 3: In the user-to-user threat model, we assume that a malicious cloud service tenant attempts to ex-filtrate information from another tenant. We assume the attacker to have control over the app and runtime of its VM while the guest kernel is provided by the CSP.

Figure 4: In the user-to-host threat model, the malicious tenant aims for information ex-filtration from the host system, e. g. the virtual machine manager or the host kernel. The attacker has control over the runtime and app in its virtual machine while the guest kernel is provided by the CSP.

KVM. Linux kernel-based virtual machine (KVM) is the hypervisor implemented in modern Linux kernels and therefore part of the Linux code base. It virtualizes the supervisor and user modes of the underlying hardware, manages context switches between VMs, and handles most VM-exit reasons unless they are related to I/O operations. Besides these architectural isolation mechanisms, KVM also implements mitigations against Spectre attacks on a VM-exit to protect the host OS or hypervisor from malicious guests. Firecracker heavily relies on KVM as its hypervisor. However, since KVM is part of the Linux source code and developed by the Linux community, we define KVM to not be a part of Firecracker. Therefore, countermeasures against microarchitectural attacks that are implemented in KVM cannot be attributed to Firecracker’s containment system.

Metadata, device, and I/O services. The metadata, device, and I/O services are the parts of the Firecracker VMM and API that interact directly with a VM, collecting and managing metrics and providing connectivity. AWS touts the simplicity of these interfaces (for a reduced attack surface) and that they are written from scratch for Firecracker in Rust, a programming language known for its security features [9]. However, Rust most notably provides in-process protection against invalid and out-of-bounds memory accesses, but microarchitectural attacks like cache attacks, Spectre, and MDS can leak information between processes rather than directly hijacking a victim’s process.

Another notable difference between Firecracker and many other VMMs is that all of these services are run within the same host process as the VM itself, albeit in another thread. While the virtualization of memory addresses within the VM provides some obfuscation between the guest’s code and the I/O services, some Spectre attacks work specifically within a single process. Intraprocess attacks may pose less of a threat to real world systems, however, since two guests running on the same hardware each have their own copy of these essential services.

Jailer barrier. In the event that the API or VMM are compromised, the jailer provides one last barrier of defense around a Firecracker instance. It protects the host system’s files and resources with namespaces and control groups (cgroups), respectively [7]. Microarchitectural attacks do not directly threaten files, which are by definition outside of the microarchitectural state. Cgroups allow a system administrator to assign processes to groups and then allocate, constrain, and monitor system resource usage on a per-group basis [17]. It is plausible that limitations applied with cgroups could impede an attacker’s ability to carry out certain microarchitectural attacks. For example, memory limitations might make it difficult to carry out eviction-based cache attacks, or CPU time limitations could prevent an attacker from making effective use of a CPU denial-of-service tool like HyperDegrade [2] which can slow down a victim process, simplifying the timing of a microarchitectural side-channel exfiltration or injection. In practice, Firecracker is not distributed with any particular cgroup rules [7]; in fact, it is specifically designed for the efficient operation of many Firecracker VMs under the default Linux resource allocation [6].

None of the isolation and containment systems in Firecracker seem to directly protect against user-to-user or user-to-host attacks. Therefore, we proceeded to test various microarchitectural attack proof of concepts inside and outside of Firecracker VMs.

This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.