In early 2015, researchers unveiled Rowhammer, a cutting-edge hack that exploits physical weaknesses in the silicon of certain types of memory chips to change the data they store. In the 42 months since then, an enhancement known as error-correcting code (or ECC) in high-end chips is believed to be a perfect defense against potentially catastrophic bitflips. 0s to 1s and vice versa.
A study published Wednesday has shattered that assumption.
Dubbed ECCploit, the new Rowhammer attack bypasses ECC protections built into many widely used models of DDR3 chips. Exploitation is the product of more than a year of painstaking research that uses syringe needles to drill errors into wood chips and cold chips to observe how they respond when the pieces are turned. The resulting insights, including some advanced calculations, allowed researchers at the Vrije Universiteit Amsterdam’s VUSec team to demonstrate that one of the key defenses against Rowhammer is insufficient.
A special event
In particular, the researchers did not demonstrate that ECCploit works against ECC in DDR4 chips, a new type of memory chip favored by high-end cloud services. They also did not show that ECCploit could penetrate hypervisors or secondary Rowhammer defenses. Regardless, the bypass of the ECC is an important event considering that the Rowhammer threat continues to evolve and cannot be easily discounted.
“It is thought that ECC provides strong protection against Rowhammer attacks,” Kaveh Razavi, one of the VUSec researchers who developed the exploit, told Ars. “ECCploit shows for the first time that it is possible to mount useful Rowhammer attacks on vulnerable ECC DRAM.”
Inside research paperthe researchers wrote:
Rowhammer has become a major threat to computer systems, from the smallest mobile devices to the largest clouds, but until now devices with high-end memory with error correction code (ECC) have been free from such attacks. This is due to the complex challenge of machine-switching ECC operations and, more importantly, to the narrow margins within which the attackers must operate: many bits must be changed in order to bypass the error correction operation, but changing the number that is not right. Bits can crash systems. Therefore, many believe that Rowhammer on ECC memory, even if theoretically possible, is impossible. This paper shows this to be false: while hard, Rowhammer attacks are still a real threat even to modern ECC systems. This is especially worrying, because all other existing protections are proven safe. Given the prevalence of Rowhammer vulnerabilities across many operating systems, we urgently need better defenses against these attacks.
To check, DDR memory is laid out in an array of rows and columns that are allocated in large blocks to various devices and systems. To protect the integrity and security of the entire system, each portion of the allocated memory is in a “sandbox” that can be accessed by a given application or OS process.
As the physical dimensions of the chips shrink over time, there is less space between each DRAM cell. Tight environments threaten this security model because they make it very difficult to prevent a cell assigned to one application or process from electrically interacting with neighboring cells assigned to another application or process.
Rowhammer exploits this physical vulnerability by quickly accessing — or “cracking” — one or more carefully selected rows in a vulnerable DIMM. By reading one or more “aggressor” lines of memory thousands of times per second, the exploit can change one or more lines to a “victim”. When done with precision, Rowhammer can change bits in ways that have important consequences for security, for example, by allowing an untrusted application to have full control rights, getting out of containers sandbox or virtual machine hypervisors, or rooted devices running vulnerable DIMMs. .
ECC: Some restrictions apply
ECC works by using what are known as memory addresses to store redundant control bits next to data bits inside DIMMs. CPUs use these words to quickly find and repair bits that have been changed. ECC was originally designed to protect against a naturally occurring event in which the universe’s cores are replaced by new DIMMs. After Rowhammer appeared, the importance of ECC grew when it proved to be the most effective defense.
But some limitations apply. ECC generally adds enough overhead to reproduce certain bitflips in a 64-bit word. When two bitflips occur in a word, it causes the underlying program or process to crash. When three bitflips occur in the right places, ECC can be completely bypassed.
Until now, there has been little public knowledge about how ECC works. VUSec researchers spent months reverse-engineering the process, in part by using syringe needles to poke the bugs into the chips and embedding the chips cold-shoe attack. By extracting data stored in cold chips as they experience errors, researchers are able to learn how computer memory controllers handle ECC control parameters.
Here’s a video of the researchers using the cold-shoe technique:
Cold-Boot attack for reverse-engineering error-correcting code (ECC).
And here’s a video of syringes making mistakes:
Memory bus error injection with two syringe needles.
Researchers eventually discovered a time group channel. By carefully measuring the amount of time it takes to perform certain processes, researchers are able to explain granular details about the bitflips that occur in silicon. In a blog postthe researchers wrote:
Armed with this knowledge, we went on to show that ECC only slows down the Rowhammer attack and is not enough to stop it. Intuitively, the method is fairly straightforward. Remember that we need three bitflips, while avoiding a situation in which only two bitflips occur. The first thing we discovered was a technique to ensure that, at most, a specific bitflip occurs in a memory word. The trick is simple: we make sure that all the pieces in the position we hammered and the pieces in the position we want to hit are the same, except one. If the bits in the same position in the two positions are the same, no bitflip will occur. If they are different, the bit can change. So we can freely try and change first bit 1, then bit 2, then bit 3, etc. At first sight, that seems pointless. Then, ECC will simply correct that bitflip and it will look like nothing happened.
A time trick
It is different from the sentence: one flip is no flip. However, this is not entirely true. What we see is that we can see that little has been corrected through a time band channel. In short: it will take a bit longer to read from a memory location where the bitflip needs to be corrected than it takes to read from an address where no correction is needed. Therefore, we can try each one in turn until we find a word in which we can change the three bits that are vulnerable. The final step is then to make all three pieces in two different positions and hit one last time, to move all three pieces in one direction: the mission is complete.
There is no imminent danger
The researchers tested ECCploit on four hardware platforms, including:
- AMD Opteron 6376 Bulldozer (15h)
- Intel Xeon E3-1270 v3 Haswell
- Intel Xeon E5-2650 v1 Sandy Bridge
- Intel Xeon E5-2620 v1 Sandy Bridge
The researchers said they tested “several memory modules from different manufacturers” and confirmed that a significant amount of Rowhammer bitflips occurred in the type of DIMM tested by a it is different from the group of researchers. VUSec researchers declined to identify the DIMM manufacturers.
As previously noted, ECCploit targets DDR3 DIMMs (although in fairness, researchers say they believe some of the telltale side channel is in DDR4). There is also no indication that ECCploit works reliably against commonly used endpoints in cloud environments such as AWS or Microsoft Azure.
In a statement, a Microsoft official wrote: “We are constantly monitoring and testing the security of our services against Rowhammer attacks, including extreme attack scenarios beyond realistic scenarios. This test includes the procedures described in this document, which are not permanent. threat to our services.” The statement was unclear. Amazon officials did not respond to an email seeking comment for this post.
The downside: while ECCploit represents a significant advance that could (a) leave some servers vulnerable or (b) open systems to future attacks, there is no indication ECCploit currently poses a threat to large cloud providers .
“Overall, this is an amazing service that will help hardware manufacturers improve their defenses against this class of attacks, but we don’t (yet) have direct evidence of any widespread vulnerability on public cloud providers,” Kenn White, an independent researcher. who specializes in cloud security, told Ars. “I don’t want to come across as a bratty guy in the balcony, because this is it painful a job that took hundreds of hours to pull off. But unless you can show a real exploit, you’re in the confines of endpoints and homeware. “