Meltdown and Specter created something of a meltdown in the cloud computing world. And by definition, the flaws found in the processors at the heart of much of the world’s computing infrastructure have directly or indirectly affected the adoption of today’s Internet-based services. That’s especially true for a variant of the Specter vulnerability accidentally revealed by Google on January 3, because this particular vulnerability could allow malware to run in a user’s virtual machine or other “sandbox” environment to read data from another—or , from the host server itself.
In June 2017, Intel learned of these threats from researchers who kept the information under wraps so that hardware and software vendors could work aggressively on fixes. But while places like Amazon, Google, and Microsoft were noticed early because of their “Tier 1” nature, smaller infrastructure companies and data center operators were left in the dark until the the report will be issued on January 3. This sends many organizations immediately scrambling : there is no warning of the abuses that exist before the proof-of-concept code for their abuse is already public.
Tory Kulick, director of operations and security at the hosting company Linode, describes this as Chaos. “How could something this big show up like this without any proper warning? We feel out of the loop, like ‘What are we missing? These POCs (proofs of concept of injuries) are out there now. ?’ Everything is going through my mind.”
“When this thing broke, no one heard a peep from Intel or from anyone else directly,” said Zachary Smith, CEO of the hosting service. Packet, told Ars. “All we can see is what’s going on the Google blog about how to use this thing. So we’re all begging. The big guys—Google, Amazon, and Microsoft—have at least 60 days than the time of preparation, and we have said. in the time of negative preparation.”
Even the groups behind some operating system distributions—including the developers of the BSD distributions—were unaware of the flaws until Google published the Project Zero blog. “Only Tier-1 companies receive advance information, and it’s not a guarantee – it’s a selective exposure,” Theo de Raadt, director of the OpenBSD project, said, while talking to ITWire. “Everyone below Level-1 has just been injured.”
The nature and timing of Google’s disclosure, at least in part through the independent discovery of vulnerabilities, has made the response even more chaotic and painful for cloud hosts and users. The microcode corrections of the processor to the firmware have been completely released, in some cases we still remember then. Some applications have taken a big performance hit. And no one is really sure how all the software changes and firmware patching will affect cloud services as they roll out.
So to overcome the confusion, these companies made a kind of novel: they decided to work in a group. A group of second-tier service providers has come together to share accurate information about patches from multiple vendors, metrics on their effectiveness, and best practices for rolling them out. In the past week, this ad hoc board—a group of less than 25 companies working on the simple distribution Slack—has attracted several high-profile members, including Netflix and Amazon Web Services. And this informal classification even allowed the researchers originally behind the Specter/Meltdown discovery to interact directly with the companies involved.
“Probably one of the best things that came out of the whole ordeal was this cloud hosting partnership,” Linode’s Kulick said. “Sharing links and things like that are very important.”
And Kulick, like others on the team, hopes this event will lead to more frequent collaboration across the industry—giving small organizations and large cloud customers a seat at the table for future security issues of this magnitude.
“Our company has grown,” Smith said. “We’re not a ragtag group of people running small hosting racks and putting some websites online – we run important parts of people’s lives on our infrastructure for them, and it’s going to be a problem if if we don’t do that. find a way to combine.”
“Thank God this is not a state player,” Smith added.
A dumpster fire started
As the world shakes off the ravages of New Year’s Eve, another headache is shaping up among chats in Slack channels at Packet, a “bare metal” hosting company based in New York.
“Monday night and Wednesday, some AMD activities and comments to Kernel.org happening into our internal Slack channels,” said Smith (Kernel.org is where contributors push new updates to parts of the Linux kernel) “We host Kernel.org, so we look closely as well. Everyone was like, ‘Something’s going on.'”
There has been a long discussion in the Kernel.org change documentation dating back to May 2017 about a new feature called KAISER (“Kernel Address Separation to have Sub-Channels Efficiently Removed”). This feature was triggered by long-standing concerns about the potential for such Meltdown attacks and the Specter evidence that the theory is based on. Commitments for KAISER began about a month before Meltdown and Specter were introduced to Intel, so work is already ongoing to try to mitigate the threat of these classes of attacks. By the time Packet and others started taking care of this, kernel updates related to KAISER were coming with increasing frequency—and with more subtle indications of potential exploitation—as the year went by.
“I think people are seeing things through the bonds and they’re starting to come together,” Kulick said.
The following comment was made by a Linux kernel AMD’s Tom Lendacky on December 27 really set the awareness, angering the executives in many companies who know about the weaknesses. The comment essentially spells out AMD’s position at that point on deterrent bugs: the company believes that its processors are not subject to the types of attacks that the kernel’s page table isolation feature protects against. AMD also believes that its microarchitecture does not allow memory pointers, including attention pointers, to access high-privilege data when running in low-privilege mode when that access would result in a page fault. .
Of course, AMD’s architecture will later change to not be immune to side-channel attacks as Lendacky said.
“AMD doesn’t help with their snarky kernel type,” said Smith, who suggested the comment may have played a role in Google’s early release of information on Specter and Meltdown. Even if it does, however, other researchers are starting to discover flaws independently of Specter and Meltdown—researcher Anders Fogh has written publicly about what will later be defined as Meltdown in late July last year.
Whatever caused the high-profile leak, Jann Horn of Google’s Project Zero security research group published details of Meltdown and Specter in January 3-weeks before the first initiative on vulnerability releases. At that point, according to Smith, “you know, all hell broke loose.”
Kulick said that he thought that Google’s announcement made problems, but “even if it had been revealed in the ninth as planned, we would all have been in a position of harm. It would have been a different thing if it had been the lead time .”
Given how dependent all kinds of applications have become on cloud services, it’s said that no one at Intel, Red Hat, AMD, or Google cares about anyone outside of top-tier hardware and software. mechanical engineer.
“The Tier 2 providers that are represented in this small task force create control of hundreds of thousands, if not millions, of servers,” Smith said. “But each of us is smaller… Google didn’t think to call it Packet. Intel didn’t think to call it Packet, and they certainly didn’t call OVH or Digital Ocean. And yet we matter as much from a customer perspective, because our customers need a lot more help.”
Once the details came out, communications from Intel, AMD, and other hardware vendors about Specter and Meltdown were (and have continued to be) spot on. Even today, there is no central communication channel for everyone involved. “My view is that (Intel’s communications with customers) are going through different teams depending on the regions,” Kulick said. “They’ve been hitting pretty hard, so the delays are being communicated.”
“Intel is behind the eight ball,” Smith said. He suggests Intel is too consumed with the relationship problem and is not focused on talking with customers like him. “I’ve encouraged (Intel) … I’m asking their data center team to do some kind of online communication to answer questions. We need to have some open communication, which is not all.” it would be good, but we have to work together; people have to listen. And I think our community wants to help — we just need to have more of an open dialogue.”
Consequently, the communication issue has not been helped by the absence of any kind of established channel for communication. “Frankly, this shows what a public cloud industry it is,” Smith said. “We don’t have really good working groups. So where, if it’s Red Hat or it’s Intel or it’s Supermicro, do you go under some kind of common code to work with everyone around. security issue? There is no point.”