-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCI (GPU) passthrough hardening: option ROM edition #1087
Comments
The MSI boards already have global Option ROM disable, but couldn't convince them to make them per-Slot granularity. It is either fully on, fully off, or GPU Option ROMs only.
You want to enable Secure Boot then tell it to ignore to check Boot Loaders so that it only validates Option ROMs only. So, Point 8 here: #929
And how you can possibly do that? As far that I know, VBIOS flashing works by using vendor tools that tells the GPU to use its internal I2C/SPI/whatever Controller to flash the ROM. If you passed the card, these vendor tools would work as in a bare metal environment and I don't see how are you going to block that.
Sure, you can try asking one of the third party vendors to put a Flash Write disable jumper, which I recall having seen in a few historial PC Motherboards when Flash ROM was first introduced. But unless you're a high bidder than can ask for a few thousands of custom cards it is not gonna happen, so I won't bother with it. |
This may be enough, if there are no other devices needing Option ROM. In this particular case we are talking about a laptop, so the customization is limited (yes, I know you can still attach almost any PCIe device, but it's much less common in practice).
Yes, exactly.
Well, I want to enable it for Option ROM even if you need it disabled for the OS.
Yes, exactly, that's the problem.
That doesn't help much if internal flash can still be modified, even if not loaded by that VM. Option ROM could still be changed and will be used by firmware on next reboot (unless you do similar trick in firmware to side-load Option ROM?).
Yes, it would be technically better solution (as it's more comprehensive than just Option ROM), but not feasible at this scale. |
I don't see that like a problem you can actually fix.
Not if you disable loading Option ROMs. It may also be possible to hash the Option ROM (Point 7 of my writeup) so that you know it didn't changed. And yeah, putting an Option ROM in Firmware and loading it for that device instead of its own one QEMU style should also be possible. |
What I care about, is for a reflashed GPU (which we established already is hard to prevent in the first place) to not be able to attack host. There are many ideas how to achieve it - in the issue description, comments, and the other issue. |
The only problem I see here is that it is not validated. We would need grants to enable the DMA attacking tool in the automation process. We have capable hardware. That could confirm in every release that DMA protection is correctly applied.
OptionROMs are typically signed by
This is an exciting part. Do you have any examples of such issues? Because of that, OCP requested a standard update mechanism for GPU firmware, and a document was created.
This probably was already requested and at least partially implemented. Pease check #139
And for complex modern devices, that can be the core issue.
To prevent soft-bricking, one would imagine that boot firmware would detect that fact and warn the user or even not allow the user to self-soft-brick.
This and many other improvements could be employed in UEFI Secure Boot. I already have a ton of requirements in that space. I will explore our options as part of my training campaign in 2025. It may not be hard to implement that at least partially.
TPM measurement + event log? There are also UEFI variables dedicated to exposing firmware capabilities to OS like OsIndicationsSupported.
I guess we should employ guidance from here and expose things in ACPI DMAR table, some information already should be there, but the point is there is no validation of that.
There are better directions than this. Relying on some custom coreboot files exposed will create technical debt, and appropriate mechanisms already exist in the UEFI world. We should ask what to do with non-UEFI builds. Still, I think we should get back to the question of what the standard behavior OSes use for such capability is, and standard most likely will mean what Windows uses for that. Also, checking the Linux approach would be useful. @zirblazer It is hard to read your write-up. It should be split, TBH. Every point is separate (it could be linked for better context). @marmarek I don't think it is possible to make boot firmware responsible for controlling peripheral updates when those peripherals have their closed-source verification mechanism. We cannot handle all possible mocking of buses in the system without affecting correct operation. Unless we reach SPDM and device authentication for the whole system, the feature is unlikely to be implemented. Getting updates only from reasonably trustworthy sources with known paths for escalation, e.g., LVFS, can be done, but that does not prevent malicious actors from gaining privileges in the system and abusing those to deliver the wrong firmware to peripherals if those allow unauthenticated updates. That is on the peripheral vendor to provide the correct update mechanism or on the open-source firmware community to deliver support for a transparent mechanism. The best thing we can do is to look for best practices regarding peripheral firmware updates, test that on given hardware, and provide advice on what hardware is recommended now. Even together, we do not have enough resources to solve that problem. P.S. Maybe this is good discussion for December DUG? |
Sure, but then you have issues with blob redistribution 😫 To prevent softbricks but still ensuring oprom integrity, I was thinking about something along these lines:
and these db and dbx would need ot be independent of secureboot, ideally with an option to use secureboot or these separate oprom db / dbx |
Generally this looks like a good plan. I have just one concern:
If that optionrom was malicious, since it got loaded it could modify the firmware to avoid the warning. Or display something else here, including different hash (than actually got computed and latter added to db/dbx). |
And one more thing: I'd like to see from the OS level if the current optionrom for a given device is included in db/dbx. This way, I can see if the trusted hash was recorded before connecting the dGPU to an untrusted VM for the first time (and if not - ask the user to reboot first, to record trusted hash before potentially having it reflashed by untrusted VM). |
It all looks well, but from vboot autopsy, we know that warnings mostly scare users rather than do anything good. The whole thing should be rather optional at build time for security-conscious people.
Why independent? You will have to create another verification mechanism alongside Secure Boot. Also how you will bypass secureboot over your own verification mechanism? Or will it be just another layer on top of Secure Boot? Other relevant requests: #929 (point 7) |
I'm not sure if independent db/dbx is needed, but I'd like to have independent options for this - for example to enable option rom verification, while still allowing to boot any kernel. |
I've had a longer think about this, it makes sense to reuse secureboot for this, but add a mode for option rom verification only. Users would enroll their GPUs while SB is in setup mode, then set SB user mode to enable enforcement. if another console (serial or GOP) is available, then we can defer loading and have a popup asking whether to load. EFI has EFI_DEFERRED_IMAGE_LOAD_PROTOCOL exactly for this so that part is easy-ish. With regards to brick prevention (GOP or GPU changes while oprom verification is enabled, no other consoles are available), I'm not sure if there's any way to do this securely:
The safest option would be to deny execution always, but have an external way to reset settings (e.g. CMOS reset). |
This can be made super easy if you only support platforms that have a static trusted GPU, namely, any Intel or AMD Processor with an integrated GPU, or platforms with a BMC that also provides its own GPU. This feature is completely unviable on platforms where you need video output working to present an interface for the user to authorize Option ROM on the first place.
I think only nVidia GeForce 2xxx series had a builtin XHCI Controller, it was removed on the next generation and I don't recall Radeons implementing this. |
My friend's 6800XT has USB-C with USB 3 and DP, 7000 series also have this
Fair enough. I was mostly thinking about MSI users with K-series CPUs, for example |
You're right, Radeons began implementing a XHCI Controller too. Visible on lspci and all that.
K no, F series. K includes IGP, except the KF ones. Fixable by avoiding purchasing any F series. And, as a matter of fact, for Coreboot purposes I always recommended to buy Processor with IGP because dGPU compatibility was never perfect anyways, so you were already risking it. |
IMHO it's okay tradeoff to support the most strict option only if there is always trusted iGPU. That's also why I'm asking to have those settings be visible from the OS - so I can inform the user if GPU passthrough is safe or not, and inform about associated risks. |
@marmarek Checking such feasibility on the OS side would be great, since firmware releases are way slower than software releases. Quite some people would love to see a Qubes-certified laptop with NVIDIA dGPU. |
Related comment: due to the fact that we can no longer get new stock of the NVIDIA variants of the V54 and V56 Series, it will be financially unfeasible to release a Heads firmware version for those NVIDIA variants. The last deliveries will take place within two months from now and we expect to have enough stock for over one year. |
With integrated GPU and ASPEED BMC graphics you always have a native initialization of that GPU in coreboot, either by FSP for Intel iGPU we have to trust anyways or libgfxinit, or coreboot native code (ASPEED BMC) . So integrated GPUs are really out of scope here. |
@miczyg1 I think the point here is that with native gfx init in coreboot, the initialization code can't be as easily reflashed as an oprom, so there's always a "trusted" GPU driver in the system, which invalidates the entire soft brick problem. |
The problem you're addressing (if any)
Using GPU (or any PCI device for that matter) passthrough with a less trusted VM may allow it to reflash firmware of such device. Just after reboot (during firmware and OS startup) such device is not isolated in a VM and may try to compromise the whole host. This can be done in at least two ways:
Theoretically, reflashing malicious firmware should not be possible due to (at least) signature check done by the GPU firmware update mechanism, but history shows this sometimes happen to be buggy/ineffective or in some cases even non-existent.
Describe the solution you'd like
I see two solutions:
In either case, there needs to be a mechanism for the OS to verify if the mechanism was enabled to inform the user if passthrough is safe for a given device. And similarly, OS needs to be informed if early boot DMA was enabled. Maybe there is some ACPI table that can be used to pass this info to the OS? Or maybe OS can inspect coreboot config (cbfs?) to check if the option is enabled?
Where is the value to a user, and who might that user be?
Use GPU passthrough with reduced risk of compromising the whole system.
Describe alternatives you've considered
Alternative solution could be reliably blocking reflashing dGPU firmware by the VM. And ensure device reset on reboot works reliably too. In other words - ensure that all VM-controlled state is discarded on reboot.
I think this solution would require changes to the board design, and thus be significantly harder to make in practice.
Additional context
We consider making a feature like this mandatory for allowing Qubes OS certification of systems with dGPU. Without such feature, we don't consider dGPU passthrough safe enough to certify such system, and thus it doesn't make much sense for users to buy systems like this if dGPU would be allowed only in dom0, as it would be mostly wasted.
This is especially relevant for V5x models with nvidia.
The text was updated successfully, but these errors were encountered: