-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add instruction decoding and emulating support for IOIO instructions #391
Conversation
1bc8213
to
a16cdf3
Compare
Please rebase on main, thanks! |
SVSM fails to boot in debug mode (RELEASE=0) starting from commit 8188603 "kernel/cpu/vc: Simplify IOIO instruction handling in #VC handler":
However, this boots fine (and test in SVSMs pass as well) on AMD for release builds. |
Rebased and force updated. |
Thanks for testing this on AMD machine! I think this issue is related with the stage2.bin size which is increased quite a bit in DEBUG version by using the instruction decoder crate to emulate the in/out instructions, and results in the heap area becomes smaller. So in the latest update, that commit is refactored only for the SVSM #VC handler and leaving stage#2 #VC handler as before. The DEBUG version worked fine from my side on TDX machine because I made some hack to increase the stage2 boot memory larger than 640K, and the reason I increased this because stage2.bin size did become larger, which I thought was caused by introducing TDX specific code. But seems using the instruction decoder to emulate in/out instructions in stage2 also increased the size. Thanks again for the testing! Please let me know if this issue can still be observed. |
The #VC handler is panicking during one test-in-svsm:
I'm investigating this :) |
Thank you for investigating this. This error represents svsm kernel is failed to translate a virtual address with checking permissions in the page table. So what I can image is that something might be wrong in the translate_addr_region() or the check_access_rights() functions. Please let me know if I can provide any help on this. |
Good catch, I also doubted it was a binary size issue, it already happened in the past. I'm not sure if we'll need INS/OUTS in stage 2, so enabling this only in the SVSM #VC handler makes sense to me. However, I think we'll need to really discuss this stage2 binary size issue with Jörg once he's back since this is a recurrent problem now |
Some hint from my side, according to the reported error code 9, |
c06e73b
to
73304d6
Compare
In in-svsm testing, it hangs now before jumping to the kernel main:
I suspect a kvm panic that doesn't show on my screen. This doesn't happen in a regular (non in-svsm testing) SVSM boot |
A possible reason may still be the stage2 binary size. If the size of stage2 is too large, the remaining memory for heap in stage2 might not be large enough to load svsm kernel. So, is it possible to try with "make test-in-svsm RELEASE=1"? Suppose the release version should give a smaller stage2 binary. I also want to do some investigation about the in-svsm test failure, but seems I didn't encounter the hang issue. From my side, there is a panic error which is reasonable as the in-svsm test expects to run on SEV, not TDX.
Maybe it is because TDX doesn't have #VC handler and didn't use insn_decode crate for stage2 (although SEV stage2 #VC handle doesn't use emulation API provided by insn_decode crate, but it is still using the decoding API), thus no significant impact to stage2 binary size. So probably it is worth to have a try with release version on SEV to see if there is any difference? |
So this bug indeed doesn't show up with |
By looking into the compiling log of running "make test-in-svsm RELEASE=0", seems cargo still applies the parameter
Thanks! Please let me know if I can provide any help, e.g., making any debugging patches |
A few more hints here: |
Thanks so much for the investigation! On TDX machine, the Based on this, if I hard-code |
Hi @p4zuu, I noticed that PR #432 has been merged. When you have a moment, could you please retest this PR to check if the boot hang issue, as mentioned in the comment, has been resolved? Additionally, this PR has been updated to address the test-in-svsm failure on SEV. While I was able to replicate the failure manually on a TDX machine and test the fix, I haven't had the opportunity to verify it on an actual SEV machine. Could you assist by running the test-in-svsm on SEV to confirm whether the issue is indeed fixed? Thank you for your help! |
Yes, correct.
The data should be in the .rodata section of the svsm kernel ELF. The virtual address of the data is
This decoded virtual address is translated to physical address As my knowledge on TDX, the svsm kernel ELF is mapped starting from physical address For the
The test_port_io_string_16_get_last with striping off SEV functions can run successfully on TDX. Not sure if the failure on SEV is still related with the private/shared bit, as this part is different between SEV and TDX. Another suspect is about the data value in the address
So could you please help to dump the .rodata section to see if the value in address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pointed out the bug in my review. As a general note I question the need for re-mapping already mapped memory for the sake of instruction emulation in a VC/VE handler. For these use-cases the code can just access the virtual addresses directly using a pointer guard like GuestPtr
.
The consideration is to make this instruction decoder general for both the SVSM kernel and the guest (SEV-SNP or TDP). If the instruction is from guest, the decoded linear address is mapped in guest's page table and cannot be directly used by the svsm kernel. From another side, the guest memory region represented by the linear address may cross pages, which may not be contiguous. Based on these, translate the guest memory region to a set of physical addresses, and then re-map. Just realized another thing, the guest memory is not statically mapped in svsm kernel's page table, right? Then seems we need to do the re-map anyway because guest memory doesn't have an accessible virtual address in the svsm kernel. |
I understand and agree with the general need for these abstractions. You are right that for decoding and emulating instructions from the guest they are absolutely necessary. But for emulation in VE/VC handlers there can be a much simpler abstraction which accesses the virtual addresses directly (protected by an exception guard).
Yes, guest-memory accesses always need the re-map. Note that this also means guest page-table traversals. |
Just a quick question, will there be VE/VC caused by SVSM user mode in future? The reason I ask because for the VE/VC generated by the user mode, it looks like we still need to do the user page-table traversal. And as the user mode memory is not guaranteed to be contiguous, the user mode virtual address may cross pages, so also need a re-mapping. As currently there is no VE/VC caused by SVSM user mode, as well as no need to decode/emulate for guest, I can remove the page-table traversal and remapping code from this PR, and leverage the |
User-mode will cause VE/VC exceptions, but there is no need to support IOIO/MMIO from user-space for now. If that ever changes the user-mode virtual addresses can still be accessed directly with a guard, there just need to be some extra checks (e.g. whether the decoded instruction only targets user memory and whether the exception reason matches the instruction).
You can leave it there for now, it will be needed for decoding guest instructions in the not too distant future. Just use another abstraction for VE/VC handlers. |
Sure. Thanks for the suggestion! I will refactor for the VE/VC handlers. |
Hi @joergroedel, this PR has been rebased and updated with adding a simpler abstraction for VC/VE handler. The PTE attribute checking for the page table traversal and the guard for memory re-mapping are left there, and the virtual address translation in PageTable are removed from this PR. The considerations are:
So the virtual address translation code in PageTable is removed in this update. Please let me know if you have any concern. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few new comments, otherwise this looks good to me :)
Add instruction decoding logic to support for the outs instruction. The emulation of outs is achieved by reading data from the decoded memory location and writing it to an IO port. The InsnMachineCtx has been expanded with new methods to facilitate reading and writing CPU registers, accessing CPU flags, determining the current privilege level, handling IO out operations, and mapping memory via linear addresses. Additionally, the TestCtx has been augmented to provide read_reg and cpl methods, ensuring compatibility with fuzz testing. Introduce new trait InsnMachineMem to support memory access. This trait includes the read_integer() method, which is essential for reading data from memory when emulating the outs instruction. Signed-off-by: Chuanxiao Dong <[email protected]>
Add instruction decoding logic to support for the ins instruction. The emulation of ins is achieved by reading data from an IO port and then writing it to the decoded memory location. Extend InsnMachineCtx to include a new method to manage IO in operations. Additionally, the InsnMachineMem trait has been extended with a method write_integer for writing integer data to memory, which is crucial for the accurate emulation of the ins instruction. Signed-off-by: Chuanxiao Dong <[email protected]>
DecodeInsnCtx has been enhanced to support the emulation of in and out instructions. The in instruction is emulated by fetching data from the IO port and writing it to the EAX/AX/AL register, while the out instruction is emulated by reading data from the EAX/AX/AL register and sending it to the IO port. Prior to emulation, the IO permission bitmap is consulted to ensure proper access rights. Signed-off-by: Chuanxiao Dong <[email protected]>
Enhance the TestCtx to implement the required methods of InsnMachineCtx. Also introduce TestMem structure which implements InsnMachineMem trait. With these, extend the test to support testing decoding of ins and outs instructions, as well as emulation for ins/outs/in/out instructions. These new tests ensure support for byte, word and doubleword IO port size. The instruction fuzz is also enhanced to test the emulate interface. Signed-off-by: Chuanxiao Dong <[email protected]>
X86ExceptionContext captures the general purpose registers, flags and CS values when exception occured, which can be used to implement the InsnMachineCtx trait method read/write_reg, read_flags and read_cpl. Signed-off-by: Chuanxiao Dong <[email protected]>
Whether an access is permitted by a translation is determined by the access rights specified by the paging-structure entries controlling the translation: paging-mode modifiers in CR0, CR4, and the IA32_EFER MSR; EFLAGS.AC; the supervisor/user mode access. Add PTWalkAttr to include all these factors and implement function check_access_rights() to validate the access rights of a pte. A Ok containing the translated physical address and a bool flag representing if the translated PTEntry is leaf or not will be returned if it passes the validation, otherwise returns Err with the corresponding page fault error code flags. The access rights validation will be used when traversals a page-table to translate a virtual address. Signed-off-by: Chuanxiao Dong <[email protected]>
Introduce MemMappingGuard to represent a temporary memory mapping. It contains a PerCPUPageMappingGuard, which is extended to remap multiple non-contigous 4k pages to a contigous virtual address space. The MemMappingGuard provides read and write that facilitate a specific data type access within the remapped memory using a specified offset. When a MemMappingGuard instance is dropped, the PerCPUPageMappingGuard is also dropped which will unmap the associated memory range. Additionally, the InsnMachineMem trait has been implemented for MemMappingGuard, enabling it to support memory access operations required by the instruction decoder. This is essential for the emulation of specific instructions. Signed-off-by: Chuanxiao Dong <[email protected]>
To handle the instruction emulation for the #VC/#VE happens in the COCONUT-SVSM kernel, implementing the map_linear_addr method for X86ExceptionContext to convert the linear address to the COCONUT-SVSM kernel virtual address by using GuestPtr, which implements the InsnMachineMem. Signed-off-by: Chuanxiao Dong <[email protected]>
NativeIOPort leverages the default implementations of outl and inl provided by the IOPort trait, which utilize the outl/inl assembly instructions. Conversely, SVSMIOPort provides custom implementations of outl and inl that interact with the GHCB interfaces. Signed-off-by: Chuanxiao Dong <[email protected]>
The ioio_perm() is implemented by returning false to indicate that the user mode is not allowed to access any IO port. The ioio_in() and ioio_out() are implemented by calling the relevant in/out interfaces provided by the console IO of the SVSM_PLATFORM. Signed-off-by: Chuanxiao Dong <[email protected]>
The DecodedInsnCtx support emulating IOIO instructions, including in/out/ins/outs, which can be utilized by the SVSM #VC handler to support handling all the IOIO instructions. Signed-off-by: Chuanxiao Dong <[email protected]>
With the #VC handler now supporting ins/outs instructions, the test case test_port_io_string_16_get_last() can be enabled. The test case is also extended to test rep insw by reading data from the test port and writing to the memory. Signed-off-by: Chuanxiao Dong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now.
Add the instruction decoding and emulating support for ins/outs instructions. To support emulating, the InsnMachineCtx has been extended with more methods for the instruction decoder to get CPU state. And implemented these new methods for the X86ExceptionContext.
Beside provides more CPU states, it also provides the memory accessing ability to the instruction decoder via the new trait InsnMachineMem. The PageTable has been extended to support translating virtual addresses to the corresponding physical addresses, with the accessing rights been checked. And the physical addresses are temporarily mapped in the page table with contiguous virtual addresses via a PerCPUPageMappingGuard, which is wrapped in the new structure MemMappingGuard. This new structure implements the trait InsnMachineMem and is used by the instruction decoder to access memory.
Besides emulating ins/outs instruction, in/out instruction can also be emulated. The test cases have been enhanced to cover the emulation for ins/outs/in/out instructions with byte, word and double word size.
In the end, the SEV #VC handler has been simplified by directly using the emulation interface to handle IOIO instructions. The test_in_svsm test_port_io_string_16_get_last() now is enabled. Has set up a TDX machine which can boot the SVSM in TD, so performed similar test as what test_port_io_string_16_get_last() did with TD, but didn't have a chance to test on with SEV due to lack of AMD machine. Appreciate it if someone can help to try this PR on an AMD machine.