Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: run with interrupts enabled when possible #485

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

msft-jlange
Copy link
Contributor

On platforms that support SVSM interrupt delivery correctly (all platforms except SNP without Restricted Injection), the kernel should generally run with interrupts enabled so that interrupts directed at the SVSM can be handled as quickly as possible.

kernel/src/cpu/idt/common.rs Show resolved Hide resolved
kernel/src/cpu/idt/common.rs Outdated Show resolved Hide resolved
kernel/src/cpu/idt/svsm.rs Show resolved Hide resolved
kernel/src/cpu/irq_state.rs Show resolved Hide resolved
kernel/src/platform/tdp.rs Show resolved Hide resolved
kernel/src/sev/ghcb.rs Show resolved Hide resolved
@joergroedel
Copy link
Member

The changes break test-in-svsm in my setup, I need to investigate before merging.

@stefano-garzarella
Copy link
Contributor

The changes break test-in-svsm in my setup, I need to investigate before merging.

Yeah, also on my env:

[SVSM] Launching request-processing task on CPU 0
[SVSM] running 141 tests
[SVSM] test cpu::irq_state::tests::irq_enable_disable ...
[SVSM] ERROR: Panic: CPU[0] panicked at kernel/src/cpu/irq_state.rs:228:9:
assertion failed: irqs_disabled()
[SVSM] ---BACKTRACE---:
[SVSM]   [0xffffff8000112845]
[SVSM]   [0xffffff80000cecb3]
[SVSM]   [0xffffff8000014201]
[SVSM]   [0xffffff8000025021]
[SVSM]   [0xffffff80000ae256]
[SVSM]   [0xffffff800007a795]
[SVSM]   [0xffffff80000377e0]
[SVSM]   [0xffffff800006b52b]
[SVSM]   [0xffffff800006b540]
[SVSM] ---END---

Maybe with commit 78b0a92 we need to adjust tests in kernel/src/cpu/irq_state.rs since we don't start with irq disabled anymore. (I didn't follow the details, just guessing)

@msft-jlange
Copy link
Contributor Author

The changes break test-in-svsm in my setup, I need to investigate before merging.

I've modified the tests to understand the interrupt configuration, so hopefully everything will work now.

@stefano-garzarella
Copy link
Contributor

The changes break test-in-svsm in my setup, I need to investigate before merging.

I've modified the tests to understand the interrupt configuration, so hopefully everything will work now.

Some other tests failing in kernel/src/locking/rwlock.rs :

[SVSM] test locking::rwlock::tests::rw_lock_irq_safe ...
[SVSM] ERROR: Panic: CPU[0] panicked at kernel/src/locking/rwlock.rs:368:9:
assertion failed: irqs_disabled()
[SVSM] ---BACKTRACE---:
[SVSM]   [0xffffff8000112855]
[SVSM]   [0xffffff8000010b46]
[SVSM]   [0xffffff80000210a1]
[SVSM]   [0xffffff8000024871]
[SVSM]   [0xffffff80000ae256]
[SVSM]   [0xffffff800007a795]
[SVSM]   [0xffffff80000377e0]
[SVSM]   [0xffffff800006b52b]
[SVSM]   [0xffffff800006b540]
[SVSM] ---END---
qemu-system-x86_64: terminating on signal 2

But applying the same changes you did also there, everything is fine:

diff --git a/kernel/src/locking/rwlock.rs b/kernel/src/locking/rwlock.rs
index 9d81b7d..f48fa55 100644
--- a/kernel/src/locking/rwlock.rs
+++ b/kernel/src/locking/rwlock.rs
@@ -330,11 +330,12 @@ mod tests {
     #[cfg_attr(not(test_in_svsm), ignore = "Can only be run inside guest")]
     fn rw_lock_irq_unsafe() {
         use crate::cpu::irq_state::{raw_irqs_disable, raw_irqs_enable};
-        use crate::cpu::{irqs_disabled, irqs_enabled};
+        use crate::cpu::{irqs_enabled};
         use crate::locking::*;
 
-        assert!(irqs_disabled());
         unsafe {
+            let was_enabled = irqs_enabled();
+
             raw_irqs_enable();
             let lock = RWLock::new(0);
 
@@ -355,6 +356,10 @@ mod tests {
             // IRQs must still be enabled
             assert!(irqs_enabled());
             raw_irqs_disable();
+
+            if !was_enabled {
+                raw_irqs_disable();
+            }
         }
     }
 
@@ -365,8 +370,9 @@ mod tests {
         use crate::cpu::{irqs_disabled, irqs_enabled};
         use crate::locking::*;
 
-        assert!(irqs_disabled());
         unsafe {
+            let was_enabled = irqs_enabled();
+
             raw_irqs_enable();
             let lock = RWLockIrqSafe::new(0);
 
@@ -389,6 +395,10 @@ mod tests {
             // IRQs must still be enabled
             assert!(irqs_enabled());
             raw_irqs_disable();
+
+            if !was_enabled {
+                raw_irqs_disable();
+            }
         }
     }
 }

BTW, in order not to break the bisection, I think it's better to change the tests before commit 78b0a92 or along with it.

@joergroedel joergroedel added needs-rebase The PR needs to be rebased to the latest upstream branch in-review PR is under active review and not yet approved labels Oct 23, 2024
@msft-jlange
Copy link
Contributor Author

Some other tests failing in kernel/src/locking/rwlock.rs :

Thanks for identifying the problem!

BTW, in order not to break the bisection, I think it's better to change the tests before commit 78b0a92 or along with it.

Fixed.

@stefano-garzarella
Copy link
Contributor

Another issue :-( I guess because we merged new stuff in main:

[SVSM] ERROR: Panic: CPU[0] panicked at kernel/src/locking/spinlock.rs:245:9:
assertion failed: irqs_disabled()

There are some tests in kernel/src/locking/spinlock.rs with the same assumption (irq disabled) that we need to fix.

nit: maybe better to update the title of commit 05082b0 since we are fixing tests not only in cpu/irq_state

Various interrupt-related tests previously assumed that the test
environment would always execute with interrupts disabled.  They must
actually tolerate any interrupt configuration within the kernel
environment.

Signed-off-by: Jon Lange <[email protected]>
Enable the use of interrupts whenever the kernel is executing outside of
an IRQ guard.  This is not safe on SNP systems that do not use
Restricted Injection but is safe in all other configurations.

Signed-off-by: Jon Lange <[email protected]>
Some platforms permit the injection of arbitrary interrupt vectors.
System calls from user mode are implemented as software interrupts, and
handling an injected interrupt as a system call would be a security
issue.  The system call handler must determine whether it was invoked in
an unsafe way.

Signed-off-by: Jon Lange <[email protected]>
Issuing a GHCB call via VMGEXIT requires setting and potentially reading
the GHCB MSR.  Since GHCB calls can be issued by interrupt routines,
interrupts must be disabled during each set-exit-read sequence to
prevent possible tearing due to preemption during such a sequence.

Signed-off-by: Jon Lange <[email protected]>
@msft-jlange
Copy link
Contributor Author

There are some tests in kernel/src/locking/spinlock.rs with the same assumption (irq disabled) that we need to fix.

Thanks for checking! Hopefully I've got them all now.

nit: maybe better to update the title of commit 05082b0 since we are fixing tests not only in cpu/irq_state

There's no obviously good name because there isn't a clear path here, so hopefully the new title of the commit is good enough.

@stefano-garzarella
Copy link
Contributor

There are some tests in kernel/src/locking/spinlock.rs with the same assumption (irq disabled) that we need to fix.

Thanks for checking! Hopefully I've got them all now.

Yeah, no failures running make test-in-svsm on my env.

nit: maybe better to update the title of commit 05082b0 since we are fixing tests not only in cpu/irq_state

There's no obviously good name because there isn't a clear path here, so hopefully the new title of the commit is good enough.

Yep, I think it's fine ;-)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-review PR is under active review and not yet approved needs-rebase The PR needs to be rebased to the latest upstream branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants