Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some fixes for the #HV handler #466

Merged
merged 3 commits into from
Oct 15, 2024
Merged

Conversation

Freax13
Copy link
Contributor

@Freax13 Freax13 commented Sep 25, 2024

This PR contains some fixes for the #HV handler.

There's still one remaining problem which is that the IF flag is never ever enabled (except in some tests) and so some parts of the #HV handler are effectively dead code. The reason for that is that irqs_disable remembers the state of the IF when it's called, and if the IF flag is not set, irqs_enable will not set IF. This doesn't seem like intended behavior, but I'm unsure where the best place to set the IF flag would be.

Cc @msft-jlange Could you please test these changes in your environment that has restricted injection support?

@joergroedel joergroedel added the wait-for-review PR needs for approval by reviewers label Sep 26, 2024
Copy link
Contributor

@msft-jlange msft-jlange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a closer look, I don't understand why these changes are needed at all. The address held by HV_DOORBELL_ADDR simply indicates where in the PerCpu structure the pointer to the #HV doorbell page is located. It is not a promise that the current CPU has a doorbell page. In fact it cannot, because in a multi-processor configuration, the APs start running in an environment in which HV_DOORBELL_ADDR is non-null but the local doorbell page has not yet been set up. For this reason, the #HV handling code looks not only at HV_DOORBELL_ADDR to decide whether to process, but also at whether the pointer fetched from that address is non-null. Consequently, I see no fault with the existing code, and no reason to change it.

@Freax13
Copy link
Contributor Author

Freax13 commented Oct 1, 2024

I rebased onto main to resolve some conflicts.

Taking a closer look, I don't understand why these changes are needed at all. The address held by HV_DOORBELL_ADDR simply indicates where in the PerCpu structure the pointer to the #HV doorbell page is located. It is not a promise that the current CPU has a doorbell page. In fact it cannot, because in a multi-processor configuration, the APs start running in an environment in which HV_DOORBELL_ADDR is non-null but the local doorbell page has not yet been set up. For this reason, the #HV handling code looks not only at HV_DOORBELL_ADDR to decide whether to process, but also at whether the pointer fetched from that address is non-null. Consequently, I see no fault with the existing code, and no reason to change it.

Previously HV_DOORBELL_ADDR wasn't properly initialized, it was always 0 on all CPUs at all times. This was because setup_hv_doorbell was called after idt_init, so idt_init couldn't properly setup HV_DOORBELL_ADDR.

@msft-jlange
Copy link
Contributor

Previously HV_DOORBELL_ADDR wasn't properly initialized, it was always 0 on all CPUs at all times. This was because setup_hv_doorbell was called after idt_init, so idt_init couldn't properly setup HV_DOORBELL_ADDR.

I missed that before. In that case, a change is definitely needed. However, I think we still have a subtle issue with the sequence. Once the #HV doorbell page is registered with the hypervisor, it is possible to start receiving events, but if the #HV handler is not prepared to consume them, then they will remain pending indefinitely because NoFurtherSignal will remain set and the hypervisor will not send additional #HV notifications to force processing. The best way to handle this will to ensure that HV_DOORBELL_ADDR is set before the #HV page is registered with the hypervisor. I think that just means moving the call to init_doorbell_addr() so it comes before the call to this_cpu().configure_hv_doorbell().

@Freax13 Freax13 force-pushed the fix/hv branch 3 times, most recently from 15743d7 to a71f499 Compare October 2, 2024 08:40
@Freax13
Copy link
Contributor Author

Freax13 commented Oct 2, 2024

Previously HV_DOORBELL_ADDR wasn't properly initialized, it was always 0 on all CPUs at all times. This was because setup_hv_doorbell was called after idt_init, so idt_init couldn't properly setup HV_DOORBELL_ADDR.

I missed that before. In that case, a change is definitely needed. However, I think we still have a subtle issue with the sequence. Once the #HV doorbell page is registered with the hypervisor, it is possible to start receiving events, but if the #HV handler is not prepared to consume them, then they will remain pending indefinitely because NoFurtherSignal will remain set and the hypervisor will not send additional #HV notifications to force processing. The best way to handle this will to ensure that HV_DOORBELL_ADDR is set before the #HV page is registered with the hypervisor. I think that just means moving the call to init_doorbell_addr() so it comes before the call to this_cpu().configure_hv_doorbell().

Yeah, that makes sense. I restored the previous call graph and changed PerCpu::hv_doorbell_addr to return the address to the hv_doorbell field even if it hasn't been initialized yet. For that to work, I had to change the OnceCell<&'static HVDoorbell> to Cell<Option<&'static HVDoorbell>> because OnceCell doesn't have memory layout guarantees, and Cell does.

The doorbell page is not yet setup by the time idt_init is called, so
HV_DOORBELL_ADDR was always initialized to 0. Instead setup
HV_DOORBELL_ADDR immediatly after setting up the doorbell page.

Signed-off-by: Tom Dohrmann <[email protected]>
The code in the handle_as_hv block expects to be able to fall through
to default_return. The introduction of return_new_task interrupted that
flow. Move return_new_task to another location.

Fixes: 853439e ("kernel/task: Implement hook to run in new tasks")
Signed-off-by: Tom Dohrmann <[email protected]>
Just some typos I noticed while reading the comments.

Signed-off-by: Tom Dohrmann <[email protected]>
@joergroedel joergroedel added in-review PR is under active review and not yet approved and removed wait-for-review PR needs for approval by reviewers labels Oct 2, 2024
Copy link
Collaborator

@roy-hopkins roy-hopkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this on a 6.11 host that includes restricted injection support and multi-VMPL kvm support. See here: https://github.com/roy-hopkins/linux/tree/svsm_vmpl_vmsa_restinj

All works ok.

@joergroedel joergroedel merged commit c1c9110 into coconut-svsm:main Oct 15, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-review PR is under active review and not yet approved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants