Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM64 SMP IRQ stack bug modify request #13652

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GUIDINGLI
Copy link
Contributor

@GUIDINGLI GUIDINGLI commented Sep 26, 2024

Summary

ARM64 SMP IRQ stack bug modify request

In previous PR:
#13520

We have fix a SMP bug in ARM64, will cause by don't use interrupt stack in SMP mode.
The rootcause you can find in the SMP description:

config SMP
    bool "Symmetric Multi-Processing (SMP)"
    default n
    depends on ARCH_HAVE_MULTICPU
    depends on ARCH_HAVE_TESTSET
    depends on ARCH_INTERRUPTSTACK != 0 
    select SPINLOCK
    select IRQCOUNT
    ---help---
        Enables support for Symmetric Multi-Processing (SMP) on a multi-CPU
        platform.

        N.B. SMP mode requires the use of ARCH_INTERRUPTSTACK:

        CPU0 thread0  ->  IRQ enter -> add thread0 to block list -> IRQ leave(crash)
                                                                ||
                                                                /\
                                                               /  \
        CPU1 thread1  ->  block_task -> take thread0 from block_list -> run thread0

        CPU0 IRQ handler use thread0's stack, but thread0 may switch to CPU1, that 
        will caused IRQ handler stack corruption.

So we must use the IRQ stack in svc/IRQ ISR entry: arm64_sync_exc arm64_irq_handler

But this modification has conflict with the kernel mode which provide by @pussuw @jlaitine

Then we revert the changes emergency in :
#13590

But this bug still exist in current code.

There are two method to resolve this:

  1. REVERT YOUR KERNEL MODE RELATED CODE IN ARM64,
    and I will fix this bug in my method, then you can apply your kernel mode changes.
  2. FIX THE SMP BUG IN YOUR METHOD !

Another thing, about the FPU,
The removable of FPU trap reason:

  1. The FPU trap logic is not stable in some SMP situation (HERE haven't the detail analysis)
  2. Make the logic Complex and different to read.
  3. The most important reason: it has almost no benefit for the performance, because the compiler will optimize the code with float/double instructions when use the option -O3 (always be). Even the printf/syslog will use use the FPU, so the FPU is used too often and almost each context switch.
    Or, you must use -mgeneral-regs-only to limit the FPU registers usage, that will harmful for the speed.
  4. How about use the FPU registers in the IRQ/SVC ISR, maybe there are hidden bugs ?

So, suggest remove the FPU trap, save/restore in every context switch, similar with ARMv7-a/r

Impact

None

Testing

None

@GUIDINGLI GUIDINGLI mentioned this pull request Sep 26, 2024
@github-actions github-actions bot added Area: OS Components OS Components issues Size: XS The size of the change in this PR is very small labels Sep 26, 2024
@anchao
Copy link
Contributor

anchao commented Sep 27, 2024

@GUIDINGLI please send this change request to issue list, this is not a pull request for review

@anchao anchao marked this pull request as draft September 27, 2024 00:26
@GUIDINGLI
Copy link
Contributor Author

I going to revert the arm64 patch use this PR if they are not replay for a long time

@GUIDINGLI GUIDINGLI marked this pull request as ready for review September 27, 2024 02:26
@xiaoxiang781216
Copy link
Contributor

@pussuw could you look at this problem?

@pussuw
Copy link
Contributor

pussuw commented Sep 27, 2024

@pussuw could you look at this problem?

I already responded to #13520
I'll take a closer look today or on Monday

@xiaoxiang781216 xiaoxiang781216 marked this pull request as draft September 29, 2024 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: OS Components OS Components issues Size: XS The size of the change in this PR is very small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants