Skip to content

Commit

Permalink
drm/xe: Fix tlb invalidation when wedging
Browse files Browse the repository at this point in the history
[ Upstream commit 9ab4981 ]

If GuC fails to load, the driver wedges, but in the process it tries to
do stuff that may not be initialized yet. This moves the
xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says,
it's a software-only initialization and should had been named with the
_early() suffix.

Move it to be called by xe_gt_init_early(), so the locks and seqno are
initialized, avoiding a NULL ptr deref when wedging:

	xe 0000:03:00.0: [drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01
	xe 0000:03:00.0: [drm] *ERROR* GT0: firmware signature verification failed
	xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged.
	...
	BUG: kernel NULL pointer dereference, address: 0000000000000000
	#PF: supervisor read access in kernel mode
	#PF: error_code(0x0000) - not-present page
	PGD 0 P4D 0
	Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
	CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G     U  W          6.13.0-rc4-xe+ #3
	Tainted: [U]=USER, [W]=WARN
	Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022
	RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe]

This can be easily triggered by poking the GuC binary to force a
signature failure. There will still be an extra message,

	xe 0000:03:00.0: [drm] *ERROR* GT0: GuC mmio request 0x4100: no reply 0x4100

but that's better than a NULL ptr deref.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956
Fixes: c9474b7 ("drm/xe: Wedge the entire device")
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
(cherry picked from commit 5001ef3)
Signed-off-by: Thomas Hellström <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
  • Loading branch information
lucasdemarchi authored and gregkh committed Jan 17, 2025
1 parent 53a5681 commit 09b94dd
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 7 deletions.
8 changes: 4 additions & 4 deletions drivers/gpu/drm/xe/xe_gt.c
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,10 @@ int xe_gt_init_early(struct xe_gt *gt)
xe_force_wake_init_gt(gt, gt_to_fw(gt));
spin_lock_init(&gt->global_invl_lock);

err = xe_gt_tlb_invalidation_init_early(gt);
if (err)
return err;

return 0;
}

Expand Down Expand Up @@ -585,10 +589,6 @@ int xe_gt_init(struct xe_gt *gt)
xe_hw_fence_irq_init(&gt->fence_irq[i]);
}

err = xe_gt_tlb_invalidation_init(gt);
if (err)
return err;

err = xe_gt_pagefault_init(gt);
if (err)
return err;
Expand Down
4 changes: 2 additions & 2 deletions drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
Original file line number Diff line number Diff line change
Expand Up @@ -106,15 +106,15 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work)
}

/**
* xe_gt_tlb_invalidation_init - Initialize GT TLB invalidation state
* xe_gt_tlb_invalidation_init_early - Initialize GT TLB invalidation state
* @gt: graphics tile
*
* Initialize GT TLB invalidation state, purely software initialization, should
* be called once during driver load.
*
* Return: 0 on success, negative error code on error.
*/
int xe_gt_tlb_invalidation_init(struct xe_gt *gt)
int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt)
{
gt->tlb_invalidation.seqno = 1;
INIT_LIST_HEAD(&gt->tlb_invalidation.pending_fences);
Expand Down
3 changes: 2 additions & 1 deletion drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ struct xe_gt;
struct xe_guc;
struct xe_vma;

int xe_gt_tlb_invalidation_init(struct xe_gt *gt);
int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt);

void xe_gt_tlb_invalidation_reset(struct xe_gt *gt);
int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt);
int xe_gt_tlb_invalidation_vma(struct xe_gt *gt,
Expand Down

0 comments on commit 09b94dd

Please sign in to comment.