Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix file system browser hang when enrolling MOK from disk #622

Merged
merged 3 commits into from
Jan 21, 2025

Conversation

miczyg1
Copy link
Contributor

@miczyg1 miczyg1 commented Dec 16, 2023

The loop retrieving the SimpleFS volume labels and names may skip some volumes if either HandleProtocol or OpenVolume or GetInfo fails. Those skipped volumes would have uninitialized pointers to their names in the respective entries indices. This would lead to accessing random memory in console_select, because count_lines would not catch the holes with non-existing entries.

On affected platforms the result is a hang of the MokManager while trying to enroll a key from disk. The issue has been triggered on a TianoCore EDK2 UEFIPayload based firmware for x86 platforms with additional filesystem drivers: ExFAT, NTFS, EXT2 and EXT4.

@vathpela
Copy link
Contributor

vathpela commented Feb 7, 2024

This looks fairly reasonable to me, but what I don't understand is this: what are the failures? How are they triggered, and under what conditions?

@miczyg1
Copy link
Contributor Author

miczyg1 commented Feb 8, 2024

The FS drivers mentioned in my first comment did not necessarily implement GetInfo method. This resulted in certain volumes to be skipped. As entries array was not zero-pool allocation, the code executed later thought there are valid volumes/FSes in every entries index, which was not always true. Given entry would be initialized to NULL only if entries[i] = AllocatePool((StrLen(name) + 2) * sizeof(CHAR16)); failed and returned NULL (breaking the loop additionally). But if not, the later code could access some random memory and the system would simply hang. That's the first problem.

The second problem is the multiple continue directives in the loop. With many FSes in the system it becomes quite likely to skip some and create holes in the entries array, but only fake holes because these skipped FSes could still become a random pointer in entries array. But if for some reason we have hit if (!entries[i]) due to being unable to allocate pool for a volume name, we could prematurely terminate the loop and the entries array limiting the possibility to browse all FSes - third problem (purely theoretical, not that we could do anything more when we cannot allocate memory anymore).

tl;dr
Let's imagine 5 FSes (that's what I have been testing on):

  1. Some ESP, SimpleFS protocol present, openVolume works, GetInfo implemented.
  2. EXT4, SimpleFS protocol present, openVolume works, GetInfo NOT implemented.
  3. EXT4, SimpleFS protocol present, openVolume works, GetInfo NOT implemented.
  4. NTFS, SimpleFS protocol present, openVolume works, GetInfo NOT implemented.
  5. Another ESP, SimpleFS protocol present, openVolume works, GetInfo implemented.

Before my PR the result would be:

  1. Added to entries[0].
  2. SKIPPED/NOT added to entries[1], hits continue and entries[1] stays uninitialized (random).
  3. SKIPPED/NOT added to entries[2], hits continue and entries[2] stays uninitialized (random).
  4. SKIPPED/NOT added to entries[3], hits continue and entries[3] stays uninitialized (random).
  5. Added to entries[4]

entires[5] = NULL; termination after the loop. Such array gets into console_select and later hangs on accessing random pointer in entries[1], because count_lines checks for NULL pointer termination and returns the length of 5.

@miczyg1
Copy link
Contributor Author

miczyg1 commented Feb 8, 2024

Basically I have made the code more robust accounting for minimal FS driver implementations and avoiding possible access to random memory. Also the entries array will contain only valid FSes not indexed by the SimpleFS handles in the buffer, but rather incrementally indexed by successfully parsed volumes/FSes.

@desowin
Copy link

desowin commented Feb 20, 2024

I have encountered this exact issue on PRO Z790-P WIFI (MS-7E06) running Dasharo (coreboot+UEFI) v0.9.1 with disk where the only non-encrypted partition is EFI System Partition.

@miczyg1
Copy link
Contributor Author

miczyg1 commented Mar 25, 2024

@vathpela have I made it more clear possibly?

@miczyg1
Copy link
Contributor Author

miczyg1 commented Apr 25, 2024

@vathpela ping

@miczyg1
Copy link
Contributor Author

miczyg1 commented May 28, 2024

Anything I can do to help this PR land @vathpela ?

@pietrushnic
Copy link

@miczyg1, maybe we can bring this topic to the Linux Plumbers Conference?

@miczyg1
Copy link
Contributor Author

miczyg1 commented Oct 2, 2024

@miczyg1, maybe we can bring this topic to the Linux Plumbers Conference?

Have you brought it?

@pietrushnic
Copy link

No. @aronowski, what do you think about this problem?

@vathpela vathpela added this to the shim 16 milestone Nov 12, 2024
The loop retrieving the SimpleFS volume labels and names may
skip some volumes if either HandleProtocol or OpenVolume or
GetInfo fails. Those skipped volumes would have uninitialized
pointers to their names in the respective entries indices. This
would lead to accessing random memory in console_select, because
count_lines would not catch the holes with non-existing entries.

On affected platforms the result is a hang of the MokManager while
trying to enroll a key from disk. The issue has been triggered on
a TianoCore EDK2 UEFIPayload based firmware for x86 platforms with
additional filesystem drivers: ExFAT, NTFS, EXT2 and EXT4.

Use AllocateZeroPool to ensure entries array will be initialized
with NULL pointers. Handling the non-existing entries will be
added in subsequent commits.

Signed-off-by: Michał Żygowski <[email protected]>
In case GetInfo of volume root fails, it is still possible
to form a volume name from the DevicePath. Do not skip given
SimpleFS volume handle and try to form a name from DevicePath.
That way we do not lose some filesystems from file browser.

This change already fixes the problem of a hanging platform
when trying to enroll a key from disk. However, there is still
a chance of having a non-contiguous array of entries, which
will be fixed in next commit.

Signed-off-by: Michał Żygowski <[email protected]>
If HandleProtocol or OpenVolume fails, the entries array will become
non-contiguous, i.e. will have NULL pointers between valid volume
names in the array. Because of that count_lines may return a lower
number of entries than expected. As a result one may not browse all
valid filesystems in the file explorer.

Add a second index variable that will increment only on successfully
created filesystem entries. As a result, count_lines should return
proper length and there won't be any lost partitions or accesses to
invalid entries.

Signed-off-by: Michał Żygowski <[email protected]>
@vathpela vathpela merged commit d6076cb into rhboot:main Jan 21, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants