Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PANIC at zfs_znode.c with zfs 2.1 and 2.2 #16607

Open
Lorac opened this issue Oct 4, 2024 · 9 comments
Open

PANIC at zfs_znode.c with zfs 2.1 and 2.2 #16607

Lorac opened this issue Oct 4, 2024 · 9 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@Lorac
Copy link

Lorac commented Oct 4, 2024

System information

Type Version/Name
Distribution Name Debian
Distribution Version 12 Bookworm
Kernel Version 6.10.6+bpo-amd64
Architecture x86_64
OpenZFS Version zfs-2.2.6-1.bpo12+2
--- zfs-kmod-2.2.6-1.bpo12+2

Describe the problem you're observing

I am experiencing kernel panic when navigating to a ZFS dataset or copying data to it. The issue persists even after upgrading. Initially, I was using ZFS version 2.1 with kernel 6.1, and the problem occurred. I upgraded to the latest versions available to both ZFS and the kernel bookworm backports, but the issue remains.

Describe how to reproduce the problem

The kernel panic started when I used the "z" plugin in oh-my-zsh to navigate directories within a ZFS dataset. The panic also occurs when trying to rsync a directory from an ext4 filesystem to a ZFS dataset.

Include any warning/errors/backtraces from the system logs

[  515.074670] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
[  515.074675] PANIC at zfs_znode.c:345:zfs_znode_sa_init()
[  515.074676] Showing stack for process 4648
[  515.074678] CPU: 14 PID: 4648 Comm: ncdu Tainted: P           OE      6.10.6+bpo-amd64 #1  Debian 6.10.6-1~bpo12+1
[  515.074680] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B550M Phantom Gaming 4, BIOS P1.70 06/23/2021
[  515.074682] Call Trace:
[  515.074684]  <TASK>
[  515.074687]  dump_stack_lvl+0x64/0x80
[  515.074696]  spl_panic+0x100/0x120 [spl]
[  515.074705]  ? dmu_buf_set_user_ie+0x61/0x80 [zfs]
[  515.074837]  zfs_znode_sa_init+0xcc/0xe0 [zfs]
[  515.074942]  zfs_znode_alloc+0x16e/0x7c0 [zfs]
[  515.075042]  ? spl_kmem_alloc+0xf0/0x120 [spl]
[  515.075051]  zfs_zget+0x24d/0x290 [zfs]
[  515.075149]  zfs_dirent_lock+0x4f5/0x6d0 [zfs]
[  515.075250]  zfs_dirlook+0xb6/0x2f0 [zfs]
[  515.075348]  ? zfs_zaccess+0x278/0x460 [zfs]
[  515.075448]  zfs_lookup+0x264/0x410 [zfs]
[  515.075545]  zpl_lookup+0xed/0x270 [zfs]
[  515.075643]  ? generic_permission+0x39/0x240
[  515.075646]  __lookup_slow+0x83/0x130
[  515.075648]  walk_component+0xdb/0x150
[  515.075650]  path_lookupat+0x67/0x190
[  515.075653]  filename_lookup+0xde/0x1d0
[  515.075656]  vfs_statx+0x92/0x1c0
[  515.075658]  vfs_fstatat+0x85/0xb0
[  515.075660]  __do_sys_newfstatat+0x3f/0x80
[  515.075662]  do_syscall_64+0x82/0x190
[  515.075666]  ? syscall_exit_to_user_mode+0x77/0x210
[  515.075668]  ? do_syscall_64+0x8e/0x190
[  515.075669]  ? exc_page_fault+0x76/0x170
[  515.075671]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  515.075674] RIP: 0033:0x7f1e77ad675a
[  515.075676] Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 0b 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 90 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 07 31 c0 c3 0f 1f 40 00 48 8b 15 71 a6 0d 00 f7
[  515.075678] RSP: 002b:00007ffc35ee74b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[  515.075680] RAX: ffffffffffffffda RBX: 000055a6dfec0650 RCX: 00007f1e77ad675a
[  515.075681] RDX: 000055a6aefa1240 RSI: 000055a6f73a73cd RDI: 00000000ffffff9c
[  515.075682] RBP: 000055a6aefa1240 R08: 000055a6e2fbbf30 R09: 006f666e49626544
[  515.075682] R10: 0000000000000100 R11: 0000000000000246 R12: 000055a6e2fbbf30
[  515.075683] R13: 000055a6f73a73cd R14: 000055a6aef9b227 R15: 00007f1e77c86020
[  515.075685]  </TASK>
@Lorac Lorac added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 4, 2024
@Lorac
Copy link
Author

Lorac commented Oct 5, 2024

@snajpa it looks like you deleted your comment, but I will post my properties on my pool. Maybe it will help, looks like project_quota i active.

NAME  PROPERTY                       VALUE                          SOURCE
a     size                           884G                           -
a     capacity                       85%                            -
a     altroot                        -                              default
a     health                         ONLINE                         -
a     guid                           8296607161116845554            -
a     version                        -                              default
a     bootfs                         -                              default
a     delegation                     on                             default
a     autoreplace                    off                            default
a     cachefile                      -                              default
a     failmode                       wait                           default
a     listsnapshots                  on                             local
a     autoexpand                     off                            default
a     dedupratio                     1.00x                          -
a     free                           130G                           -
a     allocated                      754G                           -
a     readonly                       off                            -
a     ashift                         12                             local
a     comment                        -                              default
a     expandsize                     -                              -
a     freeing                        0                              -
a     fragmentation                  64%                            -
a     leaked                         0                              -
a     multihost                      off                            default
a     checkpoint                     -                              -
a     load_guid                      14949411101328869406           -
a     autotrim                       off                            default
a     compatibility                  off                            default
a     bcloneused                     0                              -
a     bclonesaved                    0                              -
a     bcloneratio                    1.00x                          -
a     feature@async_destroy          enabled                        local
a     feature@empty_bpobj            active                         local
a     feature@lz4_compress           active                         local
a     feature@multi_vdev_crash_dump  enabled                        local
a     feature@spacemap_histogram     active                         local
a     feature@enabled_txg            active                         local
a     feature@hole_birth             active                         local
a     feature@extensible_dataset     active                         local
a     feature@embedded_data          active                         local
a     feature@bookmarks              enabled                        local
a     feature@filesystem_limits      enabled                        local
a     feature@large_blocks           enabled                        local
a     feature@large_dnode            enabled                        local
a     feature@sha512                 enabled                        local
a     feature@skein                  enabled                        local
a     feature@edonr                  enabled                        local
a     feature@userobj_accounting     active                         local
a     feature@encryption             enabled                        local
a     feature@project_quota          active                         local
a     feature@device_removal         enabled                        local
a     feature@obsolete_counts        enabled                        local
a     feature@zpool_checkpoint       enabled                        local
a     feature@spacemap_v2            active                         local
a     feature@allocation_classes     enabled                        local
a     feature@resilver_defer         enabled                        local
a     feature@bookmark_v2            enabled                        local
a     feature@redaction_bookmarks    enabled                        local
a     feature@redacted_datasets      enabled                        local
a     feature@bookmark_written       enabled                        local
a     feature@log_spacemap           active                         local
a     feature@livelist               enabled                        local
a     feature@device_rebuild         enabled                        local
a     feature@zstd_compress          active                         local
a     feature@draid                  disabled                       local
a     feature@zilsaxattr             disabled                       local
a     feature@head_errlog            disabled                       local
a     feature@blake3                 disabled                       local
a     feature@block_cloning          disabled                       local
a     feature@vdev_zaps_v2           disabled                       local

@snajpa
Copy link
Contributor

snajpa commented Oct 5, 2024

yeah sorry I was reading through the code and my fingers were faster than my brain, I thought I saw something related to project quota but I invalidated it after a bit :-D

@snajpa
Copy link
Contributor

snajpa commented Oct 8, 2024

I think this could fit what I'm trying to fix in #16625; but that fix won't help in this case, when there's a znode without SA initialized, but somehow still linked to a directory.

I tried figuring out a way how to call zpl_xattr_security_init for the znode, but I'd really need the parent directory inode and I can't get to that in zfs_znode_sa_init. Perhaps some way could be found by using thread specific data to mark what we're looking for in zpl_lookup, if I'm right and if the file in question turns out to be really important for you, we can work on that later.

For now please make a snapshot of the affected dataset and then let's try the patch below, to see if it enables you to move forward without a panic - it should evict the inode and throw -EIO on access to that file.

If I'm right, the file was created with O_TMPFILE and perhaps power was lost, or the kernel crashed in the wrong section, otherwise I can't explain how we end up with a znode without any SA at all.

The patch (should be applied to current master - 75dda92):

diff --git a/module/os/linux/zfs/zfs_znode_os.c b/module/os/linux/zfs/zfs_znode_os.c
index f13edf95b..025f2482a 100644
--- a/module/os/linux/zfs/zfs_znode_os.c
+++ b/module/os/linux/zfs/zfs_znode_os.c
@@ -323,7 +323,7 @@ zfs_cmpldev(uint64_t dev)
        return (dev);
 }
 
-static void
+static int
 zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
     dmu_buf_t *db, dmu_object_type_t obj_type, sa_handle_t *sa_hdl)
 {
@@ -334,8 +334,11 @@ zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
        ASSERT(zp->z_sa_hdl == NULL);
        ASSERT(zp->z_acl_cached == NULL);
        if (sa_hdl == NULL) {
-               VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp,
-                   SA_HDL_SHARED, &zp->z_sa_hdl));
+               if (0 != sa_handle_get_from_db(zfsvfs->z_os, db, zp,
+                   SA_HDL_SHARED, &zp->z_sa_hdl)) {
+                       zfs_dbgmsg("sa_handle_get_from_db failed");
+                       return (1);
+               }
        } else {
                zp->z_sa_hdl = sa_hdl;
                sa_set_userp(sa_hdl, zp);
@@ -344,6 +347,7 @@ zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
        zp->z_is_sa = (obj_type == DMU_OT_SA) ? B_TRUE : B_FALSE;
 
        mutex_exit(&zp->z_lock);
+       return (0);
 }
 
 void
@@ -538,7 +542,11 @@ zfs_znode_alloc(zfsvfs_t *zfsvfs, dmu_buf_t *db, int blksz,
        zp->z_sync_writes_cnt = 0;
        zp->z_async_writes_cnt = 0;
 
-       zfs_znode_sa_init(zfsvfs, zp, db, obj_type, hdl);
+       int fail = zfs_znode_sa_init(zfsvfs, zp, db, obj_type, hdl);
+       if (fail) {
+               iput(ip);
+               return (SET_ERROR(EIO));
+       }
 
        SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_MODE(zfsvfs), NULL, &mode, 8);
        SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_GEN(zfsvfs), NULL, &tmp_gen, 8);

@Lorac
Copy link
Author

Lorac commented Oct 9, 2024

I initially tried deleting a corrupted file and directory, but I destroyed and recreated two ZFS datasets when that failed. This resolved my issue. The datasets weren’t critical (mostly build directories), but I needed them to continue working. The first time I encountered the error was from a directory which can't be a

At the time, I didn’t realize I should have taken a snapshot, assuming a corrupted file or directory wouldn’t be included in a snapshot. I am relatively new to ZFS.

The problem first appeared when deleting the directory: a specific file caused ZFS to panic. I opened a second SSH connection to my workstation and retried, but I encountered an EIO error instead of the panic. This resulted in a CPU hang on rm and 15GB of logs.

At this point, I’m not sure if the patch will make a difference, as I’ve already recreated the datasets. I don't know if the files I was trying to delete were created with O_TMPFILE since they’ve been there for a while. They are part of a build system that typically doesn’t create temporary files for the type I was trying to delete. The second dataset could have used O_TMPFILE, as it was from a C++ build with -pipe—maybe.

The drive they’re on has 68,000 power-on hours (SSD). I need help analyzing the smartctl output; you likely have more expertise interpreting that data than I do.

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       1
  5 Reallocate_NAND_Blk_Cnt 0x0032   100   100   010    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       68383
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       198
171 Program_Fail_Count      0x0032   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
173 Ave_Block-Erase_Count   0x0032   018   018   000    Old_age   Always       -       1642
174 Unexpect_Power_Loss_Ct  0x0032   100   100   000    Old_age   Always       -       100
180 Unused_Reserve_NAND_Blk 0x0033   000   000   000    Pre-fail  Always       -       5589
183 SATA_Interfac_Downshift 0x0032   100   100   000    Old_age   Always       -       0
184 Error_Correction_Count  0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   076   048   000    Old_age   Always       -       24 (Min/Max 17/52)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_ECC_Cnt 0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       1
202 Percent_Lifetime_Remain 0x0030   018   018   001    Old_age   Offline      -       82
206 Write_Error_Rate        0x000e   100   100   000    Old_age   Always       -       0
210 Success_RAIN_Recov_Cnt  0x0032   100   100   000    Old_age   Always       -       0
246 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       278295250114
247 Host_Program_Page_Count 0x0032   100   100   000    Old_age   Always       -       8696884170
248 FTL_Program_Page_Count  0x0032   100   100   000    Old_age   Always       -       46386914208

@Lorac
Copy link
Author

Lorac commented Oct 9, 2024

I actually got the panic again on another more critical dataset. It's already in my snapshots, so I'll try the patch.

This happened during a scan with ncdu

@snajpa
Copy link
Contributor

snajpa commented Oct 10, 2024

Curious, is your xattr property at the affected dataset =sa or =on?

@Lorac
Copy link
Author

Lorac commented Oct 10, 2024

xattr is on on the affected dataset.

@Lorac
Copy link
Author

Lorac commented Nov 6, 2024

@snajpa so I had bad memory stick, swap to new sticks and now no more problems. So it's not a ZFS bug per say I guess?

@snajpa
Copy link
Contributor

snajpa commented Nov 15, 2024

if you can't reproduce after a HW change (and no SW upgrade), then it must have been it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants