Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel panic from intel_gt_sysfs_get_drvdata after sysctl -a (INVARIANTS kernel, master branch 6.1) #280

Open
emaste opened this issue Jan 22, 2024 · 4 comments
Labels
bug Something isn't working i915 i915 related problems

Comments

@emaste
Copy link
Member

emaste commented Jan 22, 2024

Describe the bug
panic in strncmp() from intel_gt_sysfs_get_drvdata() upon sysctl -a


Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address	= 0x0
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80fbf610
stack pointer	        = 0x28:0xfffffe0155d3eb40
frame pointer	        = 0x28:0xfffffe0155d3eb40
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 3758 (sysctl)
rdi: 0000000000000000 rsi: ffffffff83dc3f1d rdx: 0000000000000001
rcx: 0000000000000000  r8: 0000000000000000  r9: 0000000000010000
rax: ffffffff813456e0 rbx: fffffe01418f6e88 rbp: fffffe0155d3eb40
r10: 0000000000000001 r11: ffffffff83d38f80 r12: 0000000000000013
r13: fffffe0155d3ecc0 r14: ffffffff83df5950 r15: fffffe01418f6e88
trap number		= 12
panic: page fault
cpuid = 6
time = 1705955573
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0155d3e810
vpanic() at vpanic+0x132/frame 0xfffffe0155d3e940
panic() at panic+0x43/frame 0xfffffe0155d3e9a0
trap_fatal() at trap_fatal+0x40c/frame 0xfffffe0155d3ea00
trap_pfault() at trap_pfault+0xae/frame 0xfffffe0155d3ea70
calltrap() at calltrap+0x8/frame 0xfffffe0155d3ea70
--- trap 0xc, rip = 0xffffffff80fbf610, rsp = 0xfffffe0155d3eb40, rbp = 0xfffffe0155d3eb40 ---
strncmp() at strncmp+0x10/frame 0xfffffe0155d3eb40
intel_gt_sysfs_get_drvdata() at intel_gt_sysfs_get_drvdata+0x1e/frame 0xfffffe0155d3eb60
throttle_reason_bool_show() at throttle_reason_bool_show+0x15/frame 0xfffffe0155d3eb80
sysctl_handle_attr() at sysctl_handle_attr+0x73/frame 0xfffffe0155d3ebd0
sysctl_root_handler_locked() at sysctl_root_handler_locked+0xa2/frame 0xfffffe0155d3ec20
sysctl_root() at sysctl_root+0x22e/frame 0xfffffe0155d3eca0
userland_sysctl() at userland_sysctl+0x184/frame 0xfffffe0155d3ed50
sys___sysctl() at sys___sysctl+0x5c/frame 0xfffffe0155d3ee00
amd64_syscall() at amd64_syscall+0x15e/frame 0xfffffe0155d3ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0155d3ef30
--- syscall (202, FreeBSD ELF64, __sysctl), rip = 0x157fae7cf4fa, rsp = 0x157fadb1b868, rbp = 0x157fadb1b8a0 ---

FreeBSD version
Paste the output of uname -aKU

PCI Info

hostb0@pci0:0:0:0:	class=0x060000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a14 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = '11th Gen Core Processor Host Bridge/DRAM Registers'
    class      = bridge
    subclass   = HOST-PCI
vgapci0@pci0:0:2:0:	class=0x030000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a49 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'TigerLake-LP GT2 [Iris Xe Graphics]'
    class      = display
    subclass   = VGA
none0@pci0:0:4:0:	class=0x118000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a03 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'TigerLake-LP Dynamic Tuning Processor Participant'
    class      = dasp
pcib1@pci0:0:6:0:	class=0x060400 rev=0x01 hdr=0x01 vendor=0x8086 device=0x9a09 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = '11th Gen Core Processor PCIe Controller'
    class      = bridge
    subclass   = PCI-PCI
pcib2@pci0:0:7:0:	class=0x060400 rev=0x01 hdr=0x01 vendor=0x8086 device=0x9a23 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
pcib3@pci0:0:7:1:	class=0x060400 rev=0x01 hdr=0x01 vendor=0x8086 device=0x9a25 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
pcib4@pci0:0:7:2:	class=0x060400 rev=0x01 hdr=0x01 vendor=0x8086 device=0x9a27 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
pcib5@pci0:0:7:3:	class=0x060400 rev=0x01 hdr=0x01 vendor=0x8086 device=0x9a29 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
none1@pci0:0:8:0:	class=0x088000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a11 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'GNA Scoring Accelerator module'
    class      = base peripheral
none2@pci0:0:10:0:	class=0x118000 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a0d subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tigerlake Telemetry Aggregator Driver'
    class      = dasp
xhci0@pci0:0:13:0:	class=0x0c0330 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a13 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 USB Controller'
    class      = serial bus
    subclass   = USB
none3@pci0:0:13:2:	class=0x0c0340 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a1b subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 NHI'
    class      = serial bus
    subclass   = USB
none4@pci0:0:13:3:	class=0x0c0340 rev=0x01 hdr=0x00 vendor=0x8086 device=0x9a1d subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Thunderbolt 4 NHI'
    class      = serial bus
    subclass   = USB
none5@pci0:0:18:0:	class=0x070000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0fc subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Integrated Sensor Hub'
    class      = simple comms
    subclass   = UART
xhci1@pci0:0:20:0:	class=0x0c0330 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0ed subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller'
    class      = serial bus
    subclass   = USB
none6@pci0:0:20:2:	class=0x050000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0ef subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Shared SRAM'
    class      = memory
    subclass   = RAM
ig4iic0@pci0:0:21:0:	class=0x0c8000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0e8 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Serial IO I2C Controller'
    class      = serial bus
ig4iic1@pci0:0:21:1:	class=0x0c8000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0e9 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Serial IO I2C Controller'
    class      = serial bus
ig4iic2@pci0:0:21:3:	class=0x0c8000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0eb subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Serial IO I2C Controller'
    class      = serial bus
none7@pci0:0:22:0:	class=0x078000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0e0 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Management Engine Interface'
    class      = simple comms
pcib6@pci0:0:29:0:	class=0x060400 rev=0x20 hdr=0x01 vendor=0x8086 device=0xa0b1 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP PCI Express Root Port'
    class      = bridge
    subclass   = PCI-PCI
isab0@pci0:0:31:0:	class=0x060100 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa082 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP LPC Controller'
    class      = bridge
    subclass   = PCI-ISA
hdac0@pci0:0:31:3:	class=0x040380 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0c8 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP Smart Sound Technology Audio Controller'
    class      = multimedia
    subclass   = HDA
ichsmb0@pci0:0:31:4:	class=0x0c0500 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0a3 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP SMBus Controller'
    class      = serial bus
    subclass   = SMBus
none8@pci0:0:31:5:	class=0x0c8000 rev=0x20 hdr=0x00 vendor=0x8086 device=0xa0a4 subvendor=0xf111 subdevice=0x0001
    vendor     = 'Intel Corporation'
    device     = 'Tiger Lake-LP SPI Controller'
    class      = serial bus
nvme0@pci0:1:0:0:	class=0x010802 rev=0x01 hdr=0x00 vendor=0x15b7 device=0x5011 subvendor=0x15b7 subdevice=0x5011
    vendor     = 'Sandisk Corp'
    device     = 'WD PC SN810 / Black SN850 NVMe SSD'
    class      = mass storage
    subclass   = NVM
iwlwifi0@pci0:170:0:0:	class=0x028000 rev=0x1a hdr=0x00 vendor=0x8086 device=0x2725 subvendor=0x8086 subdevice=0x0024
    vendor     = 'Intel Corporation'
    device     = 'Wi-Fi 6 AX210/AX211/AX411 160MHz'
    class      = network

DRM KMOD version
1af4c68 from git

To Reproduce

  1. kldload drm-kmod
  2. sysctl -a
@evadot evadot added bug Something isn't working i915 i915 related problems labels Jan 23, 2024
@evadot
Copy link
Contributor

evadot commented Jan 23, 2024

This has been reported by bapt@ and dumbbell@ too but I don't have a recent enough Intel machine to reproduce.
I guess this is https://github.com/freebsd/drm-kmod/blob/master/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c#L20 this strncmp which means that we probably have something wrong wrt our kobj somewhere.

@emaste
Copy link
Member Author

emaste commented Jan 30, 2024

That panic is not reproducible with throttle_reason_attrs stubbed out:

#if 0
        if (GRAPHICS_VER(gt->i915) >= 11) {
                ret = sysfs_create_files(kobj, throttle_reason_attrs);
                if (ret)
                        drm_warn(&gt->i915->drm,
                                 "failed to create gt%u throttle sysfs files (%pe)",
                                 gt->info.id, ERR_PTR(ret));
        }
#endif

However I then get a hang when invoking sysctl -a.

@emaste emaste changed the title kernel panic from intel_gt_sysfs_get_drvdata after sysctl -a kernel panic from intel_gt_sysfs_get_drvdata after sysctl -a (INVARIANTS kernel) Jan 30, 2024
@emaste
Copy link
Member Author

emaste commented Jan 30, 2024

sysctl -a also fails on my daily driver laptop. Same kernel/drm-kmod version as above, hardware is:

vgapci0@pci0:0:2:0:     class=0x030000 rev=0x02 hdr=0x00 vendor=0x8086 device=0x3ea0 subvendor=0x17aa subdevice=0x2292
    vendor     = 'Intel Corporation'
    device     = 'WhiskeyLake-U GT2 [UHD Graphics 620]'
    class      = display
    subclass   = VGA

Updated headline to indicate that this is with GENERIC (so INVARIANTS enabled), and I don't recall what the failure was here (panic or hang).

@emaste
Copy link
Member Author

emaste commented Feb 1, 2024

This appears to be resolved with #283

@emaste emaste changed the title kernel panic from intel_gt_sysfs_get_drvdata after sysctl -a (INVARIANTS kernel) kernel panic from intel_gt_sysfs_get_drvdata after sysctl -a (INVARIANTS kernel, master branch 6.1) Feb 1, 2024
emaste added a commit to emaste/drm-kmod that referenced this issue Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working i915 i915 related problems
Projects
None yet
Development

No branches or pull requests

2 participants