|
OpenBSD General Other questions regarding OpenBSD which do not fit in any of the categories below. |
|
Thread Tools | Display Modes |
|
|||
5.6 crashes
Both 5.6-release and 5.6-stable give kernel panics on my amd64 system. They seem to be network-related.
An example of a crash of -release (shortly after logging into a virtual console): Code:
# pkg_add vim kernel: type 692267296 trap, code=0 Stopped at 0x6f43204149444956:panic: uvm_fault: fault on non-pagable map(0xffffffff81d7bb60, 0xffff80000015b000) Stopped at Debugger+0x9: leave ddb{0}> trace Debugger() at Debugger+0x9 panic() at panic+0xfe uvm_fault() at uvm_fault+0xcc4 trap() at trap+0x62f --- trap (number 6) --- (null)() at 0xffff80000015ba80 db_get_value() at db_get_value+0x34 db_disasm() at db_disasm+0x42 db_trap() at db_trap+0x90 kdb_trap() at kdb_trap+0xf0 end of kernel end trace frame: 0x96ae10b000000001, count: -9 ddb{0}> ps PID PPID PGRP UID S FLAGS WAIT COMMAND 2047 23251 6419 0 3 0x83 poll ftp 23251 6419 6419 0 3 0x8b pause sh 6419 21029 6419 0 3 0x83 piperd perl 18846 1 22182 35 3 0x90 poll xconsole 30272 1 22182 0 3 0x80 netio xconsole 383 27898 383 0 3 0x80 poll xdm 15980 21909 21909H( .e 3 0x Code:
kernel: protection fault trap, code=0 Faulted in DDB; continuing... Crash happens most often already during loading the kernel or during init. An example of crash during init: Code:
... DHCPACK from 192.168.0.11 (00:a0:24:f0:fb:11) kernel: double fault trap, code=0 Stopped at 0: Code:
starting network DHCPREQUEST on sk0 to 255.255.255.255 DHCPACK from 192.168.0.11 (00:40:24:f0:fb:11) |
|
|||
Quote:
|
|
|||
Read http://www.openbsd.org/faq/faq2.html#Bugs and report the relevant info (inline, not as an attachment) to the OpenBSD misc mailing list.
__________________
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump |
|
|||
The full dmesg of bsd.sp is here:
Code:
OpenBSD 5.6 (GENERIC) #310: Fri Aug 8 00:14:24 MDT 2014 deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 2129526784 (2030MB) avail mem = 2064158720 (1968MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.2 @ 0xf0000 (43 entries) bios0: vendor Phoenix Technologies, LTD version "6.00 PG" date 03/29/2006 bios0: DFI Corp,LTD LP NF4 Series acpi0 at bios0: rev 0 acpi0: sleep states S0 S1 S4 S5 acpi0: tables DSDT FACP MCFG APIC acpi0: wakeup devices HUB0(S5) XVR0(S5) XVR1(S5) XVR2(S5) XVR3(S5) USB0(S3) USB2(S3) MMAC(S5) MMCI(S5) UAR1(S5) PS2M(S4) PS2K(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255 acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+, 2411.39 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW,LAHF,CMPLEG cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 200MHz cpu at mainbus0: not configured ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 11, 24 pins ioapic0: misconfigured as apic 0, remapped to apid 2 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (HUB0) acpicpu0 at acpi0 acpitz0 at acpi0: critical temperature is 69 degC acpibtn0 at acpi0: PWRB pci0 at mainbus0 bus 0 "NVIDIA nForce4 DDR" rev 0xa3 at pci0 dev 0 function 0 not configured pcib0 at pci0 dev 1 function 0 "NVIDIA nForce4 ISA" rev 0xa3 nviic0 at pci0 dev 1 function 1 "NVIDIA nForce4 SMBus" rev 0xa2 iic0 at nviic0 spdmem0 at iic0 addr 0x50: 1GB DDR SDRAM non-parity PC3200CL3.0 spdmem1 at iic0 addr 0x51: 1GB DDR SDRAM non-parity PC3200CL3.0 iic1 at nviic0 iic1: addr 0x4e 00=02 02=28 03=28 04=2a 12=be 13=0e 20=01 28=83 29=12 2a=12 2b=28 words 00=0200 01=0028 02=2828 03=282a 04=2a00 05=0000 06=0000 07=0000 ohci0 at pci0 dev 2 function 0 "NVIDIA nForce4 USB" rev 0xa2: apic 2 int 20, version 1.0, legacy support ehci0 at pci0 dev 2 function 1 "NVIDIA nForce4 USB" rev 0xa3: apic 2 int 20 ehci0: timed out waiting for BIOS usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "NVIDIA EHCI root hub" rev 2.00/1.00 addr 1 pciide0 at pci0 dev 6 function 0 "NVIDIA nForce4 IDE" rev 0xa2: DMA, channel 0 configured to compatibility, channel 1 configured to compatibility atapiscsi0 at pciide0 channel 0 drive 0 scsibus1 at atapiscsi0: 2 targets cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, CDRW/DVD GCC4482, E107> ATAPI 5/cdrom removable atapiscsi1 at pciide0 channel 0 drive 1 scsibus2 at atapiscsi1: 2 targets cd1 at scsibus2 targ 0 lun 0: <LG, CD-ROM CRD-8522B, 2.03> ATAPI 5/cdrom removable cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 cd1(pciide0:0:1): using PIO mode 4, DMA mode 2 pciide0: channel 1 disabled (no drives) pciide1 at pci0 dev 7 function 0 "NVIDIA nForce4 SATA" rev 0xa3: DMA pciide1: using apic 2 int 20 for native-PCI interrupt pciide2 at pci0 dev 8 function 0 "NVIDIA nForce4 SATA" rev 0xa3: DMA pciide2: using apic 2 int 20 for native-PCI interrupt wd0 at pciide2 channel 0 drive 0: <Maxtor 6B120M0> wd0: 16-sector PIO, LBA, 117246MB, 240121728 sectors wd0(pciide2:0:0): using PIO mode 4, Ultra-DMA mode 6 ppb0 at pci0 dev 9 function 0 "NVIDIA nForce4" rev 0xa2 pci1 at ppb0 bus 1 "NVIDIA Vanta" rev 0x15 at pci1 dev 6 function 0 not configured emu0 at pci1 dev 7 function 0 "Creative Labs SoundBlaster Live" rev 0x06: apic 2 int 3 ac97: codec id 0x54524123 (TriTech Microelectronics TR28602) audio0 at emu0 "Creative Labs PCI Gameport Joystick" rev 0x06 at pci1 dev 7 function 1 not configured "VIA VT6306 FireWire" rev 0x80 at pci1 dev 9 function 0 not configured skc0 at pci1 dev 10 function 0 "Marvell Yukon 88E8001/8003/8010" rev 0x13, Yukon Lite (0x9): apic 2 int 5 sk0 at skc0 port A: address 00:01:29:fc:35:59 eephy0 at sk0 phy 0: 88E1011 Gigabit PHY, rev. 5 nfe0 at pci0 dev 10 function 0 "NVIDIA CK804 LAN" rev 0xa3: apic 2 int 20, address 00:01:29:fc:34:f1 ciphy0 at nfe0 phy 1: CS8201 10/100/1000TX PHY, rev. 3 ppb1 at pci0 dev 11 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci2 at ppb1 bus 2 ppb2 at pci0 dev 12 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci3 at ppb2 bus 3 ppb3 at pci0 dev 13 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci4 at ppb3 bus 4 ppb4 at pci0 dev 14 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci5 at ppb4 bus 5 vga1 at pci5 dev 0 function 0 "NVIDIA GeForce 6800 GT" rev 0xa2 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) pchb0 at pci0 dev 24 function 0 "AMD AMD64 0Fh HyperTransport" rev 0x00 pchb1 at pci0 dev 24 function 1 "AMD AMD64 0Fh Address Map" rev 0x00 pchb2 at pci0 dev 24 function 2 "AMD AMD64 0Fh DRAM Cfg" rev 0x00 kate0 at pci0 dev 24 function 3 "AMD AMD64 0Fh Misc Cfg" rev 0x00 isa0 at pcib0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 it0 at isa0 port 0x2e/2: IT8712F rev 7, EC port 0x290 usb1 at ohci0: USB revision 1.0 uhub1 at usb1 "NVIDIA OHCI root hub" rev 1.00/1.00 addr 1 uhidev0 at uhub1 port 2 configuration 1 interface 0 "Logitech USB-PS/2 Optical Mouse" rev 2.00/21.00 addr 2 uhidev0: iclass 3/1 ums0 at uhidev0: 8 buttons, Z dir wsmouse0 at ums0 mux 0 vscsi0 at root scsibus3 at vscsi0: 256 targets softraid0 at root scsibus4 at softraid0: 256 targets root on wd0a (106e6dba438758b0.a) swap on wd0b dump on wd0b Is the enclosed dmesg output (from the stable bsd.sp) useful at all? |
|
|||
apparently fixed
edit: The system crashed shortly after this post; after a reboot again; problem still exists.
Thank you for the welcome, jggimi, and thank you all for the suggestions. The problem appears to be solved with the GENERIC.MP kernel that I compiled and installed today (after updating to the newest sources from the CVS repository). I have done a "stress" test (from the pre-compiled packages), scp'ed a big file, did a "pkg_add -u", some webbrowsing and am now compiling the userspace. No crash. With the previous kernel my system crashed during, or shortly after, init. If I understand the evolution of the OpenBSD system correctly, there have been three changes in the code since my previous multiprocessor kernel, namely errata 012, 013 and 014. As 012 has something to do with attacks (and I am on a private subnet that should be secure), and 014 is a security patch for X and previously my system crashed often before X started, I conclude that reliability patch 013, fixing hangs with the virtio device, has solved it. But does my system use a VirtIO device? I don't know. For completeness my dmesg, this time of the patched MP kernel: Code:
OpenBSD 5.6-stable (GENERIC.MP) #1: Wed Dec 24 15:10:55 CET 2014 root@gluon.instanton:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 2129526784 (2030MB) avail mem = 2064109568 (1968MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.2 @ 0xf0000 (43 entries) bios0: vendor Phoenix Technologies, LTD version "6.00 PG" date 03/29/2006 bios0: DFI Corp,LTD LP NF4 Series acpi0 at bios0: rev 0 acpi0: sleep states S0 S1 S4 S5 acpi0: tables DSDT FACP MCFG APIC acpi0: wakeup devices HUB0(S5) XVR0(S5) XVR1(S5) XVR2(S5) XVR3(S5) USB0(S3) USB2(S3) MMAC(S5) MMCI(S5) UAR1(S5) PS2M(S4) PS2K(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255 acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+, 2411.40 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW,LAHF,CMPLEG cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu0: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 200MHz cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD Athlon(tm) 64 X2 Dual Core Processor 4600+, 2411.11 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,NXE,MMXX,FFXSR,LONG,3DNOW2,3DNOW,LAHF,CMPLEG cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu1: DTLB 32 4KB entries fully associative, 8 4MB entries fully associative ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 11, 24 pins ioapic0: misconfigured as apic 0, remapped to apid 2 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (HUB0) acpicpu0 at acpi0 acpicpu1 at acpi0 acpitz0 at acpi0: critical temperature is 69 degC acpibtn0 at acpi0: PWRB pci0 at mainbus0 bus 0 "NVIDIA nForce4 DDR" rev 0xa3 at pci0 dev 0 function 0 not configured pcib0 at pci0 dev 1 function 0 "NVIDIA nForce4 ISA" rev 0xa3 nviic0 at pci0 dev 1 function 1 "NVIDIA nForce4 SMBus" rev 0xa2 iic0 at nviic0 spdmem0 at iic0 addr 0x50: 1GB DDR SDRAM non-parity PC3200CL3.0 spdmem1 at iic0 addr 0x51: 1GB DDR SDRAM non-parity PC3200CL3.0 iic1 at nviic0 iic1: addr 0x4e 00=02 02=28 03=28 04=2a 12=be 13=0e 20=01 28=83 29=12 2a=12 2b=28 words 00=0200 01=0028 02=2828 03=282a 04=2a00 05=0000 06=0000 07=0000 ohci0 at pci0 dev 2 function 0 "NVIDIA nForce4 USB" rev 0xa2: apic 2 int 20, version 1.0, legacy support ehci0 at pci0 dev 2 function 1 "NVIDIA nForce4 USB" rev 0xa3: apic 2 int 20 ehci0: timed out waiting for BIOS usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "NVIDIA EHCI root hub" rev 2.00/1.00 addr 1 pciide0 at pci0 dev 6 function 0 "NVIDIA nForce4 IDE" rev 0xa2: DMA, channel 0 configured to compatibility, channel 1 configured to compatibility atapiscsi0 at pciide0 channel 0 drive 0 scsibus1 at atapiscsi0: 2 targets cd0 at scsibus1 targ 0 lun 0: <HL-DT-ST, CDRW/DVD GCC4482, E107> ATAPI 5/cdrom removable atapiscsi1 at pciide0 channel 0 drive 1 scsibus2 at atapiscsi1: 2 targets cd1 at scsibus2 targ 0 lun 0: <LG, CD-ROM CRD-8522B, 2.03> ATAPI 5/cdrom removable cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 cd1(pciide0:0:1): using PIO mode 4, DMA mode 2 pciide0: channel 1 disabled (no drives) pciide1 at pci0 dev 7 function 0 "NVIDIA nForce4 SATA" rev 0xa3: DMA pciide1: using apic 2 int 20 for native-PCI interrupt pciide2 at pci0 dev 8 function 0 "NVIDIA nForce4 SATA" rev 0xa3: DMA pciide2: using apic 2 int 20 for native-PCI interrupt wd0 at pciide2 channel 0 drive 0: <Maxtor 6B120M0> wd0: 16-sector PIO, LBA, 117246MB, 240121728 sectors wd0(pciide2:0:0): using PIO mode 4, Ultra-DMA mode 6 ppb0 at pci0 dev 9 function 0 "NVIDIA nForce4" rev 0xa2 pci1 at ppb0 bus 1 "NVIDIA Vanta" rev 0x15 at pci1 dev 6 function 0 not configured emu0 at pci1 dev 7 function 0 "Creative Labs SoundBlaster Live" rev 0x06: apic 2 int 3 ac97: codec id 0x54524123 (TriTech Microelectronics TR28602) audio0 at emu0 "Creative Labs PCI Gameport Joystick" rev 0x06 at pci1 dev 7 function 1 not configured "VIA VT6306 FireWire" rev 0x80 at pci1 dev 9 function 0 not configured skc0 at pci1 dev 10 function 0 "Marvell Yukon 88E8001/8003/8010" rev 0x13, Yukon Lite (0x9): apic 2 int 5 sk0 at skc0 port A: address 00:01:29:fc:35:59 eephy0 at sk0 phy 0: 88E1011 Gigabit PHY, rev. 5 nfe0 at pci0 dev 10 function 0 "NVIDIA CK804 LAN" rev 0xa3: apic 2 int 20, address 00:01:29:fc:34:f1 ciphy0 at nfe0 phy 1: CS8201 10/100/1000TX PHY, rev. 3 ppb1 at pci0 dev 11 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci2 at ppb1 bus 2 ppb2 at pci0 dev 12 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci3 at ppb2 bus 3 ppb3 at pci0 dev 13 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci4 at ppb3 bus 4 ppb4 at pci0 dev 14 function 0 "NVIDIA nForce4 PCIE" rev 0xa3 pci5 at ppb4 bus 5 vga1 at pci5 dev 0 function 0 "NVIDIA GeForce 6800 GT" rev 0xa2 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) pchb0 at pci0 dev 24 function 0 "AMD AMD64 0Fh HyperTransport" rev 0x00 pchb1 at pci0 dev 24 function 1 "AMD AMD64 0Fh Address Map" rev 0x00 pchb2 at pci0 dev 24 function 2 "AMD AMD64 0Fh DRAM Cfg" rev 0x00 kate0 at pci0 dev 24 function 3 "AMD AMD64 0Fh Misc Cfg" rev 0x00 isa0 at pcib0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 it0 at isa0 port 0x2e/2: IT8712F rev 7, EC port 0x290 usb1 at ohci0: USB revision 1.0 uhub1 at usb1 "NVIDIA OHCI root hub" rev 1.00/1.00 addr 1 uhidev0 at uhub1 port 2 configuration 1 interface 0 "Logitech USB-PS/2 Optical Mouse" rev 2.00/21.00 addr 2 uhidev0: iclass 3/1 ums0 at uhidev0: 8 buttons, Z dir wsmouse0 at ums0 mux 0 vscsi0 at root scsibus3 at vscsi0: 256 targets softraid0 at root scsibus4 at softraid0: 256 targets root on wd0a (106e6dba438758b0.a) swap on wd0b dump on wd0b uid 0 on /usr: file system full Last edited by hulten; 24th December 2014 at 04:39 PM. Reason: errors |
|
||||
I couldn't tell you why the problems have apparently resolved. Should it happen again, perhaps a backtrace of the system core dump will help isolate the issue. See crash(8) for sysctl settings you may want to deploy on the remote platform, as well as debugging guidance.
|
|
|||
The crashes happen again. I was probably just lucky that I could use my system several hours instead of mostly a minute (which is usually the case).
First some standard bug report information: Code:
... DHCPACK from 192.168.0.11 (00:a):24:f0:fb:11) kernel: type 692267296 trap, code=1 Stopped at 0x6f43204140444056:panic: attempt to execute user adé s 0xb in sup _ É( % ä > _ ... panic: attempt to execute user address 0xb in supervisor mode Pß Stopped at Debugger+0x9: leave RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS, TOO. DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCUDING THAT INFORMATION! ddb{0}> trace Debugger() at Debugger+0x9 panic() at panic+0xfe trap() at trap+0x85d --- trap (number 6) --- end of kernel end trace frame: 0x176e176f17691774, count: -3 0xb: ddb{0}> machine ddbcpu 1 RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! IF RUNNING SMP, USE 'mach ddbcpu <#>' AND 'trace' ON OTHER PROCESSORS, TOO. DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCUDING THAT INFORMATION! ddb{1}> trace Debugger() at Debugger+0x9 x86_ipi_handler at x86_ipi_handler+0x64 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x1b --- interrupt --- Bad frame pointer: 0xffff8000212e2f10 end trace frame: 0xffff8000212e2f10, count: -3 cpu_idle_cycle+0x13: ddb{1}> show panic attempt to execute user address 0xb in supervisor mode ddb{1}> ps PID PPID PGRP UID S FLAGS WAIT COMMAND 16610 1 16610 0 3 0x80 poll dhclient 15412 18760 18760 77 3 0x13 biowait dhclient 18760 1 18760 0 3 0x8b pause sh 1024 0 0 0 3 0x14200 aiodoned aiodoned 5077 0 0 0 3 0x14200 syncer update 19163 0 0 0 3 0x14200 cleaner cleaner 10500 0 0 0 3 0x14200 reaper reaper 14850 0 0 0 3 0x14200 pgdaemon pagedaemon 7097 0 0 0 3 0x14200 bored crypto 29987 0 0 0 3 0x14200 pftm pfpurge 98 0 0 0 3 0x14200 bored sensors 27672 0 0 0 3 0x14200 usbtsk usbtask 28808 0 0 0 3 0x14200 usbatsk usbatsk 31061 0 0 0 3 0x40014200 acpi0 acpi0 *16000 0 0 0 7 0x40014200 idle1 15840 0 0 0 3 0x14200 bored systqmp 28757 0 0 0 3 0x14200 bored systq 19413 0 0 0 3 0x14200 bored syswq 11589 0 0 0 3 0x14200 idle0 1 0 0 0 3 0x14200 wait init 0 0 0 0 3 0x14200 scheduler swapper ddb {1}> Code:
... DHCPACK from 192.168.0.11 (00:a):24:f0:fb:11) kernel: type 692267296 trap, code=1 Stopped at 0x6f43204140444056: kernel: protection fault trap, code=0 Stopped at db_read_bytes+0x22: movzbl 0(%rdi,%rcx,1)c%eax ddb{0}> boot dump °( ( πë s...ä `\ ( itok: want -1 have 2 splassert: aä twaitok: ~ ... t: assertwaitok: want -1 have 2 ... kernel: type 269 trap, code=0 Faulted in DDB; continuing... ddb{0}> boot sync Faulted in DDB ... I have local access to this machine now. The machine ran/runs without notable problems with several versions of Debian GNU/Linux. Last edited by hulten; 25th December 2014 at 10:13 PM. Reason: quick additions |
|
|||
While some members here may have a suggestion, recognize that this site is not officially affiliated with the OpenBSD project proper. Users wanting to submit formal bug reports should study the information found at the following link for the submission protocol:
http://www.openbsd.org/report.html Completeness is considered a good thing. You will make friend by providing a thorough explanation. |
|
||||
Quote:
The x86_ipi_handler mentioned in the traceroute is in sys/arch/amd64/amd64/ipi.c and that module has not been changed since 5.6-release. I agree with ocicat that this is best reported to the Project via its misc@ mailing list. |
|
|||
Before I report this to the misc@ or bugs@, I would like to exclude the following potential hardware failure.
I have ran Memtest86 (both v1.65 from the BIOS, and v4.20 through "boot memtest"). In these tests test 5 [Block move, 64 moves] gives many errors, as does test 7 from v4.20. I have read on several internet fora that test 5 would create a lot of heat, rendering this test not trustworthy. Any ideas about this? My first guess is bad memory, and I should try to exclude or confirm this first. |
|
||||
My own experience with memtest86 and memtest+ is that I have never seen false positives. Plenty of false negatives, since they do not catch every problem; they cannot prove hardware is good; they only help identify bad hardware.
I've never had heat problems during the running of these tests. If I wanted to test power supplies and heat management, I'd run stress testers instead. You are more likely to have a memory problem than a heat problem, based upon your results..... of course, a heat problem could manifest as a memory problem. Last edited by jggimi; 26th December 2014 at 12:44 PM. Reason: nothing is ever definitive .. :) |
|
|||
it must be the memory
Thanks, jggimi. I will try to fix the memory problem. Also, none of the temperature sensors that are displayed in the BIOS show very high temperatures (nothing higher than 55°C), so what I read on those other fora on unreliable memtests was probably nonsense.
Still, it is unexpected, though not impossible, that I found the problem when running OpenBSD. I do not remember experiencing problems under GNU/Linux. |
|
|||
First I cleaned my computer with a vacuum cleaner (and connected a case fan that should have been connected before).
With Memtest86 I was able to identify one of my two DIMMs as bad. The memory test does not give problems with the other DIMM, so I assume this one is okay (though maybe false negative). I have done so many tests that my results are statistically rigorous. With presumably my only good DIMM, I have done stress tests with stress(1) that show in both OpenBSD and GNU/Linux problems (the systems crash). But again, Memtest86 gives no problems with my last hardware configuration (only one DIMM). OpenBSD: Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m stress: info [21433] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd kernel: protection fault trap, code=0 Stopped at pmap_page_remove+0x75: movq 0(%rbx),%rax ddb{0}> trace pmap_page_remove() at pmap_page_remove+0x75 uvm_anfree() at uvm_anfree+0xbe amap_wipeout() at amap_wipeout+0xb9 uvm_unmap_detach() at uvmj_unmap_detach+0x52 sys_munmap() at sys_munmap+0x14b syscall() at syscall+0x297 --- syscall (number 73) --- end of kernel end trace frame: 0x49a122e000, count: -6 0x4989e097ea: ddb{0}> Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m stress: info: [1897] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd [ 344.076011] BUG: soft lockup - CPU#0 stuck for 22s! [stress:1899] [ 360.156009] BUG: soft lockup - CPU#1 stuck for 22s! [cupsd:687] [ 344.076010] BUG: soft lockup - CPU#0 stuck for 22s! [stress:1899] [ 360.156008] BUG: soft lockup - CPU#1 stuck for 22s! [cupsd:687] ... None of the following worked: Ctrl+C, NumLock, switching to other virtual console (or graphical). This worked: SysRq+U,S,B (it initiated umount and sync, and the system rebooted). Then I decided to strip my computer from all stuff (e.g. audio card) not needed for the stress tests. Again stress(1) under GNU/Linux, now single user mode (so no cupsd, e.g.): Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m stress: info [669] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd Code:
WARNING: CPU: 1 PID: 100 at /build/linux-CMiYW9/linux-3.16.7-ckt2/kernel/watchdog.c:265 watchdog_overflow_callback+0x98/0xc0() Watchdog detected hard LOCKUP on cpu 1 Modules linked in: ... CPU: 1 PID: 100 Comm: kworker/1:1H Tainted: G D W 3.16.0-4-amd64 #1 Debian 3.16.7-ckt2-1 Hardware name: /LP NF4 Series, BIOS 6.00 PG 03/29/2006 ... Apparently I have another hardware problem. |
|
|||
In re-reading this thread, you mention testing with both 5.6-release & 5.6-stable, & it appears you have spent a considerable time thus far, & it does appear this needs to be brought to the attention of the project developers. However, OpenBSD 5.6 is several months old, & the source code under scrutiny by the developers today is nearing six months past the time 5.6 was tagged in CVS. Formally reporting with 5.6-release or 5.6-stable as test cases, while useful, is not as important as testing with a recent snapshot of -current.
I would suggest the following action:
Last edited by ocicat; 28th December 2014 at 12:06 AM. Reason: grammar |
|
||||
Quote:
ADDED: Capacitor plague |
|
|||
The idea of checking the capacitors is a good one, I didn't think of this yet. I looked carefully at all capacitors, but they all look like nice cylinders without any bulging or leakage.
I will install a recent OpenBSD snapshot and try to reproduce and report the problem. |
|
|||
I wanted to install the latest snapshot, so I went to http://ftp.nluug.nl/pub/OpenBSD/snapshots/amd64 and dd'ed install56.fs to usb. However, it gives an "ERR M" — probably my BIOS cannot handle it. The usb stick boots fine in my newer computers. I also have problems booting cd's on which I burned install56.iso (even though they, again, boot fine in my other computers).
Is it fine to boot from the official installation medium (5.6 release), do a clean install and select the software sets from http, pointing to the snapshot (same url as above)? Or should I expect the installation software (5.6 release) to be incompatible with newer software sets (snapshot 2014-12-27)? |
|
||||
Quote:
I remember a certain mainboard (Asus or Intel, don't quite remember) that always gave errors at the same address when PXE was enabled. I also remember having a certain model of HP workstation (don't remember which model) that gave a whole bunch of errors with memtest86; HP's own memory test ran fine, and so did the system ... I'm pretty sure the memory was fine. |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
bwi0 on Powerbook G4 crashes | tsarpal | OpenBSD General | 3 | 23rd February 2013 02:40 AM |
sparcstation 20 cgfourteen crashes | darf | NetBSD General | 7 | 11th March 2010 05:06 AM |
FreeBSD 7.0 with SSD Crashes | map7 | FreeBSD General | 4 | 5th February 2009 10:08 PM |
net-im/sim-im* crashes blackbox | TerryP | FreeBSD Ports and Packages | 0 | 28th September 2008 08:29 AM |
Akregator crashes | map7 | FreeBSD Ports and Packages | 2 | 13th July 2008 11:22 PM |