Thread: 5.6 crashes
View Single Post
Old 27th December 2014
hulten hulten is offline
Port Guard
 
Join Date: Dec 2014
Posts: 34
Default

First I cleaned my computer with a vacuum cleaner (and connected a case fan that should have been connected before).

With Memtest86 I was able to identify one of my two DIMMs as bad. The memory test does not give problems with the other DIMM, so I assume this one is okay (though maybe false negative). I have done so many tests that my results are statistically rigorous.

With presumably my only good DIMM, I have done stress tests with stress(1) that show in both OpenBSD and GNU/Linux problems (the systems crash). But again, Memtest86 gives no problems with my last hardware configuration (only one DIMM).

OpenBSD:
Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m
stress: info [21433] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd
kernel: protection fault trap, code=0
Stopped at      pmap_page_remove+0x75:  movq    0(%rbx),%rax
ddb{0}> trace
pmap_page_remove() at pmap_page_remove+0x75
uvm_anfree() at uvm_anfree+0xbe
amap_wipeout() at amap_wipeout+0xb9
uvm_unmap_detach() at uvmj_unmap_detach+0x52
sys_munmap() at sys_munmap+0x14b
syscall() at syscall+0x297
--- syscall (number 73) ---
end of kernel
end trace frame: 0x49a122e000, count: -6
0x4989e097ea:
ddb{0}>
Under GNU/Linux, a stress test as above:
Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m
stress: info: [1897] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd
[  344.076011] BUG: soft lockup - CPU#0 stuck for 22s! [stress:1899]
[  360.156009] BUG: soft lockup - CPU#1 stuck for 22s! [cupsd:687]
[  344.076010] BUG: soft lockup - CPU#0 stuck for 22s! [stress:1899]
[  360.156008] BUG: soft lockup - CPU#1 stuck for 22s! [cupsd:687]
...
I still get ping (icmp) back from the computer, but cannot login (ssh).
None of the following worked: Ctrl+C, NumLock, switching to other virtual console (or graphical).
This worked: SysRq+U,S,B (it initiated umount and sync, and the system rebooted).

Then I decided to strip my computer from all stuff (e.g. audio card) not needed for the stress tests.
Again stress(1) under GNU/Linux, now single user mode (so no cupsd, e.g.):
Code:
# stress --cpu 2 --io 2 --vm 2 --timeout 5m
stress: info [669] dispatching hogs: 2 cpu, 2 io, 2 vm, 0 hdd
Crash! The CPU#0 backtrace is not readable anymore on the screen, I am left with a backtrace of CPU#1:
Code:
WARNING: CPU: 1 PID: 100 at /build/linux-CMiYW9/linux-3.16.7-ckt2/kernel/watchdog.c:265 watchdog_overflow_callback+0x98/0xc0()
Watchdog detected hard LOCKUP on cpu 1
Modules linked in: ...
CPU: 1 PID: 100 Comm: kworker/1:1H Tainted: G       D W     3.16.0-4-amd64 #1 Debian 3.16.7-ckt2-1
Hardware name:    /LP NF4 Series, BIOS 6.00 PG 03/29/2006
...
Computer crashed: SysRq not responsive.

Apparently I have another hardware problem.
Reply With Quote