DaemonForums  

Go Back   DaemonForums > NetBSD > NetBSD General

NetBSD General Other questions regarding NetBSD which do not fit in any of the categories below.

Reply
 
Thread Tools Display Modes
  #1   (View Single Post)  
Old 24th February 2021
bashrules's Avatar
bashrules bashrules is offline
Aspiring Unix Greybeard
 
Join Date: Mar 2010
Location: Here
Posts: 80
Default Computer Crashes

Hi all,

My desktop (i3-2120 from 2012) runs NetBSD 9 for the last 11 months. It runs 24/7 and crashes once a month. This is very bad as I rely on it being remotely accessible.

Before, it was not my computer and I don't know of any known hardware problems.

After the crash:
  • the power LED is still on and the fan continues to spin. This makes me believe it's not a hardware problem.
  • the monitor does not detect a signal from the graphics card. Here I'm not sure what that means. If the kernel crashes, what is supposed to be on the screen? Who is driving the graphics card/monitor?
  • pings are not answered.
  • there was no power outage - all other computers in the household are still running.
  • BIOS is set up to "keep the power-state" after a power-failure, e.g. when it was previously on, it will be turned on. As NetBSD did not boot again, the power was never really lost.
  • it was not a graceful shutdown. When NetBSD came up after a manual power-cycle, filesystem checks took place (well, the journal is replayed)

I follow NetBSD cvs netbsd-9 branch. I believe that is the stable branch. The NetBSD config is the default.

NetBSD XXX 9.1_STABLE NetBSD 9.1_STABLE (GENERIC) #2: Sun Jan 3 11:19:52 PST 2021 root@XXX:/usr/obj/sys/arch/amd64/compile/GENERIC amd64

The last crash must have happened early morning of Feb 24 (I power-cyceled the computer at 9:00am. Last cron job activity at 3:00am.) Log files are "blank" in the early morning. /var/log/messages:
Code:
Feb 21 13:16:04 XXX /netbsd: [ 3620751.3847045] kern error: [drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/intel_fifo_underrun.c:230)cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A
Feb 21 13:16:04 XXX /netbsd: [ 3620751.3847045] kern error: [drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/intel_fifo_underrun.c:381)intel_pch_fifo_underrun_irq_handler] *ERROR* PCH transcoder A FIFO underrun
Feb 21 14:33:25 XXX /netbsd: [ 3625392.1028982] kern error: [drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/intel_fifo_underrun.c:230)cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A
Feb 21 14:33:25 XXX /netbsd: [ 3625392.1028982] kern error: [drm:(/usr/src/sys/external/bsd/drm2/dist/drm/i915/intel_fifo_underrun.c:381)intel_pch_fifo_underrun_irq_handler] *ERROR* PCH transcoder A FIFO underrun
Feb 24 09:03:22 XXX syslogd[184]: restart
Feb 24 09:03:22 XXX /netbsd: [   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
No other log files have timestamps around that time (well, besides cron.log).

What do you suspect, hardware problem or kernel crash?

Is there a knob to keep the kernel in a debugger upon a crash?

What can I do to root-cause the crash?
Reply With Quote
  #2   (View Single Post)  
Old 2nd March 2021
ohmpr ohmpr is offline
Port Guard
 
Join Date: Feb 2021
Posts: 20
Default

i don't know anything about netbsd, but i would suspect maybe kernel panic due to mem leak or mem going bad. As for how to diagnose i would say use whatever methods to test via process of elimination with those suspected causes in mind. IDK, just a guess.
Reply With Quote
  #3   (View Single Post)  
Old 3rd March 2021
gogh20100 gogh20100 is offline
New User
 
Join Date: Oct 2019
Posts: 1
Default

2 suggestions:

- A core dump should have been created, you can load it with gdb to see where it crashed.

- /var/log/messages reports an error in the video i915 kernel module. You may want to configure X to use the wsfb driver (instead of the automatically intel driver) to confirm this hypothesis (your machine should no longer crash).

If this is the correct diagnosis, you should open a PR including the error messages in /var/log/messages, the output of uname -a, your full dmesg and the information obtained from the core dump.
Reply With Quote
  #4   (View Single Post)  
Old 3rd March 2021
ohmpr ohmpr is offline
Port Guard
 
Join Date: Feb 2021
Posts: 20
Default

Quote:
- /var/log/messages reports an error in the video i915 kernel module. You may want to configure X to use the wsfb driver (instead of the automatically intel driver) to confirm this hypothesis (your machine should no longer crash).
yes, i ignored this because i assumed it had already been ruled out as the source of the issue since it was right there in the log posted and not mentioned. Although i probably shouldn't have since it says "kern error" and that sounds like something that could cause a kernel panic.
Reply With Quote
  #5   (View Single Post)  
Old 27th March 2021
bashrules's Avatar
bashrules bashrules is offline
Aspiring Unix Greybeard
 
Join Date: Mar 2010
Location: Here
Posts: 80
Default

Thank you for getting back to me.

Quote:
Originally Posted by ohmpr View Post
i don't know anything about netbsd, but i would suspect maybe kernel panic due to mem leak or mem going bad. As for how to diagnose i would say use whatever methods to test via process of elimination with those suspected causes in mind. IDK, just a guess.
I was running memtest over night. Memory is good.

A kernel panic should create a core-file. There is none.



Quote:
Originally Posted by gogh20100 View Post
2 suggestions:

- A core dump should have been created, you can load it with gdb to see where it crashed.

- /var/log/messages reports an error in the video i915 kernel module. You may want to configure X to use the wsfb driver (instead of the automatically intel driver) to confirm this hypothesis (your machine should no longer crash).

If this is the correct diagnosis, you should open a PR including the error messages in /var/log/messages, the output of uname -a, your full dmesg and the information obtained from the core dump.
There is no core-file. Do you happen to know how core files are created, e.g. under which circumstance are they / are they not created?

I doubt that these i915 kernel logs are responsible for this crash 3 days later.
Reply With Quote
  #6   (View Single Post)  
Old 2nd April 2021
Sehnsucht94's Avatar
Sehnsucht94 Sehnsucht94 is offline
Real Name: Paolo Vincenzo Olivo
Package Pilot
 
Join Date: Oct 2017
Location: Rome
Posts: 169
Default

Quote:
Originally Posted by bashrules View Post
There is no core-file. Do you happen to know how core files are created, e.g. under which circumstance are they / are they not created?

I doubt that these i915 kernel logs are responsible for this crash 3 days later.
It's covered in the wiki
__________________
“Mi casa tendrá dos piernas y mis sueños no tendrán fronteras„
Reply With Quote
Reply

Tags
crash, debugging, netbsd

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
5.6 crashes hulten OpenBSD General 25 3rd January 2015 10:53 PM
sparcstation 20 cgfourteen crashes darf NetBSD General 7 11th March 2010 05:06 AM
FreeBSD 7.0 with SSD Crashes map7 FreeBSD General 4 5th February 2009 10:08 PM
net-im/sim-im* crashes blackbox TerryP FreeBSD Ports and Packages 0 28th September 2008 08:29 AM
Akregator crashes map7 FreeBSD Ports and Packages 2 13th July 2008 11:22 PM


All times are GMT. The time now is 08:30 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick