|
FreeBSD General Other questions regarding FreeBSD which do not fit in any of the categories below. |
|
Thread Tools | Display Modes |
|
|||
Constant Kernel Panics, Corupted file system reboots over and over
I am having some serious issues with what appears to be drive failures. However, this is a new drive. When I reisntall the system, it works perfectly fine for a few days, and then crashes with the error below. I have run smartctl -t long and no errors come up. After this crash, it rebooted twice and then came back up. I umounted the partition, did a fsck on it and it seemed to be working fine, for a few minutes. Then it crashed again and when the computer rebooted it couldn't find a valid boot partition. Luckly rebooting again into single user mode then running a fsck all all drives will bring it back up, temporarily.
Does anyone know what is going on here? What I am reading is that this an issue that has been known since 2004 but no one has bothered to fix. Something about softupdate being broken. System specs are FreeBSD 7.0 running on a Intell 1.66 GHz dual core with 2Gigs of DDR2, Intel (not sure model) motherboard with a Jmicron JMB363 controller. The log pasted below is the lastest crash. It also occurs when I put in ANY disk that isn't a member of an array. I have pasted in the logs from those crashes as well. ad8, ad10 and ad12 are all brand new and have no issues in any other computer. I am beginning to panic ATM there seems to be a path to bring the computer back up, but as time goes on, sooner or later it just isn't going to work. Any help you guys could give me would really be appreciated. Thanks ~m Most current crash: Jul 11 22:46:45 octavian kernel: ad8: FAILURE - device detached Crash 1: Jul 7 08:39:52 octavian fsck: /dev/ar0s1d: 3694 files, 352639422 used, 89402533 free (2053 frags, 11175060 blocks, 0.0% fragmentation) Last edited by pseudonym; 13th July 2008 at 02:02 AM. Reason: Wrong MB in hardware discription.. sorry.. have so many I lose track =P |
|
|||
Either the drive or the cable. The latter is easy enough to swap out. For the former, I'd suggest getting all your data off, and writing some character (like a zero) to the entire disk using dd. That will force bad sector relocation. Then reinstall.
I had to do that a few years ago when I had a bad cable for a while. After getting the bad sectors mapped out, it has worked fine. Now the disk might be toast, and you have to replace it. BTW, what does the bad sector count show in smartctl? If it is growing, you have troubles. Correction: not the bad section info, but the error rate. Last edited by DrJ; 13th July 2008 at 04:03 PM. |
|
|||
The drive is fine. It has been low level formatted and I have run the Maxtor diagnostics on it with no errors, no bad sectors.. nothing. I have the same issues with two brand new Seagate 160s I threw in to back up my data with.
Smartctl -t long shows no errors at all. No bad sectors. I am using the gen kernel. Not sure how to use UFS without softupdates. Nor have I run tunefs. Not sure how really, or what param I want to tune. Never had to do so before. Here is the full disk info ar0: 3 250GB SATAII Seagate RAID5 JM363 ar1: 3 320GB SATAII Seagate RAID0 JM363 ad8: Maxtor 750GB IDE ar10 and ar12: 160GB SATAII Seagate (unplugged and disabled ATM) At first I thought it was the drive as well, but I have the issue with ar10 and ar12 as well. I have also tried connecting up ad8 using a IDE to SATA bridge and I get the same errors. At first I thought it was the bridge device that was fried. But I get the same thing when I connect up using a brand new IDE cable. *** Just tried turning off softupdates. [root@octavian /home/pseudonym]# tunefs -n 'disable' /dev/ad8s1d tunefs: soft updates cleared tunefs: /dev/ad8s1d: failed to write superblock Last edited by pseudonym; 13th July 2008 at 08:47 PM. |
|
|||
That test is rather cursory. You should look at smartctl -a for the device and look at the uncorrectable error rate, and the non-media errors.
Still, I doubt that that many drives are hosed. Something else must be up. |
|
|||
Here is what I get when I do a smartctl -a
Code:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 084 069 006 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 096 093 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 99 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 8 7 Seek_Error_Rate 0x000f 045 045 030 Pre-fail Always - 9629385353742 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1492 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 153 187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Temperature_Celsius 0x0022 054 040 045 Old_age Always In_the_past 774111278 194 Temperature_Celsius 0x0022 046 060 000 Old_age Always - 46 (Lifetime Min/Max 0/23) 195 Hardware_ECC_Recovered 0x001a 071 055 000 Old_age Always - 51827 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged =(. |
|
|||
The format (and some of the numbers) depend strongly on the disk you use. Here's what mine displays (with lots of edits):
Code:
Device: SEAGATE ST336753LW ... Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 25504257 0 0 25504257 28315309 9329.822 20979 write: 0 0 0 0 0 3033.998 0 Non-medium error count: 15235 |
|
|||
This is very odd.. I am not seeing any of that, even when I try it on one of my seagate drives....
Here is the entire output of smartctl -t long /dev/ad8 Code:
[root@octavian /home/pseudonym]# smartctl -a /dev/ad8 smartctl version 5.37 [i386-portbld-freebsd7.0] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.10 family Device Model: ST3750640A Serial Number: 5QD3L4N0 Firmware Version: 3.AAE User Capacity: 750,156,374,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Jul 13 03:08:39 2008 MDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 249) Self-test routine in progress... 90% of test remaining. Total time to complete Offline data collection: ( 430) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 202) minutes. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 084 069 006 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 096 093 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 99 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 8 7 Seek_Error_Rate 0x000f 045 045 030 Pre-fail Always - 9629385411627 9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1492 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 153 187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0 189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0 190 Temperature_Celsius 0x0022 052 040 045 Old_age Always In_the_past 807665712 194 Temperature_Celsius 0x0022 048 060 000 Old_age Always - 48 (Lifetime Min/Max 0/23) 195 Hardware_ECC_Recovered 0x001a 060 055 000 Old_age Always - 70299244 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 1 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 1 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Interrupted (host reset) 90% 1428 - # 2 Extended offline Self-test routine in progress 90% 1492 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Last edited by pseudonym; 13th July 2008 at 09:17 PM. Reason: drive WAS a seagate, not maxtor |
|
|||
Figured I would also throw in the dmesg as well.
Code:
[root@octavian /home/pseudonym]# dmesg Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-RELEASE #0: Sun Feb 24 19:59:52 UTC 2008 root@logan.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz (1872.27-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6f2 Stepping = 2 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0xe3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM> AMD Features=0x20100000<NX,LM> AMD Features2=0x1<LAHF> Cores per package: 2 real memory = 2137153536 (2038 MB) avail memory = 2081665024 (1985 MB) ACPI APIC Table: <INTEL DQ965GF > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic0 <Version 2.0> irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Feb 24 2008 19:59:27) acpi0: <INTEL DQ965GF> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: <ACPI CPU> on acpi0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 p4tcc1: <CPU Frequency Thermal Control> on cpu1 acpi_button0: <Sleep Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 vgapci0: <VGA-compatible display> port 0x3438-0x343f mem 0x90200000-0x902fffff,0x80000000-0x8fffffff irq 16 at device 2.0 on pci0 agp0: <Intel Q965 SVGA controller> on vgapci0 agp0: detected 7676k stolen memory agp0: aperture size is 256M pci0: <simple comms> at device 3.0 (no driver attached) atapci0: <Intel ATA controller> port 0x3430-0x3437,0x344c-0x344f,0x3428-0x342f,0x3448-0x344b,0x3400-0x340f irq 18 at device 3.2 on pci0 atapci0: [ITHREAD] ata2: <ATA channel 0> on atapci0 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci0 ata3: [ITHREAD] pci0: <simple comms, UART> at device 3.3 (no driver attached) em0: <Intel(R) PRO/1000 Network Connection Version - 6.7.3> port 0x30e0-0x30ff mem 0x90300000-0x9031ffff,0x90324000-0x90324fff irq 20 at device 25.0 on pci0 em0: Using MSI interrupt em0: Ethernet address: 00:19:d1:04:50:52 em0: [FILTER] uhci0: <UHCI (generic) USB controller> port 0x30c0-0x30df irq 16 at device 26.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: <UHCI (generic) USB controller> on uhci0 usb0: USB revision 1.0 uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: <UHCI (generic) USB controller> port 0x30a0-0x30bf irq 21 at device 26.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: <UHCI (generic) USB controller> on uhci1 usb1: USB revision 1.0 uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> mem 0x90326c00-0x90326fff irq 18 at device 26.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb2: EHCI version 1.0 usb2: companion controllers, 2 ports each: usb0 usb1 usb2: <EHCI (generic) USB 2.0 controller> on ehci0 usb2: USB revision 2.0 uhub2: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb2 uhub2: 4 ports with 4 removable, self powered pci0: <multimedia> at device 27.0 (no driver attached) pcib1: <ACPI PCI-PCI bridge> at device 28.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 28.1 on pci0 pci2: <ACPI PCI bus> on pcib2 atapci1: <Marvell 88SX6101 UDMA133 controller> port 0x2018-0x201f,0x2024-0x2027,0x2010-0x2017,0x2020-0x2023,0x2000-0x200f mem 0x90100000-0x901001ff irq 17 at device 0.0 on pci2 atapci1: [ITHREAD] ata4: <ATA channel 0> on atapci1 ata4: [ITHREAD] pcib3: <ACPI PCI-PCI bridge> at device 28.2 on pci0 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> at device 28.3 on pci0 pci4: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> at device 28.4 on pci0 pci5: <ACPI PCI bus> on pcib5 uhci2: <UHCI (generic) USB controller> port 0x3080-0x309f irq 23 at device 29.0 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb3: <UHCI (generic) USB controller> on uhci2 usb3: USB revision 1.0 uhub3: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3 uhub3: 2 ports with 2 removable, self powered uhci3: <UHCI (generic) USB controller> port 0x3060-0x307f irq 19 at device 29.1 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb4: <UHCI (generic) USB controller> on uhci3 usb4: USB revision 1.0 uhub4: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb4 uhub4: 2 ports with 2 removable, self powered uhci4: <UHCI (generic) USB controller> port 0x3040-0x305f irq 18 at device 29.2 on pci0 uhci4: [GIANT-LOCKED] uhci4: [ITHREAD] usb5: <UHCI (generic) USB controller> on uhci4 usb5: USB revision 1.0 uhub5: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb5 uhub5: 2 ports with 2 removable, self powered ehci1: <EHCI (generic) USB 2.0 controller> mem 0x90326800-0x90326bff irq 23 at device 29.7 on pci0 ehci1: [GIANT-LOCKED] ehci1: [ITHREAD] usb6: EHCI version 1.0 usb6: companion controllers, 2 ports each: usb3 usb4 usb5 usb6: <EHCI (generic) USB 2.0 controller> on ehci1 usb6: USB revision 2.0 uhub6: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb6 uhub6: 6 ports with 6 removable, self powered pcib6: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci6: <ACPI PCI bus> on pcib6 atapci2: <SiI SiI 3512 SATA150 controller> port 0x1018-0x101f,0x1024-0x1027,0x1010-0x1017,0x1020-0x1023,0x1000-0x100f mem 0x90004800-0x900049ff irq 21 at device 0.0 on pci6 atapci2: [ITHREAD] ata5: <ATA channel 0> on atapci2 ata5: [ITHREAD] ata6: <ATA channel 1> on atapci2 ata6: [ITHREAD] fwohci0: <Texas Instruments TSB43AB22/A> mem 0x90004000-0x900047ff,0x90000000-0x90003fff irq 19 at device 3.0 on pci6 fwohci0: [FILTER] fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:02:b3:00:37:13:44:37 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: <IEEE1394(FireWire) bus> on fwohci0 dcons_crom0: <dcons configuration ROM> on firewire0 dcons_crom0: bus_addr 0x12dc000 fwe0: <Ethernet over FireWire> on firewire0 if_fwe0: Fake Ethernet address: 02:02:b3:13:44:37 fwe0: Ethernet address: 02:02:b3:13:44:37 fwip0: <IP over FireWire> on firewire0 fwip0: Firewire address: 00:02:b3:00:37:13:44:37 @ 0xfffe00000000, S400, maxrec 2048 sbp0: <SBP-2/SCSI over FireWire> on firewire0 fwohci0: Initiate bus reset fwohci0: BUS reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci3: <Intel ICH8 SATA300 controller> port 0x3418-0x341f,0x3444-0x3447,0x3410-0x3417,0x3440-0x3443,0x3020-0x303f mem 0x90326000-0x903267ff irq 19 at device 31.2 on pci0 atapci3: [ITHREAD] atapci3: AHCI called from vendor specific driver atapci3: AHCI Version 01.10 controller with 6 ports detected ata7: <ATA channel 0> on atapci3 ata7: [ITHREAD] ata8: <ATA channel 1> on atapci3 ata8: [ITHREAD] ata9: <ATA channel 2> on atapci3 ata9: [ITHREAD] ata10: <ATA channel 3> on atapci3 ata10: [ITHREAD] ata11: <ATA channel 4> on atapci3 ata11: [ITHREAD] ata12: <ATA channel 5> on atapci3 ata12: [ITHREAD] pci0: <serial bus, SMBus> at device 31.3 (no driver attached) sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] pmtimer0 on isa0 ata0 at port 0x1f0-0x1f7,0x3f6 irq 14 on isa0 ata0: [ITHREAD] ata1 at port 0x170-0x177,0x376 irq 15 on isa0 ata1: [ITHREAD] atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: <Parallel port bus> on ppc0 ppbus0: [ITHREAD] plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ukbd0: <CHESEN USB Keyboard, class 0/0, rev 1.10/1.10, addr 2> on uhub1 kbd2 at ukbd0 uhid0: <CHESEN USB Keyboard, class 0/0, rev 1.10/1.10, addr 2> on uhub1 Timecounters tick every 1.000 msec hptrr: no controller detected.firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) ad8: 715404MB <Seagate ST3750640A 3.AAE> at ata4-master UDMA100 ad14: 305245MB <Seagate ST3320620AS 3.AAE> at ata7-master SATA150 ad16: 238475MB <Seagate ST3250410AS 3.AAC> at ata8-master SATA150 ad18: 238475MB <Seagate ST3250410AS 3.AAC> at ata9-master SATA150 ad20: 305245MB <Seagate ST3320620AS 3.AAC> at ata10-master SATA150 ad22: 238475MB <Seagate ST3250410AS 3.AAC> at ata11-master SATA150 ad24: 305245MB <Seagate ST3320620AS 3.AAE> at ata12-master SATA150 ar0: 915729MB <Intel MatrixRAID RAID0 (stripe 128 KB)> status: READY ar0: disk0 READY using ad14 at ata7-master ar0: disk1 READY using ad20 at ata10-master ar0: disk2 READY using ad24 at ata12-master ar1: 476944MB <Intel MatrixRAID RAID5 (stripe 128 KB)> status: READY ar1: disk0 READY using ad16 at ata8-master ar1: disk1 READY using ad18 at ata9-master ar1: disk2 READY using ad22 at ata11-master SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/ar0s1a WARNING: /archive/babylon was not properly dismounted |
|
|||
opps! Sorry.. thought the 750 was a maxtor. Guess it is also a seagate =)
|
|
|||
Smartctl output has always been more useful for SCSI drives. What you have is typical of IDEs (and probably SATAs); if you don't get the error rates most of the information is not that helpful. Temperatures are, I suppose, but I doubt that this is temperature related.
But do try the -a flag. It may not give you any different information, but it might. |
|
|||
Is this IDE or SATA? What controller are you using (you have a few)?
|
|
|||
That makes sense. I am not using my SCSI stuff right now (have a 9 bay external SCSI tower that I am going to be converting to a SATA RAID box.. damn thing sounds like a jet!). I have a whole mess of 36gig ULTRA320 drives. Having quite a few issues with the Adaptec 2100s controllers I have so they aren't in use. Besides, don't need the speed as this is mostly my torrent/fileserver (also running SWARM so that is why it is a beefy system).
Any idea why I am getting a "unable to write to superblock" when I try to do a tunfs -n disable ? It doesn't say to run fsck /dev/ad8sd1 like most of the errors I see when I look online. The output above is with the -a flag =(. |
|
|||
the 750 is IDE, the rest are SATA. They are all hooked up to a JMICRON JMB363 onboard RAID controller. It has 6 SATAII ports and of course, two IDE connectors. I have another SATA controller in there, but it isn't in use ATM. That is what the two new 160s are hooked up to. But they are unplugged.
|
|
|||
I *think* the JMicron device was fixed, but it was broken for a long time. I'd check the lists to see if that is so. I don't keep up much with IDE or SATA devices, since all my FreeBSD disks are SCSI. So I can't help you much with that.
Do you have another IDE controller laying around to try? While it shouldn't matter, you might want to disconnect all your disks other than the one you are having trouble with. Just removing the power connection would be good enough. |
|
|||
Thanks for all the help BTW. I really apprecate it. I really wish I could disconnect all the drives, but root is on the ar0. Do you think that could be it? I actually picked FreeBSD because it supported the JMIRCON device (I am a OpenBSD guy myself, Theo is pretty much my hero ever since I went down to Calgary to hear him speak =)).
If I could, right now I would back up all my Data and try again, however, when you have almost 2TB of data, it gets a little hard to find a place to put it! LOL. =). Do you think that having root on ar0 might be the issue? I was thinking of doing a reinstall and putting the system on one of the 160s.... |
|
|||
Quote:
Quote:
Quote:
But this sure sounds like a driver bug, or a subtle interaction of things on your system. For me, the OS either worked or it didn't. Except when I had bad cables (one was on an IDE CD on the JMicron, which I did get to work). |
|
|||
The Rocket RAID is definatly not supported. It is the secondary SATA controller I threw in for the two 160s and I am just using it for the sata ports. Nothing is connected to it atm.
I was thinking that it might be becuase I have that controller is on the PCI bus, and isn't a PCI-e controller and is flooding out the PCI bus. In fact.. that is my big concern.. am I flooding out the IDE bus and getting read errors. If so.. any idea how to slow things down? it does sound like a driver issue to me as well. Guess I might just have to wait for an updated version... |
|
|||
Okay. I think I have found a bad block! W00t! Can't belive I am saying that LOL. Still having issues with ad10 and ad12, but hopefully can fix this. Here is what I am getting:
1 Extended offline Completed: read failure 30% 1495 980569737 980569737 is the LBA_of_first_error. So now I have to figure out how to mark that block bad. Any ideas? |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
kernel file from 6.3 to 7.2 | l2fl2f | FreeBSD Installation and Upgrading | 3 | 14th September 2009 06:53 PM |
File system not properly dismounted | rex | FreeBSD General | 6 | 12th September 2008 02:45 PM |
File system at more than 100% | michaelrmgreen | FreeBSD General | 4 | 28th July 2008 01:52 PM |
Kernel configuration file ignored? | FWS | FreeBSD Installation and Upgrading | 16 | 26th June 2008 10:28 AM |
Which file system use to share data on Bsd system? | aleunix | Other BSD and UNIX/UNIX-like | 2 | 1st June 2008 04:14 PM |