DaemonForums  

Go Back   DaemonForums > FreeBSD > FreeBSD General

FreeBSD General Other questions regarding FreeBSD which do not fit in any of the categories below.

Reply
 
Thread Tools Display Modes
Old 4th December 2009
Carpetsmoker's Avatar
Carpetsmoker Carpetsmoker is online now
Real Name: Martin
Old man from scene 24
 
Join Date: Apr 2008
Location: Eindhoven, Netherlands
Posts: 2,051
Thanked 198 Times in 156 Posts
Default

Quote:
after the tech guy at the little hole-in-the-wall computer store gave me a lecture about always using the manufacturer's test program because it will give him an error code he needs for his return to the mfg
You don't need those for either SeaGate or WD. You do need them for HP, but I always give them error code 7; Which means the drive is so badly damaged that the test can't run
__________________
UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
Reply With Quote
Old 4th December 2009
AncientDragonfly's Avatar
AncientDragonfly AncientDragonfly is offline
not quite new user
 
Join Date: Apr 2009
Posts: 25
Thanked 0 Times in 0 Posts
Default

Quote:
Originally Posted by Carpetsmoker View Post
You don't need those for either SeaGate or WD. You do need them for HP, but I always give them error code 7; Which means the drive is so badly damaged that the test can't run
LOL! I'll remember that!
Reply With Quote
Old 29th December 2009
AncientDragonfly's Avatar
AncientDragonfly AncientDragonfly is offline
not quite new user
 
Join Date: Apr 2009
Posts: 25
Thanked 0 Times in 0 Posts
Default

Got an update, and a few more questions.

Update first:
After I got the replacement disk, I first ran the MHDD tools to see what a good hard drive should look like. Here's what the SMART test looked like on December 4:



I got busy reinstalling everything, and soon had a system that wasn't locking up.

This morning I got up and was doing my usual morning net things when a phone call came in. I returned to a black screen, with no response to mouse, keyboard, CTRL-ALT-F1, or anything other than the power button. Of course, the first thing I thought of was to find out what the SMART data said:



In not quite a month, it has acquired 998 reallocated sectors! with an enormous increase in read errors and seek errors.

Around noon today, I bought a replacement disk, this time, a Western Digital Caviar 1tb, and followed this procedure to move the installed OS and programs to the new disk: http://www.bsdguides.org/guides/free..._harddrive.php , which I was relieved to be able to complete without the disk freezing, and I was overjoyed to be able boot onto the new drive after I finished.

The Seagate 1tb will go back to the store tomorrow, so before I started the Seagate test program, I ran the MHDD program again for the SMART test, out of curiousity. It had 1026 reallocated sectors. That is 28 new ones just over the course of today!

I am currently running Seagate's SeaTools long test, but the short test showed the drive as passing. Needless to say, I don't have much faith in Seagate at this point, but at least I'll be able (thanks to this forum) to tell the guy at the store what to look at instead.

Now my new questions are these:
One of my transferred files ended up being copied as a 0 byte file, when it had been about 21k, and fortunately, I had a copy elsewhere, but it occurs to me that other files may be missing pieces too, due to the errors on the drive they were dumped from.

Is there something I can do now that will recompile everything with a few steps, replacing any missing pieces, or will that be a slow and individual process?

I have not done the security updates that came out on Dec 3 yet, and one of them requires rebuilding world. Will this cover recompiling and making complete the missing pieces, mentioned above, for the base system, leaving only the ports to be fixed?

Is there anything else I should be concerned about doing in order to deal with any issues from having copied the files using dump/restore from the bad HD to the new one?
Reply With Quote
Old 29th December 2009
robbak's Avatar
robbak robbak is offline
Real Name: Robert Backhaus
VPN Cryptographer
 
Join Date: May 2008
Location: North Queensland, Australia
Posts: 366
Thanked 40 Times in 39 Posts
Default

It may well be within the manufacturer's 'normal' range. Drive speeds and capacities are pushed to where errors are unavoidable, and recovered by the drive's internals.
If you know what a 'bathtub curve' is, you would understand that a large amount of reallocated sectors in the first month could well be expected. maybe not 1000, though - I have never done the type of investigation you have.
(Note that, back in the days of huge 10MB scsi drives, a new drive would have a list of maybe 10 bad sectors printed on its label. The user had to enter those sectors in to the software as part of the setup. A new drive will have some bad sectors, guaranteed. It's just that the drive now takes care of the, not the software.)
__________________
The only dumb question is a question not asked.
The only dumb answer is an answer not given.
Reply With Quote
Old 29th December 2009
Carpetsmoker's Avatar
Carpetsmoker Carpetsmoker is online now
Real Name: Martin
Old man from scene 24
 
Join Date: Apr 2008
Location: Eindhoven, Netherlands
Posts: 2,051
Thanked 198 Times in 156 Posts
Default

Very high Read error rate, Seek error rate, and hardware ECC recovered are normal for SeaGate drives. Reallocated sectors, however, are not AFAIK.

Did you also run a surface scan (F4 key twice)?
__________________
UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
Reply With Quote
Old 29th December 2009
AncientDragonfly's Avatar
AncientDragonfly AncientDragonfly is offline
not quite new user
 
Join Date: Apr 2009
Posts: 25
Thanked 0 Times in 0 Posts
Default

Here's the surface scan from December 4:



And here's the one from this morning:



While it was running, it hit patches where the <150ms, <500ms, and >500ms were in an area, and the drive made more noise when it got to them.

Here's an error from the drive yesterday:



I was in an ssh session finding out how much space I would need to copy it all over to somewhere else. It was running du (I think) when it lost contact.

I am thinking that even though it didn't show any UNC errors, the longer access times are just too long for the time FBSD allows for the disk to be read. There were several times it froze yesterday as I tried to get my most important stuff off of it.

The thing is, I don't really care about the number of reallocated sectors and whatnot, I care about having a system that doesn't freeze up on me while I'm working, and that I can rely on to keep my data from day-to-day.

robbak, I don't remember the bad sector labels on the 10 mb SCSIs, but I do remember thinking I had more space than I would ever need when my workplace bought me a 20 mb drive. When adding a new piece of hardware like a video card meant adding all kinds of stuff into autoexec.bat and config.sys manually to find the magic combination that worked. (I don't miss that at all!) And the floppy disks were actually floppy.
Reply With Quote
Old 29th December 2009
Carpetsmoker's Avatar
Carpetsmoker Carpetsmoker is online now
Real Name: Martin
Old man from scene 24
 
Join Date: Apr 2008
Location: Eindhoven, Netherlands
Posts: 2,051
Thanked 198 Times in 156 Posts
Default

The first MHDD results look fine. The second do not. For a new drive <150ms results are normal, <500ms are hmmish, and >500ms are definitely not OK.

The reason you don't see any UNC errors is because the bad sectors are already reallocated.
Now you also see why I recommended MHDD specifically, AFAIK MHDD is only one of two programs which display the time it takes to access sectors (The other being an internal test utility we have at work that someone once wrote).

The kernel panic could be a FreeBSD bug or some random glitch, but in combination with all the previous data I have little doubt that the "new" hard drive is broken.

This is not uncommon by the way, the drive you received probably isn't new but a refurbished (repaired) drive, IIRC you can actually see that on the Seagate label (Or at least the mfgr. date). There is a good reason why I test RMA drives

Quote:
The thing is, I don't really care about the number of reallocated sectors and whatnot
You should A reallocated means the data on that sector is lost. This could be anythiing from a random file in /tmp to /boot/kernel
Also, it's my experience that once you have 1 reallocated sector, others start showing up soon (Not always, but often).

As a sidenote, there's smartmontools in ports to monitor SMART data and send emails or something after something happens. Very useful for servers and stuff to monitor and preventing problems/unplanned downtime (As we say in Dutch "voorkomen is beter dan genezen", meaning: Preventing is better than curing ).

Anyway, any computer store worth it's money will replace this drive, even if the official tool says the drive is OK. If they give you problems you can also return the drive yourself on the Seagate website, this does take about a week though while the store may replace it immediately.
__________________
UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
Reply With Quote
Old 31st December 2009
AncientDragonfly's Avatar
AncientDragonfly AncientDragonfly is offline
not quite new user
 
Join Date: Apr 2009
Posts: 25
Thanked 0 Times in 0 Posts
Default

Quote:
Originally Posted by Carpetsmoker View Post
The first MHDD results look fine. The second do not. For a new drive <150ms results are normal, <500ms are hmmish, and >500ms are definitely not OK.
The first one was from when I first got the (first) replacement drive home, before I did anything else with it (except for putting it in the case).

Quote:
This is not uncommon by the way, the drive you received probably isn't new but a refurbished (repaired) drive, IIRC you can actually see that on the Seagate label (Or at least the mfgr. date). There is a good reason why I test RMA drives
I didn't see anything on it that indicated it might be a refurbished drive, but I had gotten a feeling when they gave it to me that it might be a good idea to test it first, probably because I didn't want to reinstall everything on another disk that might turn out to be bad. Then it went bad over the month, after I had reinstalled everything.

I guess I am actually pretty lucky because I've never had a disk be bad when I got it, or go bad soon afterwards; in fact I still have a 1gb disk still in use occasionally.

Quote:
You should A reallocated means the data on that sector is lost. This could be anythiing from a random file in /tmp to /boot/kernel
Also, it's my experience that once you have 1 reallocated sector, others start showing up soon (Not always, but often).
Well, I will from now on! And at least this process has taught me what it takes to get the stuff from a dying drive onto a new one, and how to back up absolutely everything, instead of just my irreplaceable stuff.

Quote:
As a sidenote, there's smartmontools in ports to monitor SMART data and send emails or something after something happens. Very useful for servers and stuff to monitor and preventing problems/unplanned downtime (As we say in Dutch "voorkomen is beter dan genezen", meaning: Preventing is better than curing ).
smartmontools is compiled, needs setting up now. Thanks. We have a saying like that here too, "an ounce of prevention is worth a pound of cure."

Quote:
Anyway, any computer store worth it's money will replace this drive, even if the official tool says the drive is OK. If they give you problems you can also return the drive yourself on the Seagate website, this does take about a week though while the store may replace it immediately.
The store wanted to give me a Hitachi as a replacement, but since I already replaced it, they gave me the money back. At first the tech guy grilled me about if I had mounted it in the case with screws (hahaha!), then a lot of talking I didn't understand (Korean, I think) took place between him and the older woman who has been seeing me in the store periodically for years. Then he went and tested it. I explained to her that I had already bought a replacement because I needed to put the data somewhere instead of reinstalling and configuring everything for the 3rd time. She told me that out of the 100 they had gotten in and sold, I was the only one who had returned them. I figure the other buyers either don't know yet, or they just think Windows screwed up again. I guess all the disk-intensive compiling with FreeBSD is a good preliminary test.

Anyway, now that that's all settled, is there an easy one-step way to make sure everything on the new disk that was dumped from the bad disk is complete, and if not, to get it that way?
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
FreeBSD 8.0-RELEASE Nightweaver FreeBSD Installation and Upgrading 10 10th December 2009 12:12 AM
OpenBSD OpenBSD 4.6 Release Carpetsmoker News 21 11th November 2009 11:13 PM
FreeBSD 7.2-RELEASE vermaden FreeBSD General 6 12th May 2009 08:33 PM
-Release vs. -Stable guitarscn OpenBSD Installation and Upgrading 3 2nd October 2008 02:32 PM
FreeBSD 6.2-RELEASE > 7.0-RELEASE Upgrade Marci FreeBSD Installation and Upgrading 2 23rd July 2008 02:10 PM


All times are GMT. The time now is 07:10 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick