DaemonForums  

Go Back   DaemonForums > FreeBSD > FreeBSD General

FreeBSD General Other questions regarding FreeBSD which do not fit in any of the categories below.

Reply
 
Thread Tools Display Modes
Old 15th June 2011
jggimi's Avatar
jggimi jggimi is offline
More noise than signal
 
Join Date: May 2008
Location: USA
Posts: 7,975
Default

A little level setting:

In the Good Old Days™ many decades ago now, multi-platter fixed block architecture disk drives reported their actual number of sectors and platters to the operating system.
A series of sectors in a circle on a platter made up a single track, and as the heads moved together, the combined tracks from all of the platters at one head location made up a cylinder. If the OS wanted to read or write a particular sector, the heads moved with a "seek" operation, the drive selected the head of the platter containing the track of interest, and then waited for the sector to rotate around under the head. The "address" of that sector was noted two ways: by Cylinder/Head/Sector (or, if you prefer to think of it, Cylinder/Track/Sector), or by Logical Block Address.
Around twenty five years ago, drive manufacturers began to have their drive electronics "lie" about the underlying architecture of cylinders and heads. They found that if they mapped the sectors the way they wanted to, they could provide their customers -- computer manufacturers who often also wrote OSes -- with adequate performance, simplified or no-management-required of spare/bad sectors, and either simplified data location tuning, or no data location tuning necessary at all.

Commonly these days, drive electronics report having 255 heads (or platters), and that is an obvious fallacy. The drive manufacturers decide what to do with internal sector placement, and those decisions are proprietary and closed.

Note also that with modern drives managing spare sectors without OS intervention (or knowledge), if the drive has replaced a failed sector with a spare, its LBA or CHS address will have nothing at all in common, physically, with adjacent sector addresses.

Bottom Line:
You cannot know the physical requirements for any particular I/O by LBA or CHS address used by the OS. The time required may or may not include one or more seek operations, there may or may be one or more delays for rotation. While the drive manufacturers may have internal maps that minimize seeking or "Rotational Position Sensing (RPS) miss" platter delays for sequential I/Os -- the OS or the admin have either little or no insight into these occurrences.

Last edited by jggimi; 15th June 2011 at 03:02 PM. Reason: added a "bottom line" paragraph
Reply With Quote
Old 15th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default Need a day or two:

I won't beable to reply with out checking first, but I will be reading. It don't take must to screw up: see my new setup now in progress, below. Sorry for to many words or out of place words. I'm trying not to rush things. That's how we screw up.

jggimi, that is mind blowing. Beastie you are giving me a better clue to a lot of things but I don't agree with a few details you included. I'll get back with you shortly. Give me a minute. The point of direction must come first so that we all agree. That point has been very clear. jggimi, one thing for sure no matter what they done there is always a base-line to be found. UNIX is a hell of a tool and ASM can kick the c code they use out the picture. Bottom line we have tricks tooo. UNIX and LINUX and Assembler and C are tools by themselves and that is what they use so they can't hide all from view. but if the hard-disk controller got it, were out of luck. They don't rule the controllers. The controller rules them and people like REALTECH is not cashing in i bet.

I better get started.


Quote:
The same question, when researched in terms of MIPS, got the answer "it's implementation-dependent". Well, yeah, obviously...same goes for hard drives.
First of all, Thanks for that. So I plan not to keep up with the Jone's using many brand-names of the same type hardware. I already know that the latest and greatest storage device will be replaced by 2016 or tomorrow by IBM and SONY. I already made up my mine two years ago to dedicate all to AMD-64 because you can write 32 and 64bit programs under one roof. Who cares about INTEL being .0050MHz faster anymore. It's a done deal! I might upgrade a day after 2016.

It took a few days but bashrules post came to light for me after 12 years of wondering. That's tooo much. I bet millions of us thought the same. It should be called "bashrules" to refer to the real difference. I hope the idea is correct. If not, PROVE-IT. It's just like a old-fashion washing machine, it must spin from the center, pushing anything inside OUTWARDS. But since nothing is impossible, implementation could be a factor ... I'm not going to take no chances. I'll sticking with a one brand name of hard-drive for type computer also. So this is now a DONE DEAL too! For what I need space for, a six-pack of Seagate Barracuda 1-Terabyte, one for each AMD machine will do me just fine until QUANTUM arrive.


The speed thing don't show-all. It don't mean PARTITION-1 is dead. Overall it's no faster or slower than the other two partitions when you are actually RUNNING it. Maybe one of those slices is going to feel the impact, or even certain BSD functions. I plotted to get to this for years. And now I just got the answer. So, I'm not going to jump the gun and put cart before the horse.

Just like 8GB RAM with legacy give you only 7.5GB, well a HDD has legacy too and this is the effect of PARTITION-1 but who would think this. It took 12 years to get here, a few more weeks is not going to kill me. The plan was to understand why, than fix-it, trick-it or what ever it takes. But if below turn out to be what a set up should have been for computers in the first place, PARTITION-1 (MBR - OS) COMBO can keep its legacy.


This is my new set-up in the working and why I waste my time:

Partition-1 300 GB - FreeBSD-9.0 |
1)
I'm dishing out my own legacy of from 100GB to now 300GB bigger in size to push Partition-3 beyond the OUTER MIDDLE LIMIT of the entire HDD so there be no excuse not to get the maximum speed out of the any OS and anything else to see if this "OUTER is Faster" thing is true. So far it did prove to be TRUE and this setup is just simply worth the effort and don't hurt anything.
2)
This partition will also be my storage area for zillions of UFS files that I never want to lose. I will also be mounting this partition from afar to study the raw FreeBSD lay-out using PcBSD on P-3. I always wanted to view what's in a FreeBSD directory using a desktop instead of a terminal. Now I can figure better what I can remove to trim-down this system as a server, roving things that it will NEVER use.
3)
I got tons of files I never use or only used once... Who said storage space should be ahead of the drive like we save to everyday, packing it with files that you may never use again but it is living in an area where the most speed for your system can be obtained.


Partition-2 100 GB - PcBSD-9.0 |
1)
This will be my area for testing newer versions of FreeBSD and PcBSD, mainly PcBSD's. FreeBSD has his own on Partition-1. Also it will be here just in case Windows-8 has found a way to not to allow Windows-8 to work on Virtual-Box. I'm sure Dollar Bill knows more people use the FREE Virtual-Box which can allow for may transfer of MS new OS to any non-paying parties. I will be prepared because I always pay-in-full. I need the real-deal.
2)
To swap saved versions that ARCH has dd for me and compressed and saved to files.


Partition-3 300 GB - PcBSD-8.2 |
1)
For mastering PcBSD and to learn more about FreeBSD on P1.
1)
To run Win-95 - Window-8 and the rest if I want to in Virtual-Box.
1)
It will be the Ultimate Desktop, living at the Ultimate location. PERIOD
In a few days I will know which way is up for SURE. But I hope
to know the truth before than or some darn good guest about direction.
Until than, bashrules, RULE!


Extended-4 250 GB - with bootable partitions
1)
To run Arch-command-line version to maintain the entire system and to keep all other operating systems honest.
2)
To run Fedora-Gnome-3 giving Gnome-3 a chance to live on a real partition in my world. I don't like this kind of change but might as well get use to it.
3)
A Partitions for 14 years worth of Windows Fat-32 files that will be used by all Windows from Virtual-Box on P-3
4)
A partitions for ARCH to store P1, P2 and P3 partitions compressed to files.


FREESPACE: 1GB
Something to attach a 2, 3 or 4 Terabyte drive if I ever need more space which I will not because it would only break the bashrules OUTER-LIMITS set for this HDD.


Who could ask for more?

I should be back on the ball by this coming Monday.

I'll post the new numbers for every partition, both ways (save and destroy). After that I'll be ready to do the rocket thing inside each partition dealing with each of it slices.
.............
..............
...............
Give me a few days rocket357 so I can finish this set up.

Quote:
I too have always understood the inner portion
nilsgecko What do you mean? Do you thing the HDD start reading at the most inner part of the HDD and also build partition up from that point. Or do you think at it has to go to the most-outer position of the HDD and pick-up it's very first orders... and from there do you think partitions are build downward from that point. Let me know if you don't understand the question so I can re-word it. Maybe use one of those charts to show me what you think. I think only a line or two may need to be change to show and tell your way.

Better yet this is a question for all of you guys...
...

Last edited by sharris; 15th June 2011 at 05:46 PM.
Reply With Quote
Old 15th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

Quote:
Originally Posted by sharris
This indicates to me that the MBR is at the CENTER of the HDD living on top of a fool known as the hard-di*K controller
Quote:
Seriously? How did you reach that conclusion?
Exactly what part are you speaking of...the hard-di*K typo that I ran-away with or the use of the word CENTER and not INNER-MOST?


Quote:
It's on the periphery of one of the platters and nowhere else. It's a sector like any other. The only thing special about it is that the BIOS loads it after the POST.
I'm sorry but this tells me nada.


Quote:
("data saving begin at the first OUTER edge of the HDD"), on the periphery of the platter.
Could you show your source usings this word (periphery) and HDD together.

................
................
Me:
Quote:
it indicated the MBR to be at the CENTER of the drive
YOU:
Quote:
And I presume it's wrong.
Take a look again.


Code:
I am the hole in your HDD
INNER-MOST:
^^^^^^^^^^^^^^^^^^^ TRACK 1 - claiming 300GB of sectors inside 3 TRACKS
MBR + PRIMARY 1 ---- i am the most inner with MBR in the FIRST-SECTOR!

^^^^^^^^^^^^^^^^^^^ TRACK 4 - claiming 100GB of sectors inside 1 TRACKS
I am- PRIMARY 2 ---- i am the one track most outer

^^^^^^^^^^^^^^^^^^^ TRACK 5 - claiming 300GB of sectors inside 3 TRACKS
I am- PRIMARY 3 ---- i am the the three tracks most outer

^^^^^^^^^^^^^^^^^^^ TRACK 8 - claiming 200GB of sectors inside 2 TRACKS
I am-EXTENDED 4 ---- i am the the most outer EVER --- so where is the MBR?

^^^^^^^^^^^^^^^^^^^
OUTER-MOST

If MBR was on the most outer part of the HDD than PRIMARY 1 would be the most outer partition because they are connected than-2 than-3 tha-4 would be the most inner being slowers based on information given. A picture in the text book that i mentioned shows a cd with a “single track spirals to edge of disc”. ... This thing starts at the MOST INNER section of the disc and spiral out to the outer edge of the disk. This is the only indication that can help to visualize anything. There is nothing that show and tell I seen yet.

This tell me that the first sector of a HDD could be and should be at the MOST INNER section of the HDD. The FIRST-SECTOR:

You make a pie at the MOST INNER section. You make pizza at MOST INNER section, and spread-out.

Quote:
popularized by the IBM Personal Computer.[1] It consists of a sequence of 512 bytes located at the first sector of a data storage device such as a hard disk. MBRs are usually placed on storage devices intended for use with IBM PC-compatible systems.
But do any one tell you where this FIRST-SECTOR start from .... Noooo!


http://en.wikipedia.org/wiki/Master_boot_record


Ok, when partitioning we claim a certain amount of tracks for it’s data to use any group of it's sectors. For those who claim that it is not this true ... lets look at it the SECTOR way. We already know that once we start filling any partition with data it could be sent to any group of sector as long as it stay within the the real number of tracks it take to mean given size for each partitons.

Code:
I am the hole in your HDD
and these are sectors below me

^^^^^^^^^^^^^^^^^^^ 
^...MBR...^  P2  ^  P3  ^  P4 ^  ^  Well I can't do much, MBR on my back

^....P1...^   P2  ^  P3  ^  P4 ^  ^  Darn there four of us in one cyl.

^....P1...^   P2  ^  P3  ^  P4 ^  ^  Yes this makes no since. 

^....P1...^   P2  ^  P3  ^  P4 ^  ^  Me three but I am the most outer .. all 4 of us
Reply With Quote
Old 15th June 2011
BSDfan666 BSDfan666 is offline
Real Name: N/A, this is the interweb.
Banned
 
Join Date: Apr 2008
Location: Ontario, Canada
Posts: 2,223
Default

The internal organization of drives is completely abstracted away from the user and operating system, it is instead addressed by logical sector/block numbers (LBA) from the beginning to the end of the disk.

Modern drives have sector remapping capabilities to deal with the possibility of bad sectors, so there is no guarantee that logical and physical mappings are 1:1.

CHS notation is obsolete, and you really have no idea about disk actual geometry anymore.. it doesn't matter.
Reply With Quote
Old 15th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

Quote:
beginning to the end of the disk
I also read a little about that also and it don't worry me. There is a peep-hole. AGAIN, the question is "what constitute the beginning to the end of the disk" in real English like this:

Quote:
Beastie wrote:
Outer: situated farther out; being away from a center.
Inner: situated farther in; being near a center.
Or do you really feel that has been taken away too? I don't think so. You can't turn a front door into a back-door and not leave any traces. The starting point will forever remain for any device and that is all I need to know.

Shoot your best shot.

I gave my final thought! BOTTOMS-UP; From the INNER MOST to OUTER MOST and I'm not going to bite my tongue until I peep the holes and I will.

Quote:
it doesn't matter
Bill Gates himself once said something like this "it doesn't matter, computers will never run more than a 64MB of RAM". It does matter.. For now you seek a workaround since we saw some truth about Partitioning... (P1) so far. Never bow-down to the so-call unknown.

Thanks for your latest breakdown.
Reply With Quote
Old 15th June 2011
bashrules's Avatar
bashrules bashrules is offline
Aspiring Unix Greybeard
 
Join Date: Mar 2010
Location: Here
Posts: 80
Default

Quote:
Originally Posted by sharris View Post
Bill Gates himself once said something like this "it doesn't matter, computers will never run more than a 64MB of RAM".
He didn't say this.

Quote:
Originally Posted by BSDfan666 View Post
you really have no idea about disk actual geometry anymore.. it doesn't matter.
It should matter as you want to have swap on the fastest "part" of the disk.

Bash
Reply With Quote
Old 15th June 2011
BSDfan666 BSDfan666 is offline
Real Name: N/A, this is the interweb.
Banned
 
Join Date: Apr 2008
Location: Ontario, Canada
Posts: 2,223
Default

For the programmer, logically the start of the drive would be the first logical address.. addressed to the controller as LBA 1, which contains the MBR on x86 systems.

It does not matter where data is stored on the physical medium, drive and controller electronics abstract that all away.. you don't even know if you're using a rotating disk or a solid state device with a motor making sound effects.

There is a lot of proprietary vendor-specific stuff going in modern drives, the firmware is very complex.. only a high level representation is made available to operating systems and the kernel, tuning disk performance for drives is a thing of the past.. some newer drives use 4KB physical block sizes with the traditional 512 byte logical sectors, but that was an unfortunate alignment issue.

I'm sorry sharris, but your rants are really frustrating to follow.. and I find myself trying to avoid your threads due to your habit of jumping to strange conclusions.
Reply With Quote
Old 16th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

Quote:
He didn't say this.
If you read between the line it said "something like this". I forgot exactly what he said between 1993 - 1995 I believe. I will find it before this month is out and post it once in for all. But for now tell us exacly what he said. If you remember his exact statement, you have no right to call me a lie. You can only make the correction which is close as flies on sh*t.

Thanks for sharing


Quote:
It should matter as you want to have swap on the fastest "part" of the disk.
So you know this too but no confirmation to my original post.


Quote:
... I use to read that the beginning of a hard disk drive was the fastest section of the HDD and that the inner drive was the slowest because the arm has to work so much harder. I really don't see it that way ...
So only a guesting game for me down to post #21 with flames coming from left to right... Geez! Great job guys

Quote:
I'm sorry sharris, but your rants ....
... as you avoided the original question or whaterver you want to call it because you had no answer.

Quote:
and I find myself trying to avoid your threads due to your habit of jumping to strange conclusions.
All I can do is guest. I said nothing wrong to anyone.
Reply With Quote
Old 16th June 2011
rocket357's Avatar
rocket357 rocket357 is offline
Real Name: Jonathon
Wannabe OpenBSD porter
 
Join Date: Jun 2010
Location: 127.0.0.1
Posts: 429
Default

Sharris, forget the noise and politics. Get a hard drive. Run zcav on it. Plot it with gnuplot.

Then tell me, where are the lowest logical addresses? They're on the fastest part of the disk, right? The highest logical addresses? The slowest, right?

The outer edge of the disk is moving fastest with respect to the read head of the drive. It's just simple math: Say the drive is spinning at 7200 RPM. That's 120 revolutions per second. Let's make the math simple...let's say it's a 3.5" drive, and the disk platter itself is 3" in diameter. On the innermost track (let's say it's 1" from the center), the amount of data the read head gets the opportunity to "see" is:

distance/revolution = 2*radius(pi)

At 1" from the center (i.e. the innermost track), the distance the read head "moves" (with respect to the platter) is 2*1*3.14, or 6.28 inches per revolution. At 1.5" from the center (i.e. the outermost track), the distance the read head "moves" (with respect to the platter) is 2*1.5*3.14, or 9.42 inches per revolution. There's nothing magical about it...it's simple math.

Now, let's say each inch of platter contains 512 bytes of data (this is an intentional approximation...in reality the density changes based on where the track is within each "zone"...the start of a zone (outer edge) will be less data-dense and the end of a zone (inner edge) will be more data-dense...but each *zone* (i.e. group of tracks) has approximately the same "bytes per inch", so-to-speak). This means that the amount of data that can be read each second = (revolutions/sec)*(distance/revolution)*(bytes/inch). At 1.5" from center, this equates to 120*9.42*512 bytes = 578764.8 bytes per second, or approximately 0.55 MB/sec. At 1" from center, this equates to 385843.2, or approximately 0.37 MB/sec. Again, nothing magical about it...it's simple math...the OUTER tracks are faster than the INNER tracks. This happens because the speed the drive rotates at is constant (whereas a CD drive alters how fast it spins to compensate for this nonsense).

It's true that the convention is not standard, but the argument is that the OUTERMOST edge of the platter is the BEGINNING of the disk, and the INNERMOST edge of the platter is the END of the disk. It's not standard, but everyone seems to do it that way. If you find a hard drive that defies this convention, please let me know!

I think a lot of the frustration in this thread is that you hang on to this idea that the innermost track is the start. Sure, it's easier to make a pizza from the inside out, but we aren't talking pizza here.
__________________
Linux/Network-Security Engineer by Profession. OpenBSD user by choice.

Last edited by rocket357; 16th June 2011 at 01:44 AM.
Reply With Quote
Old 16th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

Quote:
It's true that the convention is not standard, but the argument is that the OUTERMOST edge of the platter is the BEGINNING of the disk, and the INNERMOST edge of the platter is the END of the disk. It's not standard, but everyone seems to do it that way.
I been with this for over an decade, religiously.


Quote:
I think a lot of the frustration is that you hang on to this idea that the innermost track is the start.
I hate to sound like a broken record but from day-1, the year of when Partition Magic was invented, I thought the starting point was at the hard-drive most-outer position, with the MBR being on the first sector after you create the first Partition. Now both are at the top most-outer position.

Every document say, the most-outer positions is the fastest partition. Suppose you dd something and it show you the reverse (with the most inner partitions being the fastest). Would that make you wonder. Yes because you are deeping into it than I ever been or either you learn fast. I know darn well the world can't be wrong so I got the notion to reverse the idea and when I saw

Quote:
Perhaps partition 2 is the outermost partition?
That would mean partitions build have to begin from the inner-most in order for that to be possible, so I went with that. Now the world is right again and I accepted beening wrong for so long. Who am I against tons of documented facts... I bow-down.

I love that I idea and it makes all the since and I'm going to keep most parts of it on the back-burner just in case, but since you done the math I am there.

I still have not partition my HDD to do my final set up, so I can't do the zcav thing yet but I will soon. I do trust your math, some of it is in the textbook but I can't count money well. I always been weak at doing the math. If I knew how to do the math this thread would not be here.

Thanks and do give me a little time so I can read the how-to zcat, etc. One thing for sure, it makes no differences if it's bottoms-up or top-down ... I know enough not to run my main OS from Partition-1 on any HDD that prove to be slower. Other can make their own choices. I post my numbers and mostly talk myself up on solutions lately, anyway. So you will find my zcav numbers here. It be back to business for me tomorrow. I feel burn out all day wondering how to explain why.


Quote:
Sure, it's easier to make a pizza from the inside out, but we aren't talking pizza here.
It was only another scenario. I do that all the time, all my life to throw out some ideas. If it is not true I only hope that someone in the conversation would say so. Thanks you for saying so

P3 is the WINNER, regardless and it's not my fault!

Thanks again

Last edited by sharris; 16th June 2011 at 04:00 AM.
Reply With Quote
Old 16th June 2011
ocicat ocicat is offline
Administrator
 
Join Date: Apr 2008
Posts: 3,318
Default

Quote:
Originally Posted by sharris View Post
So only a guesting game for me down to post #21 with flames coming from left to right... Geez! Great job guys
Actually sharris, it is you who is beginning to cross the line by crying slander. Information provided to you by jggimi, rocket357. & BSDfan666 is based on years of experience & learning. Dismissing their statements simply because it doesn't match your view of the world doesn't speak highly of you or your willingness to deal with evidence contrary to your understanding. I have worked with some of these people for years. They know their stuff. They have shared it with you in good faith.
Quote:
If you read between the line it said "something like this".
Let me pass on some advice about Internet forum sites. Some of the people you are conversing with may be half a world away, & English may not be their first language. All anyone can in a forum is work with the information provided. If you state one thing, but mean another, you have failed to communicate; no one else can be blamed. Interjecting unrelated jokes & other irrelevant comments might spice up conversation between you & friends in an in-person conversation, but it doesn't help in discussion limited to an Internet forum site. It may have the opposite effect by simply confusing your audience. Actually, a number of people have mentioned this to you on more than one occasion in very non-combative responses.
Quote:
You can only make the correction which is close as flies on sh*t.
Resorting to four-letter language is not acceptable on this site. Period. This statement should be very clear.
Quote:
I said nothing wrong to anyone.
No one here has anything to gain by intentionally leading you astray either. rocket356 provided you the name of a recent PostgreSQL title focused on performance issues. I, too, have this title, & it is a solid reference on performance & tuning. Because many of your posts of late indicate you have a strong interest in performance tweaking, I highly suggest you get the book & study some grounded pronouncements on what real knobs you can turn to take fuller advantage of existing hardware. It doesn't matter whether you have any interest in databases or not. The book is that good on very fundamental points.
Reply With Quote
Old 16th June 2011
rocket357's Avatar
rocket357 rocket357 is offline
Real Name: Jonathon
Wannabe OpenBSD porter
 
Join Date: Jun 2010
Location: 127.0.0.1
Posts: 429
Default

I think my brain just segfaulted.
__________________
Linux/Network-Security Engineer by Profession. OpenBSD user by choice.
Reply With Quote
Old 16th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

I am sorry for my joke and the 4-letter word and that will never happen again. As far as jggimi, rocket357 and BSDfan666 there were no bad vibs. As far as me not being academic English perfect, no forum is so good that I can't speak up for myself if I want to remain as a member. As far as other who called me something that I AM NOT because I indicated that his comment(s) was off topic again, and made it seem like I twisted his arm to make him respond to my thread ... now you want to tell me how wrong I been. Both of these guys owe me an apology. Even in their country they know better and both mom and dad would be spanking their booties right now. And that is not a joke. Anyway, I still think you are a wonderful and very wise person but I don't totally agree with you right now.

Quote:
I think my brain just segfaulted
I been brain dead every since. I actually got nothing done on a day that should had been fill with joy.
Reply With Quote
Old 16th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

I'm mostly WRONG I'm sorry guys.

BSDfan666,
The only thing I meant for you was about:

Quote:
it doesn't matter
Sorry about the rest. Re-reading threads sounds again like a few unhappy comments. I did not know who wanted to jump me next. No big deal. Nobody is perfect.

Pizza anyone
Reply With Quote
Old 16th June 2011
jggimi's Avatar
jggimi jggimi is offline
More noise than signal
 
Join Date: May 2008
Location: USA
Posts: 7,975
Default

Disk drive performance tuning in the deep dark ages -- another history lesson

Disk drives are mechanical devices. A disk I/O, consisting of one (or more) sector reads or writes, is typically measured in milliseconds. What takes so long?
  • Seek time -- the moving of the head assembly to the cylinder, the collection of tracks underneath the heads, where the (first) sector resides.
  • Rotational delay -- waiting for the (first) sector of interest to make its way under the head
  • Data transfer -- the time it takes to move data onto or off of the sector(s), into the drive electronics, for transmission down the bus/channel to the OS.
There are other possible I/O delays, such as delays caused by the I/O channel being busy with transfers for other devices, but the focus of this post is on an individual drive performance.

Minimizing drive delay -- part one

At one time, the best practice for paging (swapping) spaces was neither to provision these data sets at the inner platter edge, nor the outer platter edge. Instead, the best performance was to set them at the center section of the platters. This minimized seek time. In addition, on those drives that were used for paging/swapping, best practice was to place lightly accessed data on the rest of those drives, or leave the space unallocated. Heavy use data was placed on other drives in the farm.
This may seem unfathomable in this day and age, where low-end machines are shipped with 160GB or 300GB drives, and +1TB drives are common. I speak of a time when a large drive held less than 0.5 GB, and a computer might have a dozen or more such drives attached.
Minimizing drive delay -- part two

Sometimes, when reading (or writing) a series of sectors, the timing would be such that the next sector in the group would be missed. Perhaps the sequence of sectors moved to a different platter in the cylinder, or there was a speed differential between what the drive electronics could manage and the sectors speeding along on the spinning platter under the head. This "woops, gotta wait now" rotational delay is known as an RPS Miss. RPS, if you bothered to read a prior post above, is Rotation Position Sensing.

On some drives, the sector number layout on the drive was non-sequential, so that the electronics could keep up with the rotation speed, or vice versa. This difference was known as interleave, and for many devices, this could be changed by the system administrator at disk format time. If so, this was set by a "low level format" done by the drive electronics, not by the OS. After the interleave was set, the drive could be formatted, data could be loaded on it, and performance may have changed, for better or for worse. You might have found a recommendation for your application and OS, but mostly, this was trial-and-error, test-and-retest to find an optimal setting.... if you could.

The value of altering an interleave setting was to minimize RPS Miss. And the reason that this was an operator-selectable setting was because I/O patterns are application dependent, and the application data layout across CHS maps was implementation dependent on disk drive geometry, OS, filesystem, etc.

Minimizing drive delay -- part three

Along with doing their own geometry mappings with drive electronics, those disk drive manufacturers found that if they added read caches to their electronics .... they could watch a pattern of I/Os to the drive and anticipate sequential read requests, and read those sectors into the RAM on the drive electronics. Before they were asked for. In the event the OS asked for the sectors, they could send those bytes back immeidately at bus/channel speed. No seek. No rotational delay. No data transfer from the drive.

-------

The OS is now divorced from drive geometry, as discussed in prior posts. Cache read-ahead is done automatically, and that process is a black box for the OS as well, and the algorithms differ between drive manufacturers, and perhaps model to model. RAID arrays further divorce a logical drive from physical I/O, and volume management tools (e.g. ZFS) obfuscate things further.

It is my opinion that "disk drive performance tuning" from intentional data layout on a modern, large physical drive is effectively impossible, and practically meaningless. They have hidden geometry. Hidden cache algorithms. There is so much data on one drive that we do not have consistant I/O patterns. Additional layers of volume management or storage array management make the divorce from physical location complete.

This is, of course, just my opinion. But don't let 35 years in IT, with nearly a decade of that as a senior manager for a storage systems vendor hold any sway. I could be wrong.
Reply With Quote
Old 17th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

I guest it really don't matter. If you change even by a few bytes things change. The only good thing is that if you never change size you just go with the best one and use the others for storage or whatever. The results are always consistence with-in each partitions at back-up and clean-up time. Funny, it only took a little over 8 hours to dd-zero the whole disk but it took over 11 hours to dd out the last two partition. I guest for a 1-T drive anything over 500MB falls into the disk geometry black-hole. Also now P1 is faster the original it it is tipple in size AND the first. Either GNU/DD is broke or Seagate HDD mathematics is insane. No wonder they broke the 3-T barrier. For most at lease we know where everything is, how thing work, what to avoid, and what set-up may be best for you. I think I'll stick with the lucky original.

View it as ... 327 GB by LINUX or 300 GB by BSD

^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
(A) ALL OF DISK:

HD 1 000 204 886 016 bytes | 1 TB 29358 second ..| 8 hour 15 minute |
MB/s = 34.0

^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
(B) MOST of DISK DIVIDE by THREE:

P1 327 415 463 424 bytes | 300 GB - _3386 second | 0 hour 56 minute | MB/s = 96.7
P2 327 415 495 680 bytes | 300 GB - 21454 second | 5 hour 10 minute | MB/s = 15.3
P3 327 415 495 680 bytes | 300 GB - 21955 second | 6 hour 09 minute | MB/s = 14.9


^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
(C) DIVIDE FIRST HALF ONLY:

P1 107 479 701 504 bytes | 100 GB - 4383 second | 1 hour 13 minute | MB/s = 24.5
P2 107 479 733 760 bytes | 100 GB - 1140 second | 0 hour 19 minute | MB/s = 94.2
P3 322 432 265 920 bytes | 300 GB - 3493 second | 0 hour 58 minute | MB/s = 92.3

^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
An EXAMPLE when WRITING SMALLER PARTITIONS:
this make all of the old rules correct again.. plus better speed and in-order:

dd - ad4s1 - 042,952,379,904 - _40.0GB - _360s = _6 minute - 114.0MB/s
dd - ad4s2 - 085,904,824,320 - _80.0GB - _720s = 12 minute - 114.0MB/s
dd - ad4s3 - 214,762,060,800 - 200.0GB - 1900s = 30 minute - _93.0MB/s

.........
40GB as the server partition with 6-minutes restore time! I would use the rest for jails for this kind of set-up. I would use set-up (C) for a PcBSD desktop and much more. With set-up (A) and (B) you loss tooooo much all the way around no matter what (I bet 50/50 is the key here). I wish I had tried that one before final install just completed. dang!

Last edited by sharris; 18th June 2011 at 02:11 PM. Reason: Adding MB/s just re-done. It was not included before.
Reply With Quote
Old 17th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

Quote:
jggimi wrote: Instead, the best performance was to set them at the center section of the platters. This minimized seek time.
I think this is still in use. I would say from top to center and not a step beyond, especially for the SWAP on todays HDD's. It explains those numbers I just posted above. Notice that I only went a fraction over the dividing line with 100GB (the middle with 500GB being the middle of a 1000GB HDD). Dd'ing partition-2 seems like it brought performance darn near to a halt, bringing down the good zone with it. By just now reading jggimi post, it's a match in more places than a few ... This is where you should place your lightly accessed data.

Great info jggimi
Reply With Quote
Old 17th June 2011
sharris sharris is offline
Package Pilot
 
Join Date: Jun 2010
Posts: 146
Default

PS: I had to tweaked hard to find a number so that all partitions would be of same size because I know of the MBR effect on size for any P1 during setup of partitioning. It came to be these numbers and I wrote it to disk and this is the cfdisk output even after re-boot-- I always triple check things:


sda1 ... boot . Primary ... FreeBSD ... 327115.50
sda2 ... .... ... Primary ... FreeBSD ... 327115.50
sda3 ... .... ... Primary ... FreeBSD ... 327115.50


In the dd numbers chart, notice P1 size has change its size on its own. Don't take my word for it... Try it. DEVICES IS NOT SUPPOSE TO CHANGE THE USER ACCEPTED INPUT FOR GIVEN DEVICE ONCE INITIATED, expecially by a clean boot. But as you see, it did it anyway. It makes you wonder how-much of a 3-Terabyes HDD would have decent transfer speed. My guest is 30% or less. Now you get one real Terabyte of acceptable speed the rest for hidden tricks that defy the odds, to secure the drive along with your data..


Quote:
The data transfer rate is higher in the outer cylinders compared to the inner ones because they contain more sectors. Period.
Quote:
... it's implementation-dependent
Quote:
While the drive manufacturers may have internal maps ...

Proof of resize:

P1 ... 327 415 463 424 bytes
P2 ... 327 415 495 680 bytes
P3 ... 327 415 495 680 bytes



That 12,256 BYTES... that's more that 512bytes for the MBR and 512bytes for a boot manager like GRUB. That leaves 11.256-Megabytes ripped out of P1 and is un-accounted. Standard documentation say difference so that means this is not common place, or is it?

IMO, Implementation-dependent, Magic maps or not, we now got "kind-of" a jacked-up slow device with tons of disk-space being created as the trade-off. If the Seagate 3-Terabyte HDD can gives me one-full-Terabyte in the WARP-speed-zone, I'll buy a dozen of them today.

Last edited by sharris; 17th June 2011 at 09:41 PM.
Reply With Quote
Old 17th June 2011
rocket357's Avatar
rocket357 rocket357 is offline
Real Name: Jonathon
Wannabe OpenBSD porter
 
Join Date: Jun 2010
Location: 127.0.0.1
Posts: 429
Default

Quote:
Originally Posted by sharris View Post
If the Seagate 3-Terabyte HDD can gives me one-full-Terabyte in the WARP-speed-zone, I'll buy a dozen of them today.
That's the case with some of the newer drives coming out...large capacity SATA drives can many times out-perform 15k SAS drives ****in sequential read tests**** at the beginning of the drive. As jggimi pointed out, all bets are off for random access patterns (where 15k SAS drives shine brilliantly...for the most part). (In other words, using the first 1 TB of a 3 TB sata drive for an ftp server that hosts very large files, like dvd images or such, would be good...using the same for an operating system installation would be a waste of time, since much of that will be randomly accessed data).

It's not uncommon for a drive capable of 125+ MB/sec sequential to only be able to push 0.5 MB/sec for heavily random read/write.
__________________
Linux/Network-Security Engineer by Profession. OpenBSD user by choice.

Last edited by rocket357; 17th June 2011 at 04:21 PM.
Reply With Quote
Old 18th June 2011
nilsgecko's Avatar
nilsgecko nilsgecko is offline
Port Guard
 
Join Date: Apr 2011
Location: Chicago, USA
Posts: 45
Default

Hi sharris,

I just started a new job this past week and have been working endlessly with no internet access (or time) for daemon forums at work so I'll write you later. This discussion seems to have run amok and I need to catch up on it!
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
where are start up settings on KDE Never mind i found them whispersGhost Solaris 2 12th June 2008 09:30 PM


All times are GMT. The time now is 10:44 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick