• Do you think the days of the hard drive is finally over?

    From Lynn McGuire@21:1/5 to Yousuf Khan on Wed May 13 19:35:03 2020
    On 5/13/2020 7:17 PM, Yousuf Khan wrote:
    So Seagate and other makers are getting ready to introduce 20 TB HDD's
    to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+
    hours to entirely fill such a drive with data at maximum speed! Is that
    too much time, no matter how much capacity you are getting? Is that
    basically unusable capacity? I know you can say that a drive that large
    would be filled over a number of years, and no one would be filling it
    all up in one go.

    But that's probably true in a home environment, but what about an
    enterprise environment? What if that drive were part of a RAID array,
    and one of those drives failed and needed to be replaced? In RAID
    parity, the entire drive has to be written to, because the parity is
    required on all drives at once. Imagine you start synchronizing a
    replacement drive like that, and it takes 30 hours to do that? That's a
    long enough time that it's conceivable another drive within that array
    would fail too, before it's had a chance to completely resync with the
    array. So sure, you can get that capacity with an HDD, but should you
    really be storing your data on something that slow? HDD's can't get much faster.

    Backblaze wrote an article about replacing failing drives:
    https://www.backblaze.com/blog/life-and-times-of-a-backblaze-hard-drive/

    Lynn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Yousuf Khan@21:1/5 to All on Wed May 13 20:17:28 2020
    So Seagate and other makers are getting ready to introduce 20 TB HDD's
    to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+
    hours to entirely fill such a drive with data at maximum speed! Is that
    too much time, no matter how much capacity you are getting? Is that
    basically unusable capacity? I know you can say that a drive that large
    would be filled over a number of years, and no one would be filling it
    all up in one go.

    But that's probably true in a home environment, but what about an
    enterprise environment? What if that drive were part of a RAID array,
    and one of those drives failed and needed to be replaced? In RAID
    parity, the entire drive has to be written to, because the parity is
    required on all drives at once. Imagine you start synchronizing a
    replacement drive like that, and it takes 30 hours to do that? That's a
    long enough time that it's conceivable another drive within that array
    would fail too, before it's had a chance to completely resync with the
    array. So sure, you can get that capacity with an HDD, but should you
    really be storing your data on something that slow? HDD's can't get much faster.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Taylor@21:1/5 to Yousuf Khan on Wed May 13 21:15:53 2020
    On 5/13/20 6:17 PM, Yousuf Khan wrote:
    So Seagate and other makers are getting ready to introduce 20 TB HDD's
    to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+
    hours to entirely fill such a drive with data at maximum speed! Is that
    too much time, no matter how much capacity you are getting?

    No, that is not too much time for some, if not many use cases.

    Is that basically unusable capacity?

    Absolutely not.

    I know you can say that a drive that large would be filled over a
    number of years, and no one would be filling it all up in one go.

    There will be some people that will fill it in almost one go.

    But that's probably true in a home environment, but what about an
    enterprise environment?

    Some enterprises (think Backblaze) will fill drives in a few days. They
    have very specialized ways to write data to hundreds / thousands of
    drives. They write things to each drive to capacity and then move on to
    the next drive. (There is obviously redundancy elsewhere in the
    application stack.) As such, they fill the drives in what some would
    consider one go.

    What if that drive were part of a RAID array, and one of those drives
    failed and needed to be replaced? In RAID parity, the entire drive
    has to be written to, because the parity is required on all drives at
    once.

    It depends on what type of RAID technology is used. ZFS's RAID has the
    unique ability to only re-synchronize the amount of the drive that was
    used, not the entire drive.

    Aside: ZFS is very impressive.

    Imagine you start synchronizing a replacement drive like that, and
    it takes 30 hours to do that?

    And? This happens. I have a friend & colleague that's waiting on a
    RAID array to rebuild and has estimates of nearly 200 hours.

    That's a long enough time that it's conceivable another drive within
    that array would fail too, before it's had a chance to completely
    resync with the array.

    This is, and has been for 10–20 years. That's one of the reasons that
    RAID-6 and higher RAID levels are popular.

    So sure, you can get that capacity with an HDD, but should you
    really be storing your data on something that slow?

    Sure. Anything that only needs to access a subset of the content but
    wants a deep catalog is a perfect use for such a drive.

    HDD's can't get much faster.

    What is a HDD? Why does an SSD /not/ qualify as a HDD?

    What about the various holographic storage methods that IBM (and others)
    have experimented with over the last 30 years.

    There have been multiple times in the past that hard drive manufacturers
    have experimented with, and shipped to customers, drives that have
    multiple sets of heads for increased performance.

    Also, I'm quite certain that each and every time that someone has said
    that something can't get faster, it does.



    --
    Grant. . . .
    unix || die

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From VanguardLH@21:1/5 to Yousuf Khan on Wed May 13 22:42:09 2020
    Yousuf Khan <bbbl67@spammenot.yahoo.com> wrote:

    So Seagate and other makers are getting ready to introduce 20 TB HDD's
    to the market. According to Seagate, its fastest drives are capable of sustained 250 MB/s transfers (if you believe them). It would take 30+
    hours to entirely fill such a drive with data at maximum speed! Is that
    too much time, no matter how much capacity you are getting? Is that
    basically unusable capacity? I know you can say that a drive that large
    would be filled over a number of years, and no one would be filling it
    all up in one go.

    But that's probably true in a home environment, but what about an
    enterprise environment? What if that drive were part of a RAID array,
    and one of those drives failed and needed to be replaced? In RAID
    parity, the entire drive has to be written to, because the parity is
    required on all drives at once. Imagine you start synchronizing a
    replacement drive like that, and it takes 30 hours to do that? That's a
    long enough time that it's conceivable another drive within that array
    would fail too, before it's had a chance to completely resync with the
    array. So sure, you can get that capacity with an HDD, but should you
    really be storing your data on something that slow? HDD's can't get much faster.

    HDDs are not only used by consumers that have 1 to 4 units in their
    computers. They are also used by datacenters that have THOUSANDS of at
    their site, and then THOUSANDS more at other datacenters to provide for catastrophic physical disaster (flood, tsunami, earthquake, meteor,
    falling aircraft and space junk, terrorism, etc). Google has
    datacenters in 13 locations: N. and S. Carolina, Iowa, Georgia,
    Oklahoma, Oregon, Hong Kong, Singapore, Taiwan, Finland, Belgium,
    Ireland, and Chile. Through subsidiaries, they have datacenters
    elsewhere, too: Virginia, Alanta GA (multiple), Netherlands (2
    locations), Hungary, and Poland,

    https://www.backblaze.com/blog/hard-drive-stats-for-2019/
    "As of December 31, 2019, Backblaze had 124,956 spinning hard drives."

    https://www.computerworld.com/article/3412222/the-10-biggest-data-centres-in-the-world.html

    Ranked by square footage. The Citadel (www.switch.com/the-citadel) is
    largest sized. It's hard to get them to concretely expose their total
    storage capacity. The estimate for Google is 10 exabytes which, using
    your 20TB HDD example, would consume 500,000 HDDs. At Google, an HDD
    dies every few minutes due to the sheer number of drives they employ.

    Just because YOU don't have that much data to retain or archive doesn't
    mean no one else does. The average 4K movies consumes about 100GB.
    Would only take 200 movies to fill up a 20TB drive. According AllFlicks
    back in 2016, Netflix had 6,494 movies back then (and 1,609 TV shows).
    While Netflix discards movies after awhile, I'm sure they've grown since
    then.

    The more disks you have spinning, even when adding to a RAID config, the
    more fragile becomes the setup. Putting the same amount of data on less mechanicals means less chance of physical failure.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From pedro1492@lycos.com@21:1/5 to All on Sat May 16 02:31:26 2020
    20 terrorbites would be an "archive" drive with shingles.
    The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Yousuf Khan@21:1/5 to pedro1492@lycos.com on Sat May 16 07:50:32 2020
    On 5/16/2020 5:31 AM, pedro1492@lycos.com wrote:
    20 terrorbites would be an "archive" drive with shingles.
    The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk.

    I think even 16 TB is way too large, shingles or not. It would still
    take nearly 18 hours.

    What would be a type of HDD that a system could handle practically now?
    I think perhaps the upper limit is 8 TB? That would take nearly 9 hours
    to fill. 6 TB would take 6.5 hours, 4 TB would take 4.5 hours.

    Yousuf Khan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Perkins@21:1/5 to bbbl67@spammenot.yahoo.com on Sat May 16 16:02:45 2020
    On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan
    <bbbl67@spammenot.yahoo.com> wrote:

    On 5/16/2020 5:31 AM, pedro1492@lycos.com wrote:
    20 terrorbites would be an "archive" drive with shingles.
    The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk.

    I haven't done a scandisk in quite a few years, and prior to that it was another few years since the previous one. It's not something I worry about,
    nor do I worry about how long it takes to fill a drive with data. My
    primary concerns are how many SATA ports and drive bays I have on hand.
    Those are the limiting factors.

    I think even 16 TB is way too large, shingles or not. It would still
    take nearly 18 hours.

    We all have different needs. My server has 16 SATA ports and 15 drive bays,
    so the OS lives on an SSD that lays on the floor of the case. The data
    drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to
    36.3TB. I use DriveBender to pool all of the drives into a single volume. Windows is happy with that. Since there are no SATA ports or drive bays available, upgrading for more storage means replacing one or more of the current drives. External drives aren't a serious long-term option.

    The PC that I'm typing on, which I consider my workstation, has 6 SATA
    ports native to the mobo, 3 NVMe sockets, and 10 drive bays. I use an NVMe drive for the OS and 4TB x3 plus 12TB x2 for data, giving me 36TB raw and 32.7TB formatted. I also use DriveBender here, so Windows sees single
    32.7TB volume. With one SATA port available (and 5 drive bays), I can
    expand the storage by adding one drive. Beyond that, since I'll be out of
    SATA ports and don't really want to use a PCIe SATA card, my next move
    would be to replace the 4TB drives with something bigger.

    At the moment, I'm looking at 12TB and 14TB drives as possible system
    upgrades. The 16TB drives are still expensive, with most being north of
    $400 apiece.

    What would be a type of HDD that a system could handle practically now?
    I think perhaps the upper limit is 8 TB? That would take nearly 9 hours
    to fill. 6 TB would take 6.5 hours, 4 TB would take 4.5 hours.

    Mine are 36.3TB and 32.7TB. I've never filled a volume that size all at
    once.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Yousuf Khan@21:1/5 to Mark Perkins on Sun May 17 22:06:35 2020
    On 5/16/2020 5:02 PM, Mark Perkins wrote:
    On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan
    <bbbl67@spammenot.yahoo.com> wrote:

    On 5/16/2020 5:31 AM, pedro1492@lycos.com wrote:
    20 terrorbites would be an "archive" drive with shingles.
    The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk.

    I haven't done a scandisk in quite a few years, and prior to that it was another few years since the previous one. It's not something I worry about, nor do I worry about how long it takes to fill a drive with data. My
    primary concerns are how many SATA ports and drive bays I have on hand.
    Those are the limiting factors.

    Well, nobody does Scandisks more than once in several years. I'm sure
    Pedro meant that as an extreme example, but not something that is
    unreasonable to expect to do occasionally.

    I think even 16 TB is way too large, shingles or not. It would still
    take nearly 18 hours.

    We all have different needs. My server has 16 SATA ports and 15 drive bays, so the OS lives on an SSD that lays on the floor of the case. The data
    drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to 36.3TB. I use DriveBender to pool all of the drives into a single volume. Windows is happy with that. Since there are no SATA ports or drive bays available, upgrading for more storage means replacing one or more of the current drives. External drives aren't a serious long-term option.

    But the point is, neither are internal ones these days, it seems.
    Assuming even if these are mainly used in enterprise settings, they
    would likely be part of a RAID array. Now if the RAID array is new and
    all of these drives were put in new as part of the initial setup,
    there's nothing to worry about, you fill it up to whatever level of data
    you have. Hopefully your array holds at least twice the amount of data
    that someone's old setup had, so it can keep growing before it too needs
    to be replaced or upgraded. Now as this array ages, it's reasonable to
    assume that one of the drives may die, and it would need to be replaced.
    By the time this event happens, likely this array is probably at least
    80% full or more. Inserting a replacement drive into the array will
    require massive amount of time to resync, even if it is a smart resync,
    doing only the blocks that actually have data on them.

    Now, looking up what Drive Bender is, it seems to be a virtual volume concatenator. So it's not really a RAID, individual drives die and only
    the data on them are lost, unless they are backed up. So even in that
    case, if one of these massive drives is part of your DB setup, replacing
    that drive will be a major pain in the butt even while restoring from
    backups. It really begs the question how long are you willing to wait
    for a drive to get repopulated, knowing that while this is happening
    it's also going to be maxing out the rest of your system for the amount
    of hours that the restore operation is happening?

    My point is that I think people will only be willing to wait a few
    hours, perhaps 4 or 5 hours at most, before they say it's not worth it,
    in a home environment. In an enterprise environment, that tolerance may
    get extended out to 8 or 10 hours. So at some point, all of this
    capacity is useless, because it's impractical to manage with the current
    drive and interface speeds.

    If SSD's were cheaper per byte, then even SSD's running on a SATA
    interface would still be viable at the same capacities we see HDD's at
    right now. So a 16 or 20 TB SSD would be usable devices, but 16 or 20 TB
    HDD's aren't.

    Yousuf Khan

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Perkins@21:1/5 to bbbl67@spammenot.yahoo.com on Mon May 18 17:11:06 2020
    On Sun, 17 May 2020 22:06:35 -0400, Yousuf Khan
    <bbbl67@spammenot.yahoo.com> wrote:

    On 5/16/2020 5:02 PM, Mark Perkins wrote:
    On Sat, 16 May 2020 07:50:32 -0400, Yousuf Khan
    <bbbl67@spammenot.yahoo.com> wrote:

    On 5/16/2020 5:31 AM, pedro1492@lycos.com wrote:
    20 terrorbites would be an "archive" drive with shingles.
    The sensible drives go up to 16 TB? Even that is going to take ages for a scandisk.

    I haven't done a scandisk in quite a few years, and prior to that it was
    another few years since the previous one. It's not something I worry about, >> nor do I worry about how long it takes to fill a drive with data. My
    primary concerns are how many SATA ports and drive bays I have on hand.
    Those are the limiting factors.

    Well, nobody does Scandisks more than once in several years. I'm sure
    Pedro meant that as an extreme example, but not something that is >unreasonable to expect to do occasionally.

    I think even 16 TB is way too large, shingles or not. It would still
    take nearly 18 hours.

    We all have different needs. My server has 16 SATA ports and 15 drive bays, >> so the OS lives on an SSD that lays on the floor of the case. The data
    drives are 4TB x5 and 2TB x10, for a raw capacity of 40TB, formatted to
    36.3TB. I use DriveBender to pool all of the drives into a single volume.
    Windows is happy with that. Since there are no SATA ports or drive bays
    available, upgrading for more storage means replacing one or more of the
    current drives. External drives aren't a serious long-term option.

    But the point is, neither are internal ones these days, it seems.

    I don't follow what you're saying. To me, internal drives are the primary
    data storage option.

    Assuming even if these are mainly used in enterprise settings, they
    would likely be part of a RAID array. Now if the RAID array is new and
    all of these drives were put in new as part of the initial setup,
    <snip>

    No, I'm not assuming that (Enterprise and RAID) at all. I'm assuming use in
    the home market, and specifically the subset of the home market where
    people want to keep large amounts of data accessible. RAID is relatively
    rare in that setting, isn't it? I don't know anyone who uses it, but that doesn't mean much.

    Now, looking up what Drive Bender is, it seems to be a virtual volume >concatenator. So it's not really a RAID, individual drives die and only
    the data on them are lost, unless they are backed up. So even in that
    case, if one of these massive drives is part of your DB setup, replacing
    that drive will be a major pain in the butt even while restoring from

    Restoring just the missing files is a major pain? Why does that have to be
    the case? FWIW, I haven't found that to be true. It's much faster than
    doing a full restore, for example.

    backups. It really begs the question how long are you willing to wait
    for a drive to get repopulated, knowing that while this is happening
    it's also going to be maxing out the rest of your system for the amount
    of hours that the restore operation is happening?

    If there's something you need right away, you prioritize that. Otherwise,
    let the restore run and do its thing. It's not like disk access brings a
    modern system to its knees, right? Performance wise, you wouldn't even know it's happening. So in general, there's no significant waiting, and remember that failed drives are not an every day/week/month/year occurrence. Most
    drives last longer than I'm willing to use them, getting replaced when the
    data has outgrown their capacity.

    My point is that I think people will only be willing to wait a few
    hours, perhaps 4 or 5 hours at most, before they say it's not worth it,
    in a home environment.

    I don't follow that at all.

    In an enterprise environment, that tolerance may
    get extended out to 8 or 10 hours. So at some point, all of this
    capacity is useless, because it's impractical to manage with the current >drive and interface speeds.

    ??? How often are you clearing and refilling an entire drive?

    If SSD's were cheaper per byte, then even SSD's running on a SATA
    interface would still be viable at the same capacities we see HDD's at
    right now. So a 16 or 20 TB SSD would be usable devices, but 16 or 20 TB >HDD's aren't.

    That sounds like nonsense. If 100TB HDD's were available at a reasonable
    price and reasonably reliable, many people would find them to be perfectly usable. I'd love to replace all of my smaller drives with fewer larger
    drives and in fact that's exactly what I've been doing since the
    mid-1980's.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)