• Doh! Raid10 in the dark ...

    From Adrian Caspersz@21:1/5 to All on Fri Feb 16 12:37:20 2024
    I have weekly scheduled backups and have successfully restored, I am in
    a good place - now need to grow back some hair.

    Lesson learned for a cheapskate Dell R620 home lab environment.

    Do not set up a Raid 10 system (stripe + mirror) and forget to keep an
    eye on monitoring the raid [1] for errors

    - OR specifically,

    forget to keep an *eye on the monitoring platform* for functioning
    itself [2]

    I had two drives grow bad sectors, so when I came to pull and replace
    one of them, found the other bad disk was mirroring in the same stripe
    group.

    Hmmm...An uncorrectable array, that I could have avoided had I known and replaced the first drive earlier[3].

    So downtime[4], check all disks and restore all VMs from backup :(

    Ho hmmmm... :)


    1 - Especially when using cheapo £9 dubious 10 year old 900GB SAS drives
    from eBay.

    2 - Out of the box, Proxmox configures mail alerts via SMTP -
    unfortunately that outgoing is blocked courtesy of my ISP/Spamhaus, so a
    GMail workaround now implemented.

    3 - Hmmm, time to look at configuring a "hot spare" hard drive.

    4 - Yeah, not critical though. I'm lucky I don't do much infrastructure
    work like this for a job!

    --
    Adrian C

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Adrian Caspersz@21:1/5 to Andy Burns on Fri Feb 16 14:10:19 2024
    On 16/02/2024 13:18, Andy Burns wrote:
    Adrian Caspersz wrote:

      forget to keep an eye on the monitoring platform for functioning
    itself [2]

    Don't PERC cards have a fault LED on the chassis?

    Yeah, probably for failed drives where the whole raid would have gone
    into degraded state.

    Instead both the drives that were mirrored were screaming SMART pre-fail messages that I could not hear.

    Otherwise the RAID & drives were working fine silently correcting bad
    reads until I pulled out one of the drives, then PERC announced the bad
    blocks closely followed by a Proxmox VM backup hang.

    --
    Adrian C

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andy Burns@21:1/5 to All on Fri Feb 16 13:18:40 2024
    QWRyaWFuIENhc3BlcnN6IHdyb3RlOg0KDQo+ICDCoGZvcmdldCB0byBrZWVwIGFuIGV5ZSBv biB0aGUgbW9uaXRvcmluZyBwbGF0Zm9ybSBmb3IgZnVuY3Rpb25pbmcgDQo+IGl0c2VsZiBb Ml0NCg0KRG9uJ3QgUEVSQyBjYXJkcyBoYXZlIGEgZmF1bHQgTEVEIG9uIHRoZSBjaGFzc2lz
    Pw0K

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeff Gaines@21:1/5 to Caspersz on Fri Feb 16 13:32:20 2024
    On 16/02/2024 in message <l39380Fhd8sU1@mid.individual.net> Adrian
    Caspersz wrote:

    Do not set up a Raid 10 system (stripe + mirror) and forget to keep an eye
    on monitoring the raid [1] for errors

    My NAS has 4 x 2 TB SSD in RAID 10. In fact I only have 1.2 TB of data and wonder if I should use them as 4 x individual drives - is that JBOD?

    --
    Jeff Gaines Dorset UK
    We chose to do this not because it is easy but because we thought it would
    be easy.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)