• Fwd: Also observing #988477

    From Elliott Mitchell@21:1/5 to All on Thu Jan 18 18:40:02 2024
    I should have Cc'd debian-kernel@lists.debian.org, but failed to do so.
    As such now forwarding a copy. At the very least this involves the
    Linux MD-RAID1 functionality, but I am unsure whether this is a Linux
    kernel bug versus a Xen bug.


    Forwarded:

    I am also observing #988477 occur. This machine has a AMD Zen 4
    processor. The first observation was when motherboard/processor was
    swapped out, the older motherboard/processor was several generations old.

    The pattern which is emerging is Linux MD RAID1 plus recent AMD processor
    which has full IOMMU functionality. The older machine was believed to
    have an IOMMU, but the BIOS wasn't creating appropriate ACPI tables
    (IVRS) and thus Xen was unable to utilize it.

    This seems to be occuring with a small percentage of write operations. Subsequent read operations appear to be fine.

    I am not convinced this is a Xen bug. I suspect this is instead a bug
    in the Linux MD subsystem. In particular if the DMA interface was
    designed assuming only a single device would ever access any page, but
    the MD RAID1 driver is reusing the same page for both devices.

    IOMMU page release could be handled by marking the page unused in a
    device data structure and later removed by sweeping a table. In such
    case if the MD-RAID1 driver was to redirect the page to another device
    between these two steps, the entry for a subsequent device could be wiped
    out when trying to invalidate an entry for a prior device.


    Anyway, I'm also observing bug #988477. This could also be a kernel bug.
    So far no crashes/confirmed data loss have occured, but sweeping the
    mirror does turn up small numbers of inconsistencies.


    --
    (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/)
    \BS ( | ehem+sigmsg@m5p.com PGP 87145445 | ) /
    \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)