• C3600 kernel/64bit 4.* slow IO due to -mlong-calls

    From Helge Deller@21:1/5 to Carlo Pisani on Thu Mar 15 21:50:02 2018
    Hi Carlo,

    On 15.03.2018 16:36, Carlo Pisani wrote:
    I am experiencing a very annoying behavior with my HPPA C3600: if I
    compile the (linux) kernel with -mlong-calls then the IO (e.g. file
    copy) becomes very slow, and the PCI becomes unstable (i.e. it crashes
    the machine)

    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s

    Interesting bad results!

    these tests were performed with

    dd if=/dev/zero of=here bs=1k count=100000

    -mlong-calls is enabled in the kernel by "CONFIG_MLONGCALLS"

    I think nobody else noticed the bad performance due to CONFIG_MLONGCALLS yet. I've now started some testing if we can disable that option on the debian kernels...

    the help-guide says "If you configure the kernel to include many
    drivers built-in instead as modules, the kernel executable may become
    too big, so that the linker will not be able to resolve some long
    branches and fails to link your vmlinux kernel. In that case enabling
    this option will help you to overcome this limit by using the
    -mlong-calls compiler option. Usually you want to say N here, unless
    you e.g. want to build a kernel which includes all necessary drivers
    built-in and which can be used for TFTP booting without the need to
    have an initrd ramdisk. Enabling this option will probably slow down
    your kernel"

    I need -mlong-calls because I need to compile the kernel without kernel-modules

    Why?

    all built-in, that makes the size of the kernel of
    about 23Mbytes, thus without -mlong-calls the linker fails to "link"
    objects

    let me know

    I'm not sure what kind of help you expect here?
    The only option I see is that you try to disable some options (modules) you won't
    need and thus reduce the kernel size. xfs, ipv6 and such are good candidates. Or use a 32bit kernel ?

    Helge

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to Helge Deller on Thu Mar 15 21:50:02 2018
    On 2018-03-15 4:43 PM, Helge Deller wrote:
    On 15.03.2018 16:36, Carlo Pisani wrote:
    I am experiencing a very annoying behavior with my HPPA C3600: if I
    compile the (linux) kernel with -mlong-calls then the IO (e.g. file
    copy) becomes very slow, and the PCI becomes unstable (i.e. it crashes
    the machine)

    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s
    Interesting bad results!
    I don't believe the instability mentioned is due to long calls. It's the following issue:
    https://www.spinics.net/lists/linux-parisc/msg01024.html

    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Carlo Pisani@21:1/5 to All on Thu Mar 15 17:00:02 2018
    hi
    I am experiencing a very annoying behavior with my HPPA C3600: if I
    compile the (linux) kernel with -mlong-calls then the IO (e.g. file
    copy) becomes very slow, and the PCI becomes unstable (i.e. it crashes
    the machine)

    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s

    these tests were performed with

    dd if=/dev/zero of=here bs=1k count=100000

    -mlong-calls is enabled in the kernel by "CONFIG_MLONGCALLS"

    the help-guide says "If you configure the kernel to include many
    drivers built-in instead as modules, the kernel executable may become
    too big, so that the linker will not be able to resolve some long
    branches and fails to link your vmlinux kernel. In that case enabling
    this option will help you to overcome this limit by using the
    -mlong-calls compiler option. Usually you want to say N here, unless
    you e.g. want to build a kernel which includes all necessary drivers
    built-in and which can be used for TFTP booting without the need to
    have an initrd ramdisk. Enabling this option will probably slow down
    your kernel"

    I need -mlong-calls because I need to compile the kernel without kernel-modules, all built-in, that makes the size of the kernel of
    about 23Mbytes, thus without -mlong-calls the linker fails to "link"
    objects

    let me know

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to Helge Deller on Thu Mar 15 22:00:02 2018
    On 2018-03-15 4:43 PM, Helge Deller wrote:
    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s
    Interesting bad results!

    It's hard to understand why the performance would deteriorate so much
    but I see essentially
    the same behavior.

    Dave

    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helge Deller@21:1/5 to All on Fri Mar 16 12:30:02 2018
    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s
    Interesting bad results!

    It's hard to understand why the performance would deteriorate so much
    but I see essentially the same behavior.

    Speaking of debian kernel, it's nearly impossible to link a kernel without mlong-calls.

    Compiling without mlong-calls generates this (R_PARISC_PCREL22F):
    b,l external_func,%r2
    nop

    With -mlong-calls it is much more complex:
    .LC0:
    .dword P%external_func
    .globl a
    a:
    addil LT'.LC0,%r27
    ldd RT'.LC0(%r1),%r28
    ldd 0(%r28),%r28
    ldd 16(%r28),%r2
    bve,l (%r2),%r2


    Since our kernel is running in the first 4GB of RAM (even on 64bit), couldn't we instead
    introduce a gcc option, e.g. "-mkernel-indirect-calls", which translates to:
    ldil L%external_func, %r2 // R_PARISC_DIR21L
    ldo R%external_func(%r2), %r2 // R_PARISC_DIR14R
    bve,l (%r2),%r2

    Does -mfast-indirect-calls has any effect at all?
    I haven't seen any difference when using this option.

    Thoughts?
    Helge

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to Helge Deller on Fri Mar 16 14:40:01 2018
    On 2018-03-16 7:25 AM, Helge Deller wrote:
    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s
    Interesting bad results!

    It's hard to understand why the performance would deteriorate so much
    but I see essentially the same behavior.
    Speaking of debian kernel, it's nearly impossible to link a kernel without mlong-calls.

    Compiling without mlong-calls generates this (R_PARISC_PCREL22F):
    b,l external_func,%r2
    nop
    On PA 2.0, this is a 22 bit pc-relative call that has a branch distance
    of 8 MB.  We have no stub support
    in the gnu 64-bit linker.  If we had stub support, this would be best solution.

    In addition to the argument registers, the argument pointer needs to be
    loaded for each call.


    With -mlong-calls it is much more complex:
    .LC0:
    .dword P%external_func
    .globl a
    a:
    addil LT'.LC0,%r27
    ldd RT'.LC0(%r1),%r28
    ldd 0(%r28),%r28
    ldd 16(%r28),%r2
    bve,l (%r2),%r2
    This is standard 64-bit indirect call.  It calls via a function
    descriptor.  It assumes the PIC register may change
    and the callee may be in a different space (i.e., 64-bit hpux runtime). 
    The bve instruction is specific to PA 2.0.
    b
    In the kernel, we probably don't need the load of the new PIC register
    (omitted from the above).



    Since our kernel is running in the first 4GB of RAM (even on 64bit), couldn't we instead
    introduce a gcc option, e.g. "-mkernel-indirect-calls", which translates to:
    ldil L%external_func, %r2 // R_PARISC_DIR21L
    ldo R%external_func(%r2), %r2 // R_PARISC_DIR14R
    bve,l (%r2),%r2
    Another option is to use ble (i.e., call sequence generated using -mfast-indirect-calls).  It yields the same length
    call sequence as your above sequence and it works on both PA 1.x and 2.0.

    The above sequence is not PIC.  What about modules?

    In the above three sequences, there is a delay slot after the branch
    which might be filled by the compiler with a
    useful instruction.

    Does -mfast-indirect-calls has any effect at all?
    I haven't seen any difference when using this option.
    At the moment, this option only applies to the 32-bit compiler.

    Thoughts?

    I don't remember any huge increase in gcc build time with -mlong-calls. 
    Calls don't usually dominate performance.

    Dave

    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Turner@21:1/5 to Helge Deller on Fri Mar 16 18:30:01 2018
    On Fri, Mar 16, 2018 at 4:25 AM, Helge Deller <deller@gmx.de> wrote:
    kernel gcc binutils with mlong without mlong
    4.15.7 4.9.3 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.25.1 13.4 MB/s 27.0 MB/s
    4.15.7 6.4.0 2.29.1 14.4 MB/s 25.0 MB/s
    Interesting bad results!

    It's hard to understand why the performance would deteriorate so much
    but I see essentially the same behavior.

    Speaking of debian kernel, it's nearly impossible to link a kernel without mlong-calls.

    This week I succeeded in building a stripped-down kernel without
    mlong-calls (in Gentoo), but was unable to get anything to link
    without mlong-calls when CONFIG_PARISC_PAGE_SIZE_16KB=y.

    With mlong-calls, I couldn't get a 16K page-size kernel to boot
    either. Is this a configuration anyone uses or tests? 16K pages are
    supposed to give better performance, so it'd be good if they worked.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to Matt Turner on Fri Mar 16 18:50:02 2018
    On 2018-03-16 1:29 PM, Matt Turner wrote:
    This week I succeeded in building a stripped-down kernel without
    mlong-calls (in Gentoo), but was unable to get anything to link
    without mlong-calls when CONFIG_PARISC_PAGE_SIZE_16KB=y.

    I wouldn't recommend the above option.  It affects alignment of some
    things in kernel
    and as you found it makes the kernel bigger.  There are also some things
    in userspace
    that assume 4KB pages.

    Dave

    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Turner@21:1/5 to dave.anglin@bell.net on Fri Mar 16 19:00:02 2018
    On Fri, Mar 16, 2018 at 10:44 AM, John David Anglin
    <dave.anglin@bell.net> wrote:
    On 2018-03-16 1:29 PM, Matt Turner wrote:

    This week I succeeded in building a stripped-down kernel without
    mlong-calls (in Gentoo), but was unable to get anything to link
    without mlong-calls when CONFIG_PARISC_PAGE_SIZE_16KB=y.

    I wouldn't recommend the above option. It affects alignment of some things in kernel
    and as you found it makes the kernel bigger. There are also some things in userspace
    that assume 4KB pages.

    I expect we're past the point where there are any significant
    obstacles to userspace support. Most platforms successfully support
    multiple page sizes these days.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to Matt Turner on Fri Mar 16 19:10:02 2018
    On 2018-03-16 1:58 PM, Matt Turner wrote:
    On Fri, Mar 16, 2018 at 10:44 AM, John David Anglin
    <dave.anglin@bell.net> wrote:
    On 2018-03-16 1:29 PM, Matt Turner wrote:
    This week I succeeded in building a stripped-down kernel without
    mlong-calls (in Gentoo), but was unable to get anything to link
    without mlong-calls when CONFIG_PARISC_PAGE_SIZE_16KB=y.

    I wouldn't recommend the above option. It affects alignment of some things >> in kernel
    and as you found it makes the kernel bigger. There are also some things in >> userspace
    that assume 4KB pages.
    I expect we're past the point where there are any significant
    obstacles to userspace support. Most platforms successfully support
    multiple page sizes these days.

    But they don't have the problem we do with non equivalent aliasing. The
    data section starts on a
    page boundary on parisc and it doesn't overlap text.

    Dave

    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Helge Deller@21:1/5 to John David Anglin on Fri Mar 16 21:00:01 2018
    On 16.03.2018 19:03, John David Anglin wrote:
    On 2018-03-16 1:58 PM, Matt Turner wrote:
    On Fri, Mar 16, 2018 at 10:44 AM, John David Anglin
    <dave.anglin@bell.net> wrote:
    On 2018-03-16 1:29 PM, Matt Turner wrote:
    This week I succeeded in building a stripped-down kernel without
    mlong-calls (in Gentoo), but was unable to get anything to link
    without mlong-calls when CONFIG_PARISC_PAGE_SIZE_16KB=y.

    I wouldn't recommend the above option.  It affects alignment of some things
    in kernel
    and as you found it makes the kernel bigger.  There are also some things in
    userspace
    that assume 4KB pages.
    I expect we're past the point where there are any significant
    obstacles to userspace support. Most platforms successfully support
    multiple page sizes these days.

    But they don't have the problem we do with non equivalent aliasing. The data section starts on a
    page boundary on parisc and it doesn't overlap text.

    I'd be astonished, if anything other than 4kB page size is able to
    boot to a login prompt.
    For example, I know the parisc PCI-specific code (dino,lba,...) still
    depends on 4kb page sizes.
    I've left the CONFIG options in the source in the hope somebody
    will try to finish >4kb page support at some point.

    Helge

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John David Anglin@21:1/5 to John David Anglin on Sun Mar 18 14:40:02 2018
    On 2018-03-16 9:37 AM, John David Anglin wrote:
    On 2018-03-16 7:25 AM, Helge Deller wrote:
    kernel  gcc     binutils    with mlong    without mlong
    4.15.7  4.9.3   2.25.1     13.4 MB/s    27.0 MB/s
    4.15.7  6.4.0   2.25.1     13.4 MB/s    27.0 MB/s
    4.15.7  6.4.0   2.29.1     14.4 MB/s    25.0 MB/s
    Interesting bad results!
    It's hard to understand why the performance would deteriorate so much
    but I see essentially the same behavior.
    The performance difference between with and without long calls is
    exaggerated by the I/O
    test used for the above results.  I see 22:05 and 21:39 hours for a gcc
    build and check with
    and without kernel long calls on c8000, respectively.

    I think the poor performance of long calls is primarily due to the loads
    which can trigger
    TLB misses.  This implies we should work to minimize the impact of TLB flushes.
    Flushing the whole TLB is quite detrimental to overall performance and
    it doesn't scale
    well to multiple CPUs.  On rp3440, a pdc instruction takes about 570
    cycles because of
    the broadcast to other CPUs.  So, we need to know whether a mapping is
    local and possibly
    the set of CPUs a mapping applies to.

    Speaking of debian kernel, it's nearly impossible to link a kernel
    without mlong-calls.

    Compiling without mlong-calls generates this (R_PARISC_PCREL22F):
             b,l external_func,%r2
             nop
    On PA 2.0, this is a 22 bit pc-relative call that has a branch
    distance of 8 MB.  We have no stub support
    in the gnu 64-bit linker.  If we had stub support, this would be best solution.

    In addition to the argument registers, the argument pointer needs to
    be loaded for each call.


    With -mlong-calls it is much more complex:
    .LC0:
             .dword  P%external_func
    .globl a
    a:
             addil LT'.LC0,%r27
             ldd RT'.LC0(%r1),%r28
             ldd 0(%r28),%r28
             ldd 16(%r28),%r2
             bve,l (%r2),%r2
    This is standard 64-bit indirect call.  It calls via a function descriptor.  It assumes the PIC register may change
    and the callee may be in a different space (i.e., 64-bit hpux
    runtime).  The bve instruction is specific to PA 2.0.
    b
    In the kernel, we probably don't need the load of the new PIC register (omitted from the above).
    We don't think we need function descriptors in the kernel.  They are
    only needed to load a new PIC register.
    So, we can load the function address directly from the linkage table.

            addil LT'external_function,%r27
            ldd RT'external_function(%r1),%r2
            bve,l (%r2),%r2
            Delay slot

    The above sequence is PIC.  It is the same length as the one suggested
    by Helge below and the
    linker could convert it to Helge's sequence when the call is not
    external to the main linux kernel.
    It does have one load that might trigger a TLB miss.

    I don't know enough about the call sequences used to call functions in
    external modules but
    it might be easier to do the relocation for the above.  It's probably
    already handled as the addil/ldd
    sequence should already load the address of external_function.

    It might also be possible to use a 32-bit PIC pc-relative sequence, but
    it is longer and 32-bit
    pc-relative relocations might not be supported.




    Since our kernel is running in the first 4GB of RAM (even on 64bit),
    couldn't we instead
    introduce a gcc option, e.g. "-mkernel-indirect-calls", which
    translates to:
             ldil    L%external_func, %r2        // R_PARISC_DIR21L
             ldo     R%external_func(%r2), %r2   // R_PARISC_DIR14R
             bve,l (%r2),%r2
    Another option is to use ble (i.e., call sequence generated using -mfast-indirect-calls).  It yields the same length
    call sequence as your above sequence and it works on both PA 1.x and 2.0.

    The above sequence is not PIC.  What about modules?

    In the above three sequences, there is a delay slot after the branch
    which might be filled by the compiler with a
    useful instruction.

    Does -mfast-indirect-calls has any effect at all?
    I haven't seen any difference when using this option.
    At the moment, this option only applies to the 32-bit compiler.

    Thoughts?

    I don't remember any huge increase in gcc build time with
    -mlong-calls.  Calls don't usually dominate performance.

    Dave


    --
    John David Anglin dave.anglin@bell.net

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Carlo Pisani@21:1/5 to All on Sun Mar 18 17:20:02 2018
    hi
    good news (maybe)

    yesterday I was able to compile kernel 4.16.0-Fearless-Coyote-Experimental-c3600-64bit

    git clone git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git
    git branch parisc-4.16-2

    configured with

    - 64bit kernel
    - no SMP (since C3600 owns 1 CPU)
    - no preemption
    - no mlong-call
    - no built-in drivers, all drivers are kernel-modules

    this kernel was compiled with
    - hppa2.0-unknown-linux-gnu-{ gcc-v6.4.0, binutils-v2.29.1 }

    it has been running for a while and it seems stable, even on heavy I/O
    (I need 48h to confirm)
    with "decent" (not excellent, but acceptable) performance on a
    PCI_VIA_SATA controller

    # lspci | grep SATA
    01:05.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE/SATA
    Controller (rev 50)

    # lsmod
    Module Size Used by
    sata_via 16126 2
    libata 312324 1 sata_via

    # lsprettysize data.bin
    488 Mbyte data.bin
    # time cp data.bin data2.bin
    real 0m20.034s
    user 0m0.012s
    sys 0m9.227s

    500Mbyte / 20s means 25Mbyte/sec

    it's currently under testing


    ---------------------- original message ----------

    flush_cache_range() may be called without context, which then triggers a BUG(). This patch by Dave Anglin adds code to correctly handle this case.

    parisc: Handle case where flush_cache_range is called with no context

    arch/parisc/kernel/cache.c | 41 ++++++++++++++++++++++++++++++++---------
    1 file changed, 32 insertions(+), 9 deletions(-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Carlo Pisani@21:1/5 to All on Mon Mar 19 18:50:01 2018
    kernel kernel-4.16.2-64bit
    compile with page size = 16KB
    it crashes during booting

    someone asked me a confirm: here it is the log

    ---------------------------------------------------------

    <#> edit the numbered field
    'b' boot with this command line
    'r' restore command line
    'l' list dir
    'x' reset and reboot machine
    ? 0
    3/kernel-4.16.2-64bit-ps-16KB
    Current command line:
    3/kernel-4.16.2-64bit-ps-16KB root=/dev/sda4 console=ttyS0
    0: 3/kernel-4.16.2-64bit-ps-16KB
    1: root=/dev/sda4
    2: console=ttyS0

    <#> edit the numbered field
    'b' boot with this command line
    'r' restore command line
    'l' list dir
    'x' reset and reboot machine
    ? 0
    3/kernel-4.16.2-64bit-ps-16KB
    Current command line:
    3/kernel-4.16.2-64bit-ps-16KB root=/dev/sda4 console=ttyS0
    0: 3/kernel-4.16.2-64bit-ps-16KB
    1: root=/dev/sda4
    2: console=ttyS0

    <#> edit the numbered field
    'b' boot with this command line
    'r' restore command line
    'l' list dir
    'x' reset and reboot machine
    ? b

    Command line for kernel: 'root=/dev/sda4 console=ttyS0 palo_kernel=3/kernel-4.16.2-64bit-ps-16KB'
    Selected kernel: /kernel-4.16.2-64bit-ps-16KB from partition 3
    ELF64 executable
    Entry 00100000 first 00100000 n 5
    Segment 0 load 00100000 size 162096 mediaptr 0x1000
    Segment 1 load 00128000 size 6526320 mediaptr 0x2c000
    Segment 2 load 00764000 size 2551276 mediaptr 0x666000
    Segment 3 load 009d4000 size 879160 mediaptr 0x8d5000
    Segment 4 load 00aac000 size 412264 mediaptr 0x9ac000
    Branching to kernel entry point 0x00100000. If this is the last
    message you see, you may need to switch your console. This is
    a common symptom -- search the FAQ and mailing list at parisc-linux.org

    Linux version 4.16.0-Fearless-Coyote-Experimental-c3600-64bit
    (root@c3600) (gcc version 6.4.0 (Gentoo 6.4.0 p1.0)) #3 Mon
    Mar 19 16:53:18 CET 2018
    unwind_init: start = 0x4098350c, end = 0x409d2dec, entries = 20366
    FP[0] enabled: Rev 1 Model 16
    The 64-bit Kernel has started...
    Kernel default page size is 16 KB. Huge pages disabled.
    bootconsole [ttyB0] enabled
    Initialized PDC Console for debugging.
    Determining PDC firmware type: System Map.
    model 00005cf0 00000481 00000000 00000002 77a756e1 100000f0 00000008
    000000b2 000000b2
    random: fast init done
    vers 00000301
    CPUID vers 17 rev 11 (0x0000022b)
    capabilities 0x3
    model 9000/785/C3600
    Memory Ranges:
    0) Start 0x0000000000000000 End 0x00000000efffffff Size 3840 MB
    1) Start 0x0000000100000000 End 0x00000001ffffffff Size 4096 MB
    2) Start 0x00000010f0000000 End 0x00000010ffffffff Size 256 MB
    Total Memory: 8192 MB
    PDT: type PDT_PDC, size 50, entries 0, status 2, dbe_loc
    0xffffffffffffffff, good_mem 81 MB
    PDT: Firmware reports all memory OK.
    LCD display at fffffff0f05d0008,fffffff0f05d0000 registered
    Built 3 zonelists, mobility grouping on. Total pages: 522496
    Kernel command line: root=/dev/sda4 console=ttyS0 palo_kernel=3/kernel-4.16.2-64bit-ps-16KB
    Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes)
    Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes)
    Memory: 8333088K/8388608K available (6336K kernel code, 1962K rwdata,
    1392K rodata, 208K init, 416K bss, 55520K reserved,
    0K cma-reserved)
    NR_IRQS: 80
    sched_clock: 64 bits at 552MHz, resolution 1ns, wraps every 2199023255551ns Console: colour dummy device 160x64
    Calibrating delay loop... 1099.77 BogoMIPS (lpj=2199552)
    pid_max: default: 32768 minimum: 301
    Mount-cache hash table entries: 16384 (order: 3, 131072 bytes)
    Mountpoint-cache hash table entries: 16384 (order: 3, 131072 bytes)
    devtmpfs: initialized
    clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff,
    max_idle_ns: 7645041785100000 ns
    futex hash table entries: 256 (order: -2, 6144 bytes)
    xor: measuring software checksum speed
    8regs : 2628.000 MB/sec
    8regs_prefetch: 1964.000 MB/sec
    32regs : 2280.000 MB/sec
    32regs_prefetch: 2000.000 MB/sec
    xor: using function: 8regs (2628.000 MB/sec)
    NET: Registered protocol family 16
    Searching for devices...
    Found devices:
    1. Astro BC Runway Port at 0xfffffffffed00000 [10] { 12, 0x0, 0x582, 0x0000b } 2. Elroy PCI Bridge at 0xfffffffffed30000 [10/0] { 13, 0x0, 0x782, 0x0000a }
    3. Elroy PCI Bridge at 0xfffffffffed32000 [10/1] { 13, 0x0, 0x782, 0x0000a }
    4. Elroy PCI Bridge at 0xfffffffffed38000 [10/4] { 13, 0x0, 0x782, 0x0000a }
    5. Elroy PCI Bridge at 0xfffffffffed3c000 [10/6] { 13, 0x0, 0x782, 0x0000a }
    6. Allegro W+ at 0xfffffffffffa0000 [32] { 0, 0x0, 0x5cf, 0x00004 }
    7. Memory at 0xfffffffffed10200 [49] { 1, 0x0, 0x09c, 0x00009 }
    Enabling regular chassis codes support v0.05
    CPU(s): 1 x PA8600 (PCX-W+) at 552.000000 MHz
    Cache flush threshold set to 600 KiB
    TLB flush threshold set to 224 KiB
    SBA found Astro 2.1 at 0xfffffffffed00000
    Elroy version TR4.0 (0x5) found at 0xfffffffffed30000
    LBA 10:0: PCI host bridge to bus 0000:00
    pci_bus 0000:00: root bus resource [io 0x0000-0x1fff]
    pci_bus 0000:00: root bus resource [mem
    0xfffffffff4000000-0xfffffffff47fffff] (bus address
    [0xf4000000-0xf47fffff])
    pci_bus 0000:00: root bus resource [bus 00]
    PCI: Enabled native mode for NS87415 (pif=0x8f)
    Elroy version TR4.0 (0x5) found at 0xfffffffffed32000
    LBA 10:1: PCI host bridge to bus 0000:01
    pci_bus 0000:01: root bus resource [io 0x12000-0x13fff] (bus address [0x2000-0x3fff])
    pci_bus 0000:01: root bus resource [mem
    0xfffffffff4800000-0xfffffffff4ffffff] (bus address
    [0xf4800000-0xf4ffffff])
    pci_bus 0000:01: root bus resource [bus 01]
    Elroy version TR4.0 (0x5) found at 0xfffffffffed38000
    LBA 10:4: PCI host bridge to bus 0000:02
    pci_bus 0000:02: root bus resource [io 0x28000-0x29fff] (bus address [0x8000-0x9fff])
    pci_bus 0000:02: root bus resource [mem
    0xfffffffff6000000-0xfffffffff67fffff] (bus address
    [0xf6000000-0xf67fffff])
    pci_bus 0000:02: root bus resource [bus 02]
    Elroy version TR4.0 (0x5) found at 0xfffffffffed3c000
    LBA 10:6: PCI host bridge to bus 0000:03
    pci_bus 0000:03: root bus resource [io 0x3c000-0x3dfff] (bus address [0xc000-0xdfff])
    pci_bus 0000:03: root bus resource [mem
    0xfffffffff7000000-0xfffffffff77fffff] (bus address
    [0xf7000000-0xf77fffff])
    pci_bus 0000:03: root bus resource [bus 03]
    powersw: Soft power switch at 0xfffffff0f0400804 enabled.
    raid6: int64x1 gen() 462 MB/s
    raid6: int64x1 xor() 235 MB/s
    raid6: int64x2 gen() 642 MB/s
    raid6: int64x2 xor() 314 MB/s
    raid6: int64x4 gen() 673 MB/s
    raid6: int64x4 xor() 313 MB/s
    raid6: int64x8 gen() 580 MB/s
    raid6: int64x8 xor() 324 MB/s
    raid6: using algorithm int64x4 gen() 673 MB/s
    raid6: .... xor() 313 MB/s, rmw enabled
    raid6: using intx1 recovery algorithm
    SCSI subsystem initialized
    usbcore: registered new interface driver usbfs
    usbcore: registered new interface driver hub
    usbcore: registered new device driver usb
    NET: Registered protocol family 2
    tcp_listen_portaddr_hash hash table entries: 4096 (order: 2, 65536 bytes)
    TCP established hash table entries: 65536 (order: 5, 524288 bytes)
    TCP bind hash table entries: 65536 (order: 5, 524288 bytes)
    TCP: Hash tables configured (established 65536 bind 65536)
    UDP hash table entries: 4096 (order: 3, 131072 bytes)
    UDP-Lite hash table entries: 4096 (order: 3, 131072 bytes)
    NET: Registered protocol family 1
    SuperIO: Found NS87560 Legacy I/O device at 0000:00:0e.1 (IRQ 19)
    SuperIO: Serial port 1 at 0x3f8
    SuperIO: Serial port 2 at 0x2f8
    SuperIO: Parallel port at 0x378
    SuperIO: Floppy controller at 0x3f0
    SuperIO: ACPI at 0x7e0
    SuperIO: USB regulator enabled
    clocksource: cr16: mask: 0xffffffffffffffff max_cycles: 0x7f4ee06e57, max_idle_ns: 440795224733 ns
    clocksource: Switched to clocksource cr16
    Enabling PDC chassis warnings support v0.05
    Performance monitoring counters enabled for Allegro W+
    Initialise system trusted keyrings
    workingset: timestamp_bits=59 max_order=19 bucket_order=0
    Key type asymmetric registered
    Asymmetric key parser 'x509' registered
    Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
    io scheduler noop registered
    io scheduler deadline registered
    io scheduler cfq registered (default)
    io scheduler mq-deadline registered
    io scheduler kyber registered
    io scheduler bfq registered
    PDC Stable Storage facility v0.30
    Serial: 8250/16550 driver, 2 ports, IRQ sharing enabled
    serial8250: ttyS0 at I/O 0x3f8 (irq = 3, base_baud = 115200) is a 16550A console [ttyS0] enabled
    console [ttyS0] enabled
    bootconsole [ttyB0] disabled
    bootconsole [ttyB0] disabled
    serial8250: ttyS1 at I/O 0x2f8 (irq = 4, base_baud = 115200) is a 16550A
    loop: module loaded
    Uniform Multi-Platform E-IDE driver
    ns87415 0000:00:0e.0: IDE controller (0x100b:0x0002 rev 0x03)
    ns87415 0000:00:0e.0: 100% native mode on irq 7
    ide0: BM-DMA at 0x0a00-0x0a07
    ide1: BM-DMA at 0x0a08-0x0a0f
    ide0 at 0xf00-0xf07,0xe02 on irq 7
    ide1 at 0xd00-0xd07,0xb02 on irq 7
    ide-gd driver 1.18
    ide-cd driver 5.00
    Loading iSCSI transport class v2.0-870.
    iscsi: registered transport (tcp)
    sym0: <896> rev 0x7 at pci 0000:00:0f.0 irq 20
    sym0: PA-RISC Firmware, ID 7, Fast-40, SE, parity checking
    CACHE TEST FAILED: DMA error (dstat=0x81).
    sym0: CACHE INCORRECTLY CONFIGURED.
    sym0: giving up ...
    WARNING: CPU: 0 PID: 1 at ___free_dma_mem_cluster+0x138/0x140
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-Fearless-Coyote-Experimental-c3600-64bit #3

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001101111110000001110 Not tainted
    r00-03 000000ff0806fc0e 0000000040823f00 000000004051d6e8 000000012f0a0e70 r04-07 00000000407eaf00 000000012f1d8220 000000012f1d8000 000000012f1d8000 r08-11 0000000000000000 000000012f1dc000 000000012f647000 000000012f19e800 r12-15 000000012f19e898 0000000000000000 0000000040b08c14 0000000040a85c08 r16-19 000000000000101a 0000000040929fa8 00000000f0000174 000000012f3a0000 r20-23 000000000000000a 0000000000004000 000000012f1d8170 000000000020c000 r24-27 000000012f3a0000 000000012f3a0000 000000012f19e898 00000000407eaf00 r28-31 0000000040a45608 000000012f0a0e40 000000012f0a0f00 000000000800000e sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004051d870 000000004051d874
    IIR: 03ffe01f ISR: 0000000010340380 IOR: 0000003c761d8060
    CPU: 0 CR30: 000000012f0a0000 CR31: 00000000ffffffff
    ORIG_R28: 0000000000000000
    IAOQ[0]: ___free_dma_mem_cluster+0x138/0x140
    IAOQ[1]: ___free_dma_mem_cluster+0x13c/0x140
    RP(r2): __sym_mfree+0x108/0x158
    Backtrace:
    [<000000004051d6e8>] __sym_mfree+0x108/0x158
    [<000000004051de28>] __sym_mfree_dma+0x60/0xb8
    [<000000004051d364>] sym_hcb_free+0x7c/0x250
    [<0000000040511ce0>] sym_free_resources+0x70/0xc8
    [<0000000040514278>] sym2_probe+0x678/0xaf8
    [<000000004044dd88>] pci_device_probe+0x108/0x190
    [<00000000404abdc0>] really_probe+0x330/0x400
    [<00000000404abfc8>] __driver_attach+0x138/0x140
    [<00000000404a8e68>] bus_for_each_dev+0x90/0xf8
    [<00000000404ab44c>] driver_attach+0x34/0x48
    [<00000000404aab78>] bus_add_driver+0x1a8/0x308
    [<00000000404accb0>] driver_register+0x90/0x180
    [<000000004044d3bc>] __pci_register_driver+0x54/0x68
    [<000000004011e55c>] 0x4011e55c
    [<0000000040134250>] do_one_initcall+0x70/0x1d8
    [<000000004010152c>] 0x4010152c

    ---[ end trace fba2d821ec98bbe6 ]---
    WARNING: CPU: 0 PID: 1 at ___free_dma_mem_cluster+0x138/0x140
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.16.0-Fearless-Coyote-Experimental-c3600-64bit #3

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001101111100100001110 Tainted: G W
    r00-03 000000ff0806f90e 0000000040823f00 000000004051d6e8 000000012f0a0dc0 r04-07 00000000407eaf00 000000012f1d8200 000000012f1d8000 000000012f1d8000 r08-11 0000000000000000 000000012f1dc000 000000012f647000 000000012f19e800 r12-15 000000012f19e898 0000000000000000 0000000040b08c14 0000000040a85c08 r16-19 000000000000101a 0000000040929fa8 00000000f0000174 000000012f1dc000 r20-23 000000000000000a 0000000000004000 000000012f1d8170 0000000000208000 r24-27 000000012f1dc000 000000012f1dc000 000000012f19e898 00000000407eaf00 r28-31 0000000040a45608 000000012f0a0d90 000000012f0a0e50 000000000800000e sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004051d870 000000004051d874
    IIR: 03ffe01f ISR: 0000000010340380 IOR: 0000003c761d80d8
    CPU: 0 CR30: 000000012f0a0000 CR31: 00000000ffffffff
    ORIG_R28: 0000000000000000
    IAOQ[0]: ___free_dma_mem_cluster+0x138/0x140
    IAOQ[1]: ___free_dma_mem_cluster+0x13c/0x140
    RP(r2): __sym_mfree+0x108/0x158
    Backtrace:
    [<000000004051d6e8>] __sym_mfree+0x108/0x158
    [<000000004051de28>] __sym_mfree_dma+0x60/0xb8
    [<0000000040511d00>] sym_free_resources+0x90/0xc8
    [<0000000040514278>] sym2_probe+0x678/0xaf8
    [<000000004044dd88>] pci_device_probe+0x108/0x190
    [<00000000404abdc0>] really_probe+0x330/0x400
    [<00000000404abfc8>] __driver_attach+0x138/0x140
    [<00000000404a8e68>] bus_for_each_dev+0x90/0xf8
    [<00000000404ab44c>] driver_attach+0x34/0x48
    [<00000000404aab78>] bus_add_driver+0x1a8/0x308
    [<00000000404accb0>] driver_register+0x90/0x180
    [<000000004044d3bc>] __pci_register_driver+0x54/0x68
    [<000000004011e55c>] 0x4011e55c
    [<0000000040134250>] do_one_initcall+0x70/0x1d8
    [<000000004010152c>] 0x4010152c
    [<000000004014ce94>] kernel_init+0x24/0x1b0

    ---[ end trace fba2d821ec98bbe7 ]---
    sym0: <896> rev 0x7 at pci 0000:00:0f.1 irq 20
    sym0: PA-RISC Firmware, ID 7, Fast-40, LVD, parity checking
    CACHE TEST FAILED: DMA error (dstat=0x81).
    sym0: CACHE INCORRECTLY CONFIGURED.
    sym0: giving up ...
    WARNING: CPU: 0 PID: 1 at ___free_dma_mem_cluster+0x138/0x140
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.16.0-Fearless-Coyote-Experimental-c3600-64bit #3

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001101111101000001110 Tainted: G W
    r00-03 000000ff0806fa0e 0000000040823f00 000000004051d6e8 000000012f0a0e70 r04-07 00000000407eaf00 000000012f59c220 000000012f59c000 000000012f59c000 r08-11 0000000000000000 000000012f594000 000000012f647000 000000012f19e000 r12-15 000000012f19e098 0000000000000000 0000000040b08c14 0000000040a85c08 r16-19 000000000000101a 0000000040929fa8 00000000f0000174 000000012f0c4000 r20-23 000000000000000a 0000000000004000 000000012f59c170 0000000000214000 r24-27 000000012f0c4000 000000012f0c4000 000000012f19e098 00000000407eaf00 r28-31 0000000040a45608 000000012f0a0e40 000000012f0a0f00 000000000800000e sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004051d870 000000004051d874
    IIR: 03ffe01f ISR: 0000000010340380 IOR: 0000003d6719c0a8
    CPU: 0 CR30: 000000012f0a0000 CR31: 00000000ffffffff
    ORIG_R28: 0000000000000000
    IAOQ[0]: ___free_dma_mem_cluster+0x138/0x140
    IAOQ[1]: ___free_dma_mem_cluster+0x13c/0x140
    RP(r2): __sym_mfree+0x108/0x158
    Backtrace:
    [<000000004051d6e8>] __sym_mfree+0x108/0x158
    [<000000004051de28>] __sym_mfree_dma+0x60/0xb8
    [<000000004051d364>] sym_hcb_free+0x7c/0x250
    [<0000000040511ce0>] sym_free_resources+0x70/0xc8
    [<0000000040514278>] sym2_probe+0x678/0xaf8
    [<000000004044dd88>] pci_device_probe+0x108/0x190
    [<00000000404abdc0>] really_probe+0x330/0x400
    [<00000000404abfc8>] __driver_attach+0x138/0x140
    [<00000000404a8e68>] bus_for_each_dev+0x90/0xf8
    [<00000000404ab44c>] driver_attach+0x34/0x48
    [<00000000404aab78>] bus_add_driver+0x1a8/0x308
    [<00000000404accb0>] driver_register+0x90/0x180
    [<000000004044d3bc>] __pci_register_driver+0x54/0x68
    [<000000004011e55c>] 0x4011e55c
    [<0000000040134250>] do_one_initcall+0x70/0x1d8
    [<000000004010152c>] 0x4010152c

    ---[ end trace fba2d821ec98bbe8 ]---
    WARNING: CPU: 0 PID: 1 at ___free_dma_mem_cluster+0x138/0x140
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.16.0-Fearless-Coyote-Experimental-c3600-64bit #3

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001101111100100001110 Tainted: G W
    r00-03 000000ff0806f90e 0000000040823f00 000000004051d6e8 000000012f0a0dc0 r04-07 00000000407eaf00 000000012f59c200 000000012f59c000 000000012f59c000 r08-11 0000000000000000 000000012f594000 000000012f647000 000000012f19e000 r12-15 000000012f19e098 0000000000000000 0000000040b08c14 0000000040a85c08 r16-19 000000000000101a 0000000040929fa8 00000000f0000174 000000012f594000 r20-23 000000000000000a 0000000000004000 000000012f59c170 0000000000210000 r24-27 000000012f594000 000000012f594000 000000012f19e098 00000000407eaf00 r28-31 0000000040a45608 000000012f0a0d90 000000012f0a0e50 000000000800000e sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004051d870 000000004051d874
    IIR: 03ffe01f ISR: 0000000010340380 IOR: 0000003d6719c048
    CPU: 0 CR30: 000000012f0a0000 CR31: 00000000ffffffff
    ORIG_R28: 0000000000000000
    IAOQ[0]: ___free_dma_mem_cluster+0x138/0x140
    IAOQ[1]: ___free_dma_mem_cluster+0x13c/0x140
    RP(r2): __sym_mfree+0x108/0x158
    Backtrace:
    [<000000004051d6e8>] __sym_mfree+0x108/0x158
    [<000000004051de28>] __sym_mfree_dma+0x60/0xb8
    [<0000000040511d00>] sym_free_resources+0x90/0xc8
    [<0000000040514278>] sym2_probe+0x678/0xaf8
    [<000000004044dd88>] pci_device_probe+0x108/0x190
    [<00000000404abdc0>] really_probe+0x330/0x400
    [<00000000404abfc8>] __driver_attach+0x138/0x140
    [<00000000404a8e68>] bus_for_each_dev+0x90/0xf8
    [<00000000404ab44c>] driver_attach+0x34/0x48
    [<00000000404aab78>] bus_add_driver+0x1a8/0x308
    [<00000000404accb0>] driver_register+0x90/0x180
    [<000000004044d3bc>] __pci_register_driver+0x54/0x68
    [<000000004011e55c>] 0x4011e55c
    [<0000000040134250>] do_one_initcall+0x70/0x1d8
    [<000000004010152c>] 0x4010152c
    [<000000004014ce94>] kernel_init+0x24/0x1b0

    ---[ end trace fba2d821ec98bbe9 ]---
    st: Version 20160209, fixed bufsize 32768, s/g segs 256
    SCSI Media Changer driver v0.25
    Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
    Linux Tulip driver version 1.1.15-NAPI (Feb 27, 2007)
    tulip0: no phy info, aborting mtable build
    tulip0: MII transceiver #1 config 1000 status 782d advertising 01e1
    net eth0: Digital DS21142/43 Tulip rev 65 at MMIO 0xfffffffff4008000, 00:30:6e:1e:2c:17, IRQ 17
    ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
    ehci-pci: EHCI PCI platform driver
    ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
    ohci-pci: OHCI PCI platform driver
    ohci-pci 0000:00:0e.2: OHCI PCI host controller
    ohci-pci 0000:00:0e.2: new USB bus registered, assigned bus number 1
    ohci-pci 0000:00:0e.2: irq 1, io mem 0xfffffffff4007000
    usb usb1: New USB device found, idVendor=1d6b, idProduct=0001
    usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
    usb usb1: Product: OHCI PCI host controller
    usb usb1: Manufacturer: Linux
    4.16.0-Fearless-Coyote-Experimental-c3600-64bit ohci_hcd
    usb usb1: SerialNumber: 0000:00:0e.2
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 3 ports detected
    usbcore: registered new interface driver uas
    usbcore: registered new interface driver usb-storage
    usbcore: registered new interface driver ftdi_sio
    usbserial: USB Serial support registered for FTDI USB Serial Device
    rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
    usbcore: registered new interface driver usbhid
    usbhid: USB HID core driver
    pktgen: Packet Generator for packet performance testing. Version: 2.75
    ipip: IPv4 and MPLS over IPv4 tunneling driver
    Initializing XFRM netlink socket
    NET: Registered protocol family 17
    NET: Registered protocol family 15
    bridge: filtering via arp/ip/ip6tables is no longer available by
    default. Update your scripts to load br_netfilter if you
    need this.
    8021q: 802.1Q VLAN Support v1.8
    Key type dns_resolver registered
    Loading compiled-in X.509 certificates
    rtc-generic rtc-generic: setting system clock to 2018-03-19 17:30:04
    UTC (1521480604)
    md: Waiting for all devices to be available before autodetect
    md: If you don't use raid, use raid=noautodetect
    md: Autodetecting RAID arrays.
    md: autorun ...
    md: ... autorun DONE.
    VFS: Cannot open root device "sda4" or unknown-block(0,0): error -6
    Please append a correct "root=" boot option; here are the available partitions: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)