• Re: Kernel versions 6.x don't boot on Amiga 4000

    From Geert Uytterhoeven@21:1/5 to glaubitz@physik.fu-berlin.de on Tue Feb 21 16:00:01 2023
    Hi Adrian,

    On Tue, Feb 21, 2023 at 3:51 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
    I tested Debian's most recent m68k kernels from the 6.0.x and 6.1.x series and

    Thanks for testing!

    neither of these boot on my Amiga 4000/060. Both get stuck at the ABCDGHIJK message.

    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    FWIW, I noticed that the kernel image itself is already over 7 MB, not sure whether this is a problem.

    Depends on how much RAM you have ;-)

    Anyone else tried a recent kernel on their Amigas?

    I really should start booting on real Amiga hardware again...

    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to All on Tue Feb 21 16:00:01 2023
    Hi!

    I tested Debian's most recent m68k kernels from the 6.0.x and 6.1.x series and neither of these boot on my Amiga 4000/060. Both get stuck at the ABCDGHIJK message.

    Will try earlier kernels until I found the one where the breakage was introduced.
    Currently known latest kernel to work is 5.10.5.

    FWIW, I noticed that the kernel image itself is already over 7 MB, not sure whether this is a problem.

    Anyone else tried a recent kernel on their Amigas?

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer
    `. `' Physicist
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to Geert Uytterhoeven on Tue Feb 21 17:00:01 2023
    Hi Geert!

    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff] [ 0.000000] Unable to handle kernel access at virtual address (ptrval)
    [ 0.000000] Oops: 00000000
    [ 0.000000] Modules linked in:
    [ 0.000000] PC: [<00201d3c>] memcmp+0x28/0x56
    [ 0.000000] SR: 2709 SP: (ptrval) a2: 004a5580
    [ 0.000000] d0: 00000003 d1: 00000001 d2: 00201d14 d3: 00000272
    [ 0.000000] d4: 00012750 d5: 08023ec0 a0: 0000000c a1: 0f7ffff4
    [ 0.000000] Process swapper (pid: 0, task=(ptrval))
    [ 0.000000] Frame format=4 fault addr=0f7ffff4 fslw=01051000
    [ 0.000000] Stack from 004a3fac:
    [ 0.000000] 00201d14 00000272 00374e40 0f7ffff4 0f800000 00534b22 0f7ffff4 0042e325
    [ 0.000000] 0000000c 0055c000 00000272 00012750 08023ec0 00012750 080dbf48 08001000
    [ 0.000000] 08001000 0f7ffff0 00553d9a 00000000 00533872
    [ 0.000000] Call Trace: [<00201d14>] memcmp+0x0/0x56
    [ 0.000000] [<00374e40>] _printk+0x0/0x18
    [ 0.000000] [<00534b22>] start_kernel+0x8a/0x5d6
    [ 0.000000] [<00012750>] LOGTBL+0x228/0x800
    [ 0.000000] [<00012750>] LOGTBL+0x228/0x800
    [ 0.000000] [<00533872>] _sinittext+0x872/0x11f8
    [ 0.000000]
    [ 0.000000] Code: b288 661e 4280 6030 2a49 284b 264c 224d <bb8c> 66ea 5988 7003 b088 65f0 224d 264c 60dc 4283 1631 1800 4282 1433 1800
    2003
    [ 0.000000] Disabling lock debugging due to kernel taint
    [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
    [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

    FWIW, I noticed that the kernel image itself is already over 7 MB, not sure whether this is a problem.

    Depends on how much RAM you have ;-)

    128 MB.

    Anyone else tried a recent kernel on their Amigas?

    I really should start booting on real Amiga hardware again...

    You should ;-).

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer
    `. `' Physicist
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to Michael Schmitz on Tue Feb 21 22:50:01 2023
    Hi Michael!

    On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
    a1 is just  before the end of your RAM chunk. If that's a longword
    access, you'd fall over the edge :) Can you disassemble the code snippet
    (or memcmp()) so we can see what's happening?

    Here you go:

    00201d14 <memcmp>:
    201d14: 48e7 301c moveml %d2-%d3/%a3-%a5,%sp@-
    201d18: 226f 0018 moveal %sp@(24),%a1
    201d1c: 266f 001c moveal %sp@(28),%a3
    201d20: 206f 0020 moveal %sp@(32),%a0
    201d24: 7003 moveq #3,%d0
    201d26: b088 cmpl %a0,%d0
    201d28: 650a bcss 201d34 <memcmp+0x20>
    201d2a: 4281 clrl %d1
    201d2c: b288 cmpl %a0,%d1
    201d2e: 661e bnes 201d4e <memcmp+0x3a>
    201d30: 4280 clrl %d0
    201d32: 6030 bras 201d64 <memcmp+0x50>
    201d34: 2a49 moveal %a1,%a5
    201d36: 284b moveal %a3,%a4
    201d38: 264c moveal %a4,%a3
    201d3a: 224d moveal %a5,%a1
    201d3c: bb8c cmpml %a4@+,%a5@+
    201d3e: 66ea bnes 201d2a <memcmp+0x16>
    201d40: 5988 subql #4,%a0
    201d42: 7003 moveq #3,%d0
    201d44: b088 cmpl %a0,%d0
    201d46: 65f0 bcss 201d38 <memcmp+0x24>
    201d48: 224d moveal %a5,%a1
    201d4a: 264c moveal %a4,%a3
    201d4c: 60dc bras 201d2a <memcmp+0x16>
    201d4e: 4283 clrl %d3
    201d50: 1631 1800 moveb %a1@(0,%d1:l),%d3
    201d54: 4282 clrl %d2
    201d56: 1433 1800 moveb %a3@(0,%d1:l),%d2
    201d5a: 2003 movel %d3,%d0
    201d5c: 9082 subl %d2,%d0
    201d5e: 5281 addql #1,%d1
    201d60: b483 cmpl %d3,%d2
    201d62: 67c8 beqs 201d2c <memcmp+0x18>
    201d64: 4cdf 380c moveml %sp@+,%d2-%d3/%a3-%a5
    201d68: 4e75 rts

    The kernel image is actually unstripped. Is there a config option for that?

    Do we want to keep symbols in a non-debug kernel?

    I do recall recent changes to the mm code, but that was for NOMMU. I
    wonder whether there was anything else that would introduce an implicit assumption about memory starting at 0x0 ...

    Sounds like a possible culprit.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer
    `. `' Physicist
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to John Paul Adrian Glaubitz on Tue Feb 21 22:20:01 2023
    Hi Adrian,

    On 22/02/23 04:53, John Paul Adrian Glaubitz wrote:
    Hi Geert!

    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in
    https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?
    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]

    In both your case and Kars', the memory does not start at 0x0. Kars
    finds all memory reserved on his HP.

    6.2rc8 boots fine on my 030 (memory starting at 0x0).

    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Unable to handle kernel access at virtual address (ptrval)
    [ 0.000000] Oops: 00000000
    [ 0.000000] Modules linked in:
    [ 0.000000] PC: [<00201d3c>] memcmp+0x28/0x56
    [ 0.000000] SR: 2709 SP: (ptrval) a2: 004a5580
    [ 0.000000] d0: 00000003 d1: 00000001 d2: 00201d14 d3: 00000272
    [ 0.000000] d4: 00012750 d5: 08023ec0 a0: 0000000c a1: 0f7ffff4

    a1 is just  before the end of your RAM chunk. If that's a longword
    access, you'd fall over the edge :) Can you disassemble the code snippet
    (or memcmp()) so we can see what's happening?

    I do recall recent changes to the mm code, but that was for NOMMU. I
    wonder whether there was anything else that would introduce an implicit assumption about memory starting at 0x0 ...

    [ 0.000000] Process swapper (pid: 0, task=(ptrval))
    [ 0.000000] Frame format=4 fault addr=0f7ffff4 fslw=01051000
    [ 0.000000] Stack from 004a3fac:
    [ 0.000000] 00201d14 00000272 00374e40 0f7ffff4 0f800000 00534b22 0f7ffff4 0042e325
    [ 0.000000] 0000000c 0055c000 00000272 00012750 08023ec0 00012750 080dbf48 08001000
    [ 0.000000] 08001000 0f7ffff0 00553d9a 00000000 00533872
    [ 0.000000] Call Trace: [<00201d14>] memcmp+0x0/0x56
    [ 0.000000] [<00374e40>] _printk+0x0/0x18
    [ 0.000000] [<00534b22>] start_kernel+0x8a/0x5d6
    [ 0.000000] [<00012750>] LOGTBL+0x228/0x800
    [ 0.000000] [<00012750>] LOGTBL+0x228/0x800
    [ 0.000000] [<00533872>] _sinittext+0x872/0x11f8
    [ 0.000000]
    [ 0.000000] Code: b288 661e 4280 6030 2a49 284b 264c 224d <bb8c> 66ea 5988 7003 b088 65f0 224d 264c 60dc 4283 1631 1800 4282 1433 1800
    2003
    [ 0.000000] Disabling lock debugging due to kernel taint
    [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
    [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

    FWIW, I noticed that the kernel image itself is already over 7 MB, not sure >>> whether this is a problem.
    Depends on how much RAM you have ;-)
    128 MB.

    Anyone else tried a recent kernel on their Amigas?
    I really should start booting on real Amiga hardware again...
    You should ;-).

    Thirded :-)

    Cheers,

        Michael


    Adrian


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to John Paul Adrian Glaubitz on Wed Feb 22 02:00:02 2023
    Hi Adrian,

    On 22/02/23 10:46, John Paul Adrian Glaubitz wrote:
    Hi Michael!

    On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
    a1 is just  before the end of your RAM chunk. If that's a longword

    Actually it isn't that close - if I read the stack correctly, we're
    comparing 0xc bytes from 0x0f7ffff4 which is to 0x0f7ffffff.

    The post-increment of a5 to 0x0f800000 might cause a pre-fetch beyond
    end of memory - how does that get handled?

    access, you'd fall over the edge :) Can you disassemble the code snippet
    (or memcmp()) so we can see what's happening?
    Here you go:

    00201d14 <memcmp>:
    201d14: 48e7 301c moveml %d2-%d3/%a3-%a5,%sp@-
    201d18: 226f 0018 moveal %sp@(24),%a1
    201d1c: 266f 001c moveal %sp@(28),%a3
    201d20: 206f 0020 moveal %sp@(32),%a0
    201d24: 7003 moveq #3,%d0
    201d26: b088 cmpl %a0,%d0
    201d28: 650a bcss 201d34 <memcmp+0x20>
    201d2a: 4281 clrl %d1
    201d2c: b288 cmpl %a0,%d1
    201d2e: 661e bnes 201d4e <memcmp+0x3a>
    201d30: 4280 clrl %d0
    201d32: 6030 bras 201d64 <memcmp+0x50>
    201d34: 2a49 moveal %a1,%a5 <======= 0x0f7ffff4
    201d36: 284b moveal %a3,%a4
    201d38: 264c moveal %a4,%a3
    201d3a: 224d moveal %a5,%a1
    201d3c: bb8c cmpml %a4@+,%a5@+ <======= a5 will be 0x0f800000 after post-increment
    201d3e: 66ea bnes 201d2a <memcmp+0x16>
    201d40: 5988 subql #4,%a0
    201d42: 7003 moveq #3,%d0
    201d44: b088 cmpl %a0,%d0
    201d46: 65f0 bcss 201d38 <memcmp+0x24>
    201d48: 224d moveal %a5,%a1
    201d4a: 264c moveal %a4,%a3
    201d4c: 60dc bras 201d2a <memcmp+0x16>
    201d4e: 4283 clrl %d3
    201d50: 1631 1800 moveb %a1@(0,%d1:l),%d3
    201d54: 4282 clrl %d2
    201d56: 1433 1800 moveb %a3@(0,%d1:l),%d2
    201d5a: 2003 movel %d3,%d0
    201d5c: 9082 subl %d2,%d0
    201d5e: 5281 addql #1,%d1
    201d60: b483 cmpl %d3,%d2
    201d62: 67c8 beqs 201d2c <memcmp+0x18>
    201d64: 4cdf 380c moveml %sp@+,%d2-%d3/%a3-%a5
    201d68: 4e75 rts

    The kernel image is actually unstripped. Is there a config option for that?
    I'm sure the compressed kernel image is stripped but includes the kernel
    symbol table (see below). The symbol table is definitely good to have (otherwise you'd have to figure what all the addresses on the stack mean
    from a separate symbol table).
    Do we want to keep symbols in a non-debug kernel?

    Definitely ...

    Cheers,

        Michael

    Output of objdump -h:

    vmlinux-6.2.0-rc8-atari-fpuemu-atafbfix+:     file format elf32-m68k

    Sections:
    Idx Name          Size      VMA       LMA       File off  Algn
      0 .text         0030169c  00001000  00001000  00001000  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
      1 __ex_table    00001ab0  003026a0  003026a0  003026a0  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      2 .rodata       000c81e8  00305000  00305000  00305000  2**4
                      CONTENTS, ALLOC, LOAD, DATA
      3 __ksymtab     00009a14  003cd1e8  003cd1e8  003cd1e8  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      4 __ksymtab_gpl 000057c0  003d6bfc  003d6bfc  003d6bfc  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      5 __ksymtab_strings 000166a3  003dc3bc  003dc3bc  003dc3bc  2**0
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      6 __param       000006cc  003f2a60  003f2a60  003f2a60  2**1
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      7 __modver      00000088  003f312c  003f312c  003f312c  2**1
                      CONTENTS, ALLOC, LOAD, DATA
      8 .notes        00000054  003f31b4  003f31b4  003f31b4  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      9 .data         00051a20  003f4000  003f4000  003f4000  2**4
                      CONTENTS, ALLOC, LOAD, DATA
     10 .bss          0002266c  00445a20  00445a20  00445a20  2**4
                      ALLOC
     11 .init.text    00017be0  00469000  00469000  00447000  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
     12 .init.data    00004c1c  00480be0  00480be0  0045ebe0  2**2
                      CONTENTS, ALLOC, LOAD, DATA
     13 .m68k_fixup   00000480  004857fc  004857fc  004637fc  2**0
                      CONTENTS, ALLOC, LOAD, DATA
     14 .init_end     00000384  00485c7c  00485c7c  00463c7c  2**0
                      ALLOC
     15 .comment      0000002d  00000000  00000000  00463c7c  2**0
                      CONTENTS, READONLY


    I do recall recent changes to the mm code, but that was for NOMMU. I
    wonder whether there was anything else that would introduce an implicit
    assumption about memory starting at 0x0 ...
    Sounds like a possible culprit.

    Adrian


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to Michael Schmitz on Thu Feb 23 19:30:01 2023
    Correcting myself again...

    On 22/02/23 13:53, Michael Schmitz wrote:
    Hi Adrian,

    On 22/02/23 10:46, John Paul Adrian Glaubitz wrote:
    Hi Michael!

    On Wed, 2023-02-22 at 10:09 +1300, Michael Schmitz wrote:
    a1 is just  before the end of your RAM chunk. If that's a longword

    Actually it isn't that close - if I read the stack correctly, we're
    comparing 0xc bytes from 0x0f7ffff4 which is to 0x0f7ffffff.

    The post-increment of a5 to 0x0f800000 might cause a pre-fetch beyond
    end of memory - how does that get handled?

    The stack frame format in this case (at least, going by the 68000 series
    PRM) seems to indicate it's not something to do with prefetch.

    Can you try Kars' recent patch? Maybe the old bug calculating the RAM
    end address only now got 'active' on your configuration due to more
    recent MM changes?

    Cheers,

        Michaell


    access, you'd fall over the edge :) Can you disassemble the code
    snippet
    (or memcmp()) so we can see what's happening?
    Here you go:

    00201d14 <memcmp>:
       201d14:       48e7 301c       moveml %d2-%d3/%a3-%a5,%sp@- >>    201d18:       226f 0018       moveal %sp@(24),%a1
       201d1c:       266f 001c       moveal %sp@(28),%a3
       201d20:       206f 0020       moveal %sp@(32),%a0
       201d24:       7003            moveq #3,%d0
       201d26:       b088            cmpl %a0,%d0
       201d28:       650a            bcss 201d34 <memcmp+0x20>
       201d2a:       4281            clrl %d1
       201d2c:       b288            cmpl %a0,%d1
       201d2e:       661e            bnes 201d4e <memcmp+0x3a>
       201d30:       4280            clrl %d0
       201d32:       6030            bras 201d64 <memcmp+0x50>
       201d34:       2a49            moveal %a1,%a5 <=======  0x0f7ffff4
       201d36:       284b            moveal %a3,%a4
       201d38:       264c            moveal %a4,%a3
       201d3a:       224d            moveal %a5,%a1
       201d3c:       bb8c            cmpml %a4@+,%a5@+ <=======  a5 will
    be 0x0f800000 after post-increment
       201d3e:       66ea            bnes 201d2a <memcmp+0x16>
       201d40:       5988            subql #4,%a0
       201d42:       7003            moveq #3,%d0
       201d44:       b088            cmpl %a0,%d0
       201d46:       65f0            bcss 201d38 <memcmp+0x24>
       201d48:       224d            moveal %a5,%a1
       201d4a:       264c            moveal %a4,%a3
       201d4c:       60dc            bras 201d2a <memcmp+0x16>
       201d4e:       4283            clrl %d3
       201d50:       1631 1800       moveb %a1@(0,%d1:l),%d3
       201d54:       4282            clrl %d2
       201d56:       1433 1800       moveb %a3@(0,%d1:l),%d2
       201d5a:       2003            movel %d3,%d0
       201d5c:       9082            subl %d2,%d0
       201d5e:       5281            addql #1,%d1
       201d60:       b483            cmpl %d3,%d2
       201d62:       67c8            beqs 201d2c <memcmp+0x18>
       201d64:       4cdf 380c       moveml %sp@+,%d2-%d3/%a3-%a5 >>    201d68:       4e75            rts

    The kernel image is actually unstripped. Is there a config option for
    that?
    I'm sure the compressed kernel image is stripped but includes the
    kernel symbol table (see below). The symbol table is definitely good
    to have (otherwise you'd have to figure what all the addresses on the
    stack mean from a separate symbol table).
    Do we want to keep symbols in a non-debug kernel?

    Definitely ...

    Cheers,

        Michael

    Output of objdump -h:

    vmlinux-6.2.0-rc8-atari-fpuemu-atafbfix+:     file format elf32-m68k

    Sections:
    Idx Name          Size      VMA       LMA       File off  Algn
      0 .text         0030169c  00001000  00001000  00001000  2**2                   CONTENTS, ALLOC, LOAD, READONLY, CODE
      1 __ex_table    00001ab0  003026a0  003026a0  003026a0  2**2                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      2 .rodata       000c81e8  00305000  00305000  00305000  2**4                   CONTENTS, ALLOC, LOAD, DATA
      3 __ksymtab     00009a14  003cd1e8  003cd1e8  003cd1e8  2**2                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      4 __ksymtab_gpl 000057c0  003d6bfc  003d6bfc  003d6bfc  2**2                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      5 __ksymtab_strings 000166a3  003dc3bc  003dc3bc  003dc3bc  2**0                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      6 __param       000006cc  003f2a60  003f2a60  003f2a60  2**1                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      7 __modver      00000088  003f312c  003f312c  003f312c  2**1                   CONTENTS, ALLOC, LOAD, DATA
      8 .notes        00000054  003f31b4  003f31b4  003f31b4  2**2                   CONTENTS, ALLOC, LOAD, READONLY, DATA
      9 .data         00051a20  003f4000  003f4000  003f4000  2**4                   CONTENTS, ALLOC, LOAD, DATA
     10 .bss          0002266c  00445a20  00445a20  00445a20  2**4                   ALLOC
     11 .init.text    00017be0  00469000  00469000  00447000  2**2                   CONTENTS, ALLOC, LOAD, READONLY, CODE
     12 .init.data    00004c1c  00480be0  00480be0  0045ebe0  2**2                   CONTENTS, ALLOC, LOAD, DATA
     13 .m68k_fixup   00000480  004857fc  004857fc  004637fc  2**0                   CONTENTS, ALLOC, LOAD, DATA
     14 .init_end     00000384  00485c7c  00485c7c  00463c7c  2**0                   ALLOC
     15 .comment      0000002d  00000000  00000000  00463c7c  2**0                   CONTENTS, READONLY


    I do recall recent changes to the mm code, but that was for NOMMU. I
    wonder whether there was anything else that would introduce an implicit
    assumption about memory starting at 0x0 ...
    Sounds like a possible culprit.

    Adrian


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Walsh@21:1/5 to All on Fri Feb 24 02:10:01 2023
    FYI:

    Just caught this trying a re-compile of kernel 5.15.2 from kernel org,
    under debbootstrap/sbuild and qemu-system-m68k both produce this issue:


    CC mm/process_vm_access.o
    CC mm/page_alloc.o
    mm/page_alloc.c: In function ‘mem_init_print_info’: mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&__init_begin[0] <= &_sinittext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &__init_end[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &_sinittext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &_etext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__init_begin[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__init_begin[0] < &_edata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &__start_rodata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_etext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__start_rodata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_edata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    CC mm/init-mm.o
    CC mm/memblock.o






    --
    Stephen - Vk3heg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Walsh@21:1/5 to John Paul Adrian Glaubitz on Fri Feb 24 01:20:01 2023
    Hi Adrian,

    On Tue, 21 Feb 2023 15:50:52 +0100
    John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    Will try earlier kernels until I found the one where the breakage was introduced. Currently known latest kernel to work is 5.10.5.

    From my testing last year trying to boot my Amiga 3000, the break
    happens sometime after 5.15.0-2. (The last working kernel for me)

    I've not been able to successfully boot any later kernel's since.

    All my attempts to compile my own kernel have failed.


    --
    Stephen - Vk3heg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Fri Feb 24 04:10:01 2023
    Hi Stephen,

    that's apparently been corrected in later versions. Commit ca831f29f8f25c97182e726429b38c0802200c8f (in from 5.17).

    I doubt this would lead to different code generated.

    Which was the first broken version you tried? That would narrow down the
    search range considerably...

    Cheers,

    Michael




    Am 24.02.2023 um 14:09 schrieb Stephen Walsh:
    FYI:

    Just caught this trying a re-compile of kernel 5.15.2 from kernel org,
    under debbootstrap/sbuild and qemu-system-m68k both produce this issue:


    CC mm/process_vm_access.o
    CC mm/page_alloc.o
    mm/page_alloc.c: In function ‘mem_init_print_info’: mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&__init_begin[0] <= &_sinittext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &__init_end[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &_sinittext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &_etext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext, init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__init_begin[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__init_begin[0] < &_edata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin, init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &__start_rodata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_etext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__start_rodata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_edata[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize, __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    CC mm/init-mm.o
    CC mm/memblock.o







    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Paul Adrian Glaubitz@21:1/5 to Michael Schmitz on Fri Feb 24 20:50:01 2023
    Hi Michael!

    On Sat, 2023-02-25 at 08:39 +1300, Michael Schmitz wrote:
    the only commits to hit arch/m68k/mm between 5.15 and now are:

    29f28f8b826d m68k: fix livelock in uaccess
    6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
    d92725256b4f mm: avoid unnecessary page fault retires on shared memory types f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
    05d51e42df06 m68k: Introduce a virtual m68k machine
    c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O 36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit 0e25498f8cd4 exit: Add and use make_task_dead.
    376e3fdecb0d m68k: Enable memtest functionality
    952eea9b01e4 memblock: allow to specify flags with memblock_add_node()

    The first is a fix for the second so these should be tested together.
    None appear suspect to me.

    Running memtest could incur a boot delay but AFAIR that isn't enabled by default, and it isn't implicated in the panic log Adrian posted.

    I don't have time this weekend to bisect the issue. But I think, I can start bisecting it on Sunday evening. I will give it a try on Amiga Forever.

    Adrian

    --
    .''`. John Paul Adrian Glaubitz
    : :' : Debian Developer
    `. `' Physicist
    `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Fri Feb 24 20:50:01 2023
    Hi Stephen, Adrian

    the only commits to hit arch/m68k/mm between 5.15 and now are:

    29f28f8b826d m68k: fix livelock in uaccess
    6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
    d92725256b4f mm: avoid unnecessary page fault retires on shared memory types f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
    05d51e42df06 m68k: Introduce a virtual m68k machine
    c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O
    36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit 0e25498f8cd4 exit: Add and use make_task_dead.
    376e3fdecb0d m68k: Enable memtest functionality
    952eea9b01e4 memblock: allow to specify flags with memblock_add_node()

    The first is a fix for the second so these should be tested together.
    None appear suspect to me.

    Running memtest could incur a boot delay but AFAIR that isn't enabled by default, and it isn't implicated in the panic log Adrian posted.

    Cheers,

    Michael

    Am 24.02.2023 um 16:02 schrieb Michael Schmitz:
    Hi Stephen,

    that's apparently been corrected in later versions. Commit ca831f29f8f25c97182e726429b38c0802200c8f (in from 5.17).

    I doubt this would lead to different code generated.

    Which was the first broken version you tried? That would narrow down the search range considerably...

    Cheers,

    Michael




    Am 24.02.2023 um 14:09 schrieb Stephen Walsh:
    FYI:

    Just caught this trying a re-compile of kernel 5.15.2 from kernel org,
    under debbootstrap/sbuild and qemu-system-m68k both produce this issue:


    CC mm/process_vm_access.o
    CC mm/page_alloc.o
    mm/page_alloc.c: In function ‘mem_init_print_info’:
    mm/page_alloc.c:8163:27: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&__init_begin[0] <=
    &_sinittext[0]’ to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &__init_end[0]’
    to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8167:9: note: in expansion of macro ‘adj_init_size’
    8167 | adj_init_size(__init_begin, __init_end, init_data_size,
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext,
    init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &_sinittext[0]’ to
    compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext,
    init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext,
    init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&_sinittext[0] < &_etext[0]’ to
    compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8169:9: note: in expansion of macro ‘adj_init_size’
    8169 | adj_init_size(_stext, _etext, codesize, _sinittext,
    init_code_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin,
    init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__init_begin[0]’ to >> compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin,
    init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin,
    init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__init_begin[0] < &_edata[0]’ to
    compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8170:9: note: in expansion of macro ‘adj_init_size’
    8170 | adj_init_size(_sdata, _edata, datasize, __init_begin,
    init_data_size);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_stext[0] <= &__start_rodata[0]’
    to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_etext[0]’
    to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8171:9: note: in expansion of macro ‘adj_init_size’
    8171 | adj_init_size(_stext, _etext, codesize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:27: note: use ‘&_sdata[0] <= &__start_rodata[0]’
    to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^~
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: warning: comparison between two arrays
    [-Warray-compare]
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    mm/page_alloc.c:8163:41: note: use ‘&__start_rodata[0] < &_edata[0]’
    to compare the addresses
    8163 | if (start <= pos && pos < end && size > adj) \
    | ^
    mm/page_alloc.c:8172:9: note: in expansion of macro ‘adj_init_size’
    8172 | adj_init_size(_sdata, _edata, datasize,
    __start_rodata, rosize);
    | ^~~~~~~~~~~~~
    CC mm/init-mm.o
    CC mm/memblock.o







    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Fri Feb 24 21:50:01 2023
    Hi Adrian,

    Am 25.02.2023 um 08:49 schrieb John Paul Adrian Glaubitz:
    Hi Michael!

    On Sat, 2023-02-25 at 08:39 +1300, Michael Schmitz wrote:
    the only commits to hit arch/m68k/mm between 5.15 and now are:

    29f28f8b826d m68k: fix livelock in uaccess
    6d0b92254510 m68k/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
    d92725256b4f mm: avoid unnecessary page fault retires on shared memory types >> f95a387cdeb3 m68k: coldfire: drop ISA_DMA_API support
    05d51e42df06 m68k: Introduce a virtual m68k machine
    c4d5b6eef258 m68k: mm: Remove check for VM_IO to fix deferred I/O
    36ef159f4408 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
    0e25498f8cd4 exit: Add and use make_task_dead.
    376e3fdecb0d m68k: Enable memtest functionality
    952eea9b01e4 memblock: allow to specify flags with memblock_add_node()

    The first is a fix for the second so these should be tested together.
    None appear suspect to me.

    Running memtest could incur a boot delay but AFAIR that isn't enabled by
    default, and it isn't implicated in the panic log Adrian posted.

    I don't have time this weekend to bisect the issue. But I think, I can start bisecting it on Sunday evening. I will give it a try on Amiga Forever.

    I had hoped we could maybe narrow down the range to bisect by compile...
    As it stands, testing each Debian kernel image released since 5.15.2
    already requires a bisect approach so you indeed have your work cut out
    for you.

    Let me know what you find - the list of commits in mm/ is too huge to contemplate in its entirety but might be easier to digest from one
    release to another.

    Cheers,

    Michael



    Adrian


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Geert Uytterhoeven@21:1/5 to glaubitz@physik.fu-berlin.de on Sun Feb 26 12:10:01 2023
    Hi Adrian,

    On Tue, Feb 21, 2023 at 4:53 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Unable to handle kernel access at virtual address (ptrval)

    I see the same issue on my A4000, bisecting...

    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Walsh@21:1/5 to All on Sun Feb 26 12:50:01 2023
    Hi Michael,

    that's apparently been corrected in later versions. Commit ca831f29f8f25c97182e726429b38c0802200c8f (in from 5.17).

    I doubt this would lead to different code generated.

    Which was the first broken version you tried? That would narrow down
    the search range considerably...

    Version 5.16-02-m68k is the last working version for me on my A3000
    (Reported back in Sept last year... Only now have I been able to get
    back to it)..

    I downloaded the kernel image deb's from snapshot.debian.org.

    Version's 5.16-3 through to 5.16-6 boot but fail back to the initramfs
    saying they can't find the root file system. The hd is listed during
    the boot process though.

    It's not the size of the initram, but a kernel issue. I changed the
    initramfs settings from "most" to "dep", shrinking it in size, and
    still had kernel boot issues.

    Kernel's above 5.17/5.19/6.x fail and don't even start the heart beat.

    Searching for SAVEKMSG magic...
    Found 2674 bytes at 0x001e0010

    [ 0.000000] Linux version 5.17.0-1-m68k (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.2.0-20) 11.2.0, GNU ld (GNU Binutils for Debian) 2.38) #1 Debian 5.17.3-1 (2022-04-18)
    [ 0.000000] printk: console [debug0] enabled
    [ 0.000000] Amiga hardware found: [A3000] VIDEO BLITTER AMBER_FF AUDIO FLOPPY A3000_SCSI KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA DENISE_HR AGNUS_HR_PAL MAGIC_REKICK ZORRO3
    [ 0.000000] initrd: 0f7f395d - 10000000
    [ 0.000000] Ignoring memory chunk at 0x7800000:0x800000 before the first chunk
    [ 0.000000] Fix your bootloader or use a memfile to make use of this area!
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000ffffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000fffffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000fffffff] [ 0.000000] Unable to handle kernel access at virtual address (ptrval)
    [ 0.000000] Oops: 00000000
    [ 0.000000] Modules linked in:
    [ 0.000000] PC: [<001edcac>] memcmp+0x2c/0x5c
    [ 0.000000] SR: 2700 SP: (ptrval) a2: 0047d530
    [ 0.000000] d0: 00408ab1 d1: 0ffffff8 d2: 001edc80 d3: 0000019e
    [ 0.000000] d4: 0804c588 d5: 0080c6a3 a0: 0000000c a1: 0ffffff4
    [ 0.000000] Process swapper (pid: 0, task=(ptrval))
    [ 0.000000] Frame format=7 eff addr=0047bfb8 ssw=0505 faddr=0ffffff4
    [ 0.000000] wb 1 stat/addr/data: 0005 0804c588 0080c6a3
    [ 0.000000] wb 2 stat/addr/data: 0005 0053c000 0000019e
    [ 0.000000] wb 3 stat/addr/data: 0005 0047bfb0 001edc80
    [ 0.000000] push data: 0080c6a3 00353a8a 08001000 08056094
    [ 0.000000] Stack from 0047bfb0:
    [ 0.000000] 001edc80 0000019e 00353a8a 00514b0e 0ffffff4 00408aad 0000000c 0053c000
    [ 0.000000] 0000019e 0804c588 0080c6a3 08050258 0000ffff 08062278 08001000 08056094
    [ 0.000000] 0ffffff0 005333b8 00000000 00513872
    [ 0.000000] Call Trace: [<001edc80>] memcmp+0x0/0x5c
    [ 0.000000] [<00353a8a>] _printk+0x0/0x18
    [ 0.000000] [<00514b0e>] start_kernel+0x86/0x5ca
    [ 0.000000] [<0000ffff>] sz_long+0x5/0x6
    [ 0.000000] [<00513872>] _sinittext+0x872/0x11f8
    [ 0.000000]
    [ 0.000000] Code: 4280 6036 2209 200b 2640 2241 5881 5880 <2411> b493 66e4 2241 2640 5988 7403 b488 65e6 60d6 4283 1631 1800 4282 1433 1800
    [ 0.000000] Disabling lock debugging due to kernel taint
    [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
    [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
    <<<<<<<<<<<<<<<<<<<<




    Searching for SAVEKMSG magic...
    Found 2632 bytes at 0x001e0010

    [ 0.000000] Linux version 5.18.0-3-m68k (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-4) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 Debian 5.18.14-1 (2022-07-23)
    [ 0.000000] printk: console [debug0] enabled
    [ 0.000000] Amiga hardware found: [A3000] VIDEO BLITTER AMBER_FF AUDIO FLOPPY A3000_SCSI KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA DENISE_HR AGNUS_HR_PAL MAGIC_REKICK ZORRO3
    [ 0.000000] initrd: 0facd79f - 10000000
    [ 0.000000] Ignoring memory chunk at 0x7800000:0x800000 before the first chunk
    [ 0.000000] Fix your bootloader or use a memfile to make use of this area!
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000ffffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000fffffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000fffffff] [ 0.000000] Unable to handle kernel access at virtual address (ptrval)
    [ 0.000000] Oops: 00000000
    [ 0.000000] Modules linked in:
    [ 0.000000] PC: [<001ed3a8>] memcmp+0x2c/0x5c
    [ 0.000000] SR: 2700 SP: (ptrval) a2: 00481530
    [ 0.000000] d0: 0040af9d d1: 0ffffff8 d2: 001ed37c d3: 0000019e
    [ 0.000000] d4: 0806ba68 d5: 00532861 a0: 0000000c a1: 0ffffff4
    [ 0.000000] Process swapper (pid: 0, task=(ptrval))
    [ 0.000000] Frame format=7 eff addr=0047ffbc ssw=0505 faddr=0ffffff4
    [ 0.000000] wb 1 stat/addr/data: 0005 0806ba68 00532861
    [ 0.000000] wb 2 stat/addr/data: 0005 00536000 0000019e
    [ 0.000000] wb 3 stat/addr/data: 0005 0047ffb4 001ed37c
    [ 0.000000] push data: 00532861 00355728 08001000 0804dcc4
    [ 0.000000] Stack from 0047ffb4:
    [ 0.000000] 001ed37c 0000019e 00355728 0050fb0e 0ffffff4 0040af99 0000000c 00536000
    [ 0.000000] 0000019e 0806ba68 00532861 0806f738 08091288 08001000 0804dcc4 0ffffff0
    [ 0.000000] 0052e2b8 00000000 0050e872
    [ 0.000000] Call Trace: [<001ed37c>] memcmp+0x0/0x5c
    [ 0.000000] [<00355728>] _printk+0x0/0x18
    [ 0.000000] [<0050fb0e>] start_kernel+0x86/0x5a0
    [ 0.000000] [<0050e872>] _sinittext+0x872/0x11f8
    [ 0.000000]
    [ 0.000000] Code: 4280 6036 2209 200b 2640 2241 5881 5880 <2411> b493 66e4 2241 2640 5988 7403 b488 65e6 60d6 4283 1631 1800 4282 1433 1800
    [ 0.000000] Disabling lock debugging due to kernel taint
    [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
    [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
    <<<<<<<<<<<<<<<<<<<<









    --
    Stephen - Vk3heg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Geert Uytterhoeven@21:1/5 to geert@linux-m68k.org on Sun Feb 26 14:00:01 2023
    On Sun, Feb 26, 2023 at 12:02 PM Geert Uytterhoeven
    <geert@linux-m68k.org> wrote:
    On Tue, Feb 21, 2023 at 4:53 PM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Unable to handle kernel access at virtual address (ptrval)

    I see the same issue on my A4000, bisecting...

    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes the
    issue.

    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Mon Feb 27 03:10:01 2023
    Hi Geert, Stephen,

    Am 27.02.2023 um 01:52 schrieb Geert Uytterhoeven:
    On Sun, Feb 26, 2023 at 12:02 PM Geert Uytterhoeven
    <geert@linux-m68k.org> wrote:
    On Tue, Feb 21, 2023 at 4:53 PM John Paul Adrian Glaubitz
    <glaubitz@physik.fu-berlin.de> wrote:
    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in
    https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Unable to handle kernel access at virtual address (ptrval) >>
    I see the same issue on my A4000, bisecting...

    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes the issue.

    Yes, I'm sorry to say that was the only likely candidate. Can't see why
    though - are Macs all configured to have RAM start at address zero, and possibly contiguous, Finn?

    Cheers,

    Michael



    Gr{oetje,eeting}s,

    Geert


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Mon Feb 27 03:20:01 2023
    Hi Geert,

    Am 27.02.2023 um 01:52 schrieb Geert Uytterhoeven:
    On Sun, Feb 26, 2023 at 12:02 PM Geert Uytterhoeven
    <geert@linux-m68k.org> wrote:
    On Tue, Feb 21, 2023 at 4:53 PM John Paul Adrian Glaubitz
    <glaubitz@physik.fu-berlin.de> wrote:
    On Tue, 2023-02-21 at 15:55 +0100, Geert Uytterhoeven wrote:
    Looks surprisingly similar to the issue reported by Stan.
    Do the mitigations given in
    https://lore.kernel.org/all/CAMuHMdUtkr2zvZiJfLXvs9d_inJbktSNqQQfO1oxnJHZeoYcHg@mail.gmail.com
    help?

    The kernel actually crashes with a backtrace:

    ABCDGHIJK
    [ 0.000000] Linux version 6.0.0-6-m68k (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU ld (GNU Binutils for
    Debian) 2.39) #1 Debian 6.0.12-1 (2022-12-09)
    [ 0.000000] Enabling workaround for errata I14
    [ 0.000000] printk: bootconsole [debug0] enabled
    [ 0.000000] Amiga hardware found: [A4000] VIDEO BLITTER AUDIO FLOPPY A4000_IDE KEYBOARD MOUSE SERIAL PARALLEL A3000_CLK CHIP_RAM PAULA
    LISA ALICE_PAL ZORRO3
    [ 0.000000] initrd: 0ef0602c - 0f800000
    [ 0.000000] Zone ranges:
    [ 0.000000] DMA [mem 0x0000000008000000-0x000000f7ffffffff]
    [ 0.000000] Normal empty
    [ 0.000000] Movable zone start for each node
    [ 0.000000] Early memory node ranges
    [ 0.000000] node 0: [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Initmem setup node 0 [mem 0x0000000008000000-0x000000000f7fffff]
    [ 0.000000] Unable to handle kernel access at virtual address (ptrval) >>
    I see the same issue on my A4000, bisecting...

    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes the

    What about instead changing the piece of code that you identified as problematic in Kars' case to claim/map the last few bits as well (memblock_cap_size() to be precise)?

    I wonder whether Finn's memtest patch merely exposed another MM bug that
    we don't hit as easily (not without putting memory under a lot of pressure)?

    Cheers,

    Michael


    issue.

    Gr{oetje,eeting}s,

    Geert


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Finn Thain@21:1/5 to Michael Schmitz on Mon Feb 27 07:10:01 2023
    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes
    the issue.

    Yes, I'm sorry to say that was the only likely candidate. Can't see why though - are Macs all configured to have RAM start at address zero, and possibly contiguous, Finn?


    I don't really understand your question. This was not a Mac patch. The
    issue seems to be about the locations initrd_start and initrd_end in
    relation to the various memory segments (?)

    This seems to be the same bug that was raised about 6 months ago... I had thought it was a bootloader bug but I'm out of my depth here.

    https://lists.debian.org/debian-68k/2022/09/msg00047.html https://lists.debian.org/debian-68k/2022/09/msg00051.html https://lists.debian.org/debian-68k/2022/09/msg00055.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Finn Thain@21:1/5 to Michael Schmitz on Mon Feb 27 07:50:01 2023
    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    I wonder whether Finn's memtest patch merely exposed another MM bug


    A kernel patch may be easier than a bootloader patch (even if this is a bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but
    with the side effect that use of memtest will clobber the initrd.

    The initrd and memtest features aren't usually needed together. At the
    time when I needed the memtest feature I did not have confidence in the hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
    index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
    @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
    }
    #endif

    - paging_init();
    -
    #ifdef CONFIG_NATFEAT
    nf_init();
    #endif

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Mon Feb 27 08:30:01 2023
    Hi Finn,

    Am 27.02.2023 um 18:55 schrieb Finn Thain:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes
    the issue.

    Yes, I'm sorry to say that was the only likely candidate. Can't see why
    though - are Macs all configured to have RAM start at address zero, and
    possibly contiguous, Finn?


    I don't really understand your question. This was not a Mac patch. The
    issue seems to be about the locations initrd_start and initrd_end in
    relation to the various memory segments (?)

    I didn't realize that - thanks for pointing this out.


    This seems to be the same bug that was raised about 6 months ago... I had thought it was a bootloader bug but I'm out of my depth here.

    https://lists.debian.org/debian-68k/2022/09/msg00047.html https://lists.debian.org/debian-68k/2022/09/msg00051.html https://lists.debian.org/debian-68k/2022/09/msg00055.html

    I had forgotten all about that one... Thanks for jogging my memory!

    In this case though, the bug happens when the ramdisk is loaded in the
    lowest address memory chunk, at least at a lower address than the one
    the kernel runs from.

    The crashes in the above thread were all from boots where the initrd got
    loaded at the end of the memory chunk the kernel runs from.

    Time to try using copy_from_kernel_nofault() to copy the ramdisk into
    its final location? (just kidding)

    Cheers,

    Michael

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Geert Uytterhoeven@21:1/5 to fthain@linux-m68k.org on Mon Feb 27 09:30:01 2023
    Hi Finn,

    FTR, here is the diff of the dmesg between good and bad:

    +initrd: 07f61166 - 08000000

    This is wrong (note the 6 trailing zeros), as phys_to_virt() is not
    working correctly yet (module_fixup() is called from paging_init()).

    Zone ranges:
    DMA [mem 0x0000000007400000-0x0000007fffffffff]
    Normal empty
    Movable zone start for each node
    Early memory node ranges
    node 0: [mem 0x0000000007400000-0x0000000007ffffff]
    Initmem setup node 0 [mem 0x0000000007400000-0x0000000007ffffff]
    -initrd: 00b61166 - 00c00000

    This is correct (note the 5 trailing zeros).

    -pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
    -pcpu-alloc: [0] 0
    [...]
    +Unable to handle kernel access at virtual address (ptrval)
    +Oops: 00000000
    +Modules linked in:
    +PC: [<002c11be>] memcmp+0x2c/0x5c
    +SR: 2700 SP: (ptrval) a2: 003bd560
    +d0: 0035eb83 d1: 07fffff8 d2: 002c1192 d3: 000000e6
    +d4: 000684e8 d5: 00447000 a0: 0000000c a1: 07fffff4
    +Process swapper (pid: 0, task=(ptrval))
    +Frame format=7 eff addr=003bbfbc ssw=0505 faddr=07fffff4
    +wb 1 stat/addr/data: 0005 00447000 07401000
    +wb 2 stat/addr/data: 0005 000000e6 000684e8
    +wb 3 stat/addr/data: 0005 003bbfb4 002c1192
    +push data: 07401000 002c7d82 07401000 074a2cf4
    +Stack from 003bbfb4:
    +002c1192 000000e6 002c7d82 00428eda 07fffff4 0035eb7f 0000000c 00447000
    +000000e6 000684e8 00447000 07401000 074bec08 07401000 074a2cf4 07fffff0
    +00440406 00000000 00428322
    +Call Trace: [<002c1192>] memcmp+0x0/0x5c
    +[<002c7d82>] _printk+0x0/0x18
    +[<00428eda>] start_kernel+0x80/0x5b0
    +[<000684e8>] pcpu_alloc+0x88/0x3b4
    +[<00428322>] _sinittext+0x322/0x9b0

    On Mon, Feb 27, 2023 at 7:30 AM Finn Thain <fthain@linux-m68k.org> wrote:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:
    I wonder whether Finn's memtest patch merely exposed another MM bug

    A kernel patch may be easier than a bootloader patch (even if this is a bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but
    with the side effect that use of memtest will clobber the initrd.

    Which we can avoid, by moving the ramdisk handling inside paging_init().

    The initrd and memtest features aren't usually needed together. At the
    time when I needed the memtest feature I did not have confidence in the hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
    index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    Presumably something in memblock_reserve() relies on having
    called paging_init() before?

    I'll do some more debugging later today...

    @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
    }
    #endif

    - paging_init();
    -
    #ifdef CONFIG_NATFEAT
    nf_init();
    #endif



    --
    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Finn Thain@21:1/5 to All on Mon Feb 27 09:20:01 2023
    On Mon, 27 Feb 2023, I wrote:

    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    I wonder whether Finn's memtest patch merely exposed another MM bug


    A kernel patch may be easier than a bootloader patch (even if this is a bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but
    with the side effect that use of memtest will clobber the initrd.


    Maybe that's for the best now that the initrd/initramfs has grown so
    large. That portion of memory is presently skipped by memtest, which means you'd have to disable the initrd to get good coverage from memtest anyway.

    The initrd and memtest features aren't usually needed together. At the
    time when I needed the memtest feature I did not have confidence in the hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
    index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
    @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
    }
    #endif

    - paging_init();
    -
    #ifdef CONFIG_NATFEAT
    nf_init();
    #endif



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Mon Feb 27 10:50:01 2023
    Hi Geert,

    adding Mike Rapoport to the recipient list who would know whether memblock_reserve() relies on paging_init() having run.

    Cheers,

    Michael

    Am 27.02.2023 um 21:26 schrieb Geert Uytterhoeven:
    Hi Finn,

    FTR, here is the diff of the dmesg between good and bad:

    +initrd: 07f61166 - 08000000

    This is wrong (note the 6 trailing zeros), as phys_to_virt() is not
    working correctly yet (module_fixup() is called from paging_init()).

    Zone ranges:
    DMA [mem 0x0000000007400000-0x0000007fffffffff]
    Normal empty
    Movable zone start for each node
    Early memory node ranges
    node 0: [mem 0x0000000007400000-0x0000000007ffffff]
    Initmem setup node 0 [mem 0x0000000007400000-0x0000000007ffffff]
    -initrd: 00b61166 - 00c00000

    This is correct (note the 5 trailing zeros).

    -pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
    -pcpu-alloc: [0] 0
    [...]
    +Unable to handle kernel access at virtual address (ptrval)
    +Oops: 00000000
    +Modules linked in:
    +PC: [<002c11be>] memcmp+0x2c/0x5c
    +SR: 2700 SP: (ptrval) a2: 003bd560
    +d0: 0035eb83 d1: 07fffff8 d2: 002c1192 d3: 000000e6
    +d4: 000684e8 d5: 00447000 a0: 0000000c a1: 07fffff4
    +Process swapper (pid: 0, task=(ptrval))
    +Frame format=7 eff addr=003bbfbc ssw=0505 faddr=07fffff4
    +wb 1 stat/addr/data: 0005 00447000 07401000
    +wb 2 stat/addr/data: 0005 000000e6 000684e8
    +wb 3 stat/addr/data: 0005 003bbfb4 002c1192
    +push data: 07401000 002c7d82 07401000 074a2cf4
    +Stack from 003bbfb4:
    +002c1192 000000e6 002c7d82 00428eda 07fffff4 0035eb7f 0000000c 00447000
    +000000e6 000684e8 00447000 07401000 074bec08 07401000 074a2cf4 07fffff0
    +00440406 00000000 00428322
    +Call Trace: [<002c1192>] memcmp+0x0/0x5c
    +[<002c7d82>] _printk+0x0/0x18
    +[<00428eda>] start_kernel+0x80/0x5b0
    +[<000684e8>] pcpu_alloc+0x88/0x3b4
    +[<00428322>] _sinittext+0x322/0x9b0

    On Mon, Feb 27, 2023 at 7:30 AM Finn Thain <fthain@linux-m68k.org> wrote:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:
    I wonder whether Finn's memtest patch merely exposed another MM bug

    A kernel patch may be easier than a bootloader patch (even if this is a
    bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but
    with the side effect that use of memtest will clobber the initrd.

    Which we can avoid, by moving the ramdisk handling inside paging_init().

    The initrd and memtest features aren't usually needed together. At the
    time when I needed the memtest feature I did not have confidence in the
    hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
    index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    Presumably something in memblock_reserve() relies on having
    called paging_init() before?

    I'll do some more debugging later today...

    @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
    }
    #endif

    - paging_init();
    -
    #ifdef CONFIG_NATFEAT
    nf_init();
    #endif




    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eero Tamminen@21:1/5 to Michael Schmitz on Mon Feb 27 10:50:01 2023
    Hi,

    On 27.2.2023 9.19, Michael Schmitz wrote:
    Am 27.02.2023 um 18:55 schrieb Finn Thain:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1.  Reverting that on top of latest fixes
    the issue.

    Yes, I'm sorry to say that was the only likely candidate. Can't see why
    though - are Macs all configured to have RAM start at address zero, and
    possibly contiguous, Finn?


    I don't really understand your question. This was not a Mac patch. The
    issue seems to be about the locations initrd_start and initrd_end in
    relation to the various memory segments (?)

    I didn't realize that - thanks for pointing this out.

    This seems to be the same bug that was raised about 6 months ago... I had
    thought it was a bootloader bug but I'm out of my depth here.

    https://lists.debian.org/debian-68k/2022/09/msg00047.html
    https://lists.debian.org/debian-68k/2022/09/msg00051.html
    https://lists.debian.org/debian-68k/2022/09/msg00055.html

    I had forgotten all about that one... Thanks for jogging my memory!

    In this case though, the bug happens when the ramdisk is loaded in the
    lowest address memory chunk, at least at a lower address than the one
    the kernel runs from.

    I'm wondering whether this old Atari side boot issue is related at all...

    When adding Linux bootinfo support to Hatari emulator (from Aranym
    emulator) few years ago, I noticed that:
    "Linux barfs at ST-RAM memory range given after TT-RAM. However, if
    kernel is loaded to TT-RAM and ST-RAM range is given before TT-RAM
    range, kernel crashes."

    Only working config was Linux being loaded to ST-RAM, TT-RAM being
    given only after that in bootinfo, and initrd ramdisk after kernel.

    Based on mails in archive, this seemed to have been a known Linux/Atari
    issue already in 2013.


    The crashes in the above thread were all from boots where the initrd got loaded at the end of the memory chunk the kernel runs from.

    Time to try using copy_from_kernel_nofault() to copy the ramdisk into
    its final location? (just kidding)


    - Eero

    PS. For people familiar only with Amiga terminology, ST-RAM = chip RAM,
    TT-RAM = fast RAM.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael Schmitz@21:1/5 to All on Mon Feb 27 11:00:01 2023
    Eero,

    that issue (kernel running from TT-RAM) was fixed quite a few years ago
    (but maybe not in 2013), in the sense that ST-RAM could be used for
    drivers (SCSI, atafb). Using ST-RAM as normal VM should have been made a
    lot easier by changing to memblock, but AFAIR there are still some bits missing.

    RAM must be listed in bootinfo with the chunk holding the kernel first,
    _not_ in ascending address order, so that second option is expected to
    crash.

    This isn't related to the current issue for all I can see.

    Cheers,

    Michael

    Am 27.02.2023 um 22:41 schrieb Eero Tamminen:
    Hi,

    On 27.2.2023 9.19, Michael Schmitz wrote:
    Am 27.02.2023 um 18:55 schrieb Finn Thain:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:


    Bisected to commit 376e3fdecb0dcae2 ("m68k: Enable memtest
    functionality") in v5.17-rc1. Reverting that on top of latest fixes >>>>> the issue.

    Yes, I'm sorry to say that was the only likely candidate. Can't see why >>>> though - are Macs all configured to have RAM start at address zero, and >>>> possibly contiguous, Finn?


    I don't really understand your question. This was not a Mac patch. The
    issue seems to be about the locations initrd_start and initrd_end in
    relation to the various memory segments (?)

    I didn't realize that - thanks for pointing this out.

    This seems to be the same bug that was raised about 6 months ago... I
    had
    thought it was a bootloader bug but I'm out of my depth here.

    https://lists.debian.org/debian-68k/2022/09/msg00047.html
    https://lists.debian.org/debian-68k/2022/09/msg00051.html
    https://lists.debian.org/debian-68k/2022/09/msg00055.html

    I had forgotten all about that one... Thanks for jogging my memory!

    In this case though, the bug happens when the ramdisk is loaded in the
    lowest address memory chunk, at least at a lower address than the one
    the kernel runs from.

    I'm wondering whether this old Atari side boot issue is related at all...

    When adding Linux bootinfo support to Hatari emulator (from Aranym
    emulator) few years ago, I noticed that:
    "Linux barfs at ST-RAM memory range given after TT-RAM. However, if
    kernel is loaded to TT-RAM and ST-RAM range is given before TT-RAM
    range, kernel crashes."

    Only working config was Linux being loaded to ST-RAM, TT-RAM being
    given only after that in bootinfo, and initrd ramdisk after kernel.

    Based on mails in archive, this seemed to have been a known Linux/Atari
    issue already in 2013.


    The crashes in the above thread were all from boots where the initrd
    got loaded at the end of the memory chunk the kernel runs from.

    Time to try using copy_from_kernel_nofault() to copy the ramdisk into
    its final location? (just kidding)


    - Eero

    PS. For people familiar only with Amiga terminology, ST-RAM = chip RAM, TT-RAM = fast RAM.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Rapoport@21:1/5 to Michael Schmitz on Mon Feb 27 13:10:01 2023
    Hi,

    On Mon, Feb 27, 2023 at 10:42:34PM +1300, Michael Schmitz wrote:
    Hi Geert,

    adding Mike Rapoport to the recipient list who would know whether memblock_reserve() relies on paging_init() having run.

    Cheers,

    Michael

    Am 27.02.2023 um 21:26 schrieb Geert Uytterhoeven:
    Hi Finn,

    FTR, here is the diff of the dmesg between good and bad:

    +initrd: 07f61166 - 08000000

    This is wrong (note the 6 trailing zeros), as phys_to_virt() is not
    working correctly yet (module_fixup() is called from paging_init()).

    Zone ranges:
    DMA [mem 0x0000000007400000-0x0000007fffffffff]
    Normal empty
    Movable zone start for each node
    Early memory node ranges
    node 0: [mem 0x0000000007400000-0x0000000007ffffff]
    Initmem setup node 0 [mem 0x0000000007400000-0x0000000007ffffff]
    -initrd: 00b61166 - 00c00000

    This is correct (note the 5 trailing zeros).

    -pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
    -pcpu-alloc: [0] 0
    [...]
    +Unable to handle kernel access at virtual address (ptrval)
    +Oops: 00000000
    +Modules linked in:
    +PC: [<002c11be>] memcmp+0x2c/0x5c
    +SR: 2700 SP: (ptrval) a2: 003bd560
    +d0: 0035eb83 d1: 07fffff8 d2: 002c1192 d3: 000000e6
    +d4: 000684e8 d5: 00447000 a0: 0000000c a1: 07fffff4
    +Process swapper (pid: 0, task=(ptrval))
    +Frame format=7 eff addr=003bbfbc ssw=0505 faddr=07fffff4
    +wb 1 stat/addr/data: 0005 00447000 07401000
    +wb 2 stat/addr/data: 0005 000000e6 000684e8
    +wb 3 stat/addr/data: 0005 003bbfb4 002c1192
    +push data: 07401000 002c7d82 07401000 074a2cf4
    +Stack from 003bbfb4:
    +002c1192 000000e6 002c7d82 00428eda 07fffff4 0035eb7f 0000000c 00447000
    +000000e6 000684e8 00447000 07401000 074bec08 07401000 074a2cf4 07fffff0
    +00440406 00000000 00428322
    +Call Trace: [<002c1192>] memcmp+0x0/0x5c
    +[<002c7d82>] _printk+0x0/0x18
    +[<00428eda>] start_kernel+0x80/0x5b0
    +[<000684e8>] pcpu_alloc+0x88/0x3b4
    +[<00428322>] _sinittext+0x322/0x9b0

    On Mon, Feb 27, 2023 at 7:30 AM Finn Thain <fthain@linux-m68k.org> wrote:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:
    I wonder whether Finn's memtest patch merely exposed another MM bug

    A kernel patch may be easier than a bootloader patch (even if this is a bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but with the side effect that use of memtest will clobber the initrd.

    Which we can avoid, by moving the ramdisk handling inside paging_init().

    The initrd and memtest features aren't usually needed together. At the time when I needed the memtest feature I did not have confidence in the hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    Presumably something in memblock_reserve() relies on having
    called paging_init() before?

    memblock_reserve() does not rely on paging_init() as it operates on
    physical addresses and it does not care if memory was already registered.

    What does rely on paging_init() it's phys_to_virt() in the line after memblock_reserve():

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);
    initrd_end = initrd_start + m68k_ramdisk.size;

    So to have both memtest and initrd we'd need something like

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    paging_init() {
    /* setup page tables and memblock */
    early_memtest();
    }

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    or

    paging_init(); /* without early_memtest() */

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    early_memtest();


    I'll do some more debugging later today...

    @@ -335,8 +337,6 @@ void __init setup_arch(char **cmdline_p)
    }
    #endif

    - paging_init();
    -
    #ifdef CONFIG_NATFEAT
    nf_init();
    #endif




    --
    Sincerely yours,
    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Rapoport@21:1/5 to Geert Uytterhoeven on Mon Feb 27 14:00:01 2023
    Hi Geert,

    On Mon, Feb 27, 2023 at 01:31:23PM +0100, Geert Uytterhoeven wrote:
    Hi Mike,

    On Mon, Feb 27, 2023 at 12:34 PM Mike Rapoport <rppt@kernel.org> wrote:
    On Mon, Feb 27, 2023 at 10:42:34PM +1300, Michael Schmitz wrote:
    Am 27.02.2023 um 21:26 schrieb Geert Uytterhoeven:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:
    I wonder whether Finn's memtest patch merely exposed another MM bug

    A kernel patch may be easier than a bootloader patch (even if this is a
    bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but
    with the side effect that use of memtest will clobber the initrd.

    Which we can avoid, by moving the ramdisk handling inside paging_init().

    The initrd and memtest features aren't usually needed together. At the
    time when I needed the memtest feature I did not have confidence in the
    hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
    index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    Presumably something in memblock_reserve() relies on having
    called paging_init() before?

    memblock_reserve() does not rely on paging_init() as it operates on physical addresses and it does not care if memory was already registered.

    What does rely on paging_init() it's phys_to_virt() in the line after memblock_reserve():

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);
    initrd_end = initrd_start + m68k_ramdisk.size;

    So to have both memtest and initrd we'd need something like

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    paging_init() {
    /* setup page tables and memblock */
    early_memtest();
    }

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    or

    paging_init(); /* without early_memtest() */

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    early_memtest();

    Of course... /me bangs his head against the TFT for not having
    realized before the values saved into initrd_{start,end} are not just
    for printing in the pr_info() line...

    Happens to the best of us :)

    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --
    Sincerely yours,
    Mike.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Geert Uytterhoeven@21:1/5 to rppt@kernel.org on Mon Feb 27 13:40:02 2023
    Hi Mike,

    On Mon, Feb 27, 2023 at 12:34 PM Mike Rapoport <rppt@kernel.org> wrote:
    On Mon, Feb 27, 2023 at 10:42:34PM +1300, Michael Schmitz wrote:
    Am 27.02.2023 um 21:26 schrieb Geert Uytterhoeven:
    FTR, here is the diff of the dmesg between good and bad:

    +initrd: 07f61166 - 08000000

    This is wrong (note the 6 trailing zeros), as phys_to_virt() is not working correctly yet (module_fixup() is called from paging_init()).

    Zone ranges:
    DMA [mem 0x0000000007400000-0x0000007fffffffff]
    Normal empty
    Movable zone start for each node
    Early memory node ranges
    node 0: [mem 0x0000000007400000-0x0000000007ffffff]
    Initmem setup node 0 [mem 0x0000000007400000-0x0000000007ffffff]
    -initrd: 00b61166 - 00c00000

    This is correct (note the 5 trailing zeros).

    -pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
    -pcpu-alloc: [0] 0
    [...]
    +Unable to handle kernel access at virtual address (ptrval)
    +Oops: 00000000
    +Modules linked in:
    +PC: [<002c11be>] memcmp+0x2c/0x5c
    +SR: 2700 SP: (ptrval) a2: 003bd560
    +d0: 0035eb83 d1: 07fffff8 d2: 002c1192 d3: 000000e6
    +d4: 000684e8 d5: 00447000 a0: 0000000c a1: 07fffff4
    +Process swapper (pid: 0, task=(ptrval))
    +Frame format=7 eff addr=003bbfbc ssw=0505 faddr=07fffff4
    +wb 1 stat/addr/data: 0005 00447000 07401000
    +wb 2 stat/addr/data: 0005 000000e6 000684e8
    +wb 3 stat/addr/data: 0005 003bbfb4 002c1192
    +push data: 07401000 002c7d82 07401000 074a2cf4
    +Stack from 003bbfb4:
    +002c1192 000000e6 002c7d82 00428eda 07fffff4 0035eb7f 0000000c 00447000
    +000000e6 000684e8 00447000 07401000 074bec08 07401000 074a2cf4 07fffff0
    +00440406 00000000 00428322
    +Call Trace: [<002c1192>] memcmp+0x0/0x5c
    +[<002c7d82>] _printk+0x0/0x18
    +[<00428eda>] start_kernel+0x80/0x5b0
    +[<000684e8>] pcpu_alloc+0x88/0x3b4
    +[<00428322>] _sinittext+0x322/0x9b0

    On Mon, Feb 27, 2023 at 7:30 AM Finn Thain <fthain@linux-m68k.org> wrote:
    On Mon, 27 Feb 2023, Michael Schmitz wrote:
    I wonder whether Finn's memtest patch merely exposed another MM bug

    A kernel patch may be easier than a bootloader patch (even if this is a bootloader bug) particularly if it affects multiple platforms.

    A partial revert of my patch (below) will probably avoid the issue, but with the side effect that use of memtest will clobber the initrd.

    Which we can avoid, by moving the ramdisk handling inside paging_init().

    The initrd and memtest features aren't usually needed together. At the time when I needed the memtest feature I did not have confidence in the hardeare. An initrd wasn't very useful at that point.

    diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c index 3a2bb2e8fdad..92f1b9268dff 100644
    --- a/arch/m68k/kernel/setup_mm.c
    +++ b/arch/m68k/kernel/setup_mm.c
    @@ -326,6 +326,8 @@ void __init setup_arch(char **cmdline_p)
    panic("No configuration setup");
    }

    + paging_init();
    +
    #ifdef CONFIG_BLK_DEV_INITRD
    if (m68k_ramdisk.size) {
    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    Presumably something in memblock_reserve() relies on having
    called paging_init() before?

    memblock_reserve() does not rely on paging_init() as it operates on
    physical addresses and it does not care if memory was already registered.

    What does rely on paging_init() it's phys_to_virt() in the line after memblock_reserve():

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);
    initrd_end = initrd_start + m68k_ramdisk.size;

    So to have both memtest and initrd we'd need something like

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);

    paging_init() {
    /* setup page tables and memblock */
    early_memtest();
    }

    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    or

    paging_init(); /* without early_memtest() */

    memblock_reserve(m68k_ramdisk.addr, m68k_ramdisk.size);
    initrd_start = (unsigned long)phys_to_virt(m68k_ramdisk.addr);

    early_memtest();

    Of course... /me bangs his head against the TFT for not having
    realized before the values saved into initrd_{start,end} are not just
    for printing in the pr_info() line...

    Gr{oetje,eeting}s,

    Geert

    --
    Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

    In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that.
    -- Linus Torvalds

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)