When booting modern kernels (4.x or 5.x) on a Mac IIci, the kernel sees
only half of installed memory if the built-in video is used. Using a
Nubus video card, all of the installed memory is seen. This may be a
Penguin issue, though it's not clear why the kernel is ignoring
available memory. I'm documenting it here in case anyone has an idea
how to fix it.
On Thu, Mar 11, 2021 at 11:39:41AM -0700, Stan Johnson wrote:
When booting modern kernels (4.x or 5.x) on a Mac IIci, the kernel seesI'm not sure how to fix it, but I suspect this is changed due to the
only half of installed memory if the built-in video is used. Using a
Nubus video card, all of the installed memory is seen. This may be a
Penguin issue, though it's not clear why the kernel is ignoring
available memory. I'm documenting it here in case anyone has an idea
how to fix it.
way some of the memory management code now works. With built-in video
on an RBV model like the IIci, the video chip steals RAM out of the
first bank (at physical address 0). That means the first bank isn't
entirely usable for the kernel. With this in mind, it's both simpler
and faster to put the kernel into the second bank. However, newer
kernels can't cleanly handle adding the first bank (which has lower
physical addresses) to usable memory after the second bank. I'm
pretty sure that used to work. It's somewhat similar to the problem
between the fast and slow RAM on the Atari systems.
Brad Boyer
flar@allandria.com
I'm pretty sure this has never worked. Though there ought to be a way to
make use of RAM in bank 0 for video as long as the kernel is loaded at address 0 in that bank. Or if that's not possible, use the same trick as I
do on Atari - ioremap some of the bank 0 memory for use by video, and make sure the video driver uses a dedicated low memory pool allocator for that ioremapped RAM.
On Mon, Mar 15, 2021 at 08:28:37AM +1300, Michael Schmitz wrote:Only leaves the option of using a dedicated pool of ioremapped memory
I'm pretty sure this has never worked. Though there ought to be a way toOK, I wasn't sure if it used to work. I own both a IIci and IIsi, but I
make use of RAM in bank 0 for video as long as the kernel is loaded at
address 0 in that bank. Or if that's not possible, use the same trick as I >> do on Atari - ioremap some of the bank 0 memory for use by video, and make >> sure the video driver uses a dedicated low memory pool allocator for that
ioremapped RAM.
never got Linux up and running on either. I bought both of them just
about when I stopped having as much time to work on such things.
I believe one of the headaches is that the address is non-programmable
and requires the framebuffer not just to be in bank A, but to start at address 0 which means we can't put the kernel at address 0. So we
instead put it at the beginning of bank B if RBV video is going to be
used. The RBV chip doesn't have a way to specify the buffer address
(the official Apple documentation is quite clear that RBV has no
control over the framebuffer location). The other chip that does that
part is also the main memory mapping logic for all onboard devices and
has a completely static mapping for this sort of thing as far as I know.
The documentation I've found never explicitly says the address 0 is hard-coded, but it never says it can be changed either.
A second headache is that we don't have real video drivers for anythingNo, none of the RAM in the low memory chunk is mapped for use as normal
on Mac (other than valkyrie which is only found on one or two very late
m68k models and is a shared driver with some ppc models). The macfb
driver mostly just takes information passed in by Penguin and uses
that. It only has enough logic beyond that to change the color palettes
on some of the more common hardware.
On the Atari, is the rest of the low memory usable if the kernel isn't
there? I seem to recall it is only accessible by that special allocator
in that case, not as generic memory. I'm not sure of anything else we
have on a Mac that would be able to use that sort of thing.
Brad Boyer
flar@allandria.com
On 15/03/21 10:13 am, Brad Boyer wrote:
I believe one of the headaches is that the address is non-programmableOnly leaves the option of using a dedicated pool of ioremapped memory and a special allocator. That is, unless someone manages to rewrite the m68k MMU code to allow RAM chunks to be listed and mapped in any order. We now use a memory model that should allow that, but the code that maps virtual
and requires the framebuffer not just to be in bank A, but to start at >address 0 which means we can't put the kernel at address 0. So we
instead put it at the beginning of bank B if RBV video is going to be
used. The RBV chip doesn't have a way to specify the buffer address
(the official Apple documentation is quite clear that RBV has no
control over the framebuffer location). The other chip that does that
part is also the main memory mapping logic for all onboard devices and
has a completely static mapping for this sort of thing as far as I know. >The documentation I've found never explicitly says the address 0 is >hard-coded, but it never says it can be changed either.
addresses to physical addresses and vice versa (without walking the page tables) isn't there yet. It's been a while since I looked though ...
in that case, not as generic memory. I'm not sure of anything else we
have on a Mac that would be able to use that sort of thing.
The only other use is for DMA buffers (floppy and SCSI controllers, probably also DMA for the sound chip). DMA on Macs is weird (though can probably address the entire memory, not just the low 24 address bits, so the problem never arose on MAC as far as I recall).
On Thu, Mar 11, 2021 at 11:39:41AM -0700, Stan Johnson wrote:
When booting modern kernels (4.x or 5.x) on a Mac IIci, the kernel sees only half of installed memory if the built-in video is used. Using a
Nubus video card, all of the installed memory is seen. This may be a Penguin issue, though it's not clear why the kernel is ignoring
available memory. I'm documenting it here in case anyone has an idea
how to fix it.
I'm not sure how to fix it, but I suspect this is changed due to the
way some of the memory management code now works. With built-in video
on an RBV model like the IIci, the video chip steals RAM out of the
first bank (at physical address 0). That means the first bank isn't
entirely usable for the kernel. With this in mind, it's both simpler
and faster to put the kernel into the second bank. However, newer
kernels can't cleanly handle adding the first bank (which has lower
physical addresses) to usable memory after the second bank. I'm
pretty sure that used to work. It's somewhat similar to the problem
between the fast and slow RAM on the Atari systems.
Two of the main differences between m68k platforms and the ia32 PC
platform are that (a) physical RAM doesn't always start at address
zero,
On Mär 15 2021, Geert Uytterhoeven wrote:
Two of the main differences between m68k platforms and the ia32 PC
platform are that (a) physical RAM doesn't always start at address
zero,
That is shared with a lot of platforms.
Perhaps we could find some other use for the rest of the RAM in that
bank (a IIsi has a fixed 1MB there, but it was possible to put a lot
more than that in bank A on a IIci).
The issue appears not to be limited to built-in video. With 16 MiB in
Bank A (4 x 4 MiB), 64 MiB in Bank B (4 x 16 MiB), a RasterOps
ColorBoard 264 in the Nubus slot nearest the PDS slot (which contains a
32K cache card), and a Farallon Ethernet card in the middle Nubus slot,
Linux 4.1.167 sees only 64 MiB of memory, presumably all from Bank B.
Mac OS 7.5.5 and 8.1 see 80 MiB, as does NetBSD 9.1.
With 16 MiB in each bank, using a Mac II video card and an Asante 10/100 Ethernet card, Linux sees all 32 MiB (I didn't test this combination
with 80 MiB or 128 MiB).
Unfortunately, while the Asante 10/100 card works fine in Mac OS, I
couldn't find an appropriate Linux driver (I think it used to use an SMC driver, but I couldn't find one in modern kernels that works).
And the RasterOps card is a better video card than the Mac II (Toby)
video card, though I think the RasterOps card may be causing a "failed
to turn off interrupts, booting anyway" message in Penguin while booting Linux (possibly causing the memory issue?).
On a different IIci, with 64 MiB in each bank, using a Mac II video card
and a Farallon Ethermac II-C card, Linux sees all 128 MiB (with no
"failed to turn off interrupts..." message).
Maybe Mac OS reserves memory from Bank A for video unless the ROM
recognizes a known Apple video card (such as the Mac II video card)?
Does anyone know whether there's a way to have Penguin send a custom
list of free memory ranges to Linux?
If anyone wants to do any further testing, I'd be happy to help.
I didn't know that the IIsi limited video RAM to 1 MB. It's unsurprising though, considering the drawbacks of RAM based video.
This is getting complicated quickly, and some of my earlier conclusions
were wrong.
I have these two Mac IIci systems:
System A: 128 MB (64 MB in each bank), Mac II Video Card, Farallon
EtherMac II-C
System B: 80 MB (16 MB in Bank A, 64 MB in Bank B), Mac II Video Card, Farallon EtherMac II-C
When either System A or System B is using built-in video, Linux 4.14.167-mac-backport+ sees only the memory that is in Bank B.
When System B uses a RasterOps video card, only the memory in Bank B is
seen (even if the amount of memory in Banks A and B is the same). I'm
not able to get System B to see all memory except when using the Mac II
video card and with the same amount of memory in Banks A and B.
Using the 5.11.0-mac kernel with CONFIG_FLATMEM=y, System B crashes, but System A works. With 16 MB in both Banks A and B, System B doesn't
crash, either, and 5.11.0-mac sees all 32 MB (see attached serial
console log for System B, first boot crashes with 80 MB, second boot
works with 32 MB). So where 4.14.167-mac-backport+ saw only the memory
in Bank B, 5.11.0-mac crashes (I didn't try 5.11.0-mac without the CONFIG_FLATMEM=y option).
... It would be interesting to see the list of RAM segments that
Penguin generates on these machines (you can get Penguin to log them without starting the kernel).
Maybe Mac OS reserves memory from Bank A for video unless the ROM
recognizes a known Apple video card (such as the Mac II video card)?
Penguin on the 80 MB machine might work better if you swapped out the RasterOps board in favour of a Mac II video board... it's probably
worth trying.
Are these machines running the same version of Mac OS? Penguin has
some funky IIci video driver patching code that might be affected by
ROM version or MacOS version. ...
Yes, both are running Mac OS 7.5.5. The attached Penguin output is for 5.11.0-mac from System B; Penguin-1.txt is with 80 MB (crashes), Penguin-2.txt is with 32 MB (16 MB in each Bank) (works).
-Stan Johnson
Am I right in thinking that Linux only crashes when Penguin loads the
kernel into Bank A (i.e. Penguin says "The kernel will be located at
physical 0x00001000") and the kernel then goes and drops that segment
(i.e. Linux says "Ignoring memory chunk at 0x0:0x1000000 before the
first chunk")?
Thanks for collecting these logs. The Penguin logs show that rbv_boot is false, indicating that the on-board video is not in use, as described.
So I think the important question is, why does Penguin fail to sort the segments in this case? That is, why did Penguin produce this list:
Physical RAM: 80 MB
...
BI_MEMCHUNK[0].addr = 0x04000000
BI_MEMCHUNK[0].size = 0x04000000
BI_MEMCHUNK[1].addr = 0x00000000
BI_MEMCHUNK[1].size = 0x01000000
rather than a sorted list, something like the other example:
Physical RAM: 32 MB
...
BI_MEMCHUNK[0].addr = 0x00000000
BI_MEMCHUNK[0].size = 0x01000000
BI_MEMCHUNK[1].addr = 0x04000000
BI_MEMCHUNK[1].size = 0x01000000
On Thu, 18 Mar 2021, Finn Thain wrote:
Am I right in thinking that Linux only crashes when Penguin loads the
kernel into Bank A (i.e. Penguin says "The kernel will be located at
physical 0x00001000") and the kernel then goes and drops that segment
(i.e. Linux says "Ignoring memory chunk at 0x0:0x1000000 before the
first chunk")?
After re-reading your message, I think I got that wrong -- you said that "Penguin-1.txt is with 80 MB (crashes)". So I don't have a good
explanation for the v5.11 crash.
Thanks for collecting these logs. The Penguin logs show that rbv_boot is
false, indicating that the on-board video is not in use, as described.
So I think the important question is, why does Penguin fail to sort the
segments in this case? That is, why did Penguin produce this list:
Physical RAM: 80 MB
...
BI_MEMCHUNK[0].addr = 0x04000000
BI_MEMCHUNK[0].size = 0x04000000
BI_MEMCHUNK[1].addr = 0x00000000
BI_MEMCHUNK[1].size = 0x01000000
rather than a sorted list, something like the other example:
Physical RAM: 32 MB
...
BI_MEMCHUNK[0].addr = 0x00000000
BI_MEMCHUNK[0].size = 0x01000000
BI_MEMCHUNK[1].addr = 0x04000000
BI_MEMCHUNK[1].size = 0x01000000
After looking at the Penguin source code, I understand how this happens. Penguin sorts the pysical memory chunks by size, not by address, except on 68020, where it sorts by address.
If you move the 64 MB to bank A and the 16 MB to bank B, does that solve
the problem? (Please also try v5.10 before you make that change.)
AIUI, Penguin needs a large physically contiguous region, so it used the largest pysical RAM chunk (which was bank B). But that alone doesn't
really justify the weird sort order.
Laurent, can you comment on this? In particular, does EMILE sort memory chunks the way Penguin does?
AIUI, Penguin needs a large physically contiguous region, so it used the largest pysical RAM chunk (which was bank B). But that alone doesn't
really justify the weird sort order.
Hi Finn,
Am 19.03.21 um 12:49 schrieb Finn Thain:
AIUI, Penguin needs a large physically contiguous region, so it used
the largest pysical RAM chunk (which was bank B). But that alone
doesn't really justify the weird sort order.
If Penguin then loads the kernel in that same chunk, there really is no
other choice? (The kernel expects the memory chunk it runs from to be
listed first in the bootinfo struct).
I wonder about the 020 sorting scheme though - is there a hardware rule
that says the first chunk must be the largest on 020?
Cheers,
hael
On Sat, 20 Mar 2021, Michael Schmitz wrote:
Am 19.03.21 um 12:49 schrieb Finn Thain:
AIUI, Penguin needs a large physically contiguous region, so it used
the largest pysical RAM chunk (which was bank B). But that alone
doesn't really justify the weird sort order.
If Penguin then loads the kernel in that same chunk, there really is no other choice? (The kernel expects the memory chunk it runs from to be listed first in the bootinfo struct).
But finding the largest chunk and putting it first doesn't imply sorting
the whole list of bootinfo memory chunks by size. Moreover, the kernel now apprently requires chunks to be sorted by physical address, not by size.
Why not sort chunks by physical address and omit any chunks prior to the largest one, to satisfy both requirements? Then ask users to re-arrange
RAM SIMMs such that bank A is the largest.
I wonder about the 020 sorting scheme though - is there a hardware rule that says the first chunk must be the largest on 020?
The Penguin-19 source code says,
/* Hack for 020/68851. Kernel "head.S" does not handle
* 020 with > 1 memory segment and kernel not in first
* segment. Force kernel into first memory segment on
* these (020) machines.
*/
This "hack" first appeared in Penguin-14. The file Penguin.doc says,
Status: Changed 980301
New setting - "68020: Don't force kernel into bank A".
Normally the 020's requires the kernel to be placed in
bank A memory. The Penguin will force the kernel to be put
in that bank on these machines. This setting will ignore
the "forcing" and put the kernel in the bank with the
largest amount of memory available.
NOTE: Current kernels will fail if the kernel is not
forced into bank A. Emergency setting if all else fails.
Only available on 020's.
That suggests that the bank A requirement comes from a kernel limitation, perhaps stemming from a 68551 quirk (?).
Looking at MMU code in head.S in current kernels, mmu_map_tt() seems to contain the only special case for '020. But mmu_map_tt() is only used for Nubus slot space. So I'm none the wiser.
Perhaps we need to look at head.S from before 1998 to figure out what motivated Penguin's '020 special case and the option to disable it?
If Penguin then loads the kernel in that same chunk, there really is noBut finding the largest chunk and putting it first doesn't imply sorting
other choice? (The kernel expects the memory chunk it runs from to be
listed first in the bootinfo struct).
the whole list of bootinfo memory chunks by size. Moreover, the kernel now apprently requires chunks to be sorted by physical address, not by size.
Why not sort chunks by physical address and omit any chunks prior to the largest one, to satisfy both requirements? Then ask users to re-arrange
RAM SIMMs such that bank A is the largest.
The 68020/68851 combo is functionally equivalent to the 68030 as far asI wonder about the 020 sorting scheme though - is there a hardware ruleThe Penguin-19 source code says,
that says the first chunk must be the largest on 020?
/* Hack for 020/68851. Kernel "head.S" does not handle
* 020 with > 1 memory segment and kernel not in first
* segment. Force kernel into first memory segment on
* these (020) machines.
*/
This "hack" first appeared in Penguin-14. The file Penguin.doc says,
Status: Changed 980301
New setting - "68020: Don't force kernel into bank A".
Normally the 020's requires the kernel to be placed in
bank A memory. The Penguin will force the kernel to be put
in that bank on these machines. This setting will ignore
the "forcing" and put the kernel in the bank with the
largest amount of memory available.
NOTE: Current kernels will fail if the kernel is not
forced into bank A. Emergency setting if all else fails.
Only available on 020's.
That suggests that the bank A requirement comes from a kernel limitation, perhaps stemming from a 68551 quirk (?).
Looking at MMU code in head.S in current kernels, mmu_map_tt() seems to contain the only special case for '020. But mmu_map_tt() is only used for Nubus slot space. So I'm none the wiser.Yes, and what mmu_map_tt falls back to on 020 is the same code that gets otherwise used on 030 and 020 alike. I can't see a reason why this hack
Perhaps we need to look at head.S from before 1998 to figure out what motivated Penguin's '020 special case and the option to disable it?
Does any of this help with the problem of RBV Macs? Video RAM must start at address 0x0, and reordering RAM to have the largest chunk at that address would occupy the RBV video range and render RBV unusable? Can you even rearrange RAM in these machines?
On 21/03/21 2:31 pm, Finn Thain wrote:
If Penguin then loads the kernel in that same chunk, there really is
no other choice? (The kernel expects the memory chunk it runs from
to be listed first in the bootinfo struct).
But finding the largest chunk and putting it first doesn't imply
sorting the whole list of bootinfo memory chunks by size. Moreover,
the kernel now apprently requires chunks to be sorted by physical
address, not by size.
I see. You are correct - the only constraint really is that the largest chunk (with the kernel in it) come first.
The remaining chunks do not have to be sorted by size but could appear
in any order.
I guess once the RAM chunk list has been sorted, it was most convenient
to use that sorted list directly for the bootinfo records.
Why not sort chunks by physical address and omit any chunks prior to
the largest one, to satisfy both requirements? Then ask users to re-arrange RAM SIMMs such that bank A is the largest.
Yes, that could be done. I don't think the kernel would mind any RAM
banks not passed in the bootinfo struct (to wit, IIRC amiboot has a 'memfile' option to allow exclusion of RAM chunks from bootinfo, to skip
RAM that's slow or unreliable).
A warning from Penguin with advice to rearrange the largest RAM into
bank A (if possible) would certainly be more visible to the user than
the one-line warning early in the kernel boot log.
Does any of this help with the problem of RBV Macs? Video RAM must start
at address 0x0, and reordering RAM to have the largest chunk at that
address would occupy the RBV video range and render RBV unusable? Can
you even rearrange RAM in these machines?
I wonder about the 020 sorting scheme though - is there a hardware
rule that says the first chunk must be the largest on 020?
The Penguin-19 source code says,
/* Hack for 020/68851. Kernel "head.S" does not handle
* 020 with > 1 memory segment and kernel not in first
* segment. Force kernel into first memory segment on
* these (020) machines.
*/
This "hack" first appeared in Penguin-14. The file Penguin.doc says,
Status: Changed 980301
New setting - "68020: Don't force kernel into bank A".
Normally the 020's requires the kernel to be placed in
bank A memory. The Penguin will force the kernel to be put
in that bank on these machines. This setting will ignore
the "forcing" and put the kernel in the bank with the
largest amount of memory available.
NOTE: Current kernels will fail if the kernel is not
forced into bank A. Emergency setting if all else fails.
Only available on 020's.
That suggests that the bank A requirement comes from a kernel limitation, perhaps stemming from a 68551 quirk (?).The 68020/68851 combo is functionally equivalent to the 68030 as far as
I recall. I don't think such a limitation exists today in today's
head.S.
Looking at MMU code in head.S in current kernels, mmu_map_tt() seemsYes, and what mmu_map_tt falls back to on 020 is the same code that gets otherwise used on 030 and 020 alike. I can't see a reason why this hack would still be necessary.
to contain the only special case for '020. But mmu_map_tt() is only
used for Nubus slot space. So I'm none the wiser.
Perhaps we need to look at head.S from before 1998 to figure out what motivated Penguin's '020 special case and the option to disable it?
I know the head.S MMU code was completely rewritten around that time to accommodate changes needed for the Mac port. What we used before on
Atari and Amiga bears little to no relation to what we have now. My
guess is that 030 (has transparent translation register) and 020 (does
not have tt1) used distinct code paths before the rewrite, but share
much of the code now.
I haven't found a kernel source that old on my system so I can't verify
my recollection of this though. Geert has a git tree somewhere that
contains all the ancient history, might be worth checking.
Cheers,
Michael
On 21/03/21 2:31 pm, Finn Thain wrote:
This "hack" first appeared in Penguin-14. The file Penguin.doc says,
Status: Changed 980301
New setting - "68020: Don't force kernel into bank A".
Normally the 020's requires the kernel to be placed in
bank A memory. The Penguin will force the kernel to be put
in that bank on these machines. This setting will ignore
the "forcing" and put the kernel in the bank with the
largest amount of memory available.
NOTE: Current kernels will fail if the kernel is not
forced into bank A. Emergency setting if all else fails.
Only available on 020's.
That suggests that the bank A requirement comes from a kernel limitation, perhaps stemming from a 68551 quirk (?).The 68020/68851 combo is functionally equivalent to the 68030 as far as
I recall. I don't think such a limitation exists today in today's head.S.
Looking at MMU code in head.S in current kernels, mmu_map_tt() seems to contain the only special case for '020. But mmu_map_tt() is only used for Nubus slot space. So I'm none the wiser.Yes, and what mmu_map_tt falls back to on 020 is the same code that gets otherwise used on 030 and 020 alike. I can't see a reason why this hack
would still be necessary.
Perhaps we need to look at head.S from before 1998 to figure out what motivated Penguin's '020 special case and the option to disable it?
I know the head.S MMU code was completely rewritten around that time to accommodate changes needed for the Mac port. What we used before on
Atari and Amiga bears little to no relation to what we have now. My
guess is that 030 (has transparent translation register) and 020 (does
not have tt1) used distinct code paths before the rewrite, but share
much of the code now.
Can't fault your reasoning there. The use case of multiple large chunksI guess once the RAM chunk list has been sorted, it was most convenientA constraint that says the first chunk must be the largest one is
to use that sorted list directly for the bootinfo records.
undesirable because if the largest chunk has higher address than some
other large chunk, the latter would become inaccessible.
A similar problem arises when you have only two chunks of equal size.
Sorting by size doesn't help and the bootloader could theoretically end up putting the kernel in bank B, leaving bank A unavailable.
Based on the commit that Geert cited, I'd be inclined to sort chunks by physical address, find the lowest chunk having size >= 16 MB and put thatWhy not sort chunks by physical address and omit any chunks prior toYes, that could be done. I don't think the kernel would mind any RAM
the largest one, to satisfy both requirements? Then ask users to
re-arrange RAM SIMMs such that bank A is the largest.
banks not passed in the bootinfo struct (to wit, IIRC amiboot has a
'memfile' option to allow exclusion of RAM chunks from bootinfo, to skip
RAM that's slow or unreliable).
one and the higher ones into bootinfo. (Or failing that, find the one
having size >= 8 MB, or failing that, 4 MB.)
A full sort isn't really needed here but does offer some determinism. Are there implications for mm data structures? E.g. memblock_add_node() isThere is no checks about order that I would have seen. But the reason
called for each chunk, and if the chunks are in the "wrong" order, perhaps that would affect mm algorithms (?)
Probably good cause to only use it for video RAM through a separateDoes any of this help with the problem of RBV Macs? Video RAM must startI've argued elsewhere in this thread that the bank A issue in RBV Macs doesn't matter that much.
at address 0x0, and reordering RAM to have the largest chunk at that
address would occupy the RBV video range and render RBV unusable? Can
you even rearrange RAM in these machines?
If on-board video is enabled, bank A is slowed down. That suggests to me
that bank A is probably not that useful for Linux and is probably
relatively small anyway. As Brad said, bank A is always 1 MB on a IIsi. So
I don't mind if Linux ignores it in this case.
Maybe that rewrite happened around the "pre2.05" release... https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/arch/m68k/kernel/head.S?h=2.0&id=37147b87dddfc389e97d99f078be5a0f1012ba74Perhaps we need to look at head.S from before 1998 to figure out whatI know the head.S MMU code was completely rewritten around that time to
motivated Penguin's '020 special case and the option to disable it?
accommodate changes needed for the Mac port. What we used before on
Atari and Amiga bears little to no relation to what we have now. My
guess is that 030 (has transparent translation register) and 020 (does
not have tt1) used distinct code paths before the rewrite, but share
much of the code now.
I haven't found a kernel source that old on my system so I can't verify
my recollection of this though. Geert has a git tree somewhere that
contains all the ancient history, might be worth checking.
Unfortunately, even in the oldest head.S at that link, it's not obvious to
me why 68020/68851 and 68030 would execute different code paths. It's easy
to believe that the Penguin hack for 68020/68851 was incomplete (should
have been extended to 68030).
But that's academic. If we change the bootloader to fix the issue that
Stan reported and if it remains backwards compatible with Linux v2.2.25
(or Debian 3) I'd be happy with that.
I know the head.S MMU code was completely rewritten around that time
to accommodate changes needed for the Mac port. What we used before on Atari and Amiga bears little to no relation to what we have now. My
guess is that 030 (has transparent translation register) and 020 (does
not have tt1) used distinct code paths before the rewrite, but share
much of the code now.
But the 68020 does have early termination pages, which map (IIRC) 2 MiB
at once. In the early days, 2 MiB should have been fine to map the
kernel. As that's the only mechanism used by head.S, perhaps the real
reason for picking the largest chunk on Mac is that you cannot map contiguously using early termination pages a series of discontiguous 1
MiB banks?
Based on the commit that Geert cited, I'd be inclined to sort chunks by physical address, find the lowest chunk having size >= 16 MB and put thatand use the first one (lowest address one) that satisfies such a minimum size criterion.
one and the higher ones into bootinfo. (Or failing that, find the one
having size >= 8 MB, or failing that, 4 MB.)
Do you know the size of the uncompressed kernel at that stage? That way, you could skip all RAM banks smaller than that size (plus some margin for the initial mappings, one 4k page for each 4 MB of chunk size plus required number of pointer table pages)
In this way, you won't lose all of the smaller but still useful banks, just in case a user arranged the banks in ascending size order.
(Skipping the lower address banks isn't strictly required BTW - the kernel will warn and ignore them as it used to. No harm done.)
A full sort isn't really needed here but does offer some determinism. Are there implications for mm data structures? E.g. memblock_add_node() is
called for each chunk, and if the chunks are in the "wrong" order, perhaps that would affect mm algorithms (?)
There is no checks about order that I would have seen. But the reason why memory with lower addresses than mapped by head.S can't be used still isn't clear to me. Best leave the rest of the chunk list in address order.
Anyway, surely this test for CPU_68020 in Source/mmu_support.c in Penguin
is bogus. This must have to do with some quirk of the Mac II logic board
and not the type of MMU or CPU.
So, why sort chunks by address on Mac II? Beats me. I guess I'll have to
dig out my Mac II and try EMILE, which doesn't have that hack.
As you said, there is no distinction made between 020 and 030 in that code, so the same hack should have been applied to 030. Maybe the MacII (were
there other 020 Macs?) was the only one with RAM banks A and B spaced closer than 32 MB?
On Tue, Mar 23, 2021 at 06:50:05PM +1100, Finn Thain wrote:
Anyway, surely this test for CPU_68020 in Source/mmu_support.c in
Penguin is bogus. This must have to do with some quirk of the Mac II
logic board and not the type of MMU or CPU.
So, why sort chunks by address on Mac II? Beats me. I guess I'll have
to dig out my Mac II and try EMILE, which doesn't have that hack.
Apparently there was a bug in the original Mac II ROM that couldn't
handle more than 4MB in bank A. That would mean that anyone who had more
than 8MB had the larger modules in bank B.
On 22/03/21 6:24 pm, Finn Thain wrote:
I guess once the RAM chunk list has been sorted, it was most
convenient to use that sorted list directly for the bootinfo
records.
A constraint that says the first chunk must be the largest one is undesirable because if the largest chunk has higher address than some other large chunk, the latter would become inaccessible.
A similar problem arises when you have only two chunks of equal size. Sorting by size doesn't help and the bootloader could theoretically
end up putting the kernel in bank B, leaving bank A unavailable.
Can't fault your reasoning there. The use case of multiple large chunks present (and only the largest one used unless banks are rearranged)
might have been rare back in '98.
Why not sort chunks by physical address and omit any chunks prior
to the largest one, to satisfy both requirements? Then ask users
to re-arrange RAM SIMMs such that bank A is the largest.
Yes, that could be done. I don't think the kernel would mind any RAM banks not passed in the bootinfo struct (to wit, IIRC amiboot has a 'memfile' option to allow exclusion of RAM chunks from bootinfo, to
skip RAM that's slow or unreliable).
Based on the commit that Geert cited, I'd be inclined to sort chunks
by physical address, find the lowest chunk having size >= 16 MB and
put that one and the higher ones into bootinfo. (Or failing that, find
the one having size >= 8 MB, or failing that, 4 MB.)
Do you know the size of the uncompressed kernel at that stage? That way,
you could skip all RAM banks smaller than that size (plus some margin
for the initial mappings, one 4k page for each 4 MB of chunk size plus required number of pointer table pages) and use the first one (lowest address one) that satisfies such a minimum size criterion.
In this way, you won't lose all of the smaller but still useful banks,
just in case a user arranged the banks in ascending size order.
(Skipping the lower address banks isn't strictly required BTW - the
kernel will warn and ignore them as it used to. No harm done.)
[...]
Does any of this help with the problem of RBV Macs? Video RAM must
start at address 0x0, and reordering RAM to have the largest chunk
at that address would occupy the RBV video range and render RBV unusable? Can you even rearrange RAM in these machines?
I've argued elsewhere in this thread that the bank A issue in RBV Macs doesn't matter that much.
If on-board video is enabled, bank A is slowed down. That suggests to
me that bank A is probably not that useful for Linux and is probably relatively small anyway. As Brad said, bank A is always 1 MB on a
IIsi. So I don't mind if Linux ignores it in this case.
Probably good cause to only use it for video RAM through a separate allocator and pool (if a user absolutely insists and someone writes a
patch that you would accept). We can't share it with the kernel anyway
at present (and with a fixed size of 1 MB, it would be useless for
modern times kernels).
[...]
I think I've found the difference: commit 75ce89a86b88c2dd77a0d2697c1ecaf9c53016ce (the earliest//I found) has
head.S set up a page descriptor entry at the pointer table level (i.e. 'early termination' descriptor). That maps 32 MB in one go, regardless
of size and alignment of that chunk, which would not have mattered for
Atari and Amiga (as far as I know, the RAM banks are far enough apart
for such mappings not to overlap) but might have caused trouble on Mac
if the RAM banks fall within 32 MB and the kernel runs from the second
bank (which then isn't 32 MB aligned).
Today's head.S only uses early termination only if both size and
alignment match.
As you said, there is no distinction made between 020 and 030 in that
code, so the same hack should have been applied to 030. Maybe the MacII (were there other 020 Macs?) was the only one with RAM banks A and B
spaced closer than 32 MB?
But that's academic. If we change the bootloader to fix the issue that Stan reported and if it remains backwards compatible with Linux
v2.2.25 (or Debian 3) I'd be happy with that.
I'm quite certain that today's head.S needs no more 020 hacks and ought
to work on MacII if you can fit enough RAM in bank B to hold the kernel.
But finding a MacII to test this on is more than a little academic
indeed.
Let's fix the meminfo chunk ordering and see whether that fixes Stan's issues. I have no doubt that as long as the long-standing constraint
about the first chunk holding the kernel isn't violated, old kernels
will continue to boot OK.
Cheers,
Michael
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 293 |
Nodes: | 16 (2 / 14) |
Uptime: | 233:22:44 |
Calls: | 6,624 |
Files: | 12,172 |
Messages: | 5,319,627 |