NetBSD/mac68k Work Entry #1


I started messing with NetBSD/mac68k in an attempt to make NetBSD reconfigure the djMEMC memory controller on C/Q 610, 650, and 800 machines to use 128MB SIMMs like what I did with the ROM hack. That's still a bit of a work in progress, but I've encountered a number of other issues in the process.

Booting

Using a Centris 650 with 4x128MB SIMMs and 8MB of soldered RAM, with a stock ROM gives a usable 264MB of memory (half of each SIMM = 256MB + 8MB soldered = 264MB). Starting with this stock configuration, without trying to use a modified ROM or kernel or anything else, I found the NetBSD/mac68k Booter uses an unsigned char to represent the number of megabytes of RAM on the system, which it then passes to the kernel (which for the most part isn't used, but more on that later). The unsigned char only being capable of representing 255MB of RAM can't detect and pass the physical RAM recognized by MacOS. So, first step is to get the booter to recognize and pass the proper amount of physical RAM in the machine, which was a fairly trivial conversion of the unsigned char to unsigned short.

The Booter goes through some contortions in order to get the kernel loaded at physical RAM offset 0, which isn't entirely straightforward because that RAM is in use by MacOS. It first allocates memory for the kernel (and other information) from MacOS using NewPtr(). This memory is at an indeterminate location in physical RAM. It then copies another routine into this region after the kernel, and installs a shutdown routine that calls this copied code, and then shuts down the machine gracefully so all filesystems are unmounted properly, etc. The copied code gets run, disables interrupts, copies the kernel from the random memory location that was previously allocated into location 0, sets up some registers to pass information to the kernel, then jumps into the kernel. Except unfortunately, after all that setup, it jumps into the kernel at the random memory location instead of the copied kernel at address 0. This kinda mostly works, but results in occasional hangs early in the kernel. So, fixed that too.

The Booter with my changes is available here: Booter2.0.0-src-bbraun2.cpt.hqx

Process' Segment Tables

Now with NetBSD booting on the machine with the same memory configuration supported by MacOS, I tried to write a test case to ensure the physical memory is actually available and used, as a baseline for my future mods to the memory configuration. Unfortunately, if you malloc() a largeish amount of memory (in this case a mere 164MB is sufficient), then touch every page, the machine will panic. Clearly this needed to be investigated before I embarked on trying to add more memory to the system.

Some background information on how the 68040 MMU works: The MMU uses a three level table. Level 1 contains 128 entries which manage 32MB chunks of address space, each entry pointing to the Level 2 table which manages that 32MB chunk. Each Level 2 table contains 128 entries which manage 256KB of address space, each of those entries pointing to a Level 3 table which manages that 256KB chunk. Each Level 3 table contains 64 entries which manage 4KB of address space, aka page size (the 68040 is capable of using 4 or 8K pages, but by default NetBSD uses 4K pages).

User processes are managed a little differently than the kernel address space, and this panic is from a user process. Each process gets its own set of tables as above, and each process gets 1 page, or 4K allocated for this. 4K will store a grand total of one Level 1 table (128 entries * 4 bytes / entry = 512 bytes), and 7 Level 2 entries (each containing 128 entries * 4 bytes / entry = 512 bytes / table). Each of the 7 Level 2 tables manage 32MB of address space, resulting in a grand total of 224MB (32MB * 7 entries) of total virtual address space available to the process. If a user process exceeds 224MB of virtual address space, the kernel panics.

To give processes more address space, they need more memory available for a larger set of tables. The reason NetBSD hasn't done this by default is because the current allocation scheme cannot guarantee anything more than 1 page will be physically contiguous in RAM, which is important since the MMU is what's looking at these tables, they can't be logically contiguous but physical discontiguous. I'm still trying to understand the NetBSD UVM subsystem enough to safely allocate more than 1 physically contiguous page.

However, I don't really see much down side to upping the number of pages allocated for the user tables. The tables are allocated from lowest to highest, so the only way the second page will be accessed is if the process tries to use more than 224MB of address space. If the pages are not physically contiguous, the MMU is likely looking at garbage and the process will crash. If they are physically contiguous, everything is great. The alternative is leaving things how they are, and the kernel just panics in this situation.

For my testing, I've upped the number of pages and haven't had a problem. Ultimately it would be good to figure out how to allocate more than one physically contiguous page. For now, this lets me test large memory use.

The next question is why the kernel panics when running out of address space rather than doing something more graceful, since this is just a user process after all. I initially attempted to change this behavior to kill the process instead of panicking. However, when killing a process from the kernel, it is usually done by sending a SIGILL or SIGSEGV to the process, which then triggers a core dump. The UVM was attempting to allocate a new page for the process when the address space exhaustion occurred which holds several locks on the process' address mappings, and then the coredump process tries to traverse the process' entire address space, grabbing those locks along the way, and ultimately results in another kernel panic while trying to lock against ourselves. The obvious solution is to release the locks prior to killing the process in such a way. Unfortunately, those locks are several layers of abstraction up, with no clear way of how to find the locks, let alone which ones were taken, at the time of the problem.

RAM Controller Reconfiguration

The djMEMC memory controller contains 10 banks of RAM, with 2 banks per SIMM slot and 2 banks for the onboard soldered RAM. If your machine came with 4MB of soldered RAM, it's all in bank 0, and bank 1 is empty. As configured by the ROM, each bank has a maximum of 32MB. When using 128MB SIMMs, only half of each bank is in use. Each bank is then arranged to present one contiguous logical chunk of RAM to the system. So with 4MB of onboard ram in bank 0, bank 1 will start immediately afterwards; there will be no gap. Bank 2 will start immediately after the last byte of physical RAM in bank 1, and so on. Although each bank is capable of having 64MB of space, each bank is sewn to the previous to make the physical RAM contiguous on the system bus.

When enabling 128MB SIMMs, you enable 64MB of RAM per bank. So when doing this after boot, it's trying to essentially insert an extra 32MB/bank in the middle of the address space. This will disrupt anything referencing RAM beyond bank 2 (bank 0 and 1 are soldered, so don't change).

The djMEMC also supports interleaving the banks, so basically half of every word is stored in each bank, which speeds up memory accesses slightly. An interesting aside is that systems with 8MB of onboard RAM will be slightly faster than systems with 4MB of onboard RAM due to the 8MB being split between banks and interleaved, whereas the 4MB systems just have 4MB in one bank with no interleaving possible (even if you add RAM SIMMs). However, one consequence of the interleaving is one has to be very careful about touching interleaving while running from RAM. If interleaving changes while executing from RAM, the RAM will essentially be scrambled, and you've had the rug pulled out from under you. The ROM doesn't have this issue, since it is obviously executing from ROM.

All this makes attempts to reconfigure the memory controller while executing from the memory you're messing with, a bit tricky.

Updated April 10 2013