If I understand correctly then the SA-1 can see the entire BW-RAM (mapping both 40-4F/60-6F spaces) but the SNES only sees half (the 40-4F portion)?
If so it might explain why emulators have an issue where only 128KB of the 256KB RAM possible is useable. IIRC the SMW S-1 patch author indicated attempts to use 128k-256k region in emulators just ended up as a mirror of the 0k-128k region but I didn't read anything to indicate whether or not the author got that issue from the SNES CPU bus perspective, the SA-1 perspective, or both.
I've read the official dox (book 2) and got the ROM banking figured out to get a full 8MB ROM map but still wrapping my head around the BW-RAM banking as it's more complicated. The SMW SA-1 authors said something about being able to map that area in bitwise patterns and stuff (unless I'm confusing the bitwise patterning stuff with the character conversion logic since the author never clarified the difference).
I want the SNES CPU to have as much ram available as possible since I'm going to pretty much want it to live in WRAM and/or BW-RAM (meaning K and B are both in RAM or BW-RAM) for extended periods.
My understanding is:
 The SNES CPU will slow down the SA-1 whenever they share buses
 The bulk of ROM accesses ought to be done by SA-1 to utilize the character conversion logic
(implying the majority of ROM graphics will be bitmaps versus characters)
  and  suggest that other ROM data (such as audio) should be transferred by the SA-1
through a BW-RAM window set aside for the purpose
 Due to  whenever the SA1 is going to transfer to the BW-RAM window the SNES-CPU must have its home only in WRAM
 The SA-1 thread is 4x faster than the SNES CPU thread is the sang in  is avoided
 If the SNES CPU thread is also working (not perma-sleeping in WAIs) something between 4x-5x is possible
WRAM is slow and small so the SNES cpu should not be taking residence there often (in this extreme case it probably makes sense to just JMP to a WAI opcode). I envision a system where the SNES CPU has a directing (script processing) role (but is commandeered by interrupts to access the hardware which only it has access to such as the APU ports and PPU/DMA/HDMA registers). In this case the WRAM WAI sleep would be the case of the SA-1 pumping new code into WRAM for the SNES CPU (and/or DMA and/or HDMA) to operate on. You can think of this in terms that the code window (script routine) has completed and the SNES CPU is WAIing for a new script to execute (which will be started when the SA-1 is finished transferring a new script and the SNES CPU subsequently woken up).
It has to be a highly cooperative model so that each processor avoids the other's buses in all but worst cases (it may be posible that a huge piece of code that must be run by the SNES CPU would be more efficiently executed from ROM access than paged code chunks in BW-RAM and then it would be the SA1's turn for a catnap in WAIwai-land).
The format of the code window in BW-RAM might be st up like this:
The prefix might be empty. If the SA-1 writes a prefix it would do stuff like set up A,X,Y,P,B,K,D,S correctly for the code block about to run. If a code block is just successive continuation from the previous one being written no prefix is necessary (it can be all code). When execution gets to the suffix section it initiates a request to the SA-1 for more data before becoming homeless and taking residence in a WRAM WAI shelter.
This windowing algorithm basically implements a crude form of paging or allows the SNES CPU to run sequences of short routines.
Essentially a script processing thread...
Dang I wish MSU1 and SA-1 could be in the same cart and that all emulators supported it.
Did anyone also consider that SA-1 also gives you nice full 16-bit fixed point multiply/divide capability? Nice more accurate math that's even faster than the PPU's MUL/DIV!
The SA-1 gives a homebrew game (or intricate hack of an existing game such as SMW, FF6, ToP made to run 64mbit+SA-1) lots of nice goodies if its use can be mastered.
I really want to see some new life in this thread and maybe have these ideas expanded on.