Not sure how many of you saw my previous post talking about doubling the sprites. But we have verified the technique on real hardware.
A bit of background:
The CPS1 controls where sprite ram is located via the "OBJ RAM base" pointer at $800100. The CPS1 can show a maximum of 256 sprites per frame and per scanline, normally. Since a sprite uses 4 words to designate its position, tile id, and attributes, a full sprite buffer is $800 bytes, or 2k.
However, between the object and first scroll layer there is $8000 bytes available, or 16 times the amount of space for a single buffer. Final Fight uses this extra RAM to double buffer the sprite data, alternating between buffers at $900000 and $904000. It does this because updating a single buffer results in artifacts on the real hardware. The outcome is random if sprite RAM is being updated at the same time it is being drawn.
Final Fight's sprite drawing system works on a priority system with players coming first, then bosses, enemies, and so on. If the game detects that a sprite that should be drawn would overflow the 256 sprite limit, it will simply not draw the object.
In some of Final Fight's more action intensive scenes, some sprites will flicker or simply fail to draw. Due to the aforementioned sprite drawing priority system, first items and other inanimate objects will flicker, but eventually enemies will be affected too. Having too many enemies on screen will result in the enemies towards the end of RAM overflowing, leaving only enough room for a few items which only take up a few sprites. By a rough estimate the overflow is only a few dozen extra sprites, so if the CPS1 can in fact draw 256 sprites per scanline, perhaps it would be possible to double buffer sprites and reduce or outright eliminate flickering.
We have entered uncharted territory. The information about the CPS1 is scarce, and only one-liners like "Simultaneously displayable: 256 (per scanline)" on Wikipedia gave any hope this might work. Looking at MAME's source code it is clear that the only raster techniques it supports is the rowscroll table that Street Fighter II uses to render a perspective effect on the ground. However, it turns out there is an additional interrupt available on the CPS1's 68k. Some games use it for input related reasons, but according to the MAME source it is not entirely known why. Hijacking the hsync and vsync handling of MAME I was able to trigger this interrupt on a specific scanline and switch buffers. There was no apparent effect. Since MAME does not assume any hsync based raster effects could occur in CPS1 its draw handling does not support it. After modifying MAME further I was finally able to generate a real hsync based effect, disabling the sprites on the second half of the screen.
Clearly this is just a prototype, and alas may not work on the real hardware. I spent some time looking at schematics and planning my attack, soldered some lines to the hsync and vsync connections of my C board and set out to try some hardware. Initial tests were simple. I disabled Final Fight's double buffering, hooked up my hsync generator, and manually clicked through the hsync lines with microswitches. I modified the code to switch which buffer was shown when the interrupt was generated, and after clicking to the correct scanline, sprites were disabled. Clicking more times renabled the sprites and I decided it was time to really give it a go.
Hooking up the hsync/vsync lines to my interrupt generater resulted in an interesting but planned-for effect. Every half frame the game switched between a full buffer and an empty buffer. The result being only half the screen is being drawn at a time. Since the human eye cannot percieve this and I do not have a camera fast enough to record individual scanlines, the apparent effect is transparency.
Here is some footage:https://www.youtube.com/watch?v=cxichPTzGAw
Clearly something good is happening, but how do we know it will work with multiple sprite buffers?
As mentioned earlier Final Fight already has two sprite buffers it uses. Disabling one shows the artifact of memory being written as a sprite is drawn. This often appears as flickering on the sprite, but can also appear as the wrong sprite being shown, or offset strangely. By using the interrupt generater to switch mid frame to the sprite buffer that's currently being written to, we can see this effect.
Notice the trashcan in the middle of the screen. It glitches ever so slightly because it is an item and has low priority when being drawn. However, in most cases the switching of buffers is virtually unnoticable, meaning that switching sprite buffers midframe is feasible. Since the sprite vram region is 16x the size of a single buffer, and we are only using two buffers regions currently, additional buffer room is available.
The final piece of the puzzle:
So this is all fine and well, but we still have an issue. Sorting sprites would be too much for a 12mhz 68k. So we need an algorithm that lets us split sprites into buckets. By tracking the y position of all sprites drawn in an accumulator and dividing by the number of sprites shown we can find the middle point of all sprites. Telling the interrupt generator that this is the line we wish to cause an interrupt on allows us to change buffers at the correct time. So now we need to create our sprite buffers. Using the previously calculated middle point of the last frame we can leverage our knowledge that objects move around VERY little in the y position frame to frame to fill our buffers. Objects below that line go in one buffer, above it another, and if it is within one sprites height of the mid-line, both. This has a computational complexity of O(n), as we are only doing a few additional comparisons, and no additional looping. Very handlable by the 68k.
So there it is, a technique to increase the sprite limit of the CPS1. It even has the potential to be done multiple times per frame to further increase the upper limit.