- implemented Bug_Fixes\Late_HDMA_Fix, which fixes the following 1-frame color corruption:
when it should look like:
So, there's a lot I can say about this bug. And it's going to get a bit technical.
You generally just experience it as a screen blink when first talking to an NPC merchant or when opening a treasure chest with money in it. It's driven me bonkers for years as tracking it down was pretty tricky. Some emulators make it more or less likely to happen because it's very timing sensitive; I think SNES9X is where I saw it most, and ZSNES least, with Mesen-S a somewhat rare in between (but Mesen-S being the only one with robust enough debugging features to track it down). It's technically a vanilla code flaw, but various SoM hacks can make it more or less likely to occur. That screenshot is pretty much a worst-case scenario; usually it's a fair bit more subtle.
So, what causes it?
HDMA getting turned on mid-frame because stuff to do during vblank took too long.
SoM was designed so that if data uploads to video memory during vblank are running late, they can finish, and you'll just get a couple black lines at the top of the screen for 1 frame, often times unnoticeable due to overscan, that part of the screen already being black, etc. However, SoM does nothing to account for missing the HDMA initialization that only occurs at the start of scanline 0. If you look carefully at the first image compared to the second, you can see 1 missing scanline at the top of the first image.
SoM uses HDMA to change the first palette row just before 3/4ths the way down the screen so that textboxes and the HUD can be different colors from each other. In that space between the money display and the HUD, HDMA rewrites one color per scanline to replace colors 1 through 15 (color 0 is the background color and isn't changed). Due to how HDMA works, what it specifically does is write 0 to color 16 (unusable) and wait 127 scanlines, then write 0 to color 16 (unusable) and wait 32 scanlines, then write 1 color per scanline for the next 15 scanlines, then HDMA ends for that frame. Those first two writes don't do anything, but cause a specific number of scanlines to be skipped so that it can reach its intended start point for the 15 color write starting on scanline 158 (~10/16ths the way down the screen) and ending on scanline 173 (~11/16ths).
But if HDMA is turned on after scanline 0 has already occurred, it both misses that first "write and wait 127 scanlines" command AND it doesn't reinitialize its counters and pointers, so the data it's using is whatever the HDMA circuitry had at the end
of the last frame's HDMA. And that stale data is what results in bonkers palette color replacements for 1 frame (the palette gets completely refreshed during the next vblank).
So what is that data that ends up causing bonkers colors?
Basically, what's left over gets interpreted as "write 1 color per scanline for the next 128 scanlines", the palette entry numbers to use (instead of a nice linear list of 1-15) is whatever 128 bytes are at memory address $7E05EF (00 00 22 51 E3 80 22 30 AB 00 05 21 AA 7F 74 96 and so on), and the colors to write to those palette entries are the first 256 bytes starting at the very beginning of RAM (frequently changing junk data since the beginning of RAM is where a lot of temporary variables are). The result is seemingly random palette entries getting overwritten with basically random colors.
So what was the fix?
When it's time to turn on HDMA at the end of SoM's NMI handler, carefully detect if we're beyond scanline 0, and if so, manually initialize HDMA settings, subtract the current scanline number from 127 and set the HDMA line counter to the remainder, then turn on HDMA.
I also worked in a bug fix for an edge case in vanilla code where HDMA wouldn't
get turned on just after a text box closed, but where the palette hadn't been restored from text box colors back to HUD colors, which resulted in the HUD using text box colors for 1 or 2 frames (easy to miss depending on which text box frame you were using).
Edit: Of course there's a small oversight in Late_HDMA_Fix that makes the prologue log scene look wrong; it's fixed for next release.