Whoa, that's a lot to trace! Great job.
So if I'm reading this right, the patch skips over that code. Adding an A press -should- be as simple as splicing in that code then, if the right location is found.
And yes, I need to disassemble FastROM. I think part of the issue is overextending DMA transfers during V-blank, which I should probably make a separate topic for when I'm ready to address that.
EDIT:
Looks like the start of that trace loads $09A3 and checks if it's less than 2 (11 02 address) so that's somehow related.