Moving the tangent conversation here, so as to not further derail redmagejoe's thread.
Wall of text warning! This is half responding to abw and half spitballing ideas.
*: FCEUX's CDL format leaves a couple of things to be desired
I actually hit a few stumbling blocks right out of the gate when thinking about how I was going to use the CDL for something more generalized. It's weird because it SEEMS like it should be enough to be able to produce a mostly-there disassembler, but yet it still kind of isn't.
Crystalis [snip] there's a STA $04 that crosses the bank boundary and screws everything up.
Little edge cases like this shouldn't be TOO terrible. In this situation I would say the disassembler should just split the instruction and insert it as binary ".byte" statements in separate banks, and leave it to the user to sort that detail out themselves. Maybe give a warning or something.
I don't think every problem can or should be solved by the disassembler. This is one I think firmly belongs in the users hands.
ROM banks that get loaded into multiple RAM banks are also problematic.
This seems like a much bigger problem, but it might be solved with a better understanding and usage of ca65's segment system. TBH I don't fully understand it myself, and so I did something extremely generic and simple just to get FF1 up and running (1 bank = 1 segment), but you don't have to do it that way. I suspect there's a lot more flexibility there.
Assuming, of course, the parts of the PRG that are accessed in different slots don't overlap. If there is one label that needs to be $8000 when accessed from one area of code and $C000 when accessed from another.... I don't know HOW you'd solve that.
I'm not sure what the right way is to label things like LDA $8000,X in a $C000-$FFFF fixed bank where that legitimately refers to any of several banks potentially loaded into $8000.
My original plan for this was to defer it to the user. With FF1, the amount of crosstalk between banks was actually surprisingly small. In my original disassembly I think I just put Unk_8000 if I didn't know the bank and L05_8000 if I did. Then just did a Ctrl+F for "Unk" and manually replaced them with some simple code analysis until I covered them all.
But MMC1, and frankly FF1 in particular is pretty simple. So that might not be a good solution for general usage.
Similarly, I'm not sure how to handle labelling bytes that get chopped up into a pointer, e.g. the high 3 bits select the bank and the low 5 bits select the N'th pointer in a pointer table in the selected bank, especially when there's extra math and conditional logic involved. Probably I'll just define those as a constant or something.
The final disassembly should probably look something like:
.word (BankName << 13) | N
But automating that? Good luck. CDL isn't going to give you any insight into that.
Sometimes code also gets logged as data
I would just default to code always when that bit is set. My first look at the CDL format, I almost immediately decided I was going to ignore the Data bit entirely.
But as for code being located to RAM, I don't know if that even COULD be disassembled properly. At least not if there are any JMPs or JSRs to code in RAM. That seems like the biggest challenge of all the things listed so far. I think the user would have to separately disassemble that code and inject it manually into the source with a ".base" command or something.
As I type this, I find a bad trend of me saying "too hard? Pass it to the user!". That's not going to make for a very good tool. XD
Maybe a good medium is to create as much of the disassembly as possible, but make a log of things that were ambiguous and output that as well. That way the user has sort of a checklist of things they need to give a once-over. The things you mention are likely to happen in many games, but I can't imagine they happen very frequently in any one individual game.
Even a more standard problem like figuring out how to label JMP indirect -- which is something I actually do want the disassembler to handle -- just about every game does that, but they don't do it very often. I think FF1 only has like a half dozen indirect JMPs..... if that.
I really like knowing what's going on with the control flow when mucking about with ASM changes. Without that, it's too easy to look at a bunch of code and say "oh, I can make this little change and everything will be awesome" only to find out that some other code is branching into the middle of your changes and crashing the system
.
It's a nice extra that would help with understanding the code, but as far as this above point -- eeehh. The label alone should suffice. If there are no other labels in the middle of the routine, you know nobody is jumping into it .. and if there's some code that IS jumping into it without a label, then your comments would have missed it anyway.
But frequently when commenting, one of the first things I'd ask myself is "where is this code being used", as that often gives some insight on what the code is doing. So it's still a really nice thing to have. =)
Do you have any idea what the breakdown of work for your FF1 disassembly was?
It was so long ago. I seem to remember one of the biggest early hurdle was figuring out how to set up ca65 reasonably (which I'm still not convinced I did very well). But the actual disassembler wasn't very difficult to make and didn't take very long.
BUT, it was custom made and I could manually tweak it for individual quirks of the game. The tool was very specifically a Final Fantasy 1 disassembler, it would not have been easy to generalize it. I kind of wish I still had the code for it, but it's long gone.
99.99% of the work, though, was analyzing the code and commenting it. Giving labels names, etc.
ANYWAY, today I was thinking about how I want to tackle this. And like you, I saw that CDL just wasn't enough. I wanted insight into putting labels in pointer tables and jump tables. And that's something CDL doesn't really give you any information about.
So my mind went back to the classic idea of "well why not just trace the code with a mini emulator and record any information you need along the way?"
I spent a little while kicking around ideas of how to do that.
My working plan so far:
- keep track of a minimal "state" when tracing the code that includes information about contents of A,X,Y, as well as the last ~10 or so RAM accesses. The state includes:
- Known state. If obtained from an immediate instruction, or an absolute read from ROM, we know for sure what this value is ("Fully known"). If we had a fully known value and then it got modified (but not destroyed) somehow -- like with an ADC to RAM, then it's "Partially known". Or if there's no way the state can be determined, it's "unknown".
- Variable contents. If fully known, the actual value. If partially known, the last fully known value
- Offset. If fully/partially known, the ROM offset which this value came from
- Did it come from a LUT? LDA immediate has different implications than LDA Abs,X or LDA Ind,Y. The former means it might be a pointer that's hardcoded, and the later means it's a pointer that might be in a lookup table. Keep track of whether or not the data was read with indexing.
As you run into STA/STX/STY, you just copy this state from A/X/Y and dump it into your "last 10 RAM locations" buffer. The idea is that the vast majority of the time, pointers are built immediately before they're dereferenced, so by tracking where the information came from, you'll know what code is referencing it when you come across some indirection.
IE, when you hit "LDA ($10),Y", you can look to see if $10 and $11 are in your "recent RAM" buffer, and if so, you know exactly where the pointer came from, and thus can insert label names appropriately.
The reason for "partially" known is because ADC is commonly used as an alternative way to index. So if the partially known pointer is $9980 and it got some unknown RAM value added to it, the pointer table
probably still starts at $9980, and that's still the label you'd want to use.
The next issue is how to actually trace the code. First impression is just start at reset and fork at every branch. And maybe do several forks when you hit an indirect JMP. If you keep track of pointer tables as outlined above, you SHOULD be able to figure out where most jump tables are and find all the possible paths it can take (in typical cases, anyway).
But bankswitching complicates that. MOST bankswitches are done with known, immediate values. But every once in a while you'll get something that does calculations to determine what bank to swap to.
Interrupts are another problem. What if the vector points to a swappable region? What if the vector itself is IN a swappable region?
And even the basic premise isn't so straightforward. I figure if I hit an invalid opcode (or a BRK, probably), I can assume I'm no longer tracing code and can stop that fork. But how to end a fork apart from that? Code loops indefinitely, I have to stop at some point. It can't be as simple as stopping when you reach an offset I've already logged -- otherwise few forks will get beyond a simple JSR. Maybe stop when offset, current stack, and reg/ram state all match something done previously? That's a lot to keep track of for every instruction.
It's pretty daunting.
But yeah that's kind of what I was doing today. Just mulling this over trying to figure out how to tackle it.