News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: NES CDL/"Smart" disassembler discussion  (Read 1306 times)

Disch

  • Hero Member
  • *****
  • Posts: 2770
  • NES Junkie
    • View Profile
NES CDL/"Smart" disassembler discussion
« on: January 23, 2020, 01:00:37 am »
Moving the tangent conversation here, so as to not further derail redmagejoe's thread.

Wall of text warning!  This is half responding to abw and half spitballing ideas.


*: FCEUX's CDL format leaves a couple of things to be desired

I actually hit a few stumbling blocks right out of the gate when thinking about how I was going to use the CDL for something more generalized.  It's weird because it SEEMS like it should be enough to be able to produce a mostly-there disassembler, but yet it still kind of isn't.

Quote
Crystalis [snip] there's a STA $04 that crosses the bank boundary and screws everything up.

Little edge cases like this shouldn't be TOO terrible.  In this situation I would say the disassembler should just split the instruction and insert it as binary ".byte" statements in separate banks, and leave it to the user to sort that detail out themselves.  Maybe give a warning or something.

I don't think every problem can or should be solved by the disassembler.  This is one I think firmly belongs in the users hands.

Quote
ROM banks that get loaded into multiple RAM banks are also problematic.

This seems like a much bigger problem, but it might be solved with a better understanding and usage of ca65's segment system.  TBH I don't fully understand it myself, and so I did something extremely generic and simple just to get FF1 up and running (1 bank = 1 segment), but you don't have to do it that way.  I suspect there's a lot more flexibility there.


Assuming, of course, the parts of the PRG that are accessed in different slots don't overlap.  If there is one label that needs to be $8000 when accessed from one area of code and $C000 when accessed from another.... I don't know HOW you'd solve that.

Quote
I'm not sure what the right way is to label things like LDA $8000,X in a $C000-$FFFF fixed bank where that legitimately refers to any of several banks potentially loaded into $8000.

My original plan for this was to defer it to the user.  With FF1, the amount of crosstalk between banks was actually surprisingly small.  In my original disassembly I think I just put Unk_8000 if I didn't know the bank and L05_8000 if I did.  Then just did a Ctrl+F for "Unk" and manually replaced them with some simple code analysis until I covered them all.

But MMC1, and frankly FF1 in particular is pretty simple.  So that might not be a good solution for general usage.

Quote
Similarly, I'm not sure how to handle labelling bytes that get chopped up into a pointer, e.g. the high 3 bits select the bank and the low 5 bits select the N'th pointer in a pointer table in the selected bank, especially when there's extra math and conditional logic involved. Probably I'll just define those as a constant or something.

The final disassembly should probably look something like:
Code: [Select]
.word (BankName << 13) | N

But automating that?  Good luck.  CDL isn't going to give you any insight into that.

Quote
Sometimes code also gets logged as data

I would just default to code always when that bit is set.  My first look at the CDL format, I almost immediately decided I was going to ignore the Data bit entirely.

But as for code being located to RAM, I don't know if that even COULD be disassembled properly.  At least not if there are any JMPs or JSRs to code in RAM.  That seems like the biggest challenge of all the things listed so far.  I think the user would have to separately disassemble that code and inject it manually into the source with a ".base" command or something.


As I type this, I find a bad trend of me saying "too hard?  Pass it to the user!".  That's not going to make for a very good tool.  XD

Maybe a good medium is to create as much of the disassembly as possible, but make a log of things that were ambiguous and output that as well.  That way the user has sort of a checklist of things they need to give a once-over.  The things you mention are likely to happen in many games, but I can't imagine they happen very frequently in any one individual game.

Even a more standard problem like figuring out how to label JMP indirect -- which is something I actually do want the disassembler to handle -- just about every game does that, but they don't do it very often.  I think FF1 only has like a half dozen indirect JMPs..... if that.


Quote
I really like knowing what's going on with the control flow when mucking about with ASM changes. Without that, it's too easy to look at a bunch of code and say "oh, I can make this little change and everything will be awesome" only to find out that some other code is branching into the middle of your changes and crashing the system :(.

It's a nice extra that would help with understanding the code, but as far as this above point -- eeehh.  The label alone should suffice.  If there are no other labels in the middle of the routine, you know nobody is jumping into it .. and if there's some code that IS jumping into it without a label, then your comments would have missed it anyway.

But frequently when commenting, one of the first things I'd ask myself is "where is this code being used", as that often gives some insight on what the code is doing.  So it's still a really nice thing to have.  =)


Quote
Do you have any idea what the breakdown of work for your FF1 disassembly was?

It was so long ago.  I seem to remember one of the biggest early hurdle was figuring out how to set up ca65 reasonably (which I'm still not convinced I did very well).  But the actual disassembler wasn't very difficult to make and didn't take very long.

BUT, it was custom made and I could manually tweak it for individual quirks of the game.  The tool was very specifically a Final Fantasy 1 disassembler, it would not have been easy to generalize it.  I kind of wish I still had the code for it, but it's long gone.

99.99% of the work, though, was analyzing the code and commenting it.  Giving labels names, etc.




ANYWAY, today I was thinking about how I want to tackle this.  And like you, I saw that CDL just wasn't enough.  I wanted insight into putting labels in pointer tables and jump tables.  And that's something CDL doesn't really give you any information about.

So my mind went back to the classic idea of "well why not just trace the code with a mini emulator and record any information you need along the way?"

I spent a little while kicking around ideas of how to do that.

My working plan so far:

- keep track of a minimal "state" when tracing the code that includes information about contents of A,X,Y, as well as the last ~10 or so RAM accesses.  The state includes:
  • Known state.  If obtained from an immediate instruction, or an absolute read from ROM, we know for sure what this value is ("Fully known").  If we had a fully known value and then it got modified (but not destroyed) somehow -- like with an ADC to RAM, then it's "Partially known".  Or if there's no way the state can be determined, it's "unknown".
  • Variable contents.  If fully known, the actual value.  If partially known, the last fully known value
  • Offset.  If fully/partially known, the ROM offset which this value came from
  • Did it come from a LUT?  LDA immediate has different implications than LDA Abs,X or LDA Ind,Y.  The former means it might be a pointer that's hardcoded, and the later means it's a pointer that might be in a lookup table.  Keep track of whether or not the data was read with indexing.

As you run into STA/STX/STY, you just copy this state from A/X/Y and dump it into your "last 10 RAM locations" buffer.  The idea is that the vast majority of the time, pointers are built immediately before they're dereferenced, so by tracking where the information came from, you'll know what code is referencing it when you come across some indirection.

IE, when you hit "LDA ($10),Y", you can look to see if $10 and $11 are in your "recent RAM" buffer, and if so, you know exactly where the pointer came from, and thus can insert label names appropriately.


The reason for "partially" known is because ADC is commonly used as an alternative way to index.  So if the partially known pointer is $9980 and it got some unknown RAM value added to it, the pointer table probably still starts at $9980, and that's still the label you'd want to use.



The next issue is how to actually trace the code.  First impression is just start at reset and fork at every branch.  And maybe do several forks when you hit an indirect JMP.  If you keep track of pointer tables as outlined above, you SHOULD be able to figure out where most jump tables are and find all the possible paths it can take (in typical cases, anyway).

But bankswitching complicates that.  MOST bankswitches are done with known, immediate values.  But every once in a while you'll get something that does calculations to determine what bank to swap to.

Interrupts are another problem.  What if the vector points to a swappable region?  What if the vector itself is IN a swappable region?

And even the basic premise isn't so straightforward.  I figure if I hit an invalid opcode (or a BRK, probably), I can assume I'm no longer tracing code and can stop that fork.  But how to end a fork apart from that?  Code loops indefinitely, I have to stop at some point.  It can't be as simple as stopping when you reach an offset I've already logged -- otherwise few forks will get beyond a simple JSR.  Maybe stop when offset, current stack, and reg/ram state all match something done previously?  That's a lot to keep track of for every instruction.

It's pretty daunting.


But yeah that's kind of what I was doing today.  Just mulling this over trying to figure out how to tackle it.

never-obsolete

  • Jr. Member
  • **
  • Posts: 31
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #1 on: January 23, 2020, 09:42:41 pm »
I've been working on a smart disassembler and ran into many of the same issues you two were having with FCEUX's cdl files. I ended up modifying Mesen to log more information. My disassembler targets asm6 and makes two passes over a rom. The first builds a list of labels and the second actually spits out the assembly file.


Quote
Crystalis [snip] there's a STA $04 that crosses the bank boundary and screws everything up.
One of the files that is passed to the disassembler is a text file that lays out how the banks should be treated. You could have something like:

8 8 16 8 8 8 16 8 ...

If no file is passed in, then default to whatever the mapper defaults to and do as Disch suggested with the .byte directive.


Quote
ROM banks that get loaded into multiple RAM banks are also problematic.
I've been testing with Kirby's Adventure and it does this with data. Luckly it doens't try to access the data from mirrors so I've been able to get away with using .base to manipulate the address.

Code: [Select]
.org $8000
; data accessed from $8000-$9FFF

.base ($ & $1FFF) | $A000
; data accessed from $A000-$BFFF

.base ($ & $1FFF) | $8000
; back to $8000-$9FFF

Quote
I'm not sure what the right way is to label things like LDA $8000,X in a $C000-$FFFF fixed bank where that legitimately refers to any of several banks potentially loaded into $8000.
I have it so that anytime Mesen logs something as code, it also logs what banks are mapped into $8000, $A000, $C000, $E000. These are used on the first pass of the disassembler when it is building the label list. You end up with something like LDA label_0C_8000, X in the code and label_0C_8000: when bank $0C address $8000 is reached.


I haven't yet tried to tackle code executed from ram, indirect jumps, or pointer/jump tables.

For pointers to data, I was thinking of logging the value of Y when indirection is used to find the base address. Then try and find a matching entry in an address table. It gets difficult because tables can be linear, split, have math involved, or whatever and that complicates everything.
« Last Edit: January 23, 2020, 10:01:47 pm by never-obsolete »

Disch

  • Hero Member
  • *****
  • Posts: 2770
  • NES Junkie
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #2 on: January 23, 2020, 10:13:41 pm »
One of the files that is passed to the disassembler is a text file that lays out how the banks should be treated.

This is a really good idea.  I think more detailed input here is key.

For my code-tracer approach, I started thinking about how the tracer is very likely going to miss chunks of code, and how the user might want to have a way to refine it or give it some guidance.  That way the user can create a disassembly, and when they spot some missed portions, the user can tweak a few settings and run it again to fill the gaps. 

Quote
I've been testing with Kirby's Adventure and it does this with data. Luckly it doens't try to access the data from mirrors so I've been able to get away with using .base to manipulate the address.

Is that the actual output of the disassembler?  Or did you tweak that manually?  That's pretty impressive.

abw

  • Sr. Member
  • ****
  • Posts: 345
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #3 on: January 24, 2020, 12:11:01 am »
I actually hit a few stumbling blocks right out of the gate when thinking about how I was going to use the CDL for something more generalized.  It's weird because it SEEMS like it should be enough to be able to produce a mostly-there disassembler, but yet it still kind of isn't.
That's a pretty great way of putting it :P. It does let you get a fair way, but definitely not all the way.

Little edge cases like this shouldn't be TOO terrible.  In this situation I would say the disassembler should just split the instruction and insert it as binary ".byte" statements in separate banks, and leave it to the user to sort that detail out themselves.  Maybe give a warning or something.

I don't think every problem can or should be solved by the disassembler.  This is one I think firmly belongs in the users hands.
When the ROM banks happen to be in the same order they'll get loaded into RAM, that can work. If it had been non-sequential banks, though... one idea I considered was attempting multiple disassemblies of each bank with different starting offsets, e.g. starting from bytes 0, 1, and 2 to cover carry-over from 1-, 2-, and 3-byte ops, and then disqualifying disassemblies that produced invalid opcodes or conflicted with the CDL. This was about the point where I started wishing the CDL included a bit for whether the code byte it logged was the opcode or an operand.

Assuming, of course, the parts of the PRG that are accessed in different slots don't overlap.  If there is one label that needs to be $8000 when accessed from one area of code and $C000 when accessed from another.... I don't know HOW you'd solve that.
Adjusting the label value based on the bank might not be so bad: you could have both label and label + $4000 point to the same line. But even before that, ROM banks that appear in multiple RAM banks make calls to $8000 ambiguous - they could be intra-bank calls if the bank is loaded in $8000, but calls to some other bank if the bank is loaded into $C000, so it's not clear which line the label(s) should go on in the first place.

The final disassembly should probably look something like:
Code: [Select]
.word (BankName << 13) | N
That's actually what I was looking for - clearly I haven't spent enough time with the ca65 manual yet :P.

My first look at the CDL format, I almost immediately decided I was going to ignore the Data bit entirely.
One thing the Data bit is good for is identifying data intermingled with code, e.g. when you have a JSR/BRK followed by some amount of data and the called routine manipulates the stack to read the data then updates the return address to skip over it. Another thing it's good for is confirming that bytes are not unused; I've seen some games with long enough runs of $00 that I would have assumed the space was unused, but the CDL says otherwise.

As I type this, I find a bad trend of me saying "too hard?  Pass it to the user!".  That's not going to make for a very good tool.  XD
I'm also doing that a lot, but it's not so bad when the user is me :P.

Maybe a good medium is to create as much of the disassembly as possible, but make a log of things that were ambiguous and output that as well.
One of the things I am logging is control flow operators that target bytes that aren't logged as code. That's helpful for filling in sections of the CDL that never got used (e.g. when the code for capping your gold at the maximum amount never executed because you spent the entire game being dirt poor). Investigating those can also lead to discovering unused code, which can sometimes be fun all on its own.

Even a more standard problem like figuring out how to label JMP indirect
I'm not 100% sure what you mean here. Assuming you have something (a constant?) you can use like a label for the (almost certainly) RAM address containing the address to jump to, you should be able to output "JMP (constant)" easily. For indirect stuff, I mentioned having transitive comments on labels; here's an example from FF2 showing the comment on data at $9100 being applied to its pointer (label in the reassemblable version) at $9C45/$9C46 and percolating up to the code that reads the pointer at $9C13 (I guess I should probably also apply it to $9C18; like I said, it's WIP):
Code: [Select]
; Minwu's initial stats
; indirect data load target (via $9C45)
0x001110|$00:$9100:03 00 E9 F6 CC FF FF FF AF 00 AF 00 67 00 67 00
0x001120|$00:$9110:0A 14 14 10 30 28 01 50 0A 00 85 00 41 00 2B 19
0x001130|$00:$9120:0A 14 14 10 30 28 00 00 00 05 01 14 02 41 00 06
0x001140|$00:$9130:D4 D5 D6 D7 D8 D9 DA DB DC DD DF E0 E1 E3 E4 E6
...
0x001C23|$00:$9C13:BD 45 9C LDA $9C45,X ; -> $00:$9100: Minwu's initial stats
0x001C26|$00:$9C16:85 80    STA $80   
0x001C28|$00:$9C18:BD 46 9C LDA $9C46,X
0x001C2B|$00:$9C1B:85 81    STA $81   
0x001C2D|$00:$9C1D:A0 3F    LDY #$3F   
; control flow target (from $9C25)
0x001C2F|$00:$9C1F:B1 80    LDA ($80),Y
...
; indexed data load target (from $9C13)
0x001C55|$00:$9C45:00
; indexed data load target (from $9C18)
0x001C56|$00:$9C46:   91 ; $00:$9100; Minwu's initial stats

It's a nice extra that would help with understanding the code, but as far as this above point -- eeehh.  The label alone should suffice.  If there are no other labels in the middle of the routine, you know nobody is jumping into it .. and if there's some code that IS jumping into it without a label, then your comments would have missed it anyway.

But frequently when commenting, one of the first things I'd ask myself is "where is this code being used", as that often gives some insight on what the code is doing.  So it's still a really nice thing to have.  =)
I started out by just adding an indicator that an address was the target of a control flow operator, and for simple cases or when everything is labelled, then sure, you can just search for the label. But I found myself wanting to be able to immediately see from where and from how many places something was referenced, and having added that, I find myself using it frequently. One use case is medium-to-large routines with lots of internal branching; if you know the start and end addresses of the routine and can see all the addresses calling each control flow target, you can quickly determine whether the routine is self-contained or shares code with other routines elsewhere. Reaching the same conclusion without a handy list of callers is still possible, but takes longer and is more annoying.

Some other nice-to-haves on my wish list:
  • Correctly handling labels that target the middle of multi-byte ops (e.g. branching into the A9 of BIT $00A9 to get LDA #$00)
  • Adjusting labels with built-in offsets, e.g. turning LDA $8000,X and LDA $8001,X into LDA label,X and LDA label+1,X when you know $8000 and $8001 are linked, or turning LDA label,X into LDA label-192,X when you know X starts counting at #$C0.
  • Programmatically determining the upper bound on data tables (indirect or indexed).

So my mind went back to the classic idea of "well why not just trace the code with a mini emulator and record any information you need along the way?"
I thought a bit about tracing too, and it is a sweet idea, but the more I thought about it, the longer the list of pitfalls grew, and I quickly decided that was a can of worms I did not want to open. Your previous emulator experience would help, but I'm not sure even that would be enough.
Just to poke a couple of holes:
  • Forking at every branch sounds fun, except sometimes branches are really not conditional (e.g. SEC BCS), and sometimes proving they aren't conditional takes a lot of work and/or higher reasoning.
  • Finding the upper bound on jump tables might not be obvious, and sometimes jump tables contain dead values, e.g. code checks for $0000 and decides not to jump.
  • You'll also need to know what mode the mapper is currently configured for, e.g. when the number, size, or position of swappable regions is dynamic.
  • What if the vector points to RAM?
  • Stack trickery means encountering invalid opcodes isn't necessarily the end of a path, and if you do hit an invalid opcode that isn't part of a path, you've already gone too far and some of your previous work is wrong.

Disch

  • Hero Member
  • *****
  • Posts: 2770
  • NES Junkie
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #4 on: January 24, 2020, 02:32:09 am »
One thing the Data bit is good for is identifying data intermingled with code, e.g. when you have a JSR/BRK followed by some amount of data and the called routine manipulates the stack to read the data then updates the return address to skip over it.

But that data wouldn't be marked as code.  So the data bit being set doesn't really tell you anything that the code bit being clear doesn't.

Tracking unused space and anti-hacking/CRC self analysis stuff is really the only practical thing I can see the data bit being used for.  But I'm not particularly interested in either for my project.  With a complete disassembly that has been appropriately labelled, there is little to no confusion about what is or isn't free space.

Quote
I'm not 100% sure what you mean here. [with indirect JMP]

What I mean is, I want my disassembler to be smart/aware enough to be able to produce this kind of output with no manual intervention from the user:

Code: [Select]
    LDA #<JumpTable
    STA $10
    LDA #>JumpTable
    STA $11
    LDA ($10),Y
    STA $80
    INY
    LDA ($10),Y
    STA $81
    JMP ($0080)

JumpTable:
    .word LabelOne, LabelTwo, LabelThree

Note that not only is the jump table labeled, despite being accessed indirectly, but also the CONTENTS of the jump table are also labelled (of course the label names would not be so clear, they'd just be numeric).

My current plan of attack involving tracking a minimal state during the trace WOULD be able to handle this [admittedly somewhat trivial] scenario.

- contents of A are fully known because of the immediate load

- Therefore both $10 and $11 are fully known

- Therefore when you do the Indirect Y read, both bytes of the pointer are fully known -- meaning you know to put a label at that address, and the two preceeding LDAs are directly referencing it and those LDAs can be using the label name.

- After the In,Y reads, contents of A are no longer known, but we know they were done with an indexed read from a LUT starting at JumpTable.

- That knowledge transfers to $80,$81 after the STAs

- Therefore when you hit the indirect JMP, you examine what you know about $80,$81 -- and while we don't know its exact value, we know the values came from a lookup table starting at 'JumpTable'.  Therefore, once we reach JumpTable, we know to start labelling it appropriately.


I'm glossing over some of the finer implementation details, but that's how I'm envisioning it.  It seems entirely plausible to do.



Quote
Correctly handling labels that target the middle of multi-byte ops (e.g. branching into the A9 of BIT $00A9 to get LDA #$00)

I know SMB does this, and I vomit in my mouth a little bit every time I think about it.  I know they were cramped for space but this just seems so vile to me.  XD

Quote
Adjusting labels with built-in offsets, e.g. turning LDA $8000,X and LDA $8001,X into LDA label,X and LDA label+1,X when you know $8000 and $8001 are linked, or turning LDA label,X into LDA label-192,X when you know X starts counting at #$C0.

I've also considered this.  The former should be easy to do with my current setup, but I'm not quite so sure about the latter.

Quote
Forking at every branch sounds fun, except sometimes branches are really not conditional (e.g. SEC BCS), and sometimes proving they aren't conditional takes a lot of work and/or higher reasoning.

Well this is [usually] trivial when you keep track of a minimal state.  As long as the flag was set with an instruction that has a guaranteed value (ie:  immediate read or SEC/CLC or whatever), you'll be able to tell whether or not this is the case.

And even in a case like a BCC following a BCS ... you know the BCC is nonconditional because when you fork the tracer, clearly one path has C known to be set and the other has it known to be clear.

Quote
Finding the upper bound on jump tables might not be obvious, and sometimes jump tables contain dead values, e.g. code checks for $0000 and decides not to jump.

I see this as more of a theoretical problem than a real one.  A safe assumption is to end the jump table when you've reached something you've traced already (whether it be another code block, or another data table that you've labelled).  Yes, that won't catch every conceivable case, but it'll catch like 99% of them.

My goal here is not absolute perfection -- I don't think a perfect tracer is possible.  I'm shooting for "does as well a job as is reasonable"

Quote
You'll also need to know what mode the mapper is currently configured for, e.g. when the number, size, or position of swappable regions is dynamic.

In MOST cases, mapper state will be known at all times because the tracer can easily identify and handle mapper reg writes.  I can't imagine mode changes are done computationally very often.  In most games I've seen, the game sets the configuration stuff with an immediate value at bootup and then never changes it.

Bankswitching, likewise, is usually done with immediate values as well... at least for code jumps (which is what I'm interested in anyway).  Bankswaps are frequently done computationally, but usually only when there's a large chunk of data that spans several banks.  And in that case I probably don't have to care too much about which bank it's actually swapping to.

Where this COULD be a problem is with something you mentioned earlier about having the bank number as an extra byte in a 3-byte jump table.  But I don't think that's terribly common on the NES.

The BIG part where this would be a problem is when a game does a cross bank jump, stashing the current bank in ram somewhere, to have it be read back and swapped back to at a later time.  This seems like it would be much more common, and I don't really have a good solution for automating that, other than maybe forking on JSRs similar to how I fork on branches.  But that's a can of worms I still need to think about more.


Quote
What if the vector points to RAM?

SOL.

I do not see any way, either by CDL style tracking or by automated code tracing, that any code being relocated to RAM can be properly disassembled.  If you have ideas I'd love to hear them.


I'm probably going to fall back to user intervention on this one.  Give the user some way to specify portions that it knows are code, so any chunks that the tracer missed on its first run can be caught on a rerun.

Another problem here is what is the vector points to a swappable bank?  Or if the vector itself is in a swappable bank?  Or worse... both?  Do I run all possible combinations?

Quote
Stack trickery means encountering invalid opcodes isn't necessarily the end of a path,

Well if I'm keeping some basic information about the state while tracing, I know MOST of what is on the stack (JSR stuff I know, but PHA/PHP I probably don't).  And so simple stuff to tweak the stack to RTS somewhere different can probably be handled by the tracer, as long as it's nothing super crazy.

Final Fantasy 1 also does a weird thing where it will JSR to an infinite loop, because it knows when an NMI fires, the NMI handler will pop the interrupt info off the stack and RTS out of the interrupt, thus escaping the infinite loop.  This would definitely stump the tracer as I currently envision it -- which is pushing me more towards the "fork on JSR" idea.

Quote
and if you do hit an invalid opcode that isn't part of a path, you've already gone too far and some of your previous work is wrong.

I expect this to happen frequently -- and actually by design.  Particularly with just trying seemingly random pointers when trying to explore jump tables, I know most of them are going to go nowhere.  So I'm actually kind of relying on invalid opcodes tipping me off that I'm on a bad path, so I know I can end that path and discard all the data I accumulated while on it.





This is a fun conversation!!!  =D

never-obsolete

  • Jr. Member
  • **
  • Posts: 31
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #5 on: January 24, 2020, 07:01:48 am »
Quote
Is that the actual output of the disassembler?  Or did you tweak that manually?  That's pretty impressive.
It's from the output, but with the data parts removed so it doesn't take up half a page.


Quote
Correctly handling labels that target the middle of multi-byte ops (e.g. branching into the A9 of BIT $00A9 to get LDA #$00)
As the disassembler reads the opcode, I check for BIT instructions, then check to see if the address of the operand is a target of a label in the label list. I then expand it like this:

Code: [Select]
label_3F_F025:
PHA
LDA zp_48
STA ram_0575
.db $24 ; BIT
label_3F_F02C:
PHA
LDA #$86
STA zp_38
STA $8000
PLA
STA zp_48
STA $8001
RTS

It could be extended to include other opcodes, I had just only ever seen this done with BIT.

Quote
Adjusting labels with built-in offsets, e.g. turning LDA $8000,X and LDA $8001,X into LDA label,X and LDA label+1,X when you know $8000 and $8001 are linked, or turning LDA label,X into LDA label-192,X when you know X starts counting at #$C0.
K.A. does this in a few spots. The subtraction ones are the only lines that have to currently be modified to assemble because the base address ends up in code, and the index is always greater than 2, so the effective address always ends up in the table immediately after the code. I haven't really thought about how I want to handle this yet.

Quote
Programmatically determining the upper bound on data tables (indirect or indexed).
This was why I had started logging the value of Y, so you can figure out where things begin and end. It looks like I would also need to log the value of X and what addressing mode it was accessed with.


I'm also not sure how to handle vector edge cases, because K.A. also points one of them (iirc IRQ??) in ram.

abw

  • Sr. Member
  • ****
  • Posts: 345
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #6 on: January 25, 2020, 02:16:52 pm »
I ended up modifying Mesen to log more information.
What all extra information did you end up logging?

You end up with something like LDA label_0C_8000, X in the code and label_0C_8000: when bank $0C address $8000 is reached.
I guess my issue is that I'm just being silly. I've got one RAM address that refers to multiple ROM addresses and wanted to have every pointer target labelled and every label used, but really all I need is anything that will get assembled to $8000, so it doesn't matter which label I use. Probably I should still generate a PC = $8000 assertion on each of the targets, though.

For my code-tracer approach, I started thinking about how the tracer is very likely going to miss chunks of code, and how the user might want to have a way to refine it or give it some guidance.
This part I've got too - the user is able to specify CDL overrides in order to have unknown bytes parsed as code and/or data. You can also mark sections as CHR, PCM, or free, which doesn't really affect disassembly but could be used e.g. to decide to .incbin graphics data.

Tracking unused space and anti-hacking/CRC self analysis stuff is really the only practical thing I can see the data bit being used for.  But I'm not particularly interested in either for my project.  With a complete disassembly that has been appropriately labelled, there is little to no confusion about what is or isn't free space.
Yup, once you've got a complete, shiny, fully-labelled and fully-commented disassembly, there's probably no reason to care about any of the intermediate steps. On the other hand, if all you've got is a first pass and you need the user to do lots of analysis to provide additional info for a second pass, then arming the user with as much knowledge as you can provide (such as whether bytes that might look like a pointer were logged as data or not) is a good thing.

As a user, code that's also logged as data is something I want to know about, as it indicates something weird and probably non-obvious is going on. Tracking down the source of the reads often reveals interesting behaviour, be it intentional, like a checksum, or unintentional, like a bug.

What I mean is, I want my disassembler to be smart/aware enough to be able to produce this kind of output with no manual intervention from the user:
Ah, okay. Even if you end up only being able to handle trivial cases, that is a fairly common pattern, so dealing with it automatically would be great. Even a heuristic approach could work reasonably well, though sometimes it's a long way between the pointer being written and dereferenced. If you can also programmatically determine (or have the user provide) bounds on the value of Y, that would help reduce false positives.

I know SMB does this, and I vomit in my mouth a little bit every time I think about it.  I know they were cramped for space but this just seems so vile to me.  XD
:D. My first exposure to that one was in the Battle of Olympus code, which has plenty of free space. I suppose I can admire the compactness of it at least.

Where this COULD be a problem is with something you mentioned earlier about having the bank number as an extra byte in a 3-byte jump table.  But I don't think that's terribly common on the NES.
I've actually written code like that before, though it was for data pointers rather than code pointers.

The BIG part where this would be a problem is when a game does a cross bank jump, stashing the current bank in ram somewhere, to have it be read back and swapped back to at a later time.  This seems like it would be much more common, and I don't really have a good solution for automating that, other than maybe forking on JSRs similar to how I fork on branches.  But that's a can of worms I still need to think about more.
Another thing I've sometimes seen is carefully aligned JSRs where the JSR in one bank calls some code that swaps out the old bank for a new bank and the RTS returns to code in the new bank. That's another one of those places where I have to give the original devs credit for cleverness despite hating what they were doing :P.

I do not see any way, either by CDL style tracking or by automated code tracing, that any code being relocated to RAM can be properly disassembled.  If you have ideas I'd love to hear them.
Actually, you might still be able to handle that via tracing. The examples I've seen of writing code to RAM are usually pretty simple, either writing immediate values or copying a block of bytes, both of which give you known state for the destination RAM addresses, at least temporarily and assuming you know the addresses that get copied.

Another problem here is what is the vector points to a swappable bank?  Or if the vector itself is in a swappable bank?  Or worse... both?  Do I run all possible combinations?
This is basically what I do for inter-bank references where the user hasn't provided any info about the target bank, checking the target address in each bank to see if it has the right type for the referencing op, i.e. is code for branches/jumps or is data for loads/computations, which is another place the Data bit comes in handy. If I'm lucky and only get one match, then I can conclude I've found the right bank, and if I get multiple matches, I add a comment about being a possible target and leave it for the user to figure out. Ironically, this is a case where having more complete CDL info actually makes generating a disassembly harder since it leads to more potential matches :P.

Final Fantasy 1 also does a weird thing where it will JSR to an infinite loop, because it knows when an NMI fires, the NMI handler will pop the interrupt info off the stack and RTS out of the interrupt, thus escaping the infinite loop.  This would definitely stump the tracer as I currently envision it -- which is pushing me more towards the "fork on JSR" idea.
Hmm, yeah, I think I've seen similar things in "wait for vblank" routines.

This is a fun conversation!!!  =D
Agreed! And if we pool our assets/good ideas, maybe we can overcome some of these challenges too!

As the disassembler reads the opcode, I check for BIT instructions, then check to see if the address of the operand is a target of a label in the label list. I then expand it [to .db $24]
More silliness on my part - you've got a perfectly workable approach there. Once I got my disassembler up to a certain level of usefulness, I started focusing more on analyzing the specific games I was disassembling and less on generating the re-assembly.

I'm also not sure how to handle vector edge cases, because K.A. also points one of them (iirc IRQ??) in ram.
K.A. = Kirby's Adventure? Another example is Final Fantasy II, which has its NMI vector set to $0100 and sometimes writes new addresses to $0101-$0102.

Blunderpusse

  • Newbie
  • *
  • Posts: 1
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #7 on: January 27, 2020, 04:28:57 pm »
I have a question about the CDL files. I've been playing an unmodified japanese version of Final Fantasy III with FCEUX. Started a new game with the Code/Data Logger running. (now on to the actual question) How do I work with/edit the CDL created?

Disch

  • Hero Member
  • *****
  • Posts: 2770
  • NES Junkie
    • View Profile
Re: NES CDL/"Smart" disassembler discussion
« Reply #8 on: January 27, 2020, 10:29:24 pm »
I don't think there's many tools that'll allow you to view/edit a CDL file, but you can just use a hex editor (HxD is my hex editor of choice)

The CDL file format is explained here (see the middle section of this page):

http://www.fceux.com/web/help/fceux.html?CodeDataLogger.html