News:

11 March 2016 - Forum Rules

Main Menu

Pennywise's Guide to Debugging

Started by Pennywise, April 10, 2013, 04:06:33 PM

Previous topic - Next topic

Pennywise

Hey folks, I had this idea to do a document that details the fundamentals of debugging and the stuff behind it. I came up with the idea today and finished the rough draft today. Feel free to give me feedback, but keep in mind I was lazy on the memory map stuff on purpose. Ideally I'd like to make this into an HTML file and PDF to cover the basics, but I need help doing that. Anyone? Pictures will be of benefit when everything's finalized.

QuoteOf all the topics to read up on in Romhacking, debugging doesn't seem to be a topic covered all that much. Sure, there are plenty of documents for the basics and source code for various hacks and reverse-engineering, but not much for debugging. For those looking to take the next step, learning how to use a debugger is a paramount in attaining that goal. Not only just learning the concepts of debugging, but knowing how to apply them to help hack a game.

Before I even begin with explaining the debugging aspects, there are certain hardware details that must be known before anyone can even begin to learn debugging. Otherwise it's kinda like wanderering through the dark with no clue where you're going. Since the NES is my area of expertise, we'll be using that as an example. Keep in mind that Romhacking and debugging concepts are more or less universal concepts that can be applied to other systems.

What you need to know about the NES:

-Be able to differentiate between RAM and ROM addresses.
-ROM is pretty simple, that's the physical ROM on your computer and the addresses you are viewing in a hex .
-ROM is usually split into two parts: PRG-ROM and CHR-ROM. PRG stands for Program(Code) and CHR stands for Character(Graphics). Some games/mappers only have one ROM with the graphics lumped into the PRG-ROM.
-Now RAM refers to the memory of the NES and there are three types of memory for the NES: CPU, PPU and Sprite. Each memory mode has it owns memory map which basically tells the NES what to put where in memory
-Here are the general memory maps for the CPU and PPU

CPU
Internal RAM
$00-800
RAM Mirrors
$800-2000
Registers galore etc (not important for our purposes)
$2000-6000
Cartridge RAM (Not all games have/use this)
$6000-8000
PRG-RAM
$8000-FFFF

PPU
Pattern Table #0
$00-1000
Pattern Table #1
$1000-2000
Name Table #0
$2000-23C0
Attribute Table #0
$23C0-2400
Background Palette
$3F00-3F10
Sprite Palette
$3F10-3F20

Now let me breakdown what some of these terms are.

First there's RAM and Cartridge RAM which are basically the same thing. The only difference between the two is that Internal RAM is located within the NES and Cartridge RAM is located in the cartidge itself for extra RAM. The game stores various variables in this area. Stuff like HP, Strength, enemy counter rates, text positions. RAM is dynamic.

Then there's PRG-RAM. You must know the difference between PRG-ROM and -RAM. The NES cannot access the entire ROM banks at one time. It is limited to accessing 32kb's of ROM banks at a time and that 32kb goes straight into PRG-RAM. Bank sizes are dependent on mapper and setup and vary like so:

32kb(note some PRG-RAM/ROM sizes are 32kb therefore the entire PRG-ROM can be accessed in one shot)
16kb
8kb

The last bank in the PRG-ROM is always mapped to $C000 or $E000 in the PRG-RAM depending on bank size. This is what is referred to as the fixed bank. It cannot be moved or changed with another bank.

Name Table is basically the screen coordinates graphics appear on and the attribute table details colors for the graphics etc.

Keep in mind that a game does not use/access ROM addresses. They do not exist. A game will only use RAM addresses. That is why something like text pointers don't always match up with their ROM addresses. It's not pointing to the ROM, it's pointing to wherever the text is stored in RAM.

That should cover most of it, let's move onto actual debugging concepts now.

A debugger is mainly used to have the a break/stop/whatever just before certain ASM instructions are executed by the game. You set the conditions in which that happens with a debugger. We call them breakpoints, if you want to keep it simple think of it as a point in which to take a break. There are three main types of breakpoints they are:

Read - This is when we want to know when the game is accessing variables in RAM, text being accessed from PRG-RAM etc. Breaking when reading data and variables.
Write - This is mainly for when we want to know when variables are being written to RAM by the game like player stats. We don't usually use it for something like PRG-RAM because that's more like static data, but for something like RAM as it is dynamic aka ever changing.
Execute- This is when we want to know when the game is accessing specific ASM instructions from the PRG-RAM.

Breakpoints can be used for the three types of memory the NES has. Want to know when/how a graphic tile appears on the screen, set a write breakpoint for the PPU address etc.

Now onto the actual debugger.

What you want to do is load up the FCEUX emulator with a game of your choice, which by the way is one of the best emulators/debuggers I have ever used, and at that top bar you'll see a bunch of options. Click Debug -> Debugger. Don't let it overwhelm you, but the main thing you should is that on the left is the entire scrollable memory map of the CPU. Then to the right, you'll see a bunch of buttons. The important ones are Run and Step Into which are used for when you want the game to "Run" normally etc for the former and to go through each ASM instruction step by step for the latter. Then there's the breakpoint screen to the right of the button. There's the Add, Delete and Edit which are all self explanatory (add breakpoint, delete...). Click Add and it'll bring up the breakpoint menu where you input the address you want to break on and the type of breakpoint etc.

Now let's try applying this to game. Let's do Mad City the Japanese version of Bayou Billy.

We are going to start a hypothetical translation for the game and the first step is to build a table and find the text in the ROM. Building a table is quick and easy by using the PPU Viewer in FCEUX. So let's fire up Mad City and get past the title screen. Pause the game when some text appears as Gordon holds Annabelle at knife point. You'll see the katakana characters on the right side of the PPU viewer starting at $70. So let's make a basic Japanese table from those values.

Once you've finished the basic table, let's try searching for that first bit of intro text by loading the table into the hex editor along with the game. Search for the hex values or use your hex editor's kana search feature, if it has one.

We're gonna focus on モラッユク because those dakuten that precede can make things more complicated for us.

What the hell! I'm not finding the text! It's compressed!

Ok, we gotta figure out what's going on here. Let's do some debugging. Make a savestate before the text appears in the intro.

What we're gonna do is see how the text gets written to the screen. That is the end of the process of the text being read from PRG-RAM and finally appearing on the screen. When you don't know where to start, start at the end and work your way.

So let's get that text up again and open up the Name Table Viewer. You're gonna see a recreation of the screen albeit with messed up graphics in parts. Pause/unpause so that you can see the text ok. You'll also see multiple screens and only the top halfs are usually used by the game, the bottom half is mostly for mirroring. Move your cursor over to the text were interested in and there will be some information for it. The tile is $9A, the X/Y coorinates and that the PPU address 20B0. We are going to set a write breakpoint for that address. So let's pause the game and load our savestate to reset the screen.

When you've done that load up the debugger and add a PPU write breakpoint to 20B0. Before we unpause the game we're also going to use the Trace Logger. A Trace Logger is a tool that logs the code as it's executed and it's complimentary to debugging by keeping track of past code in case we need to look at it. So we're gonna log the last 10,000 instructions to window and click start logging. Now unpause the game.

The game should stop when we get just before that text and the debugger snaps. Keep in the mind the goal of all this is find out how $9A gets to $20B0 in the PPU. If you look at the trace logger, you will see this:

LDA $0700,Y @ $0704 = #$9A

This is where $9A came from. If you're savestate is before the screen appears, you'll have to click run on the debugger as there is usually a blank tile loaded to clear the screen etc.

Now we want to know how $9A got to $0704. So we're gonna delete our breakpoint, stop the trace logger, pause the game and reload our state. We're gonna add a new breakpoint now. Set a write breakpoint for $0704 to the CPU. Unpause the game.

The game is going to break a lot. We're going to continue to keep clicking run until we see

LDA $B0BE,Y @ $B0E0 = #$9A

in the trace logger.

So $9A comes from that, but how does that get $9A?

Well data is being read from a table with a Y index. Y($22) gets added onto the base of $B0BE and becomes $B0EO. Where does $22 come from?

Well if we look up a bit we see this:

LDA ($00),Y @ $B1E4 = #$22

This appears appear to be the code that reads the text from the PRG-RAM, the text read routine. Maybe the text data is not the same as what appears in the PPU and hence requires an additional table to adjust for the difference. Let's try building a table from this table.

Fire up the hex editor in FCEUX and go to $B0BE. You should see 70, 71, 72... Those look like the actual tile values that are in the PPU. Let's make a table based on those values. We're gonna start the table values at 00 because that is the beginning of the table.

Now let's try searching for that text. You should wind up at 171F4 in the ROM.

Hey we found the text! It's not compressed after all! Crazy Konami, just what the hell were they thinking! :D

KingMike

It is better to use the term "address space" instead of "RAM" to describe the memory area where ROM is mapped in. Calling it RAM implies you can write data to that space, when you cannot. That is an important distinction to understand early on. :)
"My watch says 30 chickens" Google, 2018

Drakon

Personally I think debuggers need to be a little earier to learn how to use haha.  To a beginner they're extremely confusing.

Pennywise

#3
I have experience with only three debuggers, FCEUX, BGB, and Mednafen. FCEUX is the pinnacle of what a debugger should strive for in my opinion. It is easy to use and is very powerful. BGB is not far behind either. Mednafen is keyboard driven and completely the opposite of the previous two. I am having trouble adjusting to it. For anyone looking to learn how to use a debugger, FCEUX is the way to go in my opinion.

Quote from: KingMike on April 10, 2013, 07:21:15 PM
It is better to use the term "address space" instead of "RAM" to describe the memory area where ROM is mapped in. Calling it RAM implies you can write data to that space, when you cannot. That is an important distinction to understand early on. :)

Regardless of the term, this is good information to have along with everything else. Are you talking about something like this?

Be able to differentiate between RAM and ROM addresses.

STARWIN

It is not trivial how detailed of a description you need to give about the system. It sounds a bit lazy, but if there is a good document explaining the basics, you could refer to that. Though that may increase the amount of people ignoring all of it.

I'd talk about CPU address space. Parts of it resolve to RAM, part of it resolves to registers galore, the rest resolves to cartridge. The cartridge may resolve its share of the addresses to external RAM or ROM.

PRG-RAM sounds bad/wrong. I don't think that adding that sentence makes it any better. PRG-RAM actually makes me think of the cartridge RAM. I don't know how unprofessional that makes me.

Don't be afraid of removing text (rewriting), if you want the guide to read well.

Quote
Keep in mind that a game does not use/access ROM addresses. They do not exist. A game will only use RAM addresses.

Related to earlier, I think this is bad/wrong. There are only CPU addresses (I'm not familiar with PPU). CPU opcodes (or opcodes with arguments?) go to the CPU and contain numbers interpreted as addresses in the CPU address space.

Quote
A debugger is mainly used to ...

I'd avoid "opinions" and merely present the debugger functionalities that are going to be presented. Or list them "all" and present some.

If address space is a difficult term to use, "memory" or "CPU memory" might ~work.

I'd have something like (not entirely sure when it stops actually, too lazy to think about instruction fetch cycles):
Quote
Read - Execution pauses when the current instruction loads from the given memory address.
Write - Execution pauses when the current instruction stores to the given memory address.
Execute- Execution pauses when the current instruction is in the given memory address.

It has been a while since I used a NES debugger. Is there a certain best version of FCEUX you refer to?

By the way, do you need that save state? Would a massive trace dump contain all the necessary information? Sometimes "keep clicking" is not very convenient.

It may be a bit difficult to give a generic guide to debugging. A set of tools that can be used in many ways. I guess many hackers here are interested in translations, if so, then this is an okay example. (it is a bit more complex than the basic ASM hacking I'm used to)

Pennywise

I'm not sure there is one document that details all the hardware stuff perfectly. Sure, there is yoshi's doc and the Nesdev wiki, but I do believe there is such a thing as information overload. My only aim is to present the hardware details etc that I think are important to know kinda like cliff notes I guess. This is basically my opinion on the subject and is my attempt to present everything in layman's terms so beginners can understand.

Unless someone tells me otherwise, I don't think PRG-RAM is a wrong term. I guess it can be a confusing term though. I dunno, it's the term I've always used and while I know what it means, somebody else may not.

Keep in mind, this current version is only the result of a few hours of writing. It will revised later.

Yes, the savestate is important. Massive trace dumps in my opinion are sloppy. FCEUX's ability to trace as you debug is incredibly fluid. If it's not convenient, I don't know what is.

It would be pointless for me to do something that wasn't related to translating as that is not my expertise. I'm sticking to what I know and that's that.

STARWIN

Hmm.. I checked some main sources and cannot find PRG-RAM term used in this way. To my interpretation, everyone uses that term for cartridge RAM. i.e. the bank switching system is about indirection, not about copying blocks of ROM temporarily to some RAM chip.


STARWIN

You can of course wait for someone else to give their thoughts about this, but that simply (and ambiguously) refers to $6000-$7FFF as being PRG RAM and $8000- being PRG ROM. Or perhaps you are a bit tired/forgot what you exactly wrote in the draft?

Pikachumanson

Quote from: Pennywise on April 10, 2013, 09:32:34 PM
I have experience with only three debuggers, FCEUX, BGB, and Mednafen. FCEUX is the pinnacle of what a debugger should strive for in my opinion. It is easy to use and is very powerful. BGB is not far behind either. Mednafen is keyboard driven and completely the opposite of the previous two. I am having trouble adjusting to it. For anyone looking to learn how to use a debugger, FCEUX is the way to go in my opinion.

Regardless of the term, this is good information to have along with everything else. Are you talking about something like this?

Be able to differentiate between RAM and ROM addresses.

I might add that Meka has a pretty sweet debugger as well. It lets you view the vram, ram and rom seperately.