News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Planning to hack an as-of-yet unexplored GB game. Looking for advice/guidance.  (Read 1243 times)

yaco

  • Newbie
  • *
  • Posts: 2
    • View Profile
Hello,
Title says it all really, but it never hurts to go into more detail.  :)

I just started hacking last week, so my experience is two translations (NES SMB2 and a GB puzzle game) and I'm currently working on a level hack for Super Mario Land (hex editor only, based off of Frank's Notes document) it's coming together well, and I seem to enjoy chiseling the bytes, but my noobish questions are as follows:
- How did these people map the ROM and find where the level data is and how it's organised?
- What are the prerequisites to level hacking a tool-less game? What are the tools/skills needed? (I have experience with Apple II 6502 and x86 ASM, and some C.)
- And is hex editing the only way?

I don't care how long it takes to learn the skills necessary or do the hack or how arduous the work may be. I'm ready to see it through.

Thank you in advance.  :)
« Last Edit: June 27, 2019, 07:27:08 pm by yaco »

FAST6191

  • Hero Member
  • *****
  • Posts: 3105
    • View Profile
Being a GB/GBC you are likely not going to file names, extensions and such. They can make life far easier when you have a nice folder called lvl or something similar.

You can trace your way to level data (for graphics but what tracing sessions involve, albeit many will have graphical frontends if you want, https://www.romhacking.net/documents/361/ ). If it ends up governing where a given tile lands on screen (or even is being animated) then you can go up and up until you find what put it there and where it found what to do that from.

You can also go the other way and eliminate things. If you know what is text, music, graphics and the CPU code* you don't tend to have many other choices. At this point a bit of corruption or fiddling can be the thread you pull to unravel the whole thing. Memory editing can be useful here too -- a ROM might be megabytes in size but RAM tends to only house stuff the game cares about at the time, possibly while keeping everything else the same. Find it here and search the ROM for it, or trace it back up.

*for very simple games it might be buried in the CPU code much like very simple/low text games might bury it in there.

Hex editors are very crude tools. If you are using one then you had really better just be doing a very minor change, or something you can essentially do with a find and replace or copy and paste. For a level edit I would expect it mainly if you already know the format and just want to shuffle something a few rows over, or stick another thing in there.

As for how it is organised. You usually only have a handful of ways 2d (and many 3d) levels are organised. Much like game consoles don't usually have the grunt to do proper collision detection they cheat and have things do it for them. I like to think of them as layers -- one for blocks, one for pickups, one for enemies, one for hazards... though occasionally one or all of those will be merged together (it is not unreasonable at all as a game coder to have enemies and pickups in one "layer"). For 3d stuff then the graphics are usually separate from the thing that determines where the level actually is (I am sure we have all fallen through a level at some point), and further things like where the actual track is in the case of racing games, though as the dev will have a nice thing to just generate it at the same time this can make things a bit harder there.
Beyond that there are two main approaches for each method. One is exhaustive where the level format paints each and every part of it much like each pixel makes up a tile and you can't just do nothing even if you want it blank. The other will see it use some kind of coordinate system to tell it where a given type of asset needs to be relative to something (screen, level, sub level, camera...), systems that support some measure of scrolling also tend not to have things be screen relative and can be any coordinate system the dev cared to cook up (and for 3d it gets worse -- https://www.youtube.com/watch?v=kpk2tdsPh0A ).

Once I have an example of at least one of those I usually blank it as best I can and then build a list of what each value does/creates for that thing. Same idea as many use for making a table for text, or filling out the odd bits of punctuation, really.

Of note here. If a sequel on the same or related system has a user made level editor have a look, if a PC port happened have a look. If the game has its own inbuilt editor then have a look there too.

Skills pre requisites. If you have those then you have more than enough to make a dent here in quite a few games. There might well be something out there that sees you have to learn something more to make a start with it, something someone else more versed in hacking might not, but I would say the vast majority are within your purview at present. Worst you will probably do is overload the game by putting too much on screen or in the level and running out of resources.
More generally, and for the random forum searcher that might find this at some point in the future, then the idea of data representation needs to be on lock. If you can do assembly as far as tracing it can help but at the same time if you somehow find where a piece of level data is housed then you can figure out much of what you care to know in sensible timeframes by brute force, indeed after finding it I still usually use brute force (read trying every value) and noting what it does rather than playing with a debugger. If I know I have a given level I will obviously also look at what it has and try to match it to the thing I eventually see on screen**.
Savestates, as usual, can be a dangerous thing if you notice the level is in memory and want to try poking it there. By all means give it a go but much like text you risk having the game already having done all it needs to do with it by the time you come to prodding it.

**for many things I encourage using cheats to speed a process up but I have limited uses for them here -- if you find an inventory code for a common item like a potion you don't then have to play 400 hours to get a rare drop item if you can just increase the numbers. For level editing then some kind of cheat to change the player location can help build up a complete level image or put them in interesting locations to help with this seeing on screen business.

I have some worked examples for a DS game in http://www.romhacking.net/forum/index.php/topic,14708.0.html , though it is a game with a level editor I could abuse a bit to help out.

yaco

  • Newbie
  • *
  • Posts: 2
    • View Profile
Thank you for your in-depth and informative reply.

Well, on games as compact as GB (only a few hundred Kbs at most), there isn't really a need to separate level data; it should be relatively easier to find it by process of elimination.

I've started on reading the document you linked to; turns out it focuses on GBA (nothing a quick visit to the GB pandocs won't fix) so adapting it to my needs shouldn't be too hard.

Speaking of elimination, are there any general heuristic techniques of identifying types of data?

More about corruption, my search seemed to turn up a program that integrates with the BizHawk emulator (which supports GB through Gambatte) called Real-Time Corruptor that targets common data addresses. Is something like this what you had in mind? (link:http://redscientist.com/rtc)

I've been thinking about implementing some cheats, myself, to learn about tracing code, setting breakpoints in a ROM hacking context (I have some experience with reverse engineering (some crack-me challenges)) I'm pretty sure that's the logical next step. What do you think?

About hex editing, I figured I'll just make a quick, one-level hack just to prepare for a worst-case scenario where I'd have to crawl through bytes to edit the levels of a tool-less game.

(final note: I did watch that video you linked. Interesting, if mind-bending stuff. One more reason to steer clear of 3D for the time being  :) )

tvtoon

  • Sr. Member
  • ****
  • Posts: 371
    • View Profile
Since you are doing GB, the first thing to note is the ROM banks, as almost every mapper use the same scheme. So learn to mark your way every 16384 bytes as soon as possible.

With this logic in mind, learn also tilemaps and you are ready to explore the world! :D

FAST6191

  • Hero Member
  • *****
  • Posts: 3105
    • View Profile
Thank you for your in-depth and informative reply.

Well, on games as compact as GB (only a few hundred Kbs at most), there isn't really a need to separate level data; it should be relatively easier to find it by process of elimination.

I've started on reading the document you linked to; turns out it focuses on GBA (nothing a quick visit to the GB pandocs won't fix) so adapting it to my needs shouldn't be too hard.

Speaking of elimination, are there any general heuristic techniques of identifying types of data?

More about corruption, my search seemed to turn up a program that integrates with the BizHawk emulator (which supports GB through Gambatte) called Real-Time Corruptor that targets common data addresses. Is something like this what you had in mind? (link:http://redscientist.com/rtc)

I've been thinking about implementing some cheats, myself, to learn about tracing code, setting breakpoints in a ROM hacking context (I have some experience with reverse engineering (some crack-me challenges)) I'm pretty sure that's the logical next step. What do you think?

About hex editing, I figured I'll just make a quick, one-level hack just to prepare for a worst-case scenario where I'd have to crawl through bytes to edit the levels of a tool-less game.

(final note: I did watch that video you linked. Interesting, if mind-bending stuff. One more reason to steer clear of 3D for the time being  :) )
The layers model/abstraction is not a size thing as much as an ease of coding thing, especially if the original dev might have lacked the option to just hit recompile (C++ compilers + attendant libraries tending to be a bit thin on the ground, and big in the overhead, in the 80s and 90s for the GB/GBC's little Z80 a like).

Data heuristics. Depends on the system but in the end many hackers will get a feel for things and softer ways of determining what something does. For text stuff like relative search, the fact that space will come up usually every 7 or so characters and other linguistic or computing* level things, people can usually detect RLE/LZ style compressions as things tend to start reasonably comprehensible and get worse as it goes on (some reckon they can do it for graphics as well but I have not really managed that one), for some systems you have BIOS functions that handle compression so those are nice, various formats (even if they are not separate files) might well have headers, magic stamps and certain quirks. For graphics then I am pretty good at seeing what things should be if I am using the wrong tile width or wrong mode or looking past rainbow colours if I am using the default palette if I am doing the "press page down a lot" routine with a tile editor, same for a hex editor if I am trying to figure out the "line length" of a table and have things make diagonal patterns or something. Back on compression then I will usually also notice if something can be compressed by virtue of repeating patterns. I don't know how many of those I would say go put some time into learning specifically, and indeed none of those other than the linguist level text stuff would I put any time in specifically learning if I still had other "hard science" aspects of hacking to learn, most are more things that I found myself looking for/thinking about over the years.

*for the GB/GBC it is all still pretty custom but for the DS then you can bet I know the typical numbers involved for shiftJIS and EUC-JP encodings, and while my party trick of reading ASCII text but in its hex form has waned somewhat in recent years I still know enough. On the flip side (pun intended) then as big endian and little endian is a thing I can usually force myself to do a fair amount of byte flipping in my head to account for endianess where necessary. Also don't do badly at shifting things (a logical shift might lose some precision but it does make for bigger numbers) or blanking bits in my head if the upper bit (a 32 bit pointer can address gigabytes, rather more than the average file in a game so sometimes the upper bits are used to indicate something like subdirectories or compression).

Corruption. It really is as simple as it sounds. You find a tool to change the data (hex editor, patching program, dedicated corrupting program, https://xkcd.com/378/ ...) and change it. Assuming it does not crash then whatever is broken in the game when you fire it through an emulator or back in hardware (gets expensive if you have to burn discs but it has been done) you know deals with what you changed. Sometimes you might copy a series of known working data (if every enemy sprite is now one particular enemy, or every line of text is the same line...) or corrupt data to values within a range (for the GBA in some modes if each pixel colour is 5 bits but the whole thing uses 16 bits because computing then having something other than 0 for the left over bit does not do well so you might do within a range or do an operation later to force it low. Corruption is considered quite crude by many these days but I am not going to let it die.
The program you linked it probably aimed more at people that like doing cartridge tilting ( https://www.youtube.com/watch?v=ZXTAK9BR_qk ) but if you fancy corrupting memory then I guess it could do something.

As for the Mario 64 video it is not all that involved. That is just abusing a lot of different factors to do something a bit silly. Most of it is still model layer, addition layer, track layer and so forth, just in 3d space rather than 2d (oh and a lot of modern 2d games use 3d but view things flat so you never know, still might use 2d format styles though in those cases).

On the matter of cheats. I tend to find most people come to us knowing the basic find a cheat method ( https://web.archive.org/web/20080309104350/http://etk.scener.org/?op=tutorial is my preferred series of guides here). If you want to use that as a jumping off point to find stats for games, make simpler mechanical tweaks or create something like https://www.dragonflycave.com/mechanics/gen-i-capturing for your chosen game rather than tracking down graphics in a ROM to get familiar with it all then carry on.