News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Assembly hacking  (Read 12150 times)

Meijin

  • Jr. Member
  • **
  • Posts: 10
    • View Profile
Assembly hacking
« on: May 14, 2012, 11:08:15 am »
I have always been told that you need to know assembly before attempting to hack something more advanced. But there's a question I've always wanted to ask. How is it that knowing it help you achieve something? What does it actually do ? Why can't I create some genie codes without knowing it? What's the true role of it in hacking scene?

In short, I don't know how it does support us in hacking. Someone please give me a clue.

FAST6191

  • Hero Member
  • *****
  • Posts: 3052
    • View Profile
Re: Assembly hacking
« Reply #1 on: May 14, 2012, 12:35:38 pm »
If you know exactly how every atom in an item is fitted together you can predict exactly how it works and change how it works in a fundamental way.
Assembly is the rough computing equivalent- everything boils down to it (or at least an absolution trivial abstraction from the 1 and 0 values we feed to processors). If you can find how the game knows to load something (which will come from an assembly instruction somewhere down the line) you can edit it although many newer games will abstract this further so the coders do not have to be troubled every time an artist wants to change something (an example of this would be plain pointers not being encoded the binary the CPU sees) hence the rest of ROM hacking being able to happen without assembly and/or work once someone has dived into the assembly and figured out how it works.

Game genie codes for the most part were actual edits to the binary (which is to say the part that is the code that ran on the processor). Technically they just edited the ROM the console saw so you could have instead edited graphics or something else in the rom but most of the time they edited the code the processor run as that is the way you get all levels, infinite lives and so on. As you are just using a glorified patching method you need to know the thing you are editing do something good.

There is more to it and although it is not strictly true that you need ASM to do more advanced stuff* it is often actually easier in the end to use ASM techniques (which means knowing it) to edit things and/or initially reverse engineer them as opposed to staring at the ROM and hoping something jumps out at you or messing around with techniques that are in some ways easier but not so sure fire (relative search- tends not to do Kanji, tile viewer- doesn't help when compression is used, compression searcher- custom compression might want prevent it, level editor- not when everyone makes a custom format almost by definition).

*although you can still use it and in some cases it is necessary still for various reasons like encryption modern games frequently use scripting languages like python ( http://modiki.civfanatics.com/index.php/Civ4_Python ), java (loads of things but a nice earlier example would probably be Chrome from 2003)  and lua ( http://www.techknight.com/blog/2007/10/11/puzzle-quest-mods/ ) which unlike the compilation of C type languages (give or take C#) can be converted between source code and what is run quite easily if they need to be converted at all.

puzzledude

  • Sr. Member
  • ****
  • Posts: 308
    • View Profile
Re: Assembly hacking
« Reply #2 on: May 14, 2012, 01:02:02 pm »
Assembly is the most advance of the hacking techniques. It is a series of text commands, that are compiled into machine code to be inserted into a ROM. With it you can alter game's actual code. Or in other words, do new stuff with the game that the original couldn't do. But first you need to disassemble the game.

For instance Zelda Alttp has recently been updated with the new dissasemlby. Which means we now have the neccessary RAM addresses. With this knowledge we can (for instance) make the book of Mudora replenish green magic automatically, make new HUD and MENU and make new rain or fog overlays, that the original didn't have. Advanced assembly can also make new events in the game.

Using the opcodes like LDA (load accumulator) we can load or set the accumulator to a certain value, such as LDA $7E0019 (in which case the accumulator is loaded with the value stored at RAM address $7E0019). STA (store accumulator) stores the contents of the accumulator to a RAM address, RTS returns from a subroutine call etc.


Example (part of the asm of the rain overlay in Alttp):

Rain:
   lda $7EE00E
   bne .On
   lda $7EE00D
   bne .Off
   bra Rain_End

Rain_End:
   lda #$00
   sta $7EE00E
   sta $7EE00D
   lda $7EF3C5
   rtl

variables
7EE00E - Rain Activator
7EE00D - Rain Deactivator (overridden if rain is activated before the deactivation occurs)
7EE00C - Activates the overlay animations and nonimmediate rain things
7EE00F - Activates thunder and indoor ambience (can be set at the same time as the Rain Activator)



Dr. Floppy

  • Restricted Access
  • Hero Member
  • *
  • Posts: 970
  • Make America GREAT Again!
    • View Profile
    • BaddestHacks.net
Re: Assembly hacking
« Reply #3 on: May 14, 2012, 07:02:31 pm »
ASM is the difference between you fearing the board and the board fearing you!


Meijin

  • Jr. Member
  • **
  • Posts: 10
    • View Profile
Re: Assembly hacking
« Reply #4 on: May 15, 2012, 01:10:55 am »
Thanks for your reply, I briefly undertand it.

And what's the best document to start learning ASM for a newbie who absolutely has no programming exprience like me?

Dr. Floppy

  • Restricted Access
  • Hero Member
  • *
  • Posts: 970
  • Make America GREAT Again!
    • View Profile
    • BaddestHacks.net
Re: Assembly hacking
« Reply #5 on: May 15, 2012, 01:38:32 am »
And what's the best document to start learning ASM for a newbie who absolutely has no programming exprience like me?

Assembly is all about creating instructions. Instructions are strings of two-digit numbers (in base sixteen). Each string/instruction consists of one, two or three such numbers depending upon the opcode. The opcode is the first number in the string.

While there are 256 numbers in binary hex, only 151 represent valid NES opcodes. (And you regularly use only a few dozen of them in practice!)

So far so good?


http://badderhacksnet.ipage.com/badderhacks/index.php?option=com_content&view=article&id=253:introduction-to-6502-assembly-nes-programming&catid=14:dr-floppy&Itemid=7

This tutorial reviews the whole "opcode" + "operand" = "instruction" thing, and it introduces you to your very first opcode!

Oh, yeah. This tutorial is probably NSFW in terms of language.


Pennywise

  • Hero Member
  • *****
  • Posts: 2355
  • I'm curious
    • View Profile
    • Yojimbo's Translations
Re: Assembly hacking
« Reply #6 on: May 15, 2012, 01:39:34 am »
The best way to learn ASM is to learn how it applies to the system in question. Reading documents is great and all, but it doesn't do a whole lot of good if you don't know how to apply it. Learning a particular instruction set is best done by actually writing and modifying code. In other words, you learn the instruction set bit by bit as is necessary for whatever you're doing.  Know how to use a debugger and other hacking related tools which are your life long companion to assembly. Bottom line is:

Try to read up on basic and intermediate hard info such as a memory map, registers etc
Learn a particular assembly instruction set bit by bit to figure out the application of instructions
Learn how to operate the tools of the trade
Don't just change random things in a hex editor. Learn how to analyze the code and what it means
Dig into a game. Start small in scope, but think about the big picture

EarlJ

  • Jr. Member
  • **
  • Posts: 47
    • View Profile
Re: Assembly hacking
« Reply #7 on: May 15, 2012, 02:50:33 am »
Once you get comfortable with some of the instruction set, it can also be helpful to look at a disassembly.

It's not the same as stepping through some unknown code, but before you go wading off the deep end on your own it might help you get a better handle on understanding to see someone else's explanation of what a given bit of code does. If it's well commented, try to see if you can follow what's going on without reading them.

As for what particular docs you would want to read... well, that sort of depends on what system you want to work with. General things apply to assembly broadly, but NES/SNES/Genesis/SMS all use different instruction sets and if you're just starting out dabbling in more than one could get confusing pretty quickly.

FAST6191

  • Hero Member
  • *****
  • Posts: 3052
    • View Profile
Re: Assembly hacking
« Reply #8 on: May 15, 2012, 05:59:32 am »
Yeah as others said and I probably should have made clearer every system is different and that does mean system and not just CPU as different memory mappings, interrupts, BIOS and similar calls, coprocessors and more exist. To this end the extra paragraphs I threatened

You have to handle the memory for the most part (and unless you are doing a very big hack probably directly with numbers in most cases when hacking- proper assemblers and disassemblers can define a shorthand/more human friendly name similar and/or almost declare it as a variable) which is typically what gets every new assembly programmer (every big language outside assembly attempts to do away with having the programmer manager memory at increasingly higher levels and for good reason). The differences even with that taken into account can be quite large as well- compare say a modern X86/X64 processor to a modern ARM and things are very different again (although in some ways this plays into the two big CPU philosophies* or RISC and CISC and the legacy junk x86 has to drag around with it more than straight up philosophy). Having to learn all this is the price you pay for gaining the ability to control anything and it more than most is a great example of how every problem can be solved (within the limitations of the system) and everything means something until you can prove otherwise that often takes new ROM hackers and computer scientists a bit of time to grasp.

*there are also architectures like modified Harvard architecture and Von Neumann but let us not get into that right now as very little in the console world uses anything resembling Harvard architecture.

Learn instructions little by little.... certainly I would if I were to tackle an X86 but even on a simple processor there might be complex instructions but there are a core group of a handful of instructions and concepts that the vast majority of programs will rarely use anything else but. To this end the following (all are important but I probably could have spent a little while longer on ordering) I would say learn

Basic debugging tools and how they work- a debugging emulator will typically have a break on command and that is the most valuable one. Here you say if a portion of memory is written to stop emulation and tell me what caused it. This often leads immediately to the instruction or part of the ROM you want to look at (unless there is compression involved in which case you probably get the portion of memory it was copied to so as to be decompressed but that just means you run it again and probably get where you want, coincidentally you have just bypassed compression and maybe figured out how it works if it is custom making you very valuable to your team).

The bootup sequence and if it is different (say for a modern console that has a menu) the code loading sequence for your console of choice.

What a register is and how many of them there are in the CPU you care about. Broadly speaking you will have general purpose ones and ones with a specific task (usually a pointer to the next instruction, a return address and maybe a flag register (signed, floating, carry and other such things)

Learn the memory layout including where any IO is (controllers will tend to

Learn what DMA is (you can pipe everything through the CPU but that would be unbearably slow in many cases not to mention taking up valuable CPU time so most devices outside the simplest ones will be able to trigger a memory transfer).

Learn what interrupts are and what are available for you (you can check to see if something has happened every cycle but it is better if the game knows to interrupt whatever it is doing and focus on the next thing when something happens)

Learn what the stack is. In short it is a step up from the registers (or a step down in speed) which can hold values that do not want to be written back to memory yet but do not need to be sitting taking up a register.

Learn what an operand is and what limitations they have (the ARM processor can access memory directly with an instruction but can not use the memory locations in an instruction unlike the X86 which can and frequently does). Generally speaking there are three classes of value which are other registers (take the contents of R2 and add the contents of R1 to them), immediate (add 45h to the contents of R1 (you might then have to declare a destination as well)) and memory locations which we already talked about.

Learn how an operand is constructed in your assembler of choice- typically this is something like instruction, destination, source and immediate value but that order can change to anything as far as the assembler goes and they often do change between them.

Appreciate any restrictions on the processor and the operating modes- X86 notably has stacked registers- the 16 bit register is a subset of the 32 bit register meaning if you overwrite a 32 bit register the equivalent 16 also gets overwritten
ARM in the case of the GBA and DS has THUMB mode which uses 16 bit instructions at the cost of having only some registers available for a lot of the instructions in it.

If it was not already something you knew what big and little endian mean.

On instructions you can learn all of them but as mentioned a subset of key ones or their classes is better

push and pop - frees a register to do something (push) and returns it (pop)

Mov - in most cases this copies the value either of an instruction or a register to another (note the copy as opposed to move which implies the original location is reset to 0 which most mov instructions do no do)

NOP - is an instruction that does nothing, if you have to overwrite an instruction or two (say a branch for an anti piracy check) NOP the thing and it will never have happened (granted it can get a bit more complex than that) and you will not have to redo the rest of the rom as it is now 32 bits or something out of line from the original.

Classes of instruction
The ones already mentioned probably fall into the CPU/memory management class but they are vital.
Add. Does what it says but there might be extra ones to account for signed values or to add a chain of them or to add a register to a register along with an immediate value.

Subtract. See add above but replace with subtract.

Boolean logic and bitwise operations. The power of boolean logic is undeniable and as such most processors have abilities here (although you may be limited in the NOR, XNOR and NAND department but that is easily worked around) and most will also have has bitwise operations (shifts, rotates and maybe a flip)

Multiply. More or less the same as the other maths but naturally you are more likely to exceed the register size with a multiplication so learn how it handles that and how to handle it.

Divide - very few consoles will have this in the main CPU (hell the GBA and DS do not) but they will usually have a method to do it by (coprocessor, BIOS instruction, inbuilt log tables (remember 4/5=? is able to be written and log(4) - log(5) = log (?)....) which comes right back around to the every system thing.

Branch and compare - you can run a program from start to finish but all but the simplest programs will become horrific if you try that so assembly and processors allow you to jump to something and return later. Also in this is the compare and branch class of instructions which are the processor level manifestation of the IF ELSE "loop" and other loops.

Memory load and memory store. Fairly vital.

Links to your chosen system and processor(s) it uses can usually be found in the docs section of this site (it being the mission of the site to collect such information and all) and along with any debugging emulators although you can usually find another document from the processor manufacturer (ARM, motorolla, Intel and to a slightly lesser extent IBM who pretty much have the processor markets sewn up and especially as far as consoles go are very open with their stuff).

Meijin

  • Jr. Member
  • **
  • Posts: 10
    • View Profile
Re: Assembly hacking
« Reply #9 on: May 15, 2012, 06:45:45 am »
Wow, that's some massive lecturing there, FAST6191. I don't know if I am good enough to handle such a difficult language but I'll give it my best shot. I truly appreciate your help and  the others for sparing your precious time helping a tyro like me.

Again, thank all of you.

Btw, the systems I want to work with are NES and SNES.

FAST6191

  • Hero Member
  • *****
  • Posts: 3052
    • View Profile
Re: Assembly hacking
« Reply #10 on: May 15, 2012, 09:11:08 am »
Yeah I probably dropped a bit too much for one go and took it closer to an extreme than I would like. Although I do advocate learning towards that and even if you do it does not mean you automagically gain the rest of computer science (I can mess around in various flavours of ASM but I would not call me to program a new database language or a useful GUI) although those that do usually find themselves able to learn most other aspects of computing somewhat more easily than those that have not.

Still

You have an AR code that gives infinite lives and find the instruction that subtracts 1 when you lose a life- change it to an add instruction. Congratulations you just did assembly hacking.

You find the location of a tile in memory you want to edit but do not know where it is in the ROM. You set a breakpoint on your emulator and when that tile eventually gets written into memory you know where it came from and can point your tile editor at it. Congratulations you just did an assembly hack.

You find the function that sets each text character to be 16 pixels apart via an add F instruction and change it to be C instead (tends to trouble Japanese but Roman languages are quite happy to have thinner characters). Congratulations you have just done an assembly hack.

You discover a new custom sort of LZ compression and looking at the function assembly tells you what the flags mean (usually can be determined by looking at them but not always) or the limits of the format. Even if you reprogram the decompression and compression functions in C or something afterwards assembly skills still gave it to you (in practice reverse engineering work like this is probably the most common use beyond the basic breakpoint and working around anti piracy protection).

Just because you would struggle to think of a way and then put into practice say a method to subvert a font handling format and encoding scheme to make a game use an 8 bit, variable width font and a true line handling (gjyf and such) all while keeping the game running (these systems were frequently pushed to the limit) or hand optimise a loop/fix a function from a game programmer that phoned it in does not mean you can not do some serious damage with some basic ASM skills.

Jorpho

  • Hero Member
  • *****
  • Posts: 4721
  • The cat screams with the voice of a man.
    • View Profile
This signature is an illusion and is a trap devised by Satan. Go ahead dauntlessly! Make rapid progres!

Dr. Floppy

  • Restricted Access
  • Hero Member
  • *
  • Posts: 970
  • Make America GREAT Again!
    • View Profile
    • BaddestHacks.net
Re: Assembly hacking
« Reply #12 on: May 16, 2012, 01:02:04 am »
Aww, it requires a login.

Yes, I (like a jerk) forgot to specify its access level as "Public".

It should work now.

Meijin

  • Jr. Member
  • **
  • Posts: 10
    • View Profile
Re: Assembly hacking
« Reply #13 on: May 16, 2012, 01:55:25 am »
It seems just talking will get me to nowhere but some certain video tutorials will be a big help. I've recently found a series of assembly video teaching for hackers only but I wonder if it's really useful for me in dealing with ROM hacking. Please have a look at it :

http://www.youtube.com/results?search_query=assembly+for+primers+hacker&oq=assem&aq=0p&aqi=p-p2g8&aql=&gs_l=youtube-psuggest.3.0.35i39l2j0l8.891.1793.0.3276.5.5.0.0.0.0.182.636.1j4.5.0...0.0.cTeiTn6M6-o

Does anyone get any idea ?

FAST6191

  • Hero Member
  • *****
  • Posts: 3052
    • View Profile
Re: Assembly hacking
« Reply #14 on: May 16, 2012, 04:31:15 am »
Most of those appear to be X86 which to be fair is probably the most useful type of ASM right now give or take programmable chip/FPGA type devices (although in some ways those are different as you are creating instructions as opposed to just using them). I love hacking videos ( http://www.youtube.com/user/ChRiStIaAn008 being responsible for more lost productivity than I can to think about at this stage) but I have yet to see any good ones that teach straight ASM that can stack up against the likes of http://homepage.mac.com/randyhyde/webster.cs.ucr.edu/index.html and http://stuff.pypt.lt/ggt80x86a/asm1.htm (smaller but sometimes nicer if you know some basics) much less teach it for hackers. This is not to say there are no good programming/computer science tutorial videos ( http://www.youtube.com/view_play_list?p=6B940F08B9773B9F ) and ( http://www.youtube.com/playlist?list=PL4C4720A6F225E074&feature=plcp ) being amazing for it, at best the ASM ones I have seen teach you nothing that you probably have not read in this thread already (how an instruction is formatted), what the idea of an architecture is, how to set up a debugger (usually GDB) which is all fine but will at best leave you with a bit of general knowledge and some skills in how to do a method rather than how to think in such a way that you can solve problems which is great if you are moving to a new system but not so good if you are in your position.

Also X86 is quite different to the motorola (68000)- genesis, early intel 8080/Z80 - master system and gameboy, 6502 - SNES and NES, IBM/powerpc gamecube and Wii and 360, MIPS - PS1 and PS2 and PSP and ARM - GBA and DS stuff that dominates consoles once you get past the bare minimum (between the legacy stuff it carries and that unlike most of those it is a CISC processor it gets different). Now I have in the past pointed people looking to learn assembly for console hacking at those two x86 guides above as they do teach how to think like an assembly programmer and it is possible to take what they know and twist it to consoles (I certainly did) but those youtube videos you linked at probably not the way to do it.

Meijin

  • Jr. Member
  • **
  • Posts: 10
    • View Profile
Re: Assembly hacking
« Reply #15 on: May 17, 2012, 02:35:18 am »
Thanks I will cast them aside and follow your lead.