News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Some stupid questions regarding ASM.  (Read 8329 times)

Vaccy

  • Jr. Member
  • **
  • Posts: 21
    • View Profile
Some stupid questions regarding ASM.
« on: October 01, 2013, 01:30:37 pm »
These are probably very simple questions, and while I'm not working on anything at the moment, I've just been curious about a few things...

1. How does one compile ASM code?

I've seen tutorials where people use a language of choice (C, Python) and then compile it and add it to the games code. What I'm wondering is, for one, how does one compile it into ASM? Is this an option in a standard compiler? Or is there other programs required?

2. How to insert said code?

Once it's compiled and ready to go, how is this code inserted into a rom file? I know that free space in a rom can be used to to add more code to, but how is it inserted?

3. Does the rom file need to be updated for it to find the new ASM added?

What I mean is, while you've added code for say....VWF, wouldn't you need to also make the game look to wherever you've put the new code in order to run the VWF?

4. Are there preferred languages to code in?

I have some old Java books lying around that I could try to read up on.

I'm sure I could think of more questions, but this should do for now. Sorry if I've covered something done before.

puzzledude

  • Sr. Member
  • ****
  • Posts: 308
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #1 on: October 01, 2013, 02:13:13 pm »
1. You assemble an ASM code with an assembler (such as xkas).

Your question: "How does one compile it into ASM?" is somewhat strange. Once the ASM is written it is assembled into the "hex code", which I call "ASM to hex conversion" (this is what the assembler is actually doing, I use xkas to do this). So xkas will write your ASM in a series of bytes (which you can view with the hex editor), so that the CPU can understand it.

However this is not the same as compiling. By compiling I usually think of a source code, being compiled into an exe (which is then an independent program, rather that "just" a separate code).



2. The code is inserted automatically (to assemble you need to define the asm file and the rom file). You can not just add the code. You need to know where to "hook", usually referred in slang as "hijack". This is essentially where the name "romhacking" is coming from. This can be done by extensive "tracking", usually called "tracing".

You trace the code and therefore know, how it is being executed (usually with the pointer pointing to a code). If you write an jmp or jsl jump instead (with the new pointer to empty space), you actually "hook" or "hack" the original code, with new code (which will be written in previously empty space). But this routine is usually written in ASM at the very start, so it is a part of ASM.

Once you have the ASM inserted with xkas, you can compare the original with hacked file, learning what hex changes has the ASM done. Now you can copy it into txt and insert the new code anytime with the hex editor (alternative way to do or repeat the same).



3. You don't need to update anything, since the game will know where to look, because of the jump to the new routine as a part of ASM.



4. The typical one is C++.
« Last Edit: October 01, 2013, 02:18:56 pm by puzzledude »

Vaccy

  • Jr. Member
  • **
  • Posts: 21
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #2 on: October 01, 2013, 02:49:24 pm »
Thanks for the info, this clears some things up for me.

About #1, I guess I was thinking there was a way to write something in a language such as Python, which is then used by another program to convert to ASM, and so forth. Seems that's not the case.

What confused me is this program written in python to decompress snes graphics in quintet games. https://github.com/Osteoclave/Game-Tools/blob/master/quintet_decomp.py


puzzledude

  • Sr. Member
  • ****
  • Posts: 308
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #3 on: October 01, 2013, 03:04:40 pm »
What you see there is actually a source code. When this is compiled, you will get an exe (program), and with it you will be able to decompress the gfx.

I can also give you an example of ASM (learning by example is usually very effective).



ASM (for SNES):

org $01d87d
NOP
NOP
JSL $0EFE00

org $0efe00
CMP #$2727 ;check if block is moved
BEQ $0F ; if yes branch to after first RTL
LDA $7F2000,x
STA $7F5000
LDA $00 ; set old block value to 00
STA $7F2000,x
RTL
LDA $7F5000 ; branch here if 27
STA $7F2000,x ; store intermediate instead of 27
RTL
---------------------


The upper ASM Assembled:
EA EA 22 00 FE 0E  (auto-written at D87D= org $01d87d)

C9 27 27 F0 0F BF 00 20 7F 8F 00 50 7F A5 00 9F 00 20 7F 6B AF 00 50 7F 9F 00 20 7F 6B (auto-written at 77E00= org $0efe00)
---------------------

Understanding the "conversion":
see this?
LDA $00
STA $7F2000,x
RTL

LDA is actually A5, value 00 is then byte 00, so A5 00, then STA= 9F, address reversed, so $7F2000 is 00 20 7F, 6B is RTL,

upper code is then A5 00 9F 00 20 7F 6B  in hex (this is what assembler is doing).
---------------------

Location:
the org $0efe00 is actually a SNES address, which is 77E00 as PC or hex address (this is the definition of the new location of the code to be written)

we jump here with this: JSL $0EFE00= 22 00 FE 0E  in hex

the original code to be hacked was traced, it is here:
D87D= org $01d87d.
---------------------

Original code was A5 00 9F 00 20 7F (located at D87D), but we "hacked" it into C9 27 27 F0 0F BF 00 20 7F 8F 00 50 7F A5 00 9F 00 20 7F 6B AF 00 50 7F 9F 00 20 7F 6B (located at 77E00, previously empty space). Jump is only necessary because we don't have room on original place.


snarfblam

  • Submission Reviewer
  • Hero Member
  • *****
  • Posts: 593
  • CANT HACK METROID
    • View Profile
    • snarfblam
Re: Some stupid questions regarding ASM.
« Reply #4 on: October 01, 2013, 04:05:48 pm »
1. How does one compile ASM code?

I've seen tutorials where people use a language of choice (C, Python) and then compile it and add it to the games code.

Although the two principles are very similar, ASM is assembled, and languages like C and Python are compiled. It sounds like what you're getting at are some languages and tools that compile to ASM which must then be assembled into machine language and then inserted into a game. For example, I know there is a topic on NesDev where somebody is writing a new 6502 language that is compiled to ASM and then assembled into a ROM. (From the perspective of the programmer that all happens in one step.) For a couple of the more complex hacking tasks I've undertaken, I've written custom utilities in C# that produce ASM.

BRPXQZME

  • Hero Member
  • *****
  • Posts: 4572
  • じー
    • View Profile
    • The BRPXQZME Network
Re: Some stupid questions regarding ASM.
« Reply #5 on: October 01, 2013, 09:10:15 pm »
About #1, I guess I was thinking there was a way to write something in a language such as Python, which is then used by another program to convert to ASM, and so forth. Seems that's not the case.
This would be the case for a large number of “somethings” if you wrote in a language capable of targeting the machine code level (there is such a thing as a “cross-compiler”, a compiler that targets a different system than the one it’s running on). C and C++ are two languages that are almost always compiled to machine code. But unless you are using some rather experimental packages, languages like Python or Java typically compile to some kind of bytecode, which must either be recompiled into machine code or be interpreted to run (in practice, it’s usually a mix of the two).

The thing is that the higher-level your language, the more likely it is you pay a price in terms of performance to take advantage of various tools those languages provide. Even a tool as simple as a C-style function call can get a little too heavy for some systems. So, good Python writing tends to have things in it like dictionaries, variable-length arrays, higher-order functions, and objects. These are very convenient abstractions for programming, but trying to make them work on, say, the NES would be a poor choice. Even if you could translate it to NES machine code, it wouldn’t run so well.

To answer your question #4: You should probably use a language that is easy to understand and to write, that makes it easy to accomplish the task at hand, and that other people can run if other people need to run it (and that other people can read if other people need to read it!).

For making a ROM hacking utility, there are many languages that will let you manipulate the binary data as you please, and will be around for years to come; in that case you can just pick one. For making code that goes into a hacked game, you’ll probably need assembly language or something that can hook up to it pretty closely (probably; it kind of depends on the system and, if relevant, the languages used in the game’s development).

Quote
The Tao gave birth to machine language. Machine language gave birth to the assembler.

The assembler gave birth to the compiler. Now there are ten thousand languages.

Each language has its purpose, however humble. Each language expresses the Yin and Yang of software. Each language has its place within the Tao.

But do not program in COBOL if you can avoid it.
we are in a horrible and deadly danger

Dr. Floppy

  • Restricted Access
  • Hero Member
  • *
  • Posts: 970
  • Make America GREAT Again!
    • View Profile
    • BaddestHacks.net
Re: Some stupid questions regarding ASM.
« Reply #6 on: October 02, 2013, 05:27:18 am »
I never did understand why people insist upon compiling ASM code. It's like performing cranial neurosurgery with cooking tongs whilst wearing oven mitts....  :banghead:

Vaccy

  • Jr. Member
  • **
  • Posts: 21
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #7 on: October 03, 2013, 01:44:38 pm »
I never did understand why people insist upon compiling ASM code. It's like performing cranial neurosurgery with cooking tongs whilst wearing oven mitts....  :banghead:

Seeing as I get compliled and assembled mixed up, what do you mean?

Thanks everyone, I've been reading some good docs on ASM in SNES games that's been very helpful.

VicVergil

  • Hero Member
  • *****
  • Posts: 723
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #8 on: October 03, 2013, 06:52:06 pm »
Compiled = for high level languages (C++, Java, etc): transform your program written in that language to executable (binary data).

Assembled = for low level languages ie assembly/asm (6502, ARM, R30000, Z80, and many other processors): transform your program written in the assembler language for that processor using far fewer, simpler operations (the ones supported by the processor) to do everything no matter how complicated and tedious it is..
Transform your program written in the assembler language to binary data.

The code assembled is far more optimized (fewer lines, more efficient, especially on weak systems) than the same code compiled in high level languages.
Plus you'll need that precious space.

Bregalad

  • Hero Member
  • *****
  • Posts: 2751
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #9 on: October 04, 2013, 11:04:56 am »
Higher level langauge are generally compiled THEN assembled. (C -> Assembly -> Binary) although this process is typically transparent nowadays for people who are not huge enthusiasts like we are.

And
Quote
The code assembled is far more optimized (fewer lines, more efficient, especially on weak systems) than the same code compiled in high level languages.
It really depends on the target machine and of the compiler. In some cases, compiled code might be just as good if not even slightly better than hand written assembly.
This is especially the case for RISC processors, where high quality compiler can use registers just as well as a human could, and that can work in order to minimize pipeline hazards, something that is extremely annoying to do by hand.

henke37

  • Hero Member
  • *****
  • Posts: 643
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #10 on: October 04, 2013, 11:23:56 am »
No, most compilers skip the assembly stage and go directly to bytecode.

Bregalad

  • Hero Member
  • *****
  • Posts: 2751
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #11 on: October 05, 2013, 08:24:00 am »
I think it depends on which language you're talking about. I was talking about C in mind, it looks like you have java in mind.

snarfblam

  • Submission Reviewer
  • Hero Member
  • *****
  • Posts: 593
  • CANT HACK METROID
    • View Profile
    • snarfblam
Re: Some stupid questions regarding ASM.
« Reply #12 on: October 05, 2013, 12:40:19 pm »
Languages compiled to machine language typically go directly from program code to machine language during compilation, with no assembly step in between. Languages like Java, C#, and VB are closer to "compiled then assembled" in that they are compiled to an intermediate ASM-like bytecode.

Bregalad

  • Hero Member
  • *****
  • Posts: 2751
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #13 on: October 05, 2013, 02:23:16 pm »
Quote
Languages compiled to machine language typically go directly from program code to machine language during compilation, with no assembly step in between.
Huh, no, just no.
You might not see it by default, but it's still done internally. Just use the -S flag, and your output magically turns into assembly code.

snarfblam

  • Submission Reviewer
  • Hero Member
  • *****
  • Posts: 593
  • CANT HACK METROID
    • View Profile
    • snarfblam
Re: Some stupid questions regarding ASM.
« Reply #14 on: October 05, 2013, 04:16:58 pm »
Huh, no, just no.

How about yes and no.

I looked into it a little. Popular C/C++ compilers do use assembly as an intermediate step, and C and C++ are very popular languages (but not the only high level languages). I don't really see any indication that most other languages do this (although I didn't do too much digging). Of course, so many of the popular languages we use are JIT-compiled and/or use an intermediate form other than assembly (byte code, IL, p-code, etc.).

BRPXQZME

  • Hero Member
  • *****
  • Posts: 4572
  • じー
    • View Profile
    • The BRPXQZME Network
Re: Some stupid questions regarding ASM.
« Reply #15 on: October 05, 2013, 05:26:13 pm »
There’s also that exciting family of compilers based on LLVM.
we are in a horrible and deadly danger

henke37

  • Hero Member
  • *****
  • Posts: 643
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #16 on: October 06, 2013, 04:55:02 am »
Being able to produce assembly as output != uses assembly internally.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #17 on: October 06, 2013, 10:16:12 am »
I guess I was thinking there was a way to write something in a language such as Python, which is then used by another program to convert to ASM, and so forth.

You can always script a python module to assemble a bytes/bytearray ASM object.  Depending on the processor you're targetting you may need to look up ways to optimize the generated code.
To be honest, I usually just write directly to file with a hex editor. 


Also, there's a major difference between languages like C, C++, etc. and pure virtual machines like Java and Python.  There's a level of virtualization; the difference is producing a program versus running code through a program. 
(Granted your OS is also a virtual machine.)

You can also write in pure assembly.  I still break out Borland ASM every now and then.

STARWIN

  • Sr. Member
  • ****
  • Posts: 454
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #18 on: October 06, 2013, 10:33:15 am »
I'd say the simplest thing to do is to throw the words "assembled" and "compiled" into a garbage bin. Just think of it as a translation from an encoding in memory ("language A") to an another encoding in memory ("language B"). What is encoded is a computation (program) for some machine, real or imaginary.

I feel that going either less or more philosophical from this will complicate the situation..

It is good to note there is no unique "ASM" language for a processor. Machine code (usually expressed as hex numbers) is (hopefully) unique to a given processor. "ASM" is most importantly defined by the program that translates "ASM" into machine code (an assembler). They just like to call it ASM when it has an "almost" 1-to-1 mapping between machine language opcodes and ASM commands (or whatever they want to be called). The intention being that ASM for processor X is "human-readable" machine code of processor X. It is of course convenient if many people use the same ASM language for processor X.

You can also write in pure machine code. (..or in the hex shown in your hex editor that has 1:1 mapping to whatever format it really is in..)

Bregalad

  • Hero Member
  • *****
  • Posts: 2751
    • View Profile
Re: Some stupid questions regarding ASM.
« Reply #19 on: October 16, 2013, 09:07:54 am »
In ARM there is many ways to write the same instructions :

add R1, R2, R3 lsl #0
add R1, R3, R3

or
add R1, R1, #-1
add R1, #-1
sub R1, R1, #1
etc...