11 March 2016 - Forum Rules

Main Menu

GBA ASM - MegaMan Zero Lives System

Started by PowerPanda, August 07, 2022, 05:31:56 PM

Previous topic - Next topic


I am working on a revised version of MegaMan Zero 1 that operates as a bit of a halfway point between the tough-as-nails original and the child-friendly casual mode from the legacy collections. The goal is to make the game play a bit more like Zero 2, where it's hard, but fair. I've made changes to both weapon levelups and cyber elf levelups, but I'm stuck on making a common-sense lives system.

For those who have forgotten how ridiculous the lives system is in Zero 1, losing all of your lives on a mission will cause you to fail that mission, removing it permanently from the game. You will start back at HQ with 0 lives. So you go into your next mission, fall into one of the numerous instant-death traps that is there to trip up new players, fail that mission and lose it permanently, then start back at HQ with 0 lives. And repeat. You get 2 lives at the beginning of the game, and without finding 1-ups, THAT'S IT.

My goal is to refill the player's lives at the beginning of each mission, so that they always at least start with 2 lives. I've found the place within the code to do it, and I can explain the steps, but I'm really struggling with GBA ASM code. I don't think I ever appreciated how different it was from SNES. When I learned SNES ASM, the way that it finally "clicked" with me is that I described to Gray Shadows what I wanted to do, and then watched how he accomplished that. From there, I went on to develop some fairly extensive ASM hacks. I'm wondering if someone here might be willing to help me with GBA.

At the point in the code where it assigns the current mission to the player, I want to add in either a jump or a subroutine that checks to see if a player has fewer than 2 lives, and if so, sets their life counter to 2 lives.

Let's start at 080C112C where the current mission is written to RAM.
080C112C 6010 str r0,[r2]  ; writes current mission # to 202A574
<------ Insert jump here
(The following 2 lines may need to be relocated to make room for subroutine or jump)
080C112E 78E8 ldrb r0,[r5,3h]
080C1130 3001 adds r0,1h

There is a bank of free space right between 82CC420 & 82E0000, so let's set the address to 82CC420.

no$gba tells me that the following registers are free at this moment: r1, r7, r8, r9, r10, r11. I will use rA and rB as variables so that you can choose whichever register you want to work with. None of the remaining registers contain any of the values that I want.

The code should follow this logic:
1. Load Register: Life Counter RAM address (0202A5D0) to rA
2. Load Register: Actual Value from rA to rB (will be a value between 00-09)
3. Compare: rB to 1
4. If higher, branch to step 7
5. Write a fixed value of 2 into rB
6. Write the value in rB to the RAM Address stored in rA
7. Zero out rA and rB, just to be safe
8. (relocated code to make room for the jump)
9. Jump back to 080C113X

Pre-emptive thank you to anyone who is willing to look at this. Also, if you're able to provide both the ASM and the corresponding Hex, that will allow me to better understand what's going on.


Here we're dealing with Thumb assembly, and one of the tricks is you'll need a constant pool.

First, the jump.  You can move +/- 2048 with a single B.  That's not nearly far enough to make it all the way to 0x082CC420, so you can use a BL (which overwrites LR aka R14) which can go the distance.  However, you'll need to make sure the prolog above 0x080c112C saves R14 (it'd be part of the push.)  You didn't note it as free.

For your interest, BL is actually two 16-bit instructions.  The first one already overwrites LR with half your destination address, then the second one combines that with the other half and swaps with PC to jump.

After that, it's common in Thumb to use "literal pools."  These are literal bytes representing constants (like addresses) and are stored between functions.  You run an instruction to read it from the ROM to set the constant to that value.  A tool like armips can help you automatically write out constant pools.

So step 1 (which consumes 6 bytes) would use a PC-relative ldr, usually written like this:

ldr rA, =0x0202A5D0
; .pool (you'd do this at the end.)

Technically, it looks more like this without the prettiness:

ldr rA, [pc, 32] ; Assuming the db is 32 bytes later.

db 0xD0, 0xA5, 0x02, 0x02

The rest are simpler:

; Load lives from life counter in rA.
ldrb rB, [rA, 0]
cmp rB, 1

; More than 1 life already, we're good.
bgt @@done

; Okay, need to overwrite with 2.
mov rB, 2
strb rB, [rA, 0]


At this point you've got your basic logic.  Clearing is more or less straight forward:

mov rA, 0
mov rB, 0

; Displaced this code.
ldrb r0, [r5, 3]
add r0, 1

Going back depends on if R14 was saved, or if maybe you saved it in another reg (maybe displacing a third instruction.)  Let's assume LR/R14 was free for the taking, you could just BL again and not care about LR (remember this is two extra bytes):

bl 0x080C1132

After this is where you'd probably put your .pool statement.

You could also load 0x080C1132 as a constant from the literal pool, and mov pc, rA or similar.  That would also jump and wouldn't hurt LR.  So there are several options.

When I wrote the Tomato Adventure asm patches, I added comments.  It might help to read through a function or two to get more familiar with Thumb:



Thank you. I'll be digging into that hopefully tomorrow night!


If you are going for the more "get it done" approach then there are ways of pretty trivially hardpatching GBA cheats ( is probably the current hotness, that or go with the original GBAATM, skip GABSharky and older stuff there. if you needed a primer on GBA cheats, for the sake of the forum searchers then for a guide to making cheats and in this case is for the GBA).
The GBA is modern enough to have fancy IF ELSE arrangements and other such things cheats (see the enhacklopedia link). Find some memory value that reliably indicates you are back in the mission select and you could set lives to 2 (or whatever you think works for your purposes).
Not likely to be particularly elegant but also not going to be a complete mess either.


[Unknown], thank you for that detailed writeup. The thing that I could not make sense of when i looked at it before writing this post is unfortunately the part that I'm not following in your explanation either. I can figure out how to do all of the RAM checks and register manipulation, but I can't figure out how to do a bl. I'm seeing things in no$gba that list the hex as "F000FC2A", and that translates from hex to ASM as "Lxx,80C198Ah". The 2 don't seem to be remotely related to each other. What am I missing?

The r14 register is in use at this point in the code. It is "080C7C2B".


BL is typically represented as a "32 bit" instruction (in assemblers, disassemblers, etc.)  In reality, it's essentially two separate instructions that it's only sensible to pair with each other.

Any 16-bit instruction word with the top 4 bits set (i.e. 0xFxxx) is part of a BL.  The next lower bit determines if it's the upper destination bits (not set) or the lower destination bits (set.)  The final 11 bits are the partial *relative offset* to the PC.

So let's look at a practical case.  This is largely from memory, so apologies for any minor inaccuracy.

We want to encode this instruction:

0x080C112E ???? ????: bl 0x082CC420

First, we must calculate the relative address.  Importantly, this is the distance from PC+4, because of how the pipelining works on the CPU.  That means:

currentPC = 0x080C1132
destination = 0x082CC420
distance = destination - currentPC = 0x0020B2EE

Now armed with the relative offset we want to encode, it's time to split it into the two instructions.  Since we're encoding, let's be more specific about the function of these two instructions:

// (encoding & 0xF800) == 0xF000
void BL0() {
  // These are the lower 11 bits of the encoded instruction word.
  uint16_t offset = encoding & 0x07FF;
  // The upper bit here is the sign, so shift to two's complement 32-bit first.
  // This simply copies bit 11 to all higher bits.
  int signedOffset = (int32_t)(offset << 21) >> 21;

  // This is the upper half of the PC offset, bit it's in instruction words so we also double it.
  LR = PC + (signedOffset << 11) * 2;

// (encoding & 0xF800) == 0xF800
void BL1() {
  // These are the lower 11 bits of the encoded instruction word.
  uint16_t offset = encoding & 0x07FF;
  // This is just added without sign extension.  Again, double for instruction words to bytes.
  uint32_t destination = LR + offset * 2;
  // Now we have to set PC to the next instruction, but PC again is already + 4.
  uint32_t returnAddress = PC - 2;

  // The actual LR register has the low bit set to indicate Thumb mode (for BX.)
  LR = returnAddress | 1;
  // Now we jump.
  PC = destination;

Basically, LR is used as a temporary to store the immediate bits, and then it combines everything and jumps in the second instruction.

Our distance in bytes is 0x0020B2EE, but we want to see that in instruction words (16-bit Thumb), so we halve it:

0x0020B2EE / 2 = 0x00105977

Also, we need to fit this into a signed 22-bit integer (so 21 bits for a positive number.)  It barely fits.

Splitting it up, we want to add:

(destination >> 11) = 0x020B
-> Step 1: LR = PC + (0x020B << 11)
(destination & 0x07FF) = 0x0177
-> Step 2: PC = LR + 0x0177

Okay, now we know what we want to encode.  We combine the instruction patterns and we get:


That said, I can only recommend armips for this to shortcut manual math and small errors you might have to spend a lot of time debugging:

Code (jump.asm) Select
.open "original.gba","output.gba",0x08000000

.org 0x080C112E
bl 0x082CC420


This will copy original.gba to output.gba, write four bytes to create the BL, and do all the math (without silly human mistakes) for you.

That covers the encoding itself, but you mentioned the r14 (LR) is in use.  Time to talk ABIs.  You might already know this part, just going for clarity.

This part is all about conventions, but it's easy to validate that something uses a common convention since you'll see a clear pattern in the code.

The typical Thumb ABI on GBA is:
- r0-r3 are arguments, return values, etc.
- r4-r7 are easy-access callee-saved.
- r8-r11 are limited-access callee-saved.
- r12 might be callee-saved, but is usually an assembler temporary for trampolines.
- r13 is SP.
- r14 is LR.
- r15 is PC.

You'll commonly see functions with this signature:

push r4-r7,r14
mov r7,r8
push r7


pop r7
mov r8,r7
pop r4-r7
pop r0
bx r0

This is the signature of a typical function that intends to use r4-r8.  It must preserve those regs on the stack, which it takes care of right away.  In this example it also preserves r14 which is where it needs to eventually return to.

In this case, r14 is "free."  It can be used and abused, because the value in it is only for calling additional functions.

The other major case is a "leaf" function (one that does not itself call any other functions):

push r4-r5


pop r4-r5
bx r14

In this case, only r4-r5 are used (but it could use r4-r8 just as well.)  It doesn't intend to call any functions, so it's unnecessary to save r14 since it'll never be overwritten.  In this case, r14 is not free.

If you're in this second case, you have a few general options to accomplish your long branch.

1. Change the prolog and epilog to save LR.

- this consumes 4 bytes more of stack space.
- the epilog may need to get longer by 2 bytes, which means you must copy another instruction.

- simple and consistent with ABI model.
- if you're sure only Thumb code calls this, you can switch to "pop r4-r5,r15" which /saves/ 2 bytes.

Mainly use this one if you understand all cases in which the function is called, and know it always comes from a Thumb caller and lacks stack space limits.

You just encode "push r4-r5,r14" and then add a "pop r0; bx r0" at the end, or directly pop r15.

2. Use a temporary to save LR.

- need a free reg you won't touch.
- requires 2 bytes before your BL, at least, so you must copy another instruction.
- a bit unclear to read.

- no stack space required.

In this case you'd do something like:

; Preserve LR, we won't touch r1.
mov r1, r14
bl 0x082CC420

mov r14,r1
ldr r1,=@@somelabel
mov pc,r1


If you prefer, you could also do the return using:
ldr r1,=@@somelabel | 1
bx r1


The bx instruction goes into (or stays in) Thumb mode based on the lowest bit.  In contrast, mov retains the current processor mode.  It kinda doesn't matter, because both consume 8 bytes (2 * 2 for instructions + 4 for constant.)

You could also just "bx r14", and consume 2 more bytes in the caller to "mov r14,r1".  This would be much clearer, and keep the responsibility of saving r14 in the caller (where it belongs), but you have to copy a second instruction.  So depends what you have space for.

Didn't want to go into detail about the other options, but you could use two regs and that would save more btyes overall (no need for a constant literal.)  There's many ways to do this.



[Unknown], I just want to say, thank you so much for your detailed explanations, and for your help. I conceptually understand it. However, it looks like I have some studying to do before I really understand how to write GBA ASM, so I'm back-shelving this project while I finish up an update for FF6 Divergent Paths, and do some work for ShadowOne333 on adding some long-desired features to A Link to the Past Redux.