News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: SNES Scripting and Code Compression  (Read 4252 times)

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
SNES Scripting and Code Compression
« on: April 22, 2016, 09:56:14 am »
I'm currently working on a fork of bsnes-plus that outputs a text file with a full disassembly of all executed SNES code, as well as descriptions for game script.  I designed it to work with Romance of the Three Kingdoms II, but I'm trying to modify it to make it more game independent.  It works nearly flawlessly with Rotk2, and I'm testing on Secret of Evermore now, but I have some questions.

First off, I've heard talk of both scripting and compressed code.  Is there really a difference?  I would imagine that the compression procedures used for graphics and such would be pretty terrible at compressing code.  Logically scripting would be the most effective way to reduce the size of code.  By scripting, I'm referring to using a series of sequential ROM values that are used to determine the destinations of one of the indirect jmp/jsr opcodes.  So kind of like reading a series of indexes to a function pointer array, in the SNES way.  As far as I can tell, there isn't really very clear terminology for discussing the subject.  Games that use scripting run a lot like emulators, where each "opcode" is part of a totally new instruction set.  I've been referring to them as "script commands" to differentiate, although I'd love to know if there is better existing terminology to use.

Second, does anybody know of any examples of games that use scripting OR otherwise compressed code?  I know Rotk2, SoE, and some more Koei games use scripting, but I'm sure there are more.  Is there any game that uses a compression procedure other than scripting for code?

STARWIN

  • Sr. Member
  • ****
  • Posts: 449
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #1 on: April 22, 2016, 10:51:42 am »
I'm currently working on a fork of bsnes-plus that outputs a text file with a full disassembly of all executed SNES code, as well as descriptions for game script.

Any screenshots of how it exactly looks like? I don't really get the descriptions part.

First off, I've heard talk of both scripting and compressed code.  Is there really a difference?

In my imaginary world scripting means that you have an engine there within the game engine, interpreting data as commands. Code compression I would call the case where the game has asm code compressed in the ROM and during runtime, decompresses it to RAM and executes from there.

I'm pretty sure there is a massive amount of games that interpret data as commands at some level.. I would be almost more curious about a complex game that didn't do this.

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #2 on: April 22, 2016, 11:22:42 am »
Here's a screenshot of what the output looks like for Rotk2:

http://imgur.com/BwXnViL

I agree that most games do this to some extent, but some do a lot more than others.  Rotk2 is almost impossible to understand without interpreting the script, as 90% or more of the "code" is actually script.  Also, I am able to track the ROM source of code and script executed from WRAM right now, as you can see the SNES address of the code is actually in WRAM.  However, the data that is interpreted as commands is exactly as it appears in ROM, just copied to RAM first.  Do you know of any actual cases where the code needs to be decompressed first?  Since most code does not contain a lot of repetitive data, it seems like compression algorithms would be inefficient at reducing code size.

AWJ

  • Full Member
  • ***
  • Posts: 105
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #3 on: April 22, 2016, 12:08:12 pm »
Software-interpreted programming languages that are in binary form (as opposed to human-readable text) are typically referred to as bytecode.

One of the reasons for using a bytecode scripting language in a video game is precisely that it can be much more compact than native machine language. Complex games like RPGs and war simulations almost inevitably contain scripting languages, sometimes multiple ones in the same game. For example, character creation, the battle system, and some other game logic in Romancing SaGa are written in a BASIC-like bytecode that I've written a decompiler for. There's also a completely separate scripting language used for controlling story events, which I haven't looked at but which the person who translated the game into English wrote some tools for (like all the SaGa games, all the text in the game is directly embedded in the bytecode, so translating the game was impossible without deciphering the scripting language)

Super Robot Wars 4 (and probably the rest of the SNES SRW games) contains a scripting language the implementation of which appears to comprise more than half of all the 65816 machine code in the ROM, though I haven't looked at it in great depth yet.

Scripting languages are very game-specific; it's probably a better idea to write an "offline" decompiler and recompiler than to try to build support for script editing into an emulator. You'd end up needing to write a new version of the emulator for every game... Though that may be my background as an aging UNIX nerd talking; most people in the ROMhacking community seem to really love the "emulator as do-everything IDE" approach.

It looks like that RoTK2 "disassembly" is actually bytecode interspersed with native 65816 instructions (I see a "JSR $23E6" in the middle of it). Is that really what it is or am I misinterpreting the output? Reminds me of Final Fantasy Legend 3 if it is.

ETA: Re your question about compressed machine code, Tokimeki Memorial has its sound driver (i.e. the program that runs on the SPC700) compressed. The compression is the same hybrid RLE/LZ77-ish codec that's used for all the game's graphics. The compression probably isn't very efficient for code but I guess it was good enough, since SPC700 code has to be slowly copied into APU RAM anyway.
« Last Edit: April 22, 2016, 12:17:19 pm by AWJ »

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #4 on: April 22, 2016, 12:56:43 pm »
Thanks for the info!  Yes, you are correct about there being native 65c816 as well.  It switches back and forth between bytecode and assembly, but mostly it is bytecode.

Initially I started writing an offline tool for this, but because of the way code blocks get moved to WRAM, the offline interpreter would get stuck.  The emulator version has no such problems.  Also, I have no plans to turn this into a script editor built-in to the emulator.  The goal is basically to make bytecode easier to read by adding descriptions.  With the direction I'm taking now, the plan is to use a "key" file to interpret different games.  The program will automatically generate one with generic descriptions like:

00:"Script command 00"

The user can then go back and edit the descriptions, and the output will be updated the next time the emu gets run.  So all you will need is a key file for different games.  I think this is within reach.  Even with only generic descriptions it would be useful, as you could tell the length of the commands and which commands are valid.

STARWIN

  • Sr. Member
  • ****
  • Posts: 449
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #5 on: April 22, 2016, 01:09:36 pm »
In your example I see two different interpretations for 0A though. Part of an earlier command?

In a generic case, how do you know which bytes are bytecode though? I assume you need more than a key file with descriptions?

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #6 on: April 22, 2016, 01:31:41 pm »
Yes, the second version of 0A is an addendum to bytecode command E9.  E9 acts like a function call, and is followed by a stack adjustment to make up for function arguments that were pushed previously.

As for determining which bytes are bytecode, the first step is identifying the variable jmp/jsr opcodes 6C,7C,DC, and FC.  The jump destination for each of these opcodes is determined by a RAM address and sometimes the X register.  I have a system for tracking ROM sources for all RAM addresses and the high and low bytes of the A,X, and Y registers.  If the source of the value is an opcode(ex. STZ) or an operand (ex. LDA #$1234), it is considered invalid for tracking purposes.  I only consider it valid when the source can be tracked to a non-assembly ROM source, like an indirect LDA. Preferably a single byte ROM source.  I'm not done with the code yet, but I'm sure I'll have to do some debugging with regards to false positives/false negatives.  That's why I wanted to get some examples of games that use scripting, so I could use them for testing purposes.

Gideon Zhi

  • IRC Staff
  • Hero Member
  • *****
  • Posts: 3505
    • View Profile
    • Aeon Genesis
Re: SNES Scripting and Code Compression
« Reply #7 on: April 22, 2016, 01:35:07 pm »
Second, does anybody know of any examples of games that use scripting OR otherwise compressed code?  I know Rotk2, SoE, and some more Koei games use scripting, but I'm sure there are more.  Is there any game that uses a compression procedure other than scripting for code?

Ys V is full of compressed code. The "scripting" if you will for each zone is pure 65816, and each zone's data gets decompressed to RAM. This leads to 65816 instructions being executed directly from the 7E RAM bank. Text loading and even a few routines that draw text to the screen are all contained within these compressed blocks; this is the primary reason Ys V was such a pain in the ass to hack.

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #8 on: April 28, 2016, 03:09:08 pm »
I managed to get a little farther on the code for identifying script in any SNES game.  As expected, there are some false positives, but it's actually not all that bad.  A little bit of research should be enough to tell if code is correctly labeled as script or not.  The user is going to have to do some research anyway in order to come up with good descriptions for the bytecode.  I haven't gotten any of the key file stuff set up yet, that's my next task.

I found a post in a different topic about some games using RTS/RTL as indirect jumps for bytecode, so I added in the ability to track the sources for those as well.  I tried the program on Romancing SaGa, and it looks like it caught a fair amount of bytecode.  Is there any documentation on the script for Romancing SaGa? I know there are tools for it, and AWJ mentioned a decompiler.  I'm looking to compare my output with known scripting.  I found the game translation, but not the tools used to create it.

As for Ys V, I'm not set up to handle actual code compression right now.  It might be something I can work into a key file later though.  I can track bytes in RAM to their ROM sources, but it's still going to look messed up unless you dump the decompressed code to a separate file first.  I think I can work some pretty elaborate things into a key file though, so maybe in the future.

elmer

  • Full Member
  • ***
  • Posts: 122
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #9 on: April 28, 2016, 04:06:50 pm »
Ys V is full of compressed code. The "scripting" if you will for each zone is pure 65816, and each zone's data gets decompressed to RAM. This leads to 65816 instructions being executed directly from the 7E RAM bank. Text loading and even a few routines that draw text to the screen are all contained within these compressed blocks; this is the primary reason Ys V was such a pain in the ass to hack.

Sounds similar to Legend of Xanadu 1&2 on the PC Engine, which I guess makes sense since Falcom wrote them at roughly the same time.

I'm curious about the "pure 65816" though ... Legend of Xanadu's decompressed "script" chunks contain both assembly code and bytecode (which can call assembly code).

binkers87

  • Jr. Member
  • **
  • Posts: 16
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #10 on: August 30, 2016, 11:53:33 am »
How is the progress on your tool? I am excited to try it. Been looking for something like this

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #11 on: August 30, 2016, 02:23:21 pm »
It's coming along, though I went a different route regarding the key files.  Instead, I'm using plugins that I make for individual games.  You copy and paste the dll plugins to the same folder as the game before running.  I've got two plugins written at the moment, one for Romance of the Three Kingdoms II and one for Secret of Evermore.  The one for SoE has descriptions for EVENT and SUB-EVENT scripting, but I haven't done the animation scripts yet.  On the plus side, the SoE one automatically parses EVENT scripting when you first start the game, so you don't actually have to play through for more than a few seconds to get all the script output.  So unlike Rotk2, SoE is actually pretty well suited to an offline editor.

Source code for the project is here:
https://github.com/darkmoon2321/bsnes-plus-deScriptor

It won't work on Linux in its current state, as I've only been using Windows for my testing.  It's still a little rough overall, though functional enough to get useful output.  It doesn't handle save states very well at the moment though, I need to look into fixing that.

Example output from the SoE plugin:
http://imgur.com/a/418i4

August 30, 2016, 02:41:41 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
While I only have two plugins at the moment, I should be able to transform the Rotk2 one to handle several Koei games in the near future.  Koei used the same scripting language for several of its earlier SNES titles, and also for its NES games.  For example, there is both an NES and SNES version of Rotk2, and the scripting language is identical.  You can actually copy script from one to the other and it will still have its intended effect.  My tool is based on bsnes-plus though, so getting output from an NES game would require somebody repeating my work on FCEUX or another NES emu.
« Last Edit: August 30, 2016, 02:41:41 pm by darkmoon2321 »

AWJ

  • Full Member
  • ***
  • Posts: 105
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #12 on: June 16, 2017, 09:05:19 am »
Necrobumping this topic to link my own research into the bytecode found in Koei NES and early SNES games. Executive summary: the bytecode is almost certainly the output of a C compiler.

KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 6905
  • *sigh* A changed avatar. Big deal.
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #13 on: June 16, 2017, 10:20:29 am »
I do remember Square mentioning the tool they wrote (I can't remember if it was a script editor or assembler) in a Nintendo Power preview article of SoE.
"My watch says 30 chickens" Google, 2018

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #14 on: June 16, 2017, 12:53:28 pm »
Necrobumping this topic to link my own research into the bytecode found in Koei NES and early SNES games. Executive summary: the bytecode is almost certainly the output of a C compiler.

I would have to agree.  The way they handle stack data and local variables for the functions reminds me very much of C.

I do remember Square mentioning the tool they wrote (I can't remember if it was a script editor or assembler) in a Nintendo Power preview article of SoE.

I believe it was called SIGIL.  The lead programmer, Brian Fehdrau has given a number of interviews discussing it.  He also talked about SAGE (Square's Amazing Graphical Editor).  At some point in the near future I'm planning on getting back to hacking Evermore.  I started writing an editor awhile back that was capable of importing the scripting from the game, but I got sidetracked.  I'm currently working on the Deadpool (Ninja Gaiden) hack, but it's getting close to completion.  I really love Evermore, but it's a real challenge to work with for hacking purposes.  I'm hoping I can come up with a tool to make things significantly easier to edit.

goldenband

  • Sr. Member
  • ****
  • Posts: 298
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #15 on: June 18, 2017, 11:11:44 am »
How close are we to having a tool that can dump the script of any given Koei game (or a subset, i.e. the 16-bit games), if such a thing is even possible? In particular, how much do the 16-bit ports build on what we know from their 8-bit cousins? EDIT: OK, the NESdev thread covers a lot of that, my bad.

I've wanted for some time to look into transplanting the English script from the SNES version of Lord of Darkness to the Mega Drive port.

darkmoon2321

  • Jr. Member
  • **
  • Posts: 59
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #16 on: June 18, 2017, 01:26:36 pm »
The project I linked above was capable of outputting the script and assembly for Romance of the three Kingdoms II, and Gemfire for the NES.  Originally it only worked for ROTK2, but I made the plugin more generic so it would work for both.  It's SNES only though, so it can't be used for the NES titles.  Unfortunately, I had a heck of a time trying to get it to work completely offline, and in the end I tied the tool to BSNES plus, so it would record the script as it was executed.

If you would like to see the instructions for the script you can check out:
https://www.dropbox.com/s/889nbq80ajr2ytj/ROTK2_script_wLabels.txt?dl=0

Scroll down to 008000 to start seeing script.  Since the tool only works on script that it sees executed in the emulator, I had to piece parts of it together and it's still not complete, but there's a lot available.  My interpretation of the instruction set is a little more primitive than AWJ's, as when I first wrote it I didn't record the stack positions as Local variables/arguments but instead just wrote down the relative position.  In some ways though it can be more useful to a hacker though, since it's easier to see the addresses used if you want to make modifications.  I have analyzed all of the script commands though.  There are a few that are unused, and don't appear to be very useful.

AWJ

  • Full Member
  • ***
  • Posts: 105
    • View Profile
Re: SNES Scripting and Code Compression
« Reply #17 on: June 20, 2017, 08:50:18 am »
I've wanted for some time to look into transplanting the English script from the SNES version of Lord of Darkness to the Mega Drive port.

I'd be very surprised if any of the Mega Drive versions use this bytecode. The most likely reason these games were compiled to bytecode on the NES is because the 6502 is almost uniquely ill-suited to C, mainly due to having only a 256 byte stack. That isn't a problem with the 68000 at all.