Is there an easy way for hacking text in FC Chinese bootlegs?

Started by Chatterine, June 14, 2021, 09:18:33 AM

Previous topic - Next topic

Chatterine

I started making a table, but the problem is that game itself has lots of chinese characters. Since they are stored in 16-bit format, the 1st byte can range anywhere from a0, b0, c0, d0, e0 and f0 (I think, but there could be other values as well), while the 2nd one goes from 00 to ff. This seems to apply not only to this game, but also to various bootlegs.

Currently what I can do is alter a piece of text I've already located to throw random values in, then OCRing the Simplified Chinese. Repeat this approximately 65000 times to get a full table. This method is not optimal and very tedious. I haven't noticed a pattern for this system yet, and as far as I know this game uses a custom encoding. The custom encoding makes relative searching the characters almost impossible.

A tbl file wouldn't be a solution for all problems related to hacking this thing but it would at least help me locate text in the rom with more ease.

FAST6191

Finding text and entering enough characters to figure out what each value is is the main way most set about it.
For custom stuff in Japanaese you tend to find it has some pattern in it -- most common characters first, order they appear in the script, variation of some known encoding, variation of some unofficial but otherwise popular grouping method, variation of some method they use to teach kids, variation of some popular dictionary and I have no reason to suspect Chinese (or any other 16 bit encoding) would be any different. I don't however know much about Chinese in this regard to give any suggestions as far as existing encodings and methods for teaching kids or otherwise categorising.

Beyond that you get to play assembly editor -- when the game is handling the text to in turn fetch the relevant symbol it will have some kind of logic. Hopefully it is not nested IF commands but I have been surprised in the past, more likely "if value in this range, take value, use to generate fetch command for relevant tile, fetch file into memory". In some cases you might even get a lookup table of sorts.

Chatterine

Out of curiosity, I popped the game's font up in a tile editor and it seems they're sorted in the order they appear on the script. But, the order of the tile editor isn't the same as the bytes in the text. Maybe I could look on other stuff made by the same developer to see if there's a pattern?

FAST6191

You could find that (Capcom on the DS have their own format shared between various games they made for example), or it might be that they made an encoding for a beta version of the script using first seen order and then changed it thus breaking the order, and that beta is sitting on some Chinese homebrew dev's hard drive somewhere.

That said if you have a 90% accurate ordering then I would call that something of a win and use it to make the test everything to find the lot what you want.

I should also have said you only need to find the characters used in the script. If for some reason the devs wasted NES storage on copies of the entire Hanzi collection then there is that.

Chatterine

Thanks dude! I wish there was more information on hacking this type of stuff, as there is not enough. Do you know someone who's good at this?

KingMike

Quote from: FAST6191 on June 14, 2021, 04:29:21 PM
I should also have said you only need to find the characters used in the script. If for some reason the devs wasted NES storage on copies of the entire Hanzi collection then there is that.

NES has a pretty serious VRAM access bottleneck, so it's possible they'd use CHR-ROM with duplicate tables for speed.

Thinking of this Final Fantasy 1 Chinese translation which hangs the game (with stuttering music) every time it draws a text box loading the text character data into CHR-RAM.
"My watch says 30 chickens" Google, 2018

Chatterine

Quote from: KingMike on June 16, 2021, 10:40:10 PM
Thinking of this Final Fantasy 1 Chinese translation which hangs the game (with stuttering music) every time it draws a text box loading the text character data into CHR-RAM.

Yeah, I know that one. Later bootlegs used faster methods for drawing hanzi on screen, I think. At least Nanjing's game's didn't have that slowdown.

rafaelguerrero

Hi, I think I can help you. I am also translating the pirate nes game.

Chatterine

Quote from: rafaelguerrero on July 06, 2021, 04:31:08 PM
Hi, I think I can help you. I am also translating the pirate nes game.


Thank you! What game are you working on?