News: 11 March 2016 - Forum Rules, Mobile Version
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia

Author Topic: Japanese ROM scripting help  (Read 484 times)

secret-glade

  • Jr. Member
  • **
  • Posts: 5
    • View Profile
Japanese ROM scripting help
« on: October 24, 2017, 11:52:12 am »
So I've been fiddling around quite a bit with hex editors and changing in game text. The majority of stuff that I've read coincidentally deals with ROMs that are in English and have everything laid out right in plaintext and is easily editable. From what I've read, Japanese is apparently like this at times as well, but I haven't come across a ROM that has any Japanese right in the text. I have my computer locale set to Japanese and I'm working in WindHex32, but I can't seem to find anything that has Japanese right in it. I'm not sure if all the ROMs I've opened are compressed or if for some reason Japanese text isn't showing up on my computer, or at least in the program (I'm using a kana text editor called Sakura where literally everything is in Japanese, so I don't think it's a computer language display issue). I have WindHex set to show kana, but I still only get random symbols.

One ROM I'm wanting to work on is a GBC game called Oide Rascal. Opening it up in a tile editor shows whole hiragana and katakana sets in the tiles so I don't believe its compressed, but I can't search any sequences in the hex editor. Is there maybe a better hex editor to use, or can anyone recommend some GB or GBC ROMs that for sure have the Japanese characters in the viewable script? I want to practice creating Japanese tables but I'm kind of stuck at finding the characters at all. Thanks!  :)

filler

  • RHDN Patreon Supporter!
  • Hero Member
  • *****
  • Posts: 606
  • "WINNERS DON'T SELL REPROS"
    • View Profile
    • Filler's Translation Projects
Re: Japanese ROM scripting help
« Reply #1 on: October 24, 2017, 08:31:05 pm »
Heh, this I can actually help you with. :)

The issue with viewing Japanese characters in ROMs like this is that unless they use standard Unicode, or S-JIS, they are using a custom text encoding.

The English you're viewing in other ROMs is probably stored in ASCII encoding and is viewable by default in the hex editor. In order to view other text encodings, you'll need to use a table file to tell the editor which characters to show for which values.

Typically these values are (edit: one byte :p) in the range between 00-FF, though they can be more if there's a lot of kanji.

Others may be able to help you more with pointer tables, but what I normally use is an emulator with a feature that let's you see video memory, or simple relative searching.

If you take a look in video memory when text is on the screen, you'll be able to see the font loaded into memory. This should be the order that the characters appear in the encoding, and in an emulator like FCEUX you can float your cursor over the character to display its value.

You can also do a relative search to look for values that match the pattern of the values relative to each other. You'll normally want to base this on the order of the font as it appears in memory, or in the ROM as viewed with a tile editor. Lately I've been using Translhextion's "value scan relative" for my searches.

Once you find some matches, typically based on some text you find early in the game, try making edits to it and checking it in an emulator until you've found the text you're looking for. Based on the values, and the order of the font tiles, you can normally fill out the rest of your table file. In a pinch, you can enter values into the text you found and check them in an emulator to see what they display.
« Last Edit: October 25, 2017, 01:01:20 pm by filler »

Psyklax

  • Sr. Member
  • ****
  • Posts: 412
    • View Profile
    • Psyklax Translations
Re: Japanese ROM scripting help
« Reply #2 on: October 25, 2017, 05:39:23 am »
Filler gave you a good outline of the process, so let's look at how it applies to you.

Looking at a gameplay video on YouTube it looks like it uses just kana, which makes things easier. Basically, 8- and 16-bit consoles use one byte per character to save space, which limits you to 256 different characters. Obviously that's not enough for proper Japanese script that needs lots of kanji, and that would require two bytes (giving a total of 65,536, more than enough for every character imaginable). 32-bit and onwards systems generally use more accepted standards like Unicode or Shift JIS.

Anyway, the GBC is an 8-bit  machine, but a very late one, so it can use huge cartridges compared to when the GB first came out. I translated the first Detective Conan, and it was clear that despite the amount of text, compression was unnecessary due to the huge ROM size. This game probably does the same.

However, I did a relative search using the kana order I found in the ROM, as well as using the typical kana order, and I found nothing. It's surprising, but you may need to use a different route to finding the text. On a NES it's easy because of how the system works, but GB is a bit trickier and I don't have experience with it.

It seems that unlike the NES, graphics are loaded into memory individually, then put into place in a different part of memory. The question is how they get into memory in the first place. I'm still figuring this out myself, hopefully I'll find something...

secret-glade

  • Jr. Member
  • **
  • Posts: 5
    • View Profile
Re: Japanese ROM scripting help
« Reply #3 on: October 25, 2017, 11:30:35 am »
@Filler, thank you for the explanation. I think I had a different idea of how tables worked. Like @Psyklax said, the tricky thing is finding the text at all with this one. Perhaps I'll practice a bit with NES ROMs to make sure I have the general searching process down. Are GB/C ROMs typically difficult to work with? I have trouble searching for GB/C related help since everything is just easy Pokemon utilities.

KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 6174
  • *sigh* A changed avatar. Big deal.
    • View Profile
Re: Japanese ROM scripting help
« Reply #4 on: October 25, 2017, 02:49:01 pm »
However, I did a relative search using the kana order I found in the ROM, as well as using the typical kana order, and I found nothing. It's surprising, but you may need to use a different route to finding the text. On a NES it's easy because of how the system works, but GB is a bit trickier and I don't have experience with it.

It seems that unlike the NES, graphics are loaded into memory individually, then put into place in a different part of memory. The question is how they get into memory in the first place. I'm still figuring this out myself, hopefully I'll find something...
GB games might print a text window directly in VRAM (as opposed to loading the entire font) due to that the original GB has less VRAM than the NES. The GB (which also affects backwards-compatible GBC games) has 8KB while the NES PPU has 10KB (typically, it can be 12KB with "four-screen mirroring" which requires extra RAM on the cart): an 8KB CHR-ROM/CHR-RAM bank on the cart as well as the 2KB internal VRAM for the nametable.
Quote
Sir Howard Stringer, chief executive of Sony, on Christmas sales of the PS3:
"It's a little fortuitous that the Wii is running out of hardware."

Psyklax

  • Sr. Member
  • ****
  • Posts: 412
    • View Profile
    • Psyklax Translations
Re: Japanese ROM scripting help
« Reply #5 on: October 25, 2017, 04:09:48 pm »
GB games might print a text window directly in VRAM (as opposed to loading the entire font) due to that the original GB has less VRAM than the NES. The GB (which also affects backwards-compatible GBC games) has 8KB while the NES PPU has 10KB (typically, it can be 12KB with "four-screen mirroring" which requires extra RAM on the cart): an 8KB CHR-ROM/CHR-RAM bank on the cart as well as the 2KB internal VRAM for the nametable.

It sure looks like it:

http://gameboy.mongenel.com/dmg/gbc_memorymap.txt

32KB for the program ROM, same as the NES; 6KB for the pattern table, compared to the NES's 8KB, so the sprites get only 2KB (maybe they figured it'll be okay given the lower screen resolution?); 2KB for the name tables (looks like there are two, and they cover a much wider area than the viewable window, presumably making four-way scrolling a breeze - at least that's an advantage over the NES); and 24KB for the rest.

So the GB has twice as much internal RAM as the NES, but a little less VRAM. I thought it would have memory limitations given that both graphics and program need to be accessed over the Z80's 16-bit bus, but since so much of the NES's range is wasted on mirroring, it doesn't seem like a problem. In fact, the fact that VRAM can be accessed directly rather than through a register is probably a bonus, allowing for more flexibility.

Nevertheless, I still can't find the text for this game. :D

secret-glade

  • Jr. Member
  • **
  • Posts: 5
    • View Profile
Re: Japanese ROM scripting help
« Reply #6 on: October 25, 2017, 04:21:14 pm »
I've been messing with Quest of Ki. Made a Japanese table and I'm able to edit some lines I would translate, but it corrupts other parts of the game every time no matter how small of an edit I make, even if its just one hiragana character in a line to a different character. Still able to be played, but editing one bit of script screws up most tiles aside from the start screen, pre and post-level point total screens, and these green dialogue bits in between.

Only hiragana and katakana values were replaced for this second screen, didn't replace any values that weren't kana values.




Is there a certain way I should be saving the ROMs after edits? I'm using WindHex32.


I'll try messing around with some English GBC ROMs and some other Japanese ones to see I can get a table made and actually find text somewhere.


goldenband

  • Full Member
  • ***
  • Posts: 245
    • View Profile
Re: Japanese ROM scripting help
« Reply #7 on: October 25, 2017, 05:59:44 pm »
I could be wrong (it wouldn't be the first time!), but that sounds like a sign that an emulator is using a checksum to recognize that ROM in order to overcome a bad header (or maybe a bad dump). Changing anything in the ROM means the emulator doesn't recognize it anymore, doesn't compensate for the error in the header, and is probably using the wrong mapper as a result.

Psyklax

  • Sr. Member
  • ****
  • Posts: 412
    • View Profile
    • Psyklax Translations
Re: Japanese ROM scripting help
« Reply #8 on: October 25, 2017, 07:09:34 pm »
I feel like such an idiot... :D

I spent hours banging my head against the wall trying to understand where Oide Rascal was getting its text from, when I forgot the golden rule about relative searching: try it with spaces between letters.

Get Relativeful Search:
http://www.romhacking.net/utilities/40/

I looked at the graphics in Tile Molester and saw how the kana were laid out, and used that to find the first three letters of the intro ("do u bu"). I searched for "do" as the 41st letter, clicked 'skip value', then "u" as the 6th, skip letter, same for "bu". Search... several matches found. I change the second one... success. :)

Basically, the text uses two bytes per character: one selects the graphics bank to load from, the other selects the byte in that bank. So $82C7 is "do", then $82A4 for "u" and so on. Katakana is on $83, punctuation is on $81, which you can see when you load it in Tile Molester. I actually made the table file for you (it's in Romaji because I'm tired and can't be bothered pasting in my Shift JIS characters :D ):

Code: [Select]
8140=
829F=xa
82A0=a
82A1=xi
82A2=i
82A3=xu
82A4=u
82A5=xe
82A6=e
82A7=xo
82A8=o
82A9=ka
82AA=ga
82AB=ki
82AC=gi
82AD=ku
82AE=gu
82AF=ke
82B0=ge
82B1=ko
82B2=go
82B3=sa
82B4=za
82B5=shi
82B6=ji
82B7=su
82B8=zu
82B9=se
82BA=ze
82BB=so
82BC=zo
82BD=ta
82BE=da
82BF=chi
82C0=dji
82C1=tt
82C2=tsu
82C3=dzu
82C4=te
82C5=de
82C6=to
82C7=do
82C8=na
82C9=ni
82CA=nu
82CB=ne
82CC=no
82CD=ha
82CE=ba
82CF=pa
82D0=hi
82D1=bi
82D2=pi
82D3=fu
82D4=bu
82D5=pu
82D6=he
82D7=be
82D8=pe
82D9=ho
82DA=bo
82DB=po
82DC=ma
82DD=mi
82DE=mu
82DF=me
82E0=mo
82E1=xya
82E2=ya
82E3=xyu
82E4=yu
82E5=xyo
82E6=yo
82E7=ra
82E8=ri
82E9=ru
82EA=re
82EB=ro
82EC=xwa
82ED=wa
82EE=wi
82EF=we
82F0=wo
82F1=nn
8340=XA
8341=A
8342=XI
8343=I
8344=XU
8345=U
8346=XE
8347=E
8348=XO
8349=O
834A=KA
834B=GA
834C=KI
834D=GI
834E=KU
834F=GU
8350=KE
8351=GE
8352=KO
8353=GO
8354=SA
8355=ZA
8356=SHI
8357=JI
8358=SU
8359=ZU
835A=SE
835B=ZE
835C=SO
835D=ZO
835E=TA
835F=DA
8360=CHI
8361=DJI
8362=TT
8363=TSU
8364=DZU
8365=TE
8366=DE
8367=TO
8368=DO
8369=NA
836A=NI
836B=NU
836C=NE
836D=NO
836E=HA
836F=BA
8370=PA
8371=HI
8372=BI
8373=PI
8374=FU
8375=BU
8376=PU
8377=HE
8378=BE
8379=PE
837A=HO
837B=BO
837C=PO
837D=MA
837E=MI
8380=MU
8381=ME
8382=MO
8383=XYA
8384=YA
8385=XYU
8386=YU
8387=XYO
8388=YO
8389=RA
838A=RI
838B=RU
838C=RE
838D=RO
838E=XWA
838F=WA
8392=WO
8393=NN
8394=VU
8395=XKA
8396=XKE

Just copy-paste that into a .tbl file and load it into WindHex32 EX (my editor of choice for such things), and you'll find the intro text at $AA054.

I could be wrong (it wouldn't be the first time!), but that sounds like a sign that an emulator is using a checksum to recognize that ROM in order to overcome a bad header (or maybe a bad dump). Changing anything in the ROM means the emulator doesn't recognize it anymore, doesn't compensate for the error in the header, and is probably using the wrong mapper as a result.

That's one possibility, or the other is that he's made an error in his edits somehow. The NES is generally quite easy to find text and replace, but I haven't looked at that game.

Anyway, I hope I've been of help to you! :)

secret-glade

  • Jr. Member
  • **
  • Posts: 5
    • View Profile
Re: Japanese ROM scripting help
« Reply #9 on: October 26, 2017, 11:00:06 am »
I actually didn't know there was already a Quest of Ki translation. Header was the problem. Changed the $00006 value from 40 to 41 and everything straightened out.

@Psyklax, thank you so much for the table! Are two byte characters uncommon in GB/C ROMs? I think a have a better understanding of relative searches now as well, figuring out the space between certain characters as shown in the tiles and searching for sequences in the hex code that follow that pattern of distance from each other. Thanks again for you help  :)

What I have to figure out now is how to keep the translated text inside of the text boxes correctly. I edited some text to English but it broke through the text boxes, even with line break controls. Are there any documents dealing with things like this, figuring out where text box tiles are in the code and keeping text within its confines? Only two things I can think of are figuring out the text box frame tile values by looking at the tiles and doing a relative search based on maybe the length of a horizontal side, top or bottom line of a box, or looking directly at the hex code to try to find patterns of bytes around where text lines stop and start. I'm assuming there's an easier way to do this..


FAST6191

  • Hero Member
  • *****
  • Posts: 2170
    • View Profile
Re: Japanese ROM scripting help
« Reply #10 on: October 26, 2017, 05:36:37 pm »
16 bit encodings (2 byte will be understood and is something of a thing in general PC terms but 16 bit is the more commonly seen term) I don't actually know about for the GB/GBC far as how common it is (GBA and DS it is pedestrian, everyday, utterly non noteworthy and so forth) but would not be surprised to find it is common as you like -- nobody would accuse the GB/GBC of having a lot of memory and storage but it was a bit easier than the NES. On the other hand I would also not be surprised to find some of the older methods like table switching (Japanese may have thousands of characters in common rotation but 256 at a time can still get you there, especially if you never have more than 256 characters on screen at any one time), a few 16 bit exceptions, half width, flag bytes and such bubbling up from time to time.

Length based relative search is something I had not really considered before. I, and many others, do a more basic variation on the theme when divining the table as space is typically the most common character and will tend to occur at a fairly fixed interval range so I guess it would work here.

The trouble will be though that there are many different ways. Some games will need forced new lines or go off the edge/clip (and probably crash somewhere along the way), some things (mainly menus) will be fixed length, some things will auto wrap/create a new line, other things may hybridise this all and probably something else too.

To answer your question though. I would probably look at font handling, or variable width font conversion as those necessarily deal with character placement on the screen. Not many hackers edit the text boxes/text location beyond anything the base engine affords, save for the variable width conversion stuff which itself is the mark of a good hacker dealing with text modification hacking.

Alternatively there tends to be a max limit that you can figure out (if it is not immediately evident from playing the game then make change, run ROM in emulator, make change...). Use whatever you like to tell this (even something as simple as forcing your notepad type program to have a fixed width font, disabling word wrap and setting the window size to the appropriate length, though I would probably step it up and use column editing mode of notepad++ or a spreadsheet or something) and carry on from there. Alternatively alternatively one of your team or testers may go through the whole game looking for such glitches and reporting them back to you.

Psyklax

  • Sr. Member
  • ****
  • Posts: 412
    • View Profile
    • Psyklax Translations
Re: Japanese ROM scripting help
« Reply #11 on: October 27, 2017, 03:10:38 am »
What I have to figure out now is how to keep the translated text inside of the text boxes correctly. I edited some text to English but it broke through the text boxes, even with line break controls. Are there any documents dealing with things like this, figuring out where text box tiles are in the code and keeping text within its confines?

Remember that there is NO set way of programming anything, every game is different. Changing text is generally easy without any knowledge of assembly, but if you want to change text boxes, you're going to need it. You can check out my translation of Aighina's Prophecy where I expanded the text window: it was quite a bit of work, and I needed to learn 6502 assembly to do it. You need to see what the game is doing and change it. So let's say there's an instruction that says "draw the background tile for the text window eight times" and you'd change it to ten times, and so on. But it takes more than changing one byte, and you need to play test it to make sure you don't break anything else.

You also need to see how the game handles text. Some, like Dragon Warrior on the NES, are beautiful: they take the text from the ROM and recompose it with line breaks, and can run on as far as you want. That's because the guys localising it programmed that routine in, but sadly Japanese games don't normally bother with that. With Oide Rascal for example, I provided all the kana, but there are control codes that tell the CPU what to do, so you need to experiment with that.