News: 11 March 2016 - Forum Rules

Author Topic: Game has English words already in ASCII format. What about the Japanese?  (Read 7856 times)

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
I opened up a Super Famicom game in WindHex and discovered what looks like all the English text in the game readily available without making a table.

My question is if the Japanese text is also already in ASCII format, wouldn't I see the Japanese words just as I see the English ones?

Gideon Zhi

  • Discord Staff
  • Hero Member
  • *****
  • Posts: 3536
    • View Profile
    • Aeon Genesis
Maybe, except Japanese is never, ever stored as ASCII. There aren't even 100 available characters in the ASCII codepage (it goes from 0x20-0x7E, roughly) which, if you stretch, might barely be enough for a full katakana and hiragana set depending on how diacriticals are handled, but it still wouldn't be "ASCII." To my knowledge there is no 8-bit Japanese text encoding standard.

Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
Some games have the game ID written in japanese letters (game ID is obligatory, and it must be in ASCII), but I'm almost sure it's a custom encoding of the console, not really ASCII. An example is Hero Senki (Hero Chronicle), which shows up as ヘロ センキ.

Anyway, just because some of the text is in ASCII, doesn't mean all the text in the game is. Sometimes you need a table for different instances of dialogue (opening, ending, in-battle dialogue). The most I've found (consistently) is credits written in ASCII.

Zoinkity

  • Hero Member
  • *****
  • Posts: 565
    • View Profile
Some very, very old japanese PCs used a 7bit encoding.  In those cases diacriticals are seperate characters.  0x80 was used as a flag to denote an EOL follows the char.  It's not something you'd call a standard though, and we're talking systems like 80's era microcomputers. 
For the most part you'll only see either wide chars (2 byte chars) or multi-byte encodings--unless they omit all kanji and use something "custom". 

Internal text, like notes, debug strings, included files, etc. may be encoded in an entirely different method than the rest of the game.  If you're lucky, they may have already implemented a multibyte setup so you can switch between ASCII and whatever without effort.  (Really stinking lucky, more like it.)

tryphon

  • Hero Member
  • *****
  • Posts: 736
    • View Profile
First, the question has no sense since ASCII doesn't contain any japanese characters (the 'A' is for 'American', so it's not even sufficient to code accentued letters in other languages using Latin-alphabet).

That said, many more-than-7-bits standard encodings extend ASCII ; that is, if 1 byte coded characters are allowed in the standard, they usually are the same than ASCII. IIRC, it's the case of S-JIS, which is likely the most often japanese encoding used in games.

1 byte japanese encodings exist, but as already said, they are not standard, and you have to make a table.

VicVergil

  • Hero Member
  • *****
  • Posts: 736
    • View Profile
As an example, Marvelous for the Super Famicom uses the 00-4F range for hiragana, 50-9F for katakana, A0-EF for latin characters, numbers, punctuation and then a handful of kanji, F0-FF are various control codes.
F5 followed by a byte gives you one of 255 kanji.

Terranigma has single byte values from 00-7F used in the English version as English characters, but in the JP version there's a switch control code before the text to make them switch between katakana/hiragana/latin... and 80/81/82... followed by another byte would give a kanji.
Secret of Mana manages to use in its Japanese version only single byte characters: they have the katakana/hiragana switch, and use a minimal set of kanjis in the 80-FF range.
Games like Bushi Seiriyuden and Chaos Seed have more than a thousand kanjis even (using dual byte, obviously)

Hearing about Lagrange Point... its text encoding is an ugly mess.

Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
Speaking of switches, how do you guys deal with them when you make table files? I was working on SRW1, and the whole dump was in katakana because of that. I just put a switch before the text (<kata> and <hira>) to improve readability.

You all probably use a custom dumper/inserter, but I would like to hear what are your takes on this.

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
Ok, I definitely see most if not all the English text from the game, but no Japanese characters. So it looks like I have to make a table for the Japanese dialogue after all.

VicVergil

  • Hero Member
  • *****
  • Posts: 736
    • View Profile
Ok, I definitely see most if not all the English text from the game, but no Japanese characters. So it looks like I have to make a table for the Japanese dialogue after all.

Of course you have to. What did you think? :P
Shift-JIS wasn't really introduced until higher storage consoles came.
The earlier ones would almost always use custom values for Japanese characters.
I recommend Monkeymoore.
Try to find the visuals for the font in the ROM or VRAM to get the order the game uses
(sometimes あぁいぃうぅえぇおぉ・・・ or あいうえお・・・わをんぁぃぅぇぉゃゅょ 
かきくけこさ・・・がぎぐげご・・ or かがきぎくぐけげこごさちす・・・ )
When you get it you write it in Monkeymoore.

gadesx

  • Sr. Member
  • ****
  • Posts: 278
    • View Profile
    • Gadesx scene
try this:

sjis tables for hexadecimal editors
http://dl.dropboxusercontent.com/u/22524283/tablas%20sjis.rar

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
I tried looking in a few different tile editors. I can't find the font. I did use VSNES however and found this in the VRAM of the memory from a savestate:



I can see a lot of the Kanji clearly, but the hiragana looks torn apart or something. How am I to make a table of the hiragana if i can't see the order it's placed in properly?
« Last Edit: March 08, 2014, 12:29:42 pm by naxis »

VicVergil

  • Hero Member
  • *****
  • Posts: 736
    • View Profile
Neither the kanji nor the kana are showing "correctly", they're just divided in two.
To see it properly, extract the RAM's contents and open it with your favorite tile editor on the correct mode, with two tiles per line.
Well, that should be useful for the table building part...


Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
I can't see your screenshot very well, but I think the roman alphabet is a 8x8 font, and kana is 16x16. From what I'm seeing, two roman letters occupy the same space as one kana.

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
Neither the kanji nor the kana are showing "correctly", they're just divided in two.
To see it properly, extract the RAM's contents and open it with your favorite tile editor on the correct mode, with two tiles per line.
Well, that should be useful for the table building part...

I'm sorry Ghanmi. What do you mean extract the ram's contents? How do I do that?

Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
I'm sorry Ghanmi. What do you mean extract the ram's contents? How do I do that?
It's the same thing you did with VSNES. Save a state and open it with a tile editor. If you didn't know, a Save State is just a memory dump (RAM, in this case).

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
I thought that is what it might of meant. Looking at the save state in Tile Layer Pro and YY-CHR in different formats doesn't show the font clearly.

What is meant by "two tiles per line?"

Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
In YY-CHR, change both Width and Height to 16. A "typical" tile is  8x8, so two tiles per line is 16x16... or is it 8x16? I honestly don't remember.

naxis

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
I'm using YY-CHR to find the Japanese font in the game. I'm tried the 16x16 option and this is what I got:



Can't see the Japanese font clearly and it doesn't put them back as one either. There is no 8x16 option.

I tried all the options and the 2 bits per pixel came up most clearly:




But still the Japanese font is in half. I'm still not sure how to make the Japanese characters become whole again when looking at them in a tile editor.

What am I missing in this scenario?

Scio

  • Full Member
  • ***
  • Posts: 155
    • View Profile
Which part of the game you took that savestate on? I took one right before a stage (when the stage title appears), and it shows pretty cleanly here on YY-CHR. Are there any other parts with japanese text?

KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 7255
  • *sigh* A changed avatar. Big deal.
    • View Profile
Doesn't Tile Molester have a Row Interleaved option or something that might help?
I'm tool lazy to install Java at the moment to check.

Also, you shouldn't be hacking the [h1-C] version, as that means it's a bad dump or something. Look for a ROM with [!] in the filename.
"My watch says 30 chickens" Google, 2018