News: 11 March 2016 - Forum Rules, Mobile Version
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia

Author Topic: Is translating a game really this much work?  (Read 1213 times)

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Is translating a game really this much work?
« on: August 24, 2017, 09:50:19 pm »
Hi everyone~,

I decided to have a go at translating the Japanese-only SEGA Saturn title "Air Management '96." Just to see if I could get anywhere with it, or if it was too difficult for me.

I'm at a slight disadvantage, because I don't speak any Japanese, I've never modified a ROM before, I'm not familiar with the hardware or software of the Saturn console, and I've also never played the actual game before (if anyone has, do let me know if it's fun or not).

I started yesterday, and have made some progress, but am finding there is an enormous amount of work ahead of me.

Initially, I thought the best way produce the finished product would be to copy all the files off the game CD, then burn them back to a disc image, but it seems it is not so simple; perhaps the Saturn requires a specific disc size, or some such? Anyway, after going through some applications, I found one called 'CDmage', which is very good, because it lets me open the game disc, replace/overwrite an existing file (i.e. a file containing untranslated text), and the disc still works in the emulator afterwards.

I started off looking in the game's memory - with it running in the emulator - for a number (digits) that was at the end of some Japanese text (I think it is katakana?). From there, I was able to figure out where strings are. Now the hexadecimal data for each byte is slowly starting to make sense, because they correspond to the 'Shift-JIS' table, which I already read a lot about on here. You can see a changed string, here:


So, because I can translate one string - which is the same type of string in the game as all the others - and because I can write that translation into a new disc image to play in an emulator, I feel like I have the whole process from start-to-finish available, now.

After finding the first few addresses, I started making a spreadsheet:

Bigger version: http://imgur.com/UuAC7Ip

You can see in the left-columns (start/end in game RAM), and the ones in the middle (start/end in containing file), that they all end with the same bytes at the end of every address. I guess this is normal?

But, I'm still finding it very difficult to find strings shown throughout the game. A lot of guesswork. Sometimes, it seems impossible to locate certain strings. I even find it may be easier to look at the Shift-JIS characters, and match them to what I can see on the screen (hamburger face is ツ 8356, TV aerial on roof is イ 8343, etc.), but this is even more slow and difficult.

It seems like it would take forever to translate a whole game properly.

Is there anything that can be done to speed it up? I know that people talk about script extraction(?), but I would surely have to write a custom program to extract the strings from the binary file they're all in on the disc, and I don't know which bytes are UI strings and which aren't, because I have to work that out manually at the moment... :huh:

Psyklax

  • Sr. Member
  • ****
  • Posts: 412
    • View Profile
    • Psyklax Translations
Re: Is translating a game really this much work?
« Reply #1 on: August 25, 2017, 12:48:36 am »
(hamburger face is ツ 8356, TV aerial on roof is イ 8343, etc.)

:laugh:

You really don't know Japanese, do you? :) Translating a game from Japanese without being able to read Japanese will be a challenge for sure, but the fact that the text is already in Shift-JIS will make things much easier. The reason the addresses in RAM and the file are similar is probably because the file is loaded into RAM at some position. The fact that you've got text on the screen is very positive. With a little work it should be easy to figure out how the strings are stored.

I'd recommend getting someone who actually knows Japanese to help out with this, though. I'd ask if there was a lot of text in the game, but since you've never played it I guess you're not sure. I have no experience of translating 32-bit games, but if it's like this then it'll be easier than older games. :)

Zoinkity

  • Hero Member
  • *****
  • Posts: 557
    • View Profile
Re: Is translating a game really this much work?
« Reply #2 on: August 25, 2017, 11:15:11 am »
The thing about script extractors is that they can only magically rip (or reinsert) text because somebody already went through by hand and worked out where it all is.  I feel your pain ;*)  Takes forever doing N64 titles; Saturn's about the same scope.

Depending how the game is structured there may be a table of different text banks and whatnot, or it could be a Bad Situation where every pointer is hardcoded in ASM.  In your case each file probably contains its own relevant text with maybe a generic bank someplace too.  Don't rule out some elements actually being images.

Gideon Zhi

  • IRC Staff
  • Hero Member
  • *****
  • Posts: 3389
    • View Profile
    • Aeon Genesis
Re: Is translating a game really this much work?
« Reply #3 on: August 25, 2017, 11:53:00 am »
"Is it really that much work?"

If it's done right, it's *more* than that much work. It isn't just about dropping in English text over the Japanese, it's about understanding how the game stores its strings anywhere it stores its strings (often, there are a multitude of formats in any given game), how it prints its text anywhere it prints its text (again, multitude), and rewriting everything (text, fonts, windows, interfaces, text placement, text speed, etc) not only so it's in English, but so it looks like it was never in Japanese in the first place. The more interface types a game has, generally, the more difficult it becomes to translate well. This, not just text volume, is one of the major reasons RPGs are so difficult to translate properly. Consider how many different interfaces, say, FF5 has? There's regular dialog text, there's shop windows, yes/no prompts, main menu, class screens, inventories, equipment screens, options, status screens, magic screens, save screens, load screens, the combat interface, the battle menu, item/magic/skill screens in combat... This might not even ben an exhaustive list. There's a lot to get right, and getting it right in such a fashion that people can't tell that you hacked it in the first place? That's hard.


KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 6174
  • *sigh* A changed avatar. Big deal.
    • View Profile
Re: Is translating a game really this much work?
« Reply #4 on: August 25, 2017, 12:10:47 pm »
I'd recommend getting someone who actually knows Japanese to help out with this, though. I'd ask if there was a lot of text in the game, but since you've never played it I guess you're not sure. I have no experience of translating 32-bit games, but if it's like this then it'll be easier than older games. :)
It's a Koei simulation game, so there's probably TONS of text. :)
Quote
Sir Howard Stringer, chief executive of Sony, on Christmas sales of the PS3:
"It's a little fortuitous that the Wii is running out of hardware."

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Is translating a game really this much work?
« Reply #5 on: August 25, 2017, 01:04:18 pm »
The reason the addresses in RAM and the file are similar is probably because the file is loaded into RAM at some position. [...] With a little work it should be easy to figure out how the strings are stored.
Oh~, I did not know this. In fact, the file containing the text strings on the CD is already loaded into the emulator's RAM as the intro FMV is playing. So eager~...

Thank you for your advice!

I thought more about being able to extract automatically the text strings.

This would be so much easier if:
 a.) The memory editor in Yabause, and HxD, would display the same damn characters for the bytes, instead of different ones like in the image!
 b.) My computer could display the bytes in the same Shift-JIS font as the game!


Bigger version: https://i.imgur.com/qbMICVV.png

You can see in the image that there are clear separations of text strings with '00' bytes.

Of course, we can still learn a lot from this.

Shift-JIS's 'blank character' code is 0020, not 0000, so these are not spaces. The byte '00' suggests they are more like some sort of 'null' value, not intended for public exhibition.

Notice how there is always a minimum of two '00' byte characters together.

This suggests that the value '00 00' is actually two bytes. But~, we must not assume! Because '00' is so non-descript, it's impossible to tell whether '00 00' is two sets of zero-page '00' in a row, or one set of '00 00'.

Sometimes there are three or four '00' bytes in succession. Perhaps, the developers left these superfluous null characters in the data, in case they feared they would need the extra characters after proof-reading the game and tweaking sentences? Or perhaps it was a requirement - or maybe even an optimization - to have strings aligned in memory or files this way, with bytes to space them out...

Regardless, there's a distinct pattern! With more study, I think maybe it will start to make sense, and perhaps I can begin to automate the process. It has only been two days, after all.

Depending how the game is structured there may be a table of different text banks and whatnot, or it could be a Bad Situation where every pointer is hardcoded in ASM.  In your case each file probably contains its own relevant text with maybe a generic bank someplace too.  Don't rule out some elements actually being images.
Although it is hard, I feel a bit lucky that the game is not something like a Visual Novel, that is very text-heavy. With a simulation-type game, a lot of the gameplay is procedurally-generated/very non-linear, so there is probably not as much text to translate.

Thank you for your reply!

"Is it really that much work?"
Oh my God, I think I want to give-up already, after reading that! Doing all that work for something like a Final Fantasy game would take more than a year, probably~... :-\.

My lack of Japanese understanding is more of a problem than I anticipated, yesterday. I attempted to play the actual game, but could not figure-out how to start a new game. I got stuck in a loop choosing my company's logo colours, and I couldn't understand the woman at the bottom, with the elaborate knot in her neck vest thing, because she only tells me Japanese.

goldenband

  • Full Member
  • ***
  • Posts: 245
    • View Profile
Re: Is translating a game really this much work?
« Reply #6 on: August 25, 2017, 04:13:11 pm »
This game is basically Aerobiz 3, right? So at least there's an existing template for a lot of the localization choices.

KingMike

  • Forum Moderator
  • Hero Member
  • *****
  • Posts: 6174
  • *sigh* A changed avatar. Big deal.
    • View Profile
Re: Is translating a game really this much work?
« Reply #7 on: August 25, 2017, 09:53:03 pm »

Shift-JIS's 'blank character' code is 0020, not 0000, so these are not spaces. The byte '00' suggests they are more like some sort of 'null' value, not intended for public exhibition.

Technically isn't Shift-JIS 'blank' just 20? (as in, backwards compatible with ASCII. A character below 80 is a single-byte ASCII and 80+ indicates both a Japanese character as well as the first byte of which character?)
However of course, games might not adhere to that and expect 2 bytes per character.
Quote
Sir Howard Stringer, chief executive of Sony, on Christmas sales of the PS3:
"It's a little fortuitous that the Wii is running out of hardware."

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Is translating a game really this much work?
« Reply #8 on: August 26, 2017, 12:17:29 am »
Technically isn't Shift-JIS 'blank' just 20? (as in, backwards compatible with ASCII. A character below 80 is a single-byte ASCII and 80+ indicates both a Japanese character as well as the first byte of which character?)
However of course, games might not adhere to that and expect 2 bytes per character.
Ah~, I see! I'm pretty slow at figuring this stuff out, as it wasn't until I started asking Google what all the 'OA' characters pre-pending or appending some strings meant; of course it means a newline :-[.

So far I've only located eight strings, which is a really poor effort for two days of work. And the last two of those, were done by manually matching the squiggle I can see in the emulator to a table of Shift-JIS characters, which is very tedious.

A new approach is surely required.

I looked at the characters surrounding the strings I'd located in RAM/CD file, to try and notice a pattern. '00' often surrounds strings; sometimes '0020' (or '20', as I've now learnt ;)) is used, meaning a newline; it's not unusual to see an '8148' for a question mark, which is also a good way to decide where a string should end; and sometimes there are unexplained characters that could start or terminate a string for many reasons, like users before helping me have mentioned :).

We can now begin to automate. So I wrote a program, which takes the binary file on the Saturn's CD, and breaks apart the stream of bytes by the characters I mention above - 00s, 20s, 8148s, etc.

You can see here, that it works very nicely, because the green boxes show where it has detected a string which is one of the few that I can verify is correct:


Bigger version: http://imgur.com/oWqL8l4

But the seventh string, highlighted in my spreadsheet, has the bytes '06 01 21 E4' before the string shown in the game UI. Therefore, how would the extraction tool know where the string begins, if I can't tell the tool whether or not it should separate lines on those bytes! Like below:


http://imgur.com/lLtanf8

Because the extraction tool detects the character(s) to separate lines on, it is only as smart as I am, and that is not that smart, so this tool seems really not that useful.

Of the eight strings I've manually found, three of them have random bytes pre-pending the actual string (that is shown in the game's UI).

Like this: E5 32
And this: 06 03 29 DC
And this one I mentioned: 06 01 21 E4

What the heck do they mean? I can't detect any kind of significance or pattern to these random bytes, so the extraction tool is useless! Blast it~...  ::)

August 27, 2017, 04:04:09 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
After taking a day-off from this, and perhaps thinking it was time to stop, I took a closer look at the strange bytes that prepend the genuine byte-strings, causing my extraction tool not to be useful.

So, it goes like this: 00/0A --> weird bytes --> genuine string --> 00/0A.

These are the bytes that preceed three genuine strings:


It's strange how there's a pattern to them. The byte '06' is often seen, with either '01', '03', or '05' directly after it. Followed by two random bytes. And then the '060N' appears again.

I decided to fiddle with one of the three, repeatedly, in the emulator, to observe possible changes that might indicate what the bytes do. Of course, I had to document the 'experiment':

Bigger version: https://imgur.com/a/rfsPv

Not very useful information. In fact, I would even say, this experiment was a complete waste of time.

But, you know what just dawned on me? Those bytes before the genuine strings look like they could be memory addresses.

All genuine strings I have found so far, start at '06' in the game's RAM; the furthest genuine string I have found in RAM, so far, is '0641'.

So, I think that these patterns of bytes going '0605NNNN0605NNNN', '0601NNNN0601NNNN0601NNNN0601NNNN0603NNNN', and '0601NNNN0601NNNN' are pointing to locations in the game's memory ::).

What do the two-byte pairs after the addresses do? I do not know. And why are these address pointers placed directly before a real string? I also do not know.

BUT~ :happy:

A quick Google has revealed that this is not unusual, and that there is information from other people on the Internet and this forum on this, so further reading will reveal light on this matter for me, I bet :).
« Last Edit: August 27, 2017, 04:04:09 pm by LapFeaturingMistake »

MarkAss

  • Jr. Member
  • **
  • Posts: 62
    • View Profile
Re: Is translating a game really this much work?
« Reply #9 on: September 04, 2017, 05:31:56 pm »
My experience with game memory is that each memory address is 4 bytes long. That is why sometimes you get up to 4 sets of "00" after you see text.

Since "00" will end the sentences, and you want the next line of text to start in a new memory address, you fill in the remaining bytes with "00". You get 4 sets of "00" when the text ends exactly in a multiple of 4.

Also, older games remap foreign characters in Shift-jis over english characters, since english alphabet are one byte and most foreign characters are 2 bytes long.

So mapping over English alphabet would take up 1/2 the space.
The only way to see it correctly would be to create a custom table with results the game displays.

It is a lot of work to translate games!
« Last Edit: September 04, 2017, 05:40:48 pm by MarkAss »

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Is translating a game really this much work?
« Reply #10 on: September 04, 2017, 05:44:48 pm »
Since "00" will end the sentences, and you want the next line of text to start in a new memory address, you fill in the remaining bytes with "00". You get 4 sets of "00" when the text ends exactly in a multiple of 4.

^ This guy. Everyone should listen to this guy. He does not lie.

Thank you, Mark, although instead of '00' I used the space character ('20') instead :-\. Well, as long as it works, I'm sure it'll be fine?

Although I have not posted for a while, I have made quite good progress with my translation. I also tried-out 'Mednafen', which is very good, and it's memory editor can display strings in Shift-JIS! I was so impressed, I copied the feature for my own tool, which is proving quite useful:


Bigger version: https://imgur.com/ZtQbDw5

However~, there is still a long way to go, and a lot of opportunities for the overall task to stall and remain unfinished.

BlackDog61

  • Hero Member
  • *****
  • Posts: 775
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Is translating a game really this much work?
« Reply #11 on: September 20, 2017, 06:35:25 pm »
However~, there is still a long way to go, and a lot of opportunities for the overall task to stall and remain unfinished.
Sure. But isn't it also what gets you excited and interested in the first place?
Keep learning - that's the best advice possible.
Don't hesitate to read theoretical stuff about the things you're playing with. S-JIS is described if you look for it on the internet, for instance.

I've written it a few times, but you may find that the "SJIS dump" is useful for this exploration phase, at least for the part where you say you're not sure you've found (or you can find) all text.

filler

  • RHDN Patreon Supporter!
  • Hero Member
  • *****
  • Posts: 606
  • "WINNERS DON'T SELL REPROS"
    • View Profile
    • Filler's Translation Projects
Re: Is translating a game really this much work?
« Reply #12 on: September 20, 2017, 06:54:16 pm »
Don't hesitate to read theoretical stuff about the things you're playing with. S-JIS is described if you look for it on the internet, for instance.

Not sure if this is relevant but you can find table files for S-JIS. They are probably on the site somewhere, but here's a link to mine in case it helps. It's useful if stuff happens to be encoded in S-JIS, though that's never the case for 8/16-bit games.

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Is translating a game really this much work?
« Reply #13 on: September 23, 2017, 11:55:03 am »
Sure. But isn't it also what gets you excited and interested in the first place?
Yeah~, I already lost interest in this project. Failed already. What a disappointment. I got distracted about two weeks ago, and started writing a fun plugin for MusicBee :-\.

Regarding translating Air Management '96:

I did all the tricky technical bits (I nearly gave up trying to locate some of the half-width hankaku characters, before eventually finding them), and have - I think - at least two-thirds of the Japanese text in the game translated into rough-and-ready English.

I spent quite a few days playing the game, and fixing-up any sloppy English I came across, but there is still a lot of text that needs reworking to a professional standard, and bits of Japanese text still to locate (the smaller the word (i.e. one or two characters) the harder it seems to be to locate).

I really should make an effort to go back and finish this thing off :D; it's just that the work is boring, now. Although I don't consider myself very technically proficient or smart, I'm wondering if the challenge of making the translation possible was the interesting bit for me, and now that the concept has been proven, it has lost its allure :huh:.

By the way:

Tracking down hangs/console crashes is really annoying. When I don't know where in the data the bug is coming from (because I overwrote some memory address pointer), I do like a half-and-half approach to isolate the issue by overwriting some of my changes with the original data, like this:

Round One:
0x0000 - 0x9FFF (first half of data)
0xA000 - 0xFFFF (second half of data) // it's in this one

Round Two:
0xA000 - 0xCFFF (first half of the half)
0xD000 - 0xFFFF (second half of the half) // it's in this one

Round Three (time for the knockout):
0xD000 - 0xECFF // it's in this one
0xED00 - 0xFFFF

Very frustrating, because one single 'hang' takes about twenty minutes to track down... :banghead:

goldenband

  • Full Member
  • ***
  • Posts: 245
    • View Profile
Re: Is translating a game really this much work?
« Reply #14 on: November 02, 2017, 04:22:26 pm »
Just saw this:

Regarding translating Air Management '96:

I did all the tricky technical bits (I nearly gave up trying to locate some of the half-width hankaku characters, before eventually finding them), and have - I think - at least two-thirds of the Japanese text in the game translated into rough-and-ready English.

I spent quite a few days playing the game, and fixing-up any sloppy English I came across, but there is still a lot of text that needs reworking to a professional standard, and bits of Japanese text still to locate (the smaller the word (i.e. one or two characters) the harder it seems to be to locate).

I really should make an effort to go back and finish this thing off :D; it's just that the work is boring, now. Although I don't consider myself very technically proficient or smart, I'm wondering if the challenge of making the translation possible was the interesting bit for me, and now that the concept has been proven, it has lost its allure :huh:.

Sorry to hear you've lost interest. Perhaps ROM hacking isn't for you, since it requires enough persistence to get to the finish line even when the work is boring or tedious.

(Then again, I'm not aware of any field of human endeavor in which you can succeed without a willingness to suffer through tedium; maybe it's just a question of what kinds of tedium you can tolerate vs. what kinds you can't.)

In any event, could you upload the final version of your work if you're calling it quits on this?

LapFeaturingMistake

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Is translating a game really this much work?
« Reply #15 on: November 02, 2017, 04:49:19 pm »
Hellooo~.

Perhaps ROM hacking isn't for you, since it requires enough persistence to get to the finish line even when the work is boring or tedious.

(Then again, I'm not aware of any field of human endeavor in which you can succeed without a willingness to suffer through tedium; maybe it's just a question of what kinds of tedium you can tolerate vs. what kinds you can't.)
EDIT - I had the wrong impression in my original version of this reply, sorry. I did go back to this and worked through earlier problems, but had to give-up when I got stuck at a technical problem.

In any event, could you upload the final version of your work if you're calling it quits on this?

No problem, it's here. :D

goldenband

  • Full Member
  • ***
  • Posts: 245
    • View Profile
Re: Is translating a game really this much work?
« Reply #16 on: November 02, 2017, 08:01:42 pm »
^Thanks for that!

And my apologies if my reply initially sounded snarky; it wasn't meant to be. It's just a basic reality of life, I think, that when we're at home in a particular field, it means -- almost by definition -- that we find the rewards outweigh the tedium.

I'm sure you've got interests where you find it easier to get motivated to plough through the dull bits. :) But then again I can't help but think that, having gotten so far, you probably would've stuck with it if you hadn't gotten blocked by a tech problem.

(At least, that's the way it is with me -- when I understand what's going on and know what to do or how to fix it, I usually have no problem going all the way to the end of a project. It's when I have no idea what to do that I tend to give up.)

edale

  • Jr. Member
  • **
  • Posts: 44
    • View Profile
Re: Is translating a game really this much work?
« Reply #17 on: November 04, 2017, 12:30:55 am »
Initially, I thought the best way produce the finished product would be to copy all the files off the game CD, then burn them back to a disc image, but it seems it is not so simple; perhaps the Saturn requires a specific disc size, or some such?
Without a modchip, you will NEVER get a burned disk to play on a Sega Saturn.

That distinct pattern on the edge of the disk created a specific wobble in the disk as it spins that the system checks for, there is no way to reproduce this that I'm aware of.

For unmodded Saturn owners, the only way to play mods is through emulators.

Lots of interesting info about the Sega Saturn:
https://www.youtube.com/watch?v=jOyfZex7B3E

*Insert obligatory Segata Sanshiro reference here*