11 March 2016 - Forum Rules

Main Menu

Japanese text problems

Started by gloomspirit, February 09, 2014, 09:00:02 PM

Previous topic - Next topic


I've decided to translate this japanese music game for psp since it seemed a rather easier project to begin (I want to translate visual novels in the future  :laugh:). I was trying to translate the part where some game stadistics are shown, for example: Play Time, Song Completition, etc. I've found most of these were on the eboot.bin file. The thing is that unlike normal english in which each character is a single hex value, in the game I saw that each character (hiragana, katakana, kanji) belongs two hex values so when running the game after the replacement not all letters were shown. For example I had 5 characters that represent "Play Time" and when I run the game it showed "P a  i e" :(

I want to know what the solution for this problem, I would like the answer to be precise but it doesn't have to explain in detail everything I have to do. I just need the guide to what I'm supposed to do. :)

Also, I've reading some documents regarding game hacking and I've come to a question, so if anyone would answer I'd be grateful too. Oh, and if I use wrong terms or mix up definitions feel free to correct me.

Most of game hacking guides I've read talk about rom hacking but what I want to do is umm.."iso hacking". Rom hacking talks about tables and tiles so I want to know if these concepts remain in iso games.

P.D. I have some japanese knowledge, at least reagarding hiragana and katakana. I also have some programming knowledge. :)

Thanks in advance.


It's because the game uses the S-JIS character set, and each character uses two bytes instead of one. There is a way to convert from 16-bit (2 bytes) to 8-bit (1 byte) but it varies on a game-by-game basis. It's not very hard, but there's not a set formula that can be used in all games - i.e. you have to find it on your own.

About ROM/ISO hacking, it's because cartridge-based games usually came in ROMs (Read-Only Memory). Don't mix it up with "ROM Image", which is a dump of the whole ROM to single file.
ISO is actually a filesystem, not another "format". You can use the same basics on both Cartridge-based systems and Optical Disc-based systems. The main difference is that (usually) a ROM image has no filesystem of its own (although there is a logical separation), while an ISO is a filesystem in itself. But there's no set rule written in stone, because some ROMs may have filesystems (Nintendo DS for one).

I tried to explain to the best of my ability, everyone is welcome to correct me.


Quote from: gloomspirit on February 09, 2014, 09:00:02 PM
Most of game hacking guides I've read talk about rom hacking but what I want to do is umm.."iso hacking". Rom hacking talks about tables and tiles so I want to know if these concepts remain in iso games.
Well, it's good that you use the word "concepts"; that's the right question to ask. All the basic underlying principles are the same because computers are computers. The tougher elements of hacking come in because specifics can get pretty ugly, and even if you understand all the principles involved, it can still take a bit of legwork just to hash out exactly what's going on.

Thus, text tables are practically a universally applicable technique for any actual text; you can't encode text without some sort of coding (you might have completely image-based text, but to a computer, that's not really text), even if the text is ultimately stored as some garbage-looking compressed stuff. Readable text strings don't happen without encodings.

Tiles, on the other hand, are an artifact of what techniques were practical for making good-looking realtime graphics on a lot of 80s/90s tech. You can learn quite a bit about graphics from studying how tiles work, and it's a useful technique in a game-design sense to this day (Skyrim uses a lot of squarish tiles so they can build dungeons quickly, but then they add a bunch of other stuff to make it feel natural). But it's probably not terribly relevant to most VNs, except maybe in their font rendering. That is, yes, you should know about tiles, but no, you're probably not hunting for them right here.

If you're lucky, you can find the right information or problem-solving technique to make your modifications with enough detective work (specific information about the hardware and operating environment of your game is a good start). Unfortunately, that's not usually enough unless someone has already gone to the trouble of documenting the exact game or some related software the game uses; sometimes you have to construct the solution piece by piece, or even ask for directions so you know where to look. One of the more common hacking problems for people new to it seems to be in finding documentation that is marginally related and then making it somehow work for their project based on the more general principles. It's like putting a square peg in a round hole, except you don't have a lot of experience at whittling or at making tortured similes. So don't be afraid to explore a bit, and let's hope you can stick with it! :thumbsup:
we are in a horrible and deadly danger


First of all thanks for the replies. :)

Second, I've made some search regarding 16 bit to 8 bit conversion but all I found is that apparently I will have to do some assembly coding. At this point I thought about two possible solutions:
1. Do 16 bit to 8 bit conversion. The question is that if I make this conversion is it going to be mandatory to translate every single part of the game that it is in japanese since there are a few things that wouln't need to be translated.
2. Do pointer stuff so english text which in most cases is larger than the original text can be inserted. I think that even if I chose the first option I wil have to use pointers at some point, but this 2nd option is just about using pointers without the bit conversion. The thing that makes me worry a little bit about this 2nd option is the memory space, I don't want to increase the game size too much.
I'd like to know which options is more convenient or if there is no difference maybe some pros and cons can help me decide.

Well, now I have two more questions (Yeah, doubts kept coming all night XD). Anyway, the first question. It is really necesary to make a table? I've seen this page where a person was translating some in-game english text of a japanese psx game and used savestates to locate the text. In the savestate the text could be found easily but in the game it couldn't so he use the emulator to debug the game, check the memory,registers and there were a lot of assembly lines and he finally realized the text was being showed with some kind of loop so that help him locate the text in the game. I've trying to use this method and effectively the text shows with no problems in the savestate. I'm wondering if I should follow this method or I should go with to the tables since I think is kinda easier.

The second question is regarding a patch. I know there is a still long way until that but I want to know there isn't going to be any problems when that time comes. I haven't do any search as how a program creates a patch. I just think that umm..maybe as parameters you put the original game, the modifided game and the program finds the differences and the patch is made. Anyhow, I'm just concerned about files like eboot.bin. I've decryped it but in the game is crypted and I've read there is no way to crypted it back, so is that going to be a problem in the future? Also, please tell me some aspects (if there are some) I should keep in mind so I won't have many problems in the future.

Man, I feel like a kid asking about everything that comes into his mind XD. But I feel extremely happy when I find/read the answer I've looking for, so thanks in advance to everyone that makes a reply. Oh, and don't be afraid of making long replies  :)


Of course you have to make a table. It's the first step necessary to show the japanese text. I think you've mixed it up with finding the font, which is something different. He was probably using savestates because the font is sometimes encrypted inside the ROM, so the only way he found to look at the font was to make a savestate (which is a dump of memory in a particular moment), because the game decrypts the data before using it. A table relates a HEX value with a letter (like, B9=A, BA, = B, and so on), while a font is a graphical representation of those letters.

Yeah, to convert from 16 to 8bit you need to do some ASM hacking.

About the patch, it's not really rocket science. It compares two files and makes a record of all the differences. For disc-based games, you can also code your own patcher, which will change just the necessary files.


Many games include basic Latin in their fonts. And if the game doesn't then you need to change the font so that it does. Nothing of this requires changing the character encoding. You just need to learn how the used encoding works. That should be simpler than replacing it.