News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Japanese Character Recognition Help.  (Read 5304 times)

3ukalipto

  • Newbie
  • *
  • Posts: 3
    • View Profile
Japanese Character Recognition Help.
« on: March 04, 2011, 01:27:12 am »
Hello and good afternoon to anyone who has entered this thread.

First of all, let me introduce myself, for I am quite new to the ROM hacking scene. My name is Edgar, and I enjoy playing Japanese games too much. Only downside to this is that some of the best Japanese games are only released with the native country in mind.

I became VERY interested in the translation hacking process and was a bit overwhelmed by the time it takes to complete a single translation.
I have no experience at hacking; the past few days I've been studying binary and hexadecimal code. I can tell you I'm no wizard at those two codes, but I've learned the basics already. I'm not good at converting one to another and to decimals or text, but I can do it.

I'm most interested in translating Japanese PS2 ISOs. Yes... too complex for a novice, I know. But hey, it does not cost me much to give a Long and painstaking try, right?

I know it is almost impossible for a person of my experience and background (almost none at hacking) to translate a PS2 ISO from Japanese to English or Spanish ( I also can not understand Japanese, but I've been studying it as well these past few days).

Well; enough of almost functionless preambles...

I'm interested in translating these games ( :-\) :

Kengo 3
Hissatsu Ura-Kagyou
Shinobido Takumi
Kamiwaza
Rurouni Kenshin: Enjou! Kyoto Rinne


...and; one that does not seem out of my reach at the time

David Doulliet Judo (this game is in French, not Japanese... but is a judo game!!!)



I have given a couple of tries at Kengo 3. I am trying to create a table for it; the only issue is that the game is possibly 99% in Japanese.
Well; I need the help of anyone who could understand my hand-drawn Japanese characters. I took the time to stare at my TV screen for about 40 minutes trying to get right the characters ona piece of paper; I then scanned it and processed it. And I bring it to you... I hope you can help me identify the characters and their "nature" (romanji, kanji, hirgana, katakana). I need to identify which system is the game using. Perhaps I am making a foolish assumption or doing something wrong. Feel free to point out my mistake.







Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Japanese Character Recognition Help.
« Reply #1 on: March 04, 2011, 03:03:42 am »
Feel free to point out my mistake.

Lemme break this down for you.

Quote
I have given a couple of tries at Kengo 3. I am trying to create a table for it; the only issue is that the game is possibly 99% in Japanese.

Actually, that's a pretty big issue. The one you're trying to rectify, I assume?

Odds are you don't need to make a table. Most Japanese games for modern systems just use Shift-JIS encoding. Here's a ready-made Shift-JIS table you can use. Friendly word of advice: if you're going to be using it with WindHex, get JWPce and save it in Shift-JIS format. That's right: the Shift-JIS table isn't actually stored in Shift-JIS format. It's in UTF-8, which is more widely compatible, but WindHex doesn't like it.

Quote
Well; I need the help of anyone who could understand my hand-drawn Japanese characters.

No you don't. You'll need a translator; eventually, anyway, but for now, trying to translate the game one handwritten screen at a time is counterproductive at best.

Quote
I took the time to stare at my TV screen for about 40 minutes trying to get right the characters ona piece of paper; I then scanned it and processed it. And I bring it to you...

As much as I respect the effort you've taken here (I thought it was authentic Japanese calligraphy until I read your post more closely), you're taking the wrong direction with this. Assuming you haven't done so already, you'll need a tool to break apart the ISO into its constituent files and reassemble them when you're done. If the game uses its own virtual file system (VFS), this will take the aid of an experienced hacker.

Quote
I hope you can help me identify the characters and their "nature" (romanji, kanji, hirgana, katakana). I need to identify which system is the game using. Perhaps I am making a foolish assumption or doing something wrong.

See this? This is romaji. That's right, it's the good old-fashioned Roman alphabet. That's literally what "romaji" means in the first place ("Roman characters", anyway).

As for the other three? Japanese uses a mix. Check out the following sentence:

Quote
俺の引いたカードは…封印されしエクゾディア!

That's kanji, hiragana and katakana, plus some good old-fashioned Western-style punctuation.

Even if you ultimately decide to stay away from the front lines of the actual translation part of translation hacking, it's immensely useful to be able to at least identify the three types of writing on sight; preferably, you'll be able to read hiragana and katakana, possibly a few kanji as well.

You're not making "foolish assumptions"; you're just being terminally newbie. A lot of people come around here expecting easy answers and magical tools that take anything that looks like "work" out of the process. You, on the other hand, understand that this kind of undertaking takes effort - you've just misdirected it so far. Spend some time around here and we might make a decent translation hacker outta you. :3

For a start, it might pay to step back and take a look at a simpler project. Now, in some ways, games for more recent consoles are simpler than their ancestors - data gets separated into a proper file structure instead of tossed together in a ROM - but the problem is, the newer platforms aren't half as well-documented. The NES, SNES, Game Boy and GBA are all well-traveled and familiar ground; there are some hurdles involved, but they make a fine place to start.

Good luck!
In the event of a firestorm, the salad bar will remain open.

Auryn

  • Hero Member
  • *****
  • Posts: 649
    • View Profile
Re: Japanese Character Recognition Help.
« Reply #2 on: March 04, 2011, 03:48:02 am »
Always the same with this newbies  :banghead:

Is there a chance u will translate one of those games u said?? No, at least not in at least 7 years.
By then u will have learned japanese, the emulators for ps2 will run in "normal" pc and with all the extra gadgets like a disassembler and a tracer and probably u will not have problems with binary and hex systems.

What u are asking is the same if a 1 year baby can drive a Ferrari!

First u was born, then stated crying and sleep, then started crowling around, then learned to sit, one day u could stand up and some days later u learned to walk, after that u learned to run, in the mean while u should have started speak too but this is not important in this matter, so time passes and passes and u reached the age where u can try to ride a bycicle for then change to a motorbike... some times later u got your first car and after 10 years of work without holiday u could by your ferrari but this doesn't mean u can drive it!!!

Translating a game is the same !!!!!

Somebody said in another post: If u have to ask how do u translate a game, forget about it because u will never do it!

Deets

  • Full Member
  • ***
  • Posts: 202
    • View Profile
Re: Japanese Character Recognition Help.
« Reply #3 on: March 04, 2011, 03:37:44 pm »
Quote
俺の引いたカードは…封印されしエクゾディア!

That's kanji, hiragana and katakana, plus some good old-fashioned Western-style punctuation.
Oh, Ryusui. You and your Yu-Gi-Oh :3333

3ukalipto

  • Newbie
  • *
  • Posts: 3
    • View Profile
Re: Japanese Character Recognition Help.
« Reply #4 on: March 04, 2011, 09:35:13 pm »

Hello; thank you much and sorry for the delay in the reply. I just got back from school; was a funny day. I ate something that wasn't meant to eat spiced with the wrong stuff and I guess I have been mildly intoxicated. Nothing too serious or to worry about.
Have been feeling weak after I ate it; I could not even form a strong fist and I had some serious bowel movement along with a pulsing headache.

Enough of preambles...



Quote
Actually, that's a pretty big issue (99% of the game is in Japanese). The one you're trying to rectify, I assume?

I know that's one of the major issues or problems I'll have to tackle before hacking the ISO. I'm aware that without proper knowledge about the language, I will end up being nothing more than another novice.
Quote
[/size]

Odds are you don't need to make a table. Most Japanese games for modern systems just use Shift-JIS encoding.

That I did not know; thank you for pointing it out.

Quote
Here's a ready-made Shift-JIS table you can use.

Friendly word of advice: if you're going to be using it with WindHex, get JWPce and save it in Shift-JIS format. That's right: the Shift-JIS table isn't actually stored in Shift-JIS format. It's in UTF-8, which is more widely compatible, but WindHex doesn't like it.

I downloaded that file two days ago. I had no idea it could be used in that manner; I shall try it out in a few minutes. Even though I have been working mostly with Hex Workshop, I also have with Windhex. (although, it crashes after I try opening a file too big, but my guess is that is a hardware or memory issue)
I have also acquired JWPce, and I have been using it quite intensely over the past few days; but I didn't think it could be used paired with the Shift-JIS format file. I shall listen to your suggestion.




Quote
No you don't. You'll need a translator; eventually, anyway, but for now, trying to translate the game one handwritten screen at a time is counterproductive at best.

 :) I wasn't planning in using the help of you guys to translate the whole thing in this manner. I was just trying to identify the "nature" of each character. All the sets of characters I wrote were equal to, or larger than 3 characters. I was planning in using Monkey Moore to make the table file (how? I did not know, but I am stubborn and patient enough to find out). I, then planned to use WindHex along with the created table and continue with the procedure. (which is trying all kinds of methods or techniques until I stumble with the right one).


Quote
Assuming you haven't done so already, you'll need a tool to break apart the ISO into its constituent files and reassemble them when you're done. If the game uses its own virtual file system (VFS), this will take the aid of an experienced hacker.

I disassembled one of the ISOs using Apache3; which would be the David Doulliet Judo one, but haven't been working too much with it.
Sorry for asking; and pardon my inexperience. How would I know if an ISO uses its own VFS? Would there be a file with the VFS extension within the files of the image?



Quote
See this? This is romaji. That's right, it's the good old-fashioned Roman alphabet. That's literally what "romaji" means in the first place ("Roman characters", anyway).

Thank you for refreshing my memory. I was aware of this fact.

Quote
Japanese uses a mix. Check out the following sentence: 俺の引いたカードは…封印されしエクゾディア!
That's kanji, hiragana and katakana, plus some good old-fashioned Western-style punctuation.

I was also aware that the Japanese writing system uses a combination of different character (which seem almost identically to me... most of them, that is).




Quote
... it's immensely useful to be able to at least identify the three types of writing on sight; preferably, you'll be able to read hiragana and katakana, possibly a few kanji as well.

Hehehe,  :laugh:. I'm working on that sir.

Quote
Spend some time around here and we might make a decent translation hacker outta you. :3

I will most definitely do so man.

Quote
The NES, SNES, Game Boy and GBA are all well-traveled and familiar ground; there are some hurdles involved, but they make a fine place to start.

Yes; I have read the GB and GBA are the best documented systems right now. I have been also working on a SMB hack for the NES. Basic stuff like making the table for the text, editing the text and such.

I have yet to learn all the basics. I m reading most of the documentation I could find here and in other sites in order to understand the techniques, terms, tools, etc. Trying out the tools and messing with them along with the ISOs.





Quote
Good luck!


Thank you very much for your kind wishes and information. Also, thank you for taking the time to reply to the thread. I will surely listen to the advice you have given me and try the things you have mentioned.


March 04, 2011, 09:48:54 pm - (Auto Merged - Double Posts are not allowed before 7 days.)


Kind Auryn; I thank you for the time you have taken to reply to my post. Here is my response to you Sir;

I am not asking for anyone to loan me freely their hands or in this case their experience; just as I am not seeking instant knowledge in the hacking scene.

But if a man has translated a game, it is just as easy for another one to do the same; it will take dedication, which I can develop as I work and get more and more involved with the process. It will take patience, which I do have; and in large amounts. It will take 7 or more years to complete all these games... well; as long as the world does not end on December 2012, I am good. (that of course, not keeping in mind my feeble human nature).

Also; I can not drive. I prefer to walk and not pollute the air that has given me such a precious life to enjoy. So I take no interest in Ferraris or any other sort of car for that matter; unless it is ecologically friendly.  ;)

You have described the human progressive physical growth. You learn to connect your brain to your body, then you learn to connect it with your limbs. After doing so, you do the same but this time you develop control over your extremities. You learn to make gestures, crawl, walk, run...etc.
The mind is just as you mentioned. The same.... it learns to count, to add, divide, multiply, substract, etc. As long as you feed it something, it will continue to learn.

The best way to learn is by experimenting with your mind. The same with hacking... it does not matter if you are a novice or an advanced ROM hacker as yourself; the process is the same for everyone. You observe, imitate, experience, experiment and learn.

Again; I am not asking how to translate a game. I am merely doing research, observing, paying attention, imitating, experiencing and learning. It is a slow process, but there is none other that is better.
« Last Edit: March 04, 2011, 09:52:25 pm by 3ukalipto »

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Japanese Character Recognition Help.
« Reply #5 on: March 04, 2011, 11:53:35 pm »
I disassembled one of the ISOs using Apache3; which would be the David Doulliet Judo one, but haven't been working too much with it.
Sorry for asking; and pardon my inexperience. How would I know if an ISO uses its own VFS? Would there be a file with the VFS extension within the files of the image?


It probably won't have a ".vfs" extension, but if the ISO seems to contain one huge file instead of a proper file system, then yes, it's probably a VFS.
In the event of a firestorm, the salad bar will remain open.

Auryn

  • Hero Member
  • *****
  • Posts: 649
    • View Profile
Re: Japanese Character Recognition Help.
« Reply #6 on: March 05, 2011, 03:33:06 am »
Sorry if i sounded a bit strong before but newbies need that heheh
Actually i meant 7 years for the first and not for all :p

The VFS can be identified comparing the size of the image and the total of the files u see on the disk.
Chrono Cross for PS1 is the first that come to my mind (700+MB of image but about 10MB that can be seen on disk).
Anyway be prepared because the most games now a day use some sort of sub VFS by creating big files with sub files (probably compressed) on it.

The alphabets are easy identifiable because one is very "squared" (lot's of edges with view curves on it); the other is almost the opposite, it's almost all rounded and the last ... for now let's say that if the "symbol" has more than 5 lines (strait lines without corners) or curves, it's probably a Kanji (there are kanji with less than 5 curves or lines but for now let it like this, later when u will remember the the hiragana / katakana symbols, the rest will be kanji :p)

Ryusui

  • Hero Member
  • *****
  • Posts: 4989
  • It's the greatest day.
    • View Profile
    • Tumblr
Re: Japanese Character Recognition Help.
« Reply #7 on: March 05, 2011, 04:17:42 am »
Hiragana look more like cursive writing; katakana look more like print. And generally speaking, kanji are more complex than either, but there are plenty of exceptions.
In the event of a firestorm, the salad bar will remain open.

3ukalipto

  • Newbie
  • *
  • Posts: 3
    • View Profile
Re: Japanese Character Recognition Help.
« Reply #8 on: March 05, 2011, 07:16:14 pm »
Quote
Sorry if i sounded a bit strong before but newbies need that heheh
Actually i meant 7 years for the first and not for all

Do not worry; I took no offense in your comment. I'm a very passive person and getting my feelings hurt or getting me upset is quite a difficult task... hehehe :). I suppose you were trying to see if I wasn't just a bothersome joker asking an already answered question.

About the 7 years; pheeew, that is quite sometime. I suppose I would have to drop some of those games in the long run until new tools become available; yet, I am sticking to Kengo 3, which is the one I am most interested in at the moment. I have seen is already in the requested Japanese translations´list, but since it is so famous, and not one of the first games, I've decided to take this matter into my hands. Or at the very least give a try.



Quote
The VFS can be identified comparing the size of the image and the total of the files u see on the disk.

I have noticed that almost in every ISO there is one or a couple of larger files with the AFS extension. I downloaded some applications that let me open them and see what is inside of them; and I was amazed by the amount of files contained within it.
I did some research about the extension and was encouraged to take a closer look at those files.


Hiragana look more like cursive writing; katakana look more like print. And generally speaking, kanji are more complex than either, but there are plenty of exceptions.

I'll take that into consideration Ryusui. I had gathered some reading material and software to help with the Japanese. I plan to study it these next days, and probably do some experimentation with the hacking as well.

I shall get back to you guys if I stumble upon anything interesting or a problem.


Thank you Auryn and Ryusui, for taking the time to reply to my post.