News: 11 March 2016 - Forum Rules

Author Topic: Unicode and Graphics Problems - Any Help Is Nice  (Read 2518 times)

RaiKirida452

  • Jr. Member
  • **
  • Posts: 3
    • View Profile
Unicode and Graphics Problems - Any Help Is Nice
« on: July 09, 2013, 02:01:20 pm »
Greetings,

I am new to ROM hacking. My ultimate goal is to translate a particular PSX game. Yes, I have read all the beginner docs that warn not to start with a big project - and I have taken this to heart. I figured I'd use easier NES games that have been hacked a lot and various other model games to master each given technique before applying it to my target. However, I have run into a few serious problems in just attempting the basics.

The way I thought I'd do this is:
1) Extract the script - that is to say - get every piece of Japanese text that IS text in the game into one place, translate it, then finalize an adapted English script.
2) Record the actors (yes, this game has voice acting, and yes, I'm going to have an English version. Sound is my specialty as far as multimedia goes, and the voices are a must. That said, I am notoriously picky about the acting and I'll try to make sure it's up to my standards. Either way, I'd like to put in an "off" option if there isn't one already.)
3) Finalize all of the English text and reinsert it into the game - this counts editing the font (probably the trickiest part.)
4) Edit the graphics that have Japanese text
5) Reinsert the English audio files into the game - either in place of the Japanese voice files or in addition, in which case I'll have to add an extra option in the code. (Wait... that will be the hardest part.)
6) Burn it to a disc (personal copy) that plays on my own PS2.

And I'd also like to rip the music and sprites for good measure.

I may be forgetting something, but that was the most basic overall plan.

Now, for the issues. These are pretty big issues, and have kept me from really getting anything done. Yes, there have been quite a few solutions offered in the documents. However, some skip over a a certain level of basic (between REALLY basic and "alright, now you're ready to go!) and some things are quite simply out of date.

The Game Text

The Unicode Problem
First of all... extracting the script. Well, it's a bit difficult to get a script without a table... and it's impossible to build a Japanese table without a hex editor that reads unicode.
The solution to this problem given was to use NJStar Communicator. I downloaded the trial and... well. Still getting little boxes and whacko symbols. Which is weird since I can type in Japanese in all word processors, in Chrome, in anything I want all the time. It's just that anything that relies completely on Windows (folder names, application GUIs) will not show any Asian characters. The other problem is that it seems a lot of Hex Editors do not have unicode capabilities anyway. Hex Workshop seems to... but my trial expired. I used HxD, but all of its options were showing up as gobbldigook as well, even with NJStar.
I'm kinda stuck here. Is there a hex editor... that can read unicode? Or a windows patch that makes unicode work in anything? If I can't get that working, I'm in pretty deep trouble.

"Script"
When I think "script," and correct me if I'm wrong, I think all text in the game. What I'm wondering is how this (and other data) is stored in a ROM or ISO. A lot of the tutorials assume we already understand this, and it does seem pretty straightforward, but I just wanted to confirm since I HAVEN'T been able to see this firsthand due to my unicode problem.
If I throw my ISO file into a hex editor, assuming the correct table is in, will strings of text from the game magically appear before my eyes? Or do they go in and mix the strings around and put in code that organizes it when the actual game runs? I assumed it was the former, which is what leads to the problem of not being able to extend the string length without messing with pointers (which I assume tell the game where in the hex code the strings start and end.)
This isn't a high priority to answer, so if it's just too dumb for your time, then feel free to ignore it. I can get an English ROM and play with it (all the ones I have are Japanese.) I just like to know how things work.

Which leads me to my next problem...

Data Storage

Extracting Graphics
PSX 2D graphics are supposedly TIM. The problem is... there are (obviously) no TIM files lying around on the ISO. I assume they are in the BIN files somewhere. Buuut, the only format my editing tools can read are TIM. I can't even throw the whole ISO at them. I assume the BIN files contain other smaller files like the graphics. However, I don't have anything that can break them down. It's a pretty bothersome problem simply because it is so trivial. I've seen sprite rips of this very game, yet I can't even view the stupid sprites. My understanding of this might be completely off! I intend to look into this again, searching the internet through the eyes of a graphics ripper, but I think I'm having a fundamental misunderstanding here.
All the basic graphic editing docs tell my to use TLP... which doesn't work for PSX. But anything on PSX just says "well, it's TIM format." Okay... well I think what I'm looking for is a piece of software that can pick out what's graphics from the entire ISO or even from an individual BIN file. I can keep looking, but I would much appreciate being steered to a doc that goes over sorting out the graphics from an entire ROM or ISO.

Audio Files
I guess I'm getting pretty ahead of myself here, but I thought I'd check just in case.
The files on my ISO are thus:
- 11 irreducible (according to ISO breaker) BIN files (D_Anime, D_BGM, D_Effect, D_Face, D_Field, D_SCE, D_Screen, D_SE, D_System, D_Unit, Movie)
- A .da file (MYUVOICE.da)
- SLPM_871.76
- System.CNF
- Voice.STR
I used basic music/ audio rip tools to get everything off the .da and the .str - which turned out to be all the sound effects and voice recordings.
My hypothesis is that the code contains information that tells the game what file to use and what part of the file to use to play audio clips when appropriate. The rest of the audio (music) is MIDI, and thus contained elsewhere.
So...
If I want to rip the music, I need to find and extract all of the MIDI data and throw it in a synthesizer to render it to mp3.
If I want to mess with the audio files, I need to mess with the code itself to change the audio "pointers" to tell the game where to pull the new files from.
I could be completely wrong; this is just a hypothesis.

In short, this is what I'm having problems with:
1) Viewing unicode in a hex editor (NJStar is not working for me)
       - A Japanese hex editor would be superb.
2) Finding the graphics on the ISO and getting them into a format where I can throw them into an editor to view and edit them.
         + More than the problem itself is my general confusion as to how the data is arranged on the ISO.


Sorry if my questions are nooby, but that's why I posted in the noob forum.

I'd appreciate any help!

Thank you!

-Rai

UPDATE:
I got the graphics extracted.

They're TMDs, NOT TIMs. Sure don't look 3D in game, but what the heck.
« Last Edit: July 09, 2013, 04:56:12 pm by RaiKirida452 »

henke37

  • Hero Member
  • *****
  • Posts: 643
    • View Profile
Re: Unicode and Graphics Problems - Any Help Is Nice
« Reply #1 on: July 09, 2013, 02:19:55 pm »
You are going to have to write your own extractor and rebuilder for the custom archive file formats.

RaiKirida452

  • Jr. Member
  • **
  • Posts: 3
    • View Profile
Re: Unicode and Graphics Problems - Any Help Is Nice
« Reply #2 on: July 09, 2013, 04:56:40 pm »
Figured it would come to that. Sounds like a ball.

Thanks anyway.

Klarth

  • Sr. Member
  • ****
  • Posts: 499
    • View Profile
Re: Unicode and Graphics Problems - Any Help Is Nice
« Reply #3 on: July 09, 2013, 07:34:17 pm »
Quote
I'm kinda stuck here. Is there a hex editor... that can read unicode? Or a windows patch that makes unicode work in anything? If I can't get that working, I'm in pretty deep trouble.
Windhex from this site.  Load a table and (I think) you enable Unicode and Japanese display.  For other hex editors, you usually enable unicode and may need to tell the GUI to use a font with Japanese support.  Other hex editors won't load tables and it's very likely that you game does not use Unicode, so use Windhex for this specific task.

Quote
What I'm wondering is how this (and other data) is stored in a ROM or ISO. A lot of the tutorials assume we already understand this, and it does seem pretty straightforward, but I just wanted to confirm since I HAVEN'T been able to see this firsthand due to my unicode problem.
If I throw my ISO file into a hex editor, assuming the correct table is in, will strings of text from the game magically appear before my eyes?
This is why there is no tool to automate the translation hacking process.  Every game is different and requires slightly different approaches.  If you're hacking an NES or SNES ROM, then upon loading a correct table, you'll see garbled "fake text" in a hex editor for 90% of the game's data and real text for maybe 10% of the game's data.  It's up to you to scan through the ROM and identify the correct data to dump (which is usually in large blocks at a time).  On disc-based systems, you generally identify files with text either through intuition or brute force.  Using intuition is important because it cuts down on the amount of data you need to browse through.  But if the game has compression, then you're still not going to find anything using this basic technique because the data is generally not comprehensible until you can decompress it.  The compression algorithm very well could be custom, especially during the PSX era and before.

Quote
The problem is... there are (obviously) no TIM files lying around on the ISO. I assume they are in the BIN files somewhere.
As with all file formats, the developer can choose whether to use it or not.  Many games do not use TIM and pretty much none use TIMs exclusively.  The TIM file format is basically just a header before some graphics data.  So the developer can choose to use a TIM, custom format, or no format.  But the graphics data is usually pretty standard, being very similar to BMP in its layout.  They're linear pixel formats which means you need to pick a BMP and guess the width (and perhaps stride) to correctly grab the graphics data.  I think there are several graphics editors that have linear pixel format support...but not many that allow you to adjust widths.

Quote
I used basic music/ audio rip tools to get everything off the .da and the .str - which turned out to be all the sound effects and voice recordings.
My hypothesis is that the code contains information that tells the game what file to use and what part of the file to use to play audio clips when appropriate. The rest of the audio (music) is MIDI, and thus contained elsewhere.
Like before, it may or (most likely) may not be MIDI.  If your game has a PSF or PSF2 made from it, then I'd suggest you email the author or look through the PSF/PSF2.  Also, the game's event engine controls audio gets loaded/played/looped/etc.  Modifying the event code may/will require extensive reverse engineering to figure out which event codes are relevant to you.

Quote
Finding the graphics on the ISO and getting them into a format where I can throw them into an editor to view and edit them.  More than the problem itself is my general confusion as to how the data is arranged on the ISO.
As henke37 suggested, you'll have to create extractors and rebuilders for each file.  This means doing some basic programming with loops and file i/o once you've reverse engineered the structure.

Quote
They're TMDs, NOT TIMs. Sure don't look 3D in game, but what the heck.
You can "flat" render 3D images as 2D.  It's a technique called billboarding.  For modern systems, it's much faster to draw in 3D->2D than explicitly in 2D.  Not sure about PSX/PS2.