News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Reading the Hex Code of PSP Games  (Read 6825 times)

dothacktranslate

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Reading the Hex Code of PSP Games
« on: October 20, 2016, 08:52:23 am »
]Hi, I am wondering if anyone knows of a good Hex Code reader or editor that can handle PSP files like .ccs well. Or if anyone knows how to decrypt PSP files. I found a couple of decryption programs, but I find them difficult to use.

I was able to read the game's data.cpk which delineates all of the game files. I extracted the files and they are in .ccs format (a type of C-language format?).

Upon checking the Hex code of these files in an editor, the game Hex code appears normal but the conversion to ASCII/ANSI  appears with characters beyond standard ANSI characters and it's all gibberish.

I was hoping the code would appear in C or some programming language I could read.

Some files do have normal text mixed in with the gibberish and that's what I've been translating, but there are still bits of files that I'd like to access. Some files seem to be entirely gibberish, but I believe they have the parts of the game that could still use some translation such as the subtitles in brief cutscenes.

Here's an example:

This is the start of the file which shows all the links to assets in the game. I have not edited this.



This is the part I was able to translate. You can see English text now mixed in with the nonsense code.



This is the middle of the file. There is Japanese text in there which I can edit, but I want to be able to show the nonsense code as proper ASCII text if possible so I can know how all these things are linking together. The same goes for all other files in the game.
 


I use Notepad++ to read the Japanese text, but many files are complete gibberish code. These are likely files I would not need to edit, but it would be nice to at least know what they say so I can make any changes needed for this localization. Anyone have any experience with converting or decrypting Hex code?
« Last Edit: October 20, 2016, 09:23:42 am by dothacktranslate »

NoOneee

  • Jr. Member
  • **
  • Posts: 99
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #1 on: October 20, 2016, 10:30:19 am »
It's hard to guess what that means without seeing how the game uses that data. I don't think things are compressed or encrypted here, as we can see clear text. You should try using PPSSPP's debugger.

Here's an example of what you can try:
Use PPSSPP to go to an untranslated scene.
Open the memory viewer, press the right mouse button and take a memory dump.
Open the dump in an hex editor that supports utf-8 or shift-jis, I don't know what the game uses... You may know better than me.
I personally use the 5.0 beta from http://www.hexedit.com/downloads.aspx. You can change the character set in View, Character Set.
Hopefully you'll find the untranslated text. If you found it in the dump at position, say, 0x123, it is actually on 0x08800123 on the PSP. The 0x08800000 region is where the user available RAM is located on the PSP.

You should probably search for the hex code sequence equivalent to this text in all your files... If you have a lot of files, I'd suggest merging them on a single file so it is easier to search. I use 7zip's tar to do this. If you can't find it in any files, then it is probably compressed/encrypted/encoded in another character set... In that case we need to use PPSSPP to see where did that text come from.
Open a game save just before that untranslated scene. Make sure the text isn't in RAM at this point (open the memory viewer with Ctrl+M and go the the position the text is located to check that. Remember to add 0x08800000 to the position you found the text).
I'd open the PPSSPP debugger(Ctrl+D), set a write breakpoint to the position you found the text in your memory dump (again, remember to add 0x08800000) That way the emulation will stop when that address gets written. Now proceed to the scene where the text is located. Hopefully the emulation will stop and you'll be able to see the MIPS instructions responsible to getting the text there.

I hope this is enough for you to get started... I know it seems complicated, but you'll get the hang of it.

If you can't find the text in RAM in the first place, you'll have to open the PPSSPP graphics debugger, find how the text is getting displayed on the screen, get the offset to the texture where the text is written, and setting a write memory breakpoint to that texture to so you can make the emulation stop when the game writing those textures.

Also, did you check the main executable for text? On PPSSPP go to Tools, Developer Tools, Dump Decrypted Eboot.bin on game boot.
Now open the game, and check the "memstick\PSP\SYSTEM\DUMP" folder on your PPSSPP install. You should have a "bin" file with the game code. This is the EBOOT.bin(i.e. the game's main executable) decrypted. There's usually some text there.

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Reading the Hex Code of PSP Games
« Reply #2 on: October 20, 2016, 11:10:46 am »
I was hoping the code would appear in C or some programming language I could read.
Just to be clear - that never happens. Ever. (Well someone will have fun finding an exception out of 10000 cases, maybe.)
The file you mentioned was (I guess) just an archive. When the game runs, it unarchives some files within it, upon needs. That's what games usually do.
Which means the "ccs" (was it?) is probably a script file, in a format the game can use directly.
Compared to "C" equivalent, it's more the binary executable output, not the source.
Good news is - you can read the text. So this *probably* means you're really seeing a script file, with some binary "special codes" and some text. That special code would be entirely specific to the game. We've seen some games with special codes for many different things, so the examples hereafter are probably not relevant to your game, but could give you ideas: change text color, end line, end dialog box and wait for button press, change speaker, change portrait of the speaker to reflect happiness/fright/anger/..., change text size, check if the player has taken choice A earlier, check if player is male, propose choice among 3 possibilities, etc.
You have to interpret these. And find text pointers, if there are any, too. Only then will you be able to translate properly, with longer text if needed, with more dialog boxes if needed.

NoOneee

  • Jr. Member
  • **
  • Posts: 99
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #3 on: October 20, 2016, 11:24:14 am »
Yeah, what BlackDog61 said is correct. The C/C++ code that the game was probably coded with got compiled to MIPS machine code and now resides mostly in the EBOOT.BIN and external PRX files.

dothacktranslate

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #4 on: October 20, 2016, 12:23:50 pm »
Thanks to both of you. I'll try your suggestions and see what sort of results I get.

The good news is I can translate probably 97-99% of the game directly from the script files, but there remains that 1-3% where I will probably have to do some pointer tweaking to make things work a bit better. It's probably best if I translate everything I can first, since once the patch is applied, whatever files don't change I will then know that they are not related to the content I already translated. Then I'll know what files to concentrate on and which memory dumps to go to first.

flame

  • Full Member
  • ***
  • Posts: 120
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #5 on: October 21, 2016, 08:14:09 am »
The other hint, and it's not always even true, but as a general rule, for MIPS:
a0, a1, a2, a3 -> parameters passed to a function. The caller expects them to get changed, and the called function often does change them.
v0, v1 -> return values used by the calling function

It doesn't always hold true. Compilers could do anything, you know. If you are making your own function, feel free to use any register - these are just the ones compiler programs usually use.

If you step through something simple like strlen, you'll see it's the case, at least for simple functions.

Modifying the passed in values can sometimes allow you to change function behavior without understanding the code of the underlying function. I have found it useful because for games with automatic text wrapping, the wrapping length is often a function parameter. Like for Last Ranker and R-Type Tactics II. It could be given in characters or pixels depending on how the game is programmed. I could actually use some help on wrapping length for a different game; I'll make a separate post about it later.

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Reading the Hex Code of PSP Games
« Reply #6 on: October 21, 2016, 09:33:51 am »
The good news is I can translate probably 97-99% of the game directly from the script files, but there remains that 1-3% where I will probably have to do some pointer tweaking to make things work a bit better. It's probably best if I translate everything I can first, since once the patch is applied, whatever files don't change I will then know that they are not related to the content I already translated. Then I'll know what files to concentrate on and which memory dumps to go to first.
From my experience, you'd do a much better job translating if you don't impose yourself lengths restrictions. If we take that into account, that 1-3% becomes probably 60%-ish, doesn't it?
In other words, you'd better invest now time to free you translation, than do it later and have to re-edit everything you've done.
Not to mention, once you dump from pointers instead of directly, the dump format will be different and you'll have to redo what you've done (or re-insert it). At the time you get in front of it, it may discourage you to redo things.
Just my 3 cents, rooting for success of the project!

flame

  • Full Member
  • ***
  • Posts: 120
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #7 on: October 21, 2016, 11:35:09 am »
From my experience, you'd do a much better job translating if you don't impose yourself lengths restrictions. If we take that into account, that 1-3% becomes probably 60%-ish, doesn't it?
Wrong again. That's UTF-8 encoded above if you hadn't noticed. It means they get 3 EN characters for every JA character originally used, which means most strings will fit, 97% sounds like a good number. Translations take around 2.5 EN characters for every JA character. The longer the string, the more likely there is to be enough space for the translation due to law of averages.

They mentioned length restriction on the Valkyria Chronicles 3 project. This is probably what they were talking about.

Last Ranker uses UTF-8 encoded. I figured out the pointers anyway; that game uses null-terminated strings but there are pointers for them. I guess I didn't really need to figure them out, huh.

wxMEdit to answer OP's question, has support for UTF-8. SHIFT-JIS too, but you probably won't need it.

Really you don't want to use a hex editor at all because hex editor's don't support the backspace key which you need a lot. Am I alone in this?
« Last Edit: October 21, 2016, 12:30:25 pm by flame »

NoOneee

  • Jr. Member
  • **
  • Posts: 99
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #8 on: October 21, 2016, 12:51:26 pm »
Really you don't want to use a hex editor at all because hex editor's don't support the backspace key which you need a lot. Am I alone in this?
The hex editor I've linked here does support removing/adding bytes to the file. It starts in "replace mode", but you can activate the Insert mode(Alt +I) and everything you type will get inserted, and backspace will delete the bytes instead of zeroing them out.


dothacktranslate

  • Jr. Member
  • **
  • Posts: 13
    • View Profile
Re: Reading the Hex Code of PSP Games
« Reply #9 on: October 21, 2016, 01:03:18 pm »
Wrong again. That's UTF-8 encoded above if you hadn't noticed. It means they get 3 EN characters for every JA character originally used, which means most strings will fit, 97% sounds like a good number. Translations take around 2.5 EN characters for every JA character. The longer the string, the more likely there is to be enough space for the translation due to law of averages.

They mentioned length restriction on the Valkyria Chronicles 3 project. This is probably what they were talking about.

Last Ranker uses UTF-8 encoded. I figured out the pointers anyway; that game uses null-terminated strings but there are pointers for them. I guess I didn't really need to figure them out, huh.

wxMEdit to answer OP's question, has support for UTF-8. SHIFT-JIS too, but you probably won't need it.

Really you don't want to use a hex editor at all because hex editor's don't support the backspace key which you need a lot. Am I alone in this?

Yeah, it's all in UTF-8, so it's easy enough to read the proper strings. It's not clear from the images, but the game does use NUL and various other pointers for actions on the strings. I don't know the system yet, but if I figure it out I may be able to add some tweaks here and there.

The hex editor I've linked here does support removing/adding bytes to the file. It starts in "replace mode", but you can activate the Insert mode(Alt +I) and everything you type will get inserted, and backspace will delete the bytes instead of zeroing them out.

Thanks. I'll check that out.


I came across what I thought was a file to translate. It contained a list of all player names, but they were separated by #. I'm not sure what # represents in this file, but upon patching the game with the translated file, it crashed the game. I think it may be a file of player references rather than strings that display, but I don't know. I replaced the original back into the game and everything returned to normal.

That's a post for another time, but the program you mentioned may help me experiment with this file to see how the game uses it.

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Reading the Hex Code of PSP Games
« Reply #10 on: October 22, 2016, 03:48:56 am »
That's UTF-8 encoded above if you hadn't noticed. It means they get 3 EN characters for every JA character originally used, which means most strings will fit, 97% sounds like a good number. Translations take around 2.5 EN characters for every JA character. The longer the string, the more likely there is to be enough space for the translation due to law of averages.
OK - good.