News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - 730

Pages: 1 2 3 [4]
61
I've found multiple copies like that to be very common. Usually the text goes through a series of tests each time it is copied, and variables like names are filled in etc. If you keep going, you should get to the original 'compressed' text as loaded from the CD. You might get lucky and find some simple compression or encryption routine that can be bypassed all together.

I think I was wrong with what I said, but I did find the part in memory where the compressed text loaded from the CD is, and it's getting it from there. I've yet to figure out how it's decompressed, but I did see it changing the register that points to the current byte to load from the compressed section to an address that was several bytes before what the current byte should be, similar to what I read about LZ storing bytes that were previously used as to not reuse them in memory.

The concrete example was something like this:
Code: [Select]
00 01 02 03 04 05 06 07 08 09 0A

Decompressed bytes that end up being stored:
30 E6 02 80 68 DD 02 80 84 DD

Compressed bytes:
30 E6 02 80 68 DD 02 80 82 84 FC

When it's time to write to 0x09, it should load 0x0A, and somehow in the process,
the address ends up being 0x05, loading that DD then storing it in 0x09.

Don't see how this is compressing anything at all but maybe it works out in the long run?

And I don't think there's anything I can bypass, it does need to decompress/decrypt the text in the CD after all, and I need to be able to insert text compressed/encrypted like this into the image.
I'm guessing if it's encryption I could technically rewrite it to bypass the decryption and load non-encrypted text I would write into the image, though judging from what Vehek posted, it looks like it really is compression.

Edit: I've been digging some more. There's apparently something like 3 routes through which it can go. It either loads bytes directly from the "compressed" zone and stores them into the other zone, loops loading bytes from the compressed zone until it finds the appropriate byte to load and store, or it uses a loaded byte to calculate (with some ANDs and ORs and shifts and values in other registers) an address pointing to the byte that must be stored. I've seen this address end up pointing to the uncompressed zone, way before the position where it would store the new byte. It might also have ended up pointing to a posterior position in the compressed zone, but I might've confused that for the second case I described where it skips over some bytes.

I've also seen it use 2 registers to compare them, where one is a set value and the other keeps incrementing during a loop, and depending whether the incrementing one is < than the set one, it branches off to somewhere or not. I think this loop determines a certain amount of bytes to either directly load from the compressed zone, to get from a previously written part of memory, or to skip through in the compressed zone, I don't remember right now exactly.

62
I almost wonder if it's encryption rather than compression. Since you know the RAM location of the 'decompressed' text, have you tried putting a 'breakpoint on memory-write' at that location? That would let you discover the tail-end of the routine generating the final text. From there you should be able to trace backwards to see how it's being produced.

I tried that (though not a breakpoint, just managed to find the moment it started to write the text at the location I knew it would), and it was a bit hard to follow, and also found that it's getting the value to write from another location in memory, already decompressed/decrypted. I guess I'll try it again, maybe setting a breakpoint on the other location to find out how it writes it there.

The bottom one is not compressed, "Ten emos que subdir". pSX can break on CDROM DMA, if you do not know the DMA address then set the range to all memory 80000000-801FFFFF

Hmm, it looks like you're right for the string itself, though the bytes that come before it are different from what is in RAM, so those must still be compressed.

63
Here are some notes I made a few years ago.

Wow, I didn't expect someone to reply to this thread that had actually done something with this game. Thanks, though I think at my current level of expertise it won't do much good, but I'll keep it in my own notes for later.

64
Edit: link from last post is dead, reuploaded to Mediafire here. This program takes paths to PROGRAM.BIN chunks and decodes them. The InputFile class also includes an unfinished encode() method for encoding them back (don't think it should be that hard to reverse the algorithm in order to encode them, though I doubt it would be very useful since it would probably just make the game crash). Here is the QuickBMS script for splitting PROGRAM.BIN into said chunks.

I wanted to translate this game but to be honest I think I should come back to this in a few months or years when I've learned more about programming/ASM/computers but I figured I might as well make a post before completely giving up.
So the image of this game has a file in it called PROGRAM.BIN where most of the game besides cutscenes and maybe music/voices (some XA files I haven't looked into) seems to be in.
This PROGRAM.BIN file seems to have several headers labelled PS-X EXE (and some other stuff like Sony Computer Entertainment etc.) in it separating different chunks (should be 141 of them, I split them with QuickBMS).

I managed to find in-game lines in PROGRAM.BIN but they're all jumbled up, missing characters and being shortened. I looked up the lines in RAM for comparison. Here's two examples:


Code: [Select]
付近に客室が落ちたらしい
RAM: 95 74 8B DF 82 C9 8B 71 8E BA 82 AA 97 8E 82 BF 82 BD 82 E7 82 B5 82 A2
BIN: 95 74 8B DF 82 C9 8B A8 71 8E BA D4 01 8E 1E 05 BD 67 00 D8 B5 82 A2

<stuff>Tenemos que subir
RAM: 11 35 12 31 13 3D 34 30 39 35 3B 03 54 65 6E 65 6D 6F 73 20 71 75 65 20 73 75 62 69 72
BIN: 11 26 35 EC 7E 54 65 6E 00 65 6D 6F 73 20 71 75 65 80 20 73 75 62 69 72

I read some posts about LZ compression, but those algorithms seem to be different on every game they're used on, and I also can't find anything like an LZ header in the file, so I don't know what to make of that.

I tried figuring out how the text got decompressed by debugging with this and tracing with this but I... wasn't very successful. I realized there was an option for CD-ROM reading breakpoint, but I can't figure out how data is being read and where it's being written.

Some extra info: as you can see above, the game uses both Shift-JIS and ASCII, since it has both Japanese and Spanish/English, though it doesn't seem to support special characters like á or ñ (I edited a savestate to see if I could edit text through those and when I put characters like those in, it crashed when the dialogue box came up). I haven't played the game much to see if it had symbols like those in-game, but I doubt it does.

Pages: 1 2 3 [4]