Romhacking.net

Romhacking => Newcomer's Board => Topic started by: Sledge on June 13, 2011, 12:37:46 pm

Title: Help cracking encrypted text
Post by: Sledge on June 13, 2011, 12:37:46 pm
I've got this scenario/script file from a pc game, after analysing it on a hex editor, I've come to the conclusion that it's not compressed, but encrypted, since I managed to modify some characters and see the results in-game, I figured: since each character is stored in their respective 8-bits, it removes any possibilities of being compressed. (veterans please correct me, I'm still a noob at this encryption subject.)

After observing some more, I found a 2-byte value right after the file's header (which is unencrypted, in ascii), what caught my attention was that that 2-byte value was isolated between padded 0's.

Could this be the key to the encryption? Help.
Title: Re: Help cracking encrypted text
Post by: Malias on June 13, 2011, 03:36:00 pm
So, you're saying that each character has its own hex value?  Have you considered constructing a Table (http://www.romhacking.net/start/#text)?
Title: Re: Help cracking encrypted text
Post by: Sledge on June 13, 2011, 04:52:32 pm
Quote
So, you're saying that each character has its own hex value?  Have you considered constructing a Table?

Relative Search was my first attempt.

This is how I know it's encryption: Like I said, I located the script from right in the beginning of the game. Not only that but I also managed to modify some of these characters.

This is a string that I found and below it, its hex values.

the breakup came
27 13 1A 13 11 01 16 1A 18 02 03 45 0D 04 1B 08

If this could be simply solved with constructing a table, then I would simply assign the character 'e' to $1A, but hey, character 'e' is used again, in 'breakup', this time, as $16, and later, $08.
Might be encryption, I figured.
Title: Re: Help cracking encrypted text
Post by: Ryusui on June 13, 2011, 05:07:05 pm
Might be Huffman coding, or a variant thereof. Try breaking it down into a bit string and see if any patterns show up.
Title: Re: Help cracking encrypted text
Post by: Klarth on June 13, 2011, 05:12:36 pm
We'd need both the on-screen text and its hex representation if we were to make a better guess.
Title: Re: Help cracking encrypted text
Post by: Sledge on June 13, 2011, 05:48:00 pm
Quote
Might be Huffman coding, or a variant thereof. Try breaking it down into a bit string and see if any patterns show up.

I only know huffman in theory, but I know that it consists of having some characters be stored in less than 8 bits., that's why it's called compression (please correct me if I'm mistaken). In this script file, each character takes 8 bits, that's why I previously said it might not be compression.

What caught my attention though, was this 2-byte value stored right after the file's header (which is completely readable in ascii). Who knows, this might be some kind of key, thats where I'm stuck.

Quote
We'd need both the on-screen text and its hex representation if we were to make a better guess.

You mean, more examples of dumped text like in my previous post, or the script file itself? (which is very lightweight btw)
Title: Re: Help cracking encrypted text
Post by: Jorpho on June 13, 2011, 06:21:10 pm
You say the only thing special about this two-byte value is that it is padded by zeros.  That doesn't seem very special.

You also say you managed to modify some characters, so what exactly did you do?
Title: Re: Help cracking encrypted text
Post by: tomaitheous on June 13, 2011, 06:39:57 pm
You mean, more examples of dumped text like in my previous post, or the script file itself? (which is very lightweight btw)

 I think more examples would help. Especially with reoccurring groups/pairs of the same characters (including spaces).

For example:
Code: [Select]
the breakup came

t  h  e  '' b  r  e  a  k  u  p  '' c  a  m  e
27 13 1A 13 11 01 16 1A 18 02 03 45 0D 04 1B 08
    \ /         \ /                        \ /
    $7         $15                        -$13

The above 'looks' like relative encoding. But there's not enough reoccurring pattern information in the example text to know for sure. It could be a re-arranged ascii table with relative encoding, or selective table and that header is the key. Could be ring buffer/overflow instead of signed relative (which is still a method of relative encoding, you just let the value wrap back into the ring buffer. I.e. forward moving relative values only.), etc. Could be a lot of things. More example and info is needed.
Title: Re: Help cracking encrypted text
Post by: Sledge on June 13, 2011, 07:40:20 pm
Quote
You also say you managed to modify some characters, so what exactly did you do?

Yeh, I managed to modify it using a hex editor, I knew it was a script file because of the file's extension .sce (scenario).
Not only I was right but I also was able to see the changes in-game, which I'll describe more in this post.
BTW this suspicious 2-byte value is $41E4


The breakup came from out of the blue
Code: [Select]
00000190            27 13 1A 13 11  01 16 1A 18 02 03 45 0D      '..........E.
000001A0   04 1B 08 4F 42 05 08 1B  4D 1B 01 13 55 0C 14 5B   ...OB...M...U..[
000001B0   06 1E 16 53 10 13 02 11                            ...S....

I changed the last value $11 ('e') to $12 and the game showed the letter 'f' instead. Then from $12 to $13, the game displayed 'g'. But then from $13 to $14, I was expecting a 'h' but it displayed a apostrophe (')...
Really strange...

Another example:

I just don't have the confidence
Code: [Select]
000001C0                                                 30                  0
000001D0   5D 01 07 1A 04 54 07 0C  1C 48 03 44 48 12 15 00   ]....T...H.DH...
000001E0   4E 15 1A 0C 4F 43 1C 0D  14 00 14 11 0D 10 17      N...OC.........

I changed the last byte of this last sentence $17 to $18, the game displayed 'j' instead of 'e'



BTW, here's the first bytes of the file, as displayed in the hex viewer

Code: [Select]
00000000   73 63 65 6E 61 72 69 6F  00 00 00 00 00 00 00 00   scenario........
00000010   00 00 00 00 00 2D 3C 3C  20 63 72 6F 77 64 20 58   .....-<< crowd X
00000020   63 68 61 6E 67 65 32 20  53 63 65 6E 61 72 69 6F   change2 Scenario
00000030   20 44 61 74 61 20 3E 3E  2D 00 00 00 00 00 00 00    Data >>-.......
00000040   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
00000050   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
00000060   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
00000070   00 00 00 00 00 00 00 00  00 00 63 72 65 61 74 65   ..........create
00000080   64 20 62 79 20 43 52 4F  57 44 00 00 00 00 00 00   d by CROWD......
00000090   00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00   ................
000000A0   41 E4 18 00 00 00 00 00  00 00 00 00 00 00 00 00   AƤ..............
000000B0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................

After it, it comes totally random bytes which I suspect to be encrypted.

EDIT: Well, I just tried modifying that suspecting 2-byte value and nothing happened,,, the game still played and the dialog text wasn't altered...
Title: Re: Help cracking encrypted text
Post by: Jorpho on June 14, 2011, 01:23:48 pm
Well, so far a valid hypothesis would be that the characters are stored (somewhere) in the order of "jefg[apostrophe]".  It's not impossible.

This could be disproved if, in your last example, changing the last byte to $19 did not cause the game to display "f" instead of "j".
Title: Re: Help cracking encrypted text
Post by: Sledge on June 14, 2011, 04:07:57 pm
Quote
Well, so far a valid hypothesis would be that the characters are stored (somewhere) in the order of "jefg[apostrophe]".  It's not impossible.

This could be disproved if, in your last example, changing the last byte to $19 did not cause the game to display "f" instead of "j".

Changing the last byte from $17 to $19 causes the game to display 'k' instead of 'e'.
I'm still making random changes to characters, but they seem to follow no logic at all.

Modification results at offset $1ED ($10)
$10 = c
$11 = b
$12 = a
$13 = `        <<< thats what I called apostrophe previously
$14 = g

Modification results at last byte $17 = 'e' (offset: $1EE):
$18 = j
$19 = k
$1A = h
$1B = i
$1C = n
$1D = o
$1E = l
$1F = m
$20 = R
Title: Re: Help cracking encrypted text
Post by: Jorpho on June 14, 2011, 04:33:58 pm
Are these two sentences "I just don't have the confidence" and "The breakup came from out of the blue" in the same piece of dialog?  Are they both displayed with exactly the same font?
Title: Re: Help cracking encrypted text
Post by: Sledge on June 14, 2011, 04:44:50 pm
Quote
Are these two sentences "I just don't have the confidence" and "The breakup came from out of the blue" in the same piece of dialog?  Are they both displayed with exactly the same font?

The first one is at $193 and the second one is at $1CF, so they're pretty close to each other.
I don't know what you mean by the "same piece of dialog", but the second one is displayed after a mouse click on the first one. And yeah, same font.

There's more words between them, but if I posted the whole sentece it might be confusing for you because there might be some [linebreak]'s that you wouldn't be aware of (neither would I).

Code: [Select]
00000190            27 13 1A 13 11  01 16 1A 18 02 03 45 0D      '..........E.
000001A0   04 1B 08 4F 42 05 08 1B  4D 1B 01 13 55 0C 14 5B   ...OB...M...U..[
000001B0   06 1E 16 53 10 13 02 11  1E 51 53 44 5E 53 05 49   ...S.....QSD^S.I
000001C0   4F 01 43 43 40 18 50 35  00 1A 42 1A 4B 5C 4E 30   O.CC@.P5..B.K\N0
000001D0   5D 01 07 1A 04 54 07 0C  1C 48 03 44 48 12 15 00   ]....T...H.DH...
000001E0   4E 15 1A 0C 4F 43 1C 0D  14 00 14 11 0D 10 17      N...OC.........
Title: Re: Help cracking encrypted text
Post by: golden on June 14, 2011, 05:19:51 pm
What game is this by the way?
Title: Re: Help cracking encrypted text
Post by: Sledge on June 14, 2011, 05:23:43 pm
X-Change 2 (PC)
--Link Removed-- (not sure this board allows me to post download links, if not I'll remove when warned)
Title: Re: Help cracking encrypted text
Post by: snarfblam on June 14, 2011, 05:32:39 pm
Is it just me or does this look like some kind of XOR encryption? The first thing that I noticed was the ordering of the letters.
Code: [Select]
$18 = j     <     
$19 = k       <   
$1A = h <         
$1B = i   <       
$1C = n             <
$1D = o               <
$1E = l         <   
$1F = m           < 
$20 = R       
See the pattern? Now, if you XOR each value on the left with the ANSI code of the character on the right, you'll see that you get a value of $72 each time. This just leaves the question of how the game determines what value is XORed with each successive character.
Title: Re: Help cracking encrypted text
Post by: Klarth on June 14, 2011, 05:36:45 pm
It's not a straightforward XOR encryption.  So here's a few features:

A) Each Xth byte represents Xth character (no overt byteswapping in the data)
B) Each byte is encrypted differently.  This means either XOR value or at least one addition/subtraction value changes over time.
C) We don't know if that value change is based on a hardcoded value or an encryption key.  I'm leaning towards the former.
D) It'd probably be easier to look at the PC assembly than crack this encryption.  Though I'm sure somebody at NIST would crack it in under an hour.  (Don't go to them)
E) I'd need different data patterns to even possibly figure this out.  Patterns with consecutive letters (like "look", "poor", "good", "greed", etc) would help...especially if it was the first word of a sentence, but that's asking too much.

Oh, an H-game.  H-games almost universally use some variant of XOR encryption.  I'm trying to get a copy of the game for disassembly but I'm not having much luck.
Title: Re: Help cracking encrypted text
Post by: Vegetaman on June 14, 2011, 07:54:41 pm
If it is just a standard XOR encryption, and the cipher key is trivial (meaning they weren't really trying to lock it up like Fort Knox), then doing a frequency analysis on the hex data that contains all of the text strings from the game may well allow the key to be discovered -- but the assembler method is probably the fastest.
Title: Re: Help cracking encrypted text
Post by: tomaitheous on June 14, 2011, 09:24:59 pm
The disassembling/trace option makes it too easy. Where's the fun in that?

Quote
The first one is at $193 and the second one is at $1CF, so they're pretty close to each other.
I don't know what you mean by the "same piece of dialog", but the second one is displayed after a mouse click on the first one. And yeah, same font.

 I suspect he wants to know if the strings are part of the same 'string'. Because the encryption could differ string to string block (i.e. a string or block of text is connected to a pointer). A 'key' per point(er/ed) block of text or data. 
Title: Re: Help cracking encrypted text
Post by: Sledge on June 14, 2011, 11:58:30 pm
Quote
I'd need different data patterns to even possibly figure this out.  Patterns with consecutive letters (like "look", "poor", "good", "greed", etc) would help...especially if it was the first word of a sentence, but that's asking too much.

Since I'm the one being helped here, I'm the one who's gotta worry about asking too much :beer:

Here's a sentence that uses the same characters consecutively.

''Goodbye, Takuya.''
Code: [Select]
00000170                               54 4E 28 4E 1C 07 11            TN(N...
00000180   10 14 59 43 27 13 10 07  0F 12 51 50 50            ..YC'.....QPP

As you can see, it uses ['] twice at the beginning, and twice and the end. And surprisingly, they have the same hex value $50,  I tried changing the last two bytes to $51 and it caused the game to show [&] instead of [']

Also, it uses [ o ] twice, but not with the same hex value.
Title: Re: Help cracking encrypted text
Post by: Vegetaman on June 15, 2011, 12:24:42 am
Here's a sentence that uses the same characters consecutively.

''Goodbye, Takuya.''
Code: [Select]
00000170                               54 4E 28 4E 1C 07 11            TN(N...
00000180   10 14 59 43 27 13 10 07  0F 12 51 50 50            ..YC'.....QPP

As you can see, it uses ['] twice at the beginning, and twice and the end. And surprisingly, they have the same hex value $50,  I tried changing the last two bytes to $51 and it caused the game to show [&] instead of [']


As I understand it, it breaks up like this, then?

Code: [Select]
STRING:    '  '  G  o  o  d  b  y  e  ,     T  a  k  u  y  a  .  '  '
GAME HEX:  54 4E 28 4E 1C 07 11 10 14 59 43 27 13 10 07 0F 12 51 50 50

STRING:    '  '  G  o  o  d  b  y  e  ,     T  a  k  u  y  a  .  '  '
ASCII HEX: 27 27 47 6F 6F 64 62 79 65 2C 20 54 61 6B 75 79 61 2E 27 27

What is telling is that you have a 0x4E meaning a " ' " in one spot, and a 0x4E meaning an " o " in another, in the same string... As well as having 0x54, 0x4E, and 0x50 all meaning " ' " at different points, apparently... At least, IMO...

The ASCII hex assumes what the string would be if the game would have been using an ASCII character table (which depending on how the XOR encoding is done, may or may not be a truism). That part was added to see if any patterns emerge... Which, for me, is going to have to wait until tomorrow...
Title: Re: Help cracking encrypted text
Post by: Ryusui on June 15, 2011, 01:09:19 am
Try this. Replace an entire string with the same byte. Then you can clearly see any patterns that emerge.
Title: Re: Help cracking encrypted text
Post by: Klarth on June 15, 2011, 05:56:02 am
I've isolated the code in question.  It's not too large (the file i/o code is half of it and can be trimmed from the top).  I'll post some of the tips and techniques I used probably tomorrow.  I know for certain that the start/end points are true because I verified with memory dumps.  There may be a bit of fluff in between.  The routine decrypts a file in its entirety.

Here's the routine:
http://pastebin.com/4YaJdxXk (http://pastebin.com/4YaJdxXk)

1. Routine opens the encrypted file.  The routine is built for normal files and win32 LZ lib.  I hit the breakpoint for the LZRead version instead of the win32api ReadFile version of the routine.
2. Read 0xC0 bytes (the header length).  0xA0 bytes into the header is the size for the rest of the file.  So it creates a buffer and reads the rest of the file.
3. The data in memory at this point is unreadable.  It's also verbatim according to a several large searches against the source data.  (My free hex editor apparently doesn't support file compare for free...)
4. Start at loc_403401 and figure out the scheme!  I have a funny suspicion some magic may be in sub_402E70 though, included at the bottom.

I haven't given the actual routine a good look yet but will tomorrow.  Unless somebody else figures it out.  I tried pretty hard to do it without the source and failed so I'm interested in what obfuscation it uses.
Title: Re: Help cracking encrypted text
Post by: Ryusui on June 15, 2011, 06:02:38 am
(My free hex editor apparently doesn't support file compare for free...)

WindHex.
Title: Re: Help cracking encrypted text
Post by: Nightcrawler on June 15, 2011, 08:56:53 am
(My free hex editor apparently doesn't support file compare for free...)

HxD (http://mh-nexus.de/en/hxd/) is a great general purpose freeware hex editor that does file compare and many other things.
Title: Re: Help cracking encrypted text
Post by: Jorpho on June 15, 2011, 11:03:24 am
I've used FrHed in the past when I've needed to compare files.  It is surprising that the feature is so often lacking from other quality editors.
Oh, an H-game.  H-games almost universally use some variant of XOR encryption.
I wonder why?
I've isolated the code in question.  It's not too large (the file i/o code is half of it and can be trimmed from the top).  I'll post some of the tips and techniques I used probably tomorrow.
Oh yes, I'd love to know how you did something like that.
Title: Re: Help cracking encrypted text
Post by: KaioShin on June 15, 2011, 01:46:49 pm
Oh, an H-game.  H-games almost universally use some variant of XOR encryption.
I wonder why?

The CG graphics are the selling point of h-games for a lot of people. There is a thriving scene in Japan that rips the CGs from newly released games and puts them online. So the developers encrypt their materials to make sure that not every day-1 buyer can just take them and send them to his friends. Of course there are people who also work on keeping up decryption tools with new releases, it's just like western games and copy protections. They know they'll get cracked anyway, but they try anyway each time.

By the way, there is a very good chance that a decryption tool already exists because of this. But don't count on repacking, and don't bother asking the authors of such decryption tools, they aren't interested. I tried :P
Title: Re: Help cracking encrypted text
Post by: Jorpho on June 15, 2011, 02:24:02 pm
The CG graphics are the selling point of h-games for a lot of people. There is a thriving scene in Japan that rips the CGs from newly released games and puts them online. So the developers encrypt their materials to make sure that not every day-1 buyer can just take them and send them to his friends. Of course there are people who also work on keeping up decryption tools with new releases, it's just like western games and copy protections. They know they'll get cracked anyway, but they try anyway each time.
That sort of makes sense, but if it's the CGs that get ripped, why encrypt the text?  And wouldn't even the most stringent encryption not block someone with a proper screen-capture program from nabbing the CGs?
Title: Re: Help cracking encrypted text
Post by: KaioShin on June 15, 2011, 02:44:59 pm
why encrypt the text?

Mostly it's "why not?". They just pack everything, sounds and music too. It's not like it increases the loading times or anything by a siginificant amount, other than loading and displaying texts and pictures, those engines have nothing to do after all. As for screen capturing, of course that works. But it's annoying work, requires someone to play through the game in all routes (eroge has multiple in 9 out of 10 cases) etc. It's a deterent more than it's really effective. If they keep the CG pack from being available at day 0 on 2chan, I think the devs are already happy. This is just a guess on my part, but I also believe there is generally a legal component involved, most copyright laws related to digital things make circumventing copy protections illegal, not making copies itself, so they are obligated to have some form of protection, no matter how effective it is.
Title: Re: Help cracking encrypted text
Post by: Klarth on June 16, 2011, 12:54:36 am
Well, I'm stuck on this code at the moment.  The data flow just isn't making sense to me (x86 isn't my strong suit).

So as far as how I isolated the code:
1. I searched the .exe to see if there was a reference to xc2.sce (the file in question) and there was.  Which made things much easier.
2. I loaded the .exe into IDA Pro to find that string and checked the cross reference.
3. I found the routine that used the string and a LZ* file functions along with CreateFile, so I knew I was on the right track.
4. I saw the reference to operator new and compared the allocation size against the file's length value and got a match.
5. I narrowed the routine down a bit into some loops and put a breakpoint just before it started and just after.
6. Debug the exe with IDA Pro and at each breakpoint, go into task manager, right click the exe, and dump memory with Create Dump File.  (Can do this in Vista/Win7)
7. Opened both dumps and got lucky enough to find the entire script in the post dump and the encoded version in the pre dump.

I've been trying to use the trace feature with minimal results.  The data is there, I'm just not matching my head up with it yet.  IDA has features to better inspect memory than a memory dump, but I'm not familiar with how to use them.
Title: Re: Help cracking encrypted text
Post by: golden on June 16, 2011, 01:26:24 pm
I'm pretty sure I've read that someone managed to extract X-Change's resources with AnimED. If not, there are also various tools that might be useful: http://tlwiki.tsukuru.info/index.php?title=Tools.