News:

11 March 2016 - Forum Rules

Main Menu

Curious about a compressed file...

Started by irvgotti452, January 18, 2017, 12:53:55 AM

Previous topic - Next topic

irvgotti452



Now I've read a few things but I'm a bit of brute when it comes to learning about compression and packaged files. Anybody care to educate me a bit? I'm curious to know what I can find in these files I have. Since I can edit this one I think it's safe to assume that this is one is not compressed?

BlackDog61

Completely wild guess: it may be "just an archive of files" (Files ARChive?) where all of the files you edit have been grouped as one, with a kind of table of contents at the  start of the file. Just like "tar" does?

henke37

Looks like a fairly simple table. It stores the size and offset of each file.

irvgotti452

Quote from: BlackDog61 on January 18, 2017, 03:05:45 AM
Completely wild guess: it may be "just an archive of files" (Files ARChive?) where all of the files you edit have been grouped as one, with a kind of table of contents at the  start of the file. Just like "tar" does?
Yea it looks exactly like that. Do you mean ".tar"? or is that short for something else?

Quote from: henke37 on January 18, 2017, 03:28:15 AM
Looks like a fairly simple table. It stores the size and offset of each file.
Ah, how would I be able to tell the offset and size?



Sooo...red is offset, blue is size? And also how do I know where each line of data line starts/length (eg2byte, 4byte)?

Really tryna learn how to unpack the file too.

STARWIN

offset 00 08 00 00 -> 800
size B0 B3 0A 00 -> AB3B0

offset 00 C0 0A 00 -> AC000
size 00 41 00 00 -> 4100

irvgotti452

Quote from: STARWIN on January 18, 2017, 09:42:45 AM
offset 00 08 00 00 -> 800
size B0 B3 0A 00 -> AB3B0

offset 00 C0 0A 00 -> AC000
size 00 41 00 00 -> 4100

Thank you very much. I'm starting to understand this. So for the first one with the size of AB3B0 converted to decimal 701360 would be the size in bytes?

BlackDog61

Quote from: irvgotti452 on January 18, 2017, 11:30:45 AM
So for the first one with the size of AB3B0 converted to decimal 701360 would be the size in bytes?
Yes.
Do note that "offset ofline 1" + "size of line 1" = 0xABBB0 <= "offset of line 2". This seems to be because offset have to start at a multiple of 0x800. (Typically happens when there is a need to have data aligned with something else, or disk sector sizes, etc.)

STARWIN

15 00 00 00 in the header is the amount of files
00 08 00 00 happens to match the multiplier

irvgotti452

Quote from: BlackDog61 on January 18, 2017, 11:48:38 AM
Yes.
Do note that "offset ofline 1" + "size of line 1" = 0xABBB0 <= "offset of line 2". This seems to be because offset have to start at a multiple of 0x800. (Typically happens when there is a need to have data aligned with something else, or disk sector sizes, etc.)
Quote from: STARWIN on January 18, 2017, 02:19:32 PM
15 00 00 00 in the header is the amount of files
00 08 00 00 happens to match the multiplier
Is this for every file or is it indicated in the header?

STARWIN

You just look around and guess stuff. Keep a calculator in hexadecimal mode open and check if the numbers make sense.

The main tool that answers questions is the debugger emulator, if you don't like guessing.

irvgotti452

Quote from: STARWIN on January 18, 2017, 06:13:33 PM
You just look around and guess stuff. Keep a calculator in hexadecimal mode open and check if the numbers make sense.

The main tool that answers questions is the debugger emulator, if you don't like guessing.

What do I look for on the debugger?

BlackDog61

Quote from: irvgotti452 on January 19, 2017, 12:26:42 AM
What do I look for on the debugger?
It's more complex than guessing, in my opinion.
You have to find at which time the game uses this data, and then look at the assembly that does it (stepping in the debugger) to understand what it does.

If you guess, you have to verify that you guessed right, most of the time by modifying / corrupting data and checking that the game goes as expected by your guesses. (Crashing or infinite locking are not typically useful states to validate hypotheses. Making the game load a different graphic, making it use different statistics, or whatever visible is much better.)

You can also mix "guess verification" with "looking at the debugger" if you don't want to / cannot get a visible result.

STARWIN

Yeah, this is not a very good question for a debugger to answer.. for example, if the game never uses this file, it would be difficult to prove. But the issue is at the question here, which is somewhat arbitrary, as you wouldn't necessarily prove it by changing the file either as it could be used in some special case only.

If you want to figure out something in the game, you can use debugger at that game state to answer detailed questions (and track things back to files, for example). Or without using a debugger, look for files that have same pattern as the game state you are interested in. That is a much more productive way of thinking IMO. Of course you need to understand at least some asm and breakpoint use etc in order to use the debugger, if that sounds like the interesting way.

irvgotti452

You guys are so helpful. Thanks.
Alright looks like I'll have a lot of reading to do to learn some ASM. But in the mean time since i have this information I'll be able to extract files properly. I'll do it via hex since thats where I'm at skill wise lol.

Also, P2IGa in a header, anybody have an idea what kinda graphics file this is?

flame

Did you try googling it? If you did, say you did, post the Google link even.
Time to say the game you're working on.

Unpacking simple archive file like this is REALLY simple. Rebuild pretty simple too.
Read ID string (4 bytes) and make sure it matches FARC
Read 0x8 -> num_files
Make empty list
Read position go to 0x10
For 0 to num_files:
    Read offset and size, append to list
For offset, size in list:
    filedata = the bytes from offset to offset + size
    name your file something (0.bin, 1.bin, etc...) and then write it using filedata

Your archive does not have file names so you will need to refer to them by file number throughout your project.

I write in Python. Maybe it's less lines in Python than it is in pseudocode, lol.
import struct

filename = 'myfile.bin'
with open(filename, 'rb') as f:
    if f.read(4) != b'FARC':
        raise Exception('ID string does not match "FARC".')
    f.seek(0x8)
    num_files = struct.unpack('<I', f.read(4))[0]
    f.seek(0x10)
    TOC = []
    for x in range(num_files):
        TOC.append(struct.unpack('<II', f.read(8)))
    for i, (offset, size) in enumerate(TOC):
        f.seek(offset)
        with open('{}.bin'.format(i), 'wb') as g:
            g.write(f.read(size))


It would be even clearer if you wrote it in QuickBMS script (special programming language for this kind of problem), but I don't know how to write those.

Jorpho

I made something like this for the PC version of Megaman X4 some time ago.
http://www.romhacking.net/forum/index.php/topic,20753.msg291681.html#msg291681

Of course, in that case it was really easy because the "archive" was just a bunch of WAV files strung together, and every WAV file starts with "RIFF".  You may be able to exploit a similar consistency in the case of this archive, if every file within has a consistent header.
This signature is an illusion and is a trap devisut by Satan. Go ahead dauntlessly! Make rapid progres!

irvgotti452

I love the wealth of knowledge that is here. Thanks for your replies.

Quote from: flame on January 20, 2017, 01:25:55 PM
Did you try googling it? If you did, say you did, post the Google link even.
Time to say the game you're working on.
The only game I'll be working on for the next couple of months is Namco X Capcom (http://www.romhacking.net/forum/index.php/topic,23187.0.html). Oh I've tried looking everywhere for months now This was the closest I ever came to an answer http://forum.xentax.com/viewtopic.php?f=16&t=5460&p=45365&hilit=+namco+x+capcom+#p45365.

QuoteIt would be even clearer if you wrote it in QuickBMS script (special programming language for this kind of problem), but I don't know how to write those.
Through searching I have found this program. It is great, I used the premade scripts to browse some other games.

Quote from: Jorpho on January 20, 2017, 10:38:41 PM
I made something like this for the PC version of Megaman X4 some time ago.
http://www.romhacking.net/forum/index.php/topic,20753.msg291681.html#msg291681

Of course, in that case it was really easy because the "archive" was just a bunch of WAV files strung together, and every WAV file starts with "RIFF".  You may be able to exploit a similar consistency in the case of this archive, if every file within has a consistent header.

I've learned a lot so far but there is still a lot I don't know so please bear with me. As far as the code you fine folks posted, how would I go about putting it into use? My main goal in this thread is to be able to unpack -> identify image file -> edit image file -> pack it back up.

So far I've managed to do smaller images in the game via TileMolester and Photoshop but now it's to the point where TileMolester isn't helping.

BlackDog61

Save Flame's code into "decompress.py".
Replace "myfile.bin" with your file name, of course.
Install / make sure you have python on your machine.
In a command line (windows-x, cmd), run "python decompress.py".