CUE is right that this looks like a simple LZSS. You can tell because it starts out much like the compressed file and gradually deteriorates, and because original bytes are clearly present throughout. If you could see original bytes but it was messy throughout, it might be DTE, and if it were just a mess of random noise, you would guess some kind of entropy coding (e.g. Huffman... or Huffman on top of LZSS!).
I can't debug your code for you, but it looks like it might be designed for bitwise
LZSS compression. From a general computer science perspective this is more common, but video games tend to use bytewise
LZSS compression. You can tell that this is bytewise because the original bytes are still legible. The difference is that bitwise LZSS compression tells the decompressor whether the following byte is a literal or compression code every single time with a single bit, while bytewise compression stores chunks of codes and sends them bytes at a time for easier processing. If it were bitwise, it would very quickly dissolve into noise as soon as a bit was shifted and we got "out of sync" with the original ordering.
Bytewise LZSS compression frequently is split up into three kinds of "codes":literals
, chunks of hex that appear verbatim in the decompressed file;bitfields
, one or two byte segments that tell the decompressor how to treat the next chunk of following codes;compression
, codes that tell the decompressor how far to go back and/or how long a string of hex to copy
Someone else may have a better way, but generally I have attacked these schemes like ciphers. For example, here's my attempt to parse your file:
On the left is the compressed version, and on the right is the decompressed version. It can be hard at first, and you'll need to start just by separating it into "literals" and "non-literals". (In fact, examining this, it's clear that the last three lines should be "bitfield compressed literal", not "compressed bitfield literal".) but with this you can start to guess at what your bitfield codes mean.
If you're comfortable with assembly, examining the decompression routine in action with a debugger might be faster... but to be honest I kind of like puzzling it out this way.
My impression is that it looks like the compression code just tells the decompressor how far back to go ("distance"), not how long to copy ("length"), so the "length" may be buried in the bitfields.
If that's confusing, sorry. It's just a complicated subject. But maybe that will give you some basis to go on.