News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Opening the Guts of Corpse Party 2U: A Diary  (Read 13516 times)

retrocombine

  • Newbie
  • *
  • Posts: 4
    • View Profile
Opening the Guts of Corpse Party 2U: A Diary
« on: November 20, 2013, 06:45:20 pm »
Hello. I'm interested in translating and possibly patching Corpse Party: Sachiko's Game of Love ♥ Hysteric Birthday 2U (or just Corpse Party 2U). I have translated from Japanese to English before, and consider myself intermediate in the language. Unfortunately, I don't think anyone has managed to access the game's script or assets yet, so I took it upon myself to figure out how. I have little background in computer science and haven't done romhacking before, so... this will be an adventure.

First thing I did was find a tool to extract the files inside the .iso. This took a while - as far as I can tell, there's no real alternative to UMDGen on OSX except Prometeus. Prometeus itself, well...

Quote
dyld: Library not loaded: /usr/local/lib/libusb-0.1.4.dylib
  Referenced from: /Users/rebecca/Downloads/Prometeus_v0.9.8B/Prometeus.app/Contents/MacOS/Prometeus
  Reason: no suitable image found.  Did find:
   /usr/local/lib/libusb-0.1.4.dylib: mach-o, but wrong architecture
   /usr/local/lib/libusb-0.1.4.dylib: mach-o, but wrong architecture
Trace/BPT trap: 5

It doesn't like libusb-1.0 and refuses to use it, and even then something in the architecture has changed so that not even libusb-0.1.4 works. Down goes that idea. I finally give up and just try and make a Wineskin out of UMDGen - and it works!



Hallelujah. Right off the bat I notice something interesting: KANDICT.BIN. This is relevant to my interests - after speaking with DarkHamsterlord, the main issue with unpacking the DATA0.CPK file is something about "not a @UTF table at 16". Could the difficulty in unpacking the CPK be related to its encoding, and could we use KANDICT to fix that? Or am I on the wrong track?

Let me know if I'm doing anything wrong re: this topic.

November 20, 2013, 07:45:23 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
Well, I've got another problem. I need a tool to open CPK files, but QuickBMS doesn't play well with WINE. I lack tools to unpack CPK files, but even then...there's the "not a @UTF table at 16" issue. Looking at others with the same issue, it has something to do with encryption. Apparently, there is a tool called "cpk_unpack", but it's in .exe format. Luckily, there's source code and a Makefile, so I managed to build a binary of it anyway. Trying to run it gives me...

Quote
Retros-MacBook-Pro:USRDIR retro$ /Users/retro/Downloads/utf_tab07b3/cpk_unpack DATA0.CPK
cpk_unpack 0.7 beta 3

utf_tab.c:461:query_utf_nofail: didn't find valid @UTF table where one was expected

...nothing. It looks like the same issue as "not a @UTF table at 16".

What is "16" referring to in this case? Is it some sort of memory address? I can open it up in a hex editor, here's what it looks like (url for looooooong). How do I figure out what the @UTF table is? Once I figure out what it is, how do I use it? What am I even doing I am so lost?

Anyway, is there a good CPK unpacker tool for OSX/Linux? Preferably something I can build on Mavericks without it screaming about a missing .h file or something.

UPDATE: Apparently there's a "special" version of utf_tab. Running it on DATA0.CPK gives me...

Quote
Retros-MacBook-Pro:USRDIR retro$ /Users/retro/Downloads/utf_tab07b5_special/cpk_crypt DATA0.CPK
cpk_crypt 0.7 beta 5 for over.cpk

s=5f m=15

I assume this is some sort of decryption key? Looking at the code for cpk_crypt suggests that this line doesn't get returned unless it's actually encrypted. s is "xor_bytes", m is "(unsigned int)mult".

Maybe if I unpack it now...

Quote
Retros-MacBook-Pro:USRDIR retro$ /Users/retro/Downloads/utf_tab07b5_special/cpk_unpack DATA0.cpk
cpk_unpack 0.7 beta 5 for over.cpk

/BGL00.CPK 0x4ab800 72220148
/BGS00.CPK 0x2d76c000 8378856
/CHL00.CPK 0x498b800 45722352
/CHS00.CPK 0x2df6a000 11448372
/CONFIG.CPK 0x2000 36880
/DATA0.CLS 0x1800 163
/FONTS.CPK 0xb800 1212708
/ICON.CPK 0x134000 235863
/OBJAB.CPK 0x2d6b3800 753908
/OBJSY.CPK 0x16e000 2797712
/SCENE00.CPK 0x419800 596094
/SE00.CPK 0x2114d000 53618688
/SE01.CPK 0x2446f800 153370624
/VO00.CPK 0x7526800 207794350
/VO01.CPK 0x13b52000 224374039

We have liftoff! :crazy: I got somewhere, at least. That said I have a bunch more CPKs now...I'm willing to bet that SE and VO are SFX and voice-overs, respectively, but some of the rest are a mystery.

Anyway, unpacking everything gives me a shitload of binaries. Only one set is immediately obvious - ICON.CPK's .bins look like this:




Yep, they're good ol' PNGs. Got the right header information and everything. These look like the system menu logos for the game or something.

Some of the other bins are TIM2 files. Using Game Graphic Studio, I can open them up and see them (but I don't know how to batch convert them all :( ). The SFX and VOs are .ADX, I think, and I think I'll skip over those for now.

Now, the really important part is SCENE00.CPK. Its BINs should contain the script for the game! (i.e., the text.) Unfortunately, it's garbled nonsense when I open it up in the hex editor. I don't see anything that looks remotely like text here.

I assume I'm supposed to extract text from these bins, but how? Anybody know?
« Last Edit: November 20, 2013, 10:35:50 pm by retrocombine »

cj iwakura

  • Full Member
  • ***
  • Posts: 240
  • The Rhythm Rogue
    • View Profile
    • iwakura.productions
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #1 on: November 21, 2013, 12:38:42 am »
I would love to play this and wish you luck. I could even help with editing if it gets that far.


Does the game ever turn horrific(I kind of suspected it's a giant setup), or is it really just a comedy?

蒼く咲く華 日は灯り 天に流れる | Kill The Past

DSwizzy145

  • Sr. Member
  • ****
  • Posts: 427
  • Super Famicom Fanatic
    • View Profile
    • Super Famicom Game List A-C + SNES Game List
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #2 on: November 21, 2013, 10:32:36 am »
That reminds me i need to get Corpse Party Book of Shadows before you guys finish the translation :) I wish you the best of luck with this project! P.S btw have you guys ever played Criminal Girls?

retrocombine

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #3 on: November 21, 2013, 05:21:51 pm »
So I'm stuck on trying to extract the script from SCENE00.CPK. The output is a collection of bins that, when I open it them up in the hex editor, don't really contain any text. Obviously, any text would be in Japanese, but I'm not sure how to convert these files to the right format. Has anyone done script extraction from a PSP game before?

Quote
Does the game ever turn horrific(I kind of suspected it's a giant setup), or is it really just a comedy?

It's horrific in a...very different way. Suffice to say that it's a very Japanese game.

esperknight

  • Full Member
  • ***
  • Posts: 130
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #4 on: November 23, 2013, 12:24:52 pm »
I was curious so took a peak :D  The bins that are extracted from SCENE00.cpk do contain all your text (or at least quite a bit).  Not sure what hex editor you're using but I recommend MadEdit as that supports SJIS natively by setting the encoding for it.  From looking at it, extraction should be pretty easy as all the pointers are aligned to 4 bytes.

The problem really is not getting false positives...  For that it looks like if you start at the end of the file and scan backwards every two bytes for 0x0D02 then you'll find the start of the text block.  From there you can start from the beginning and go every 4 and see if it's a pointer and if so, does it start past that position to get valid ones.

Far as insertion, I'd recommend just sticking with SJIS as it looks like it may use only 2 byte reads since even the ascii is 2 byte aligned (using % as an indicator).  Plus it's trivial to insert using Atlas that way and I tend to do it if I don't have to worry about space.   Course if it's doesn't use a VWF for ascii well.. that may be an issue. But won't tell 'till you try it ;)

Feel free to hit me up if you need anything :)  I can't say I'm quite familiar with the PSP but I've look at a few games on there.  Far as debugging and all that though, I haven't really tried and mostly just got lucky in that the games I've checked into have been easy enough to work with without special tools (this is the first I've seen of the CPK format so appreciate you posting about you're findings as I'll keep this in mind for others :D )

retrocombine

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #5 on: December 26, 2013, 11:31:39 pm »
Struck gold! Opening 24.bin in a hex editor that supports Shift-JIS gets me the game script in more-or-less plaintext. That's about where I'm stuck, though. Judging by the text, it doesn't seem like all the lines are in order - the conversations are all jumbled up. Maybe you have to piece the files together or something?

What's also odd is that certain .BINs only have a line or two of code, while some others are over 200kb each. 14.bin in particular has this:

Quote
DATA.INIT.SHORTCUT.DICT.DMENU.DMENU2.DMENU4.DMENU5.
DMENU6.DBG00.DBG01.DBG02.DBG03.
CLRFLG.MAIN00.MAIN90.CP2U00A.CP2U00B.CP2U01.CP2U02.CP2U03.
CP2U04.CP2U05.CP2U06.CP2U07.CP2U08.CP2UEX.CP2UNOTE.MES_CHK.MES_CHK.MES_CHK. MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.MES_CHK.
MES_CHK.MES_CHK.MES_CHK.MES_CHK

I don't think this is as simple as just dumping the whole thing and tada, script. Other bits also mention "return to main menu" and "clear list". As far as I know, all I can tell is "it's in Shift-JIS".

edit: I did manage to get a dump! Opening it as hexadecimal in Sublime Text seems to get me the right code, but it's still encoded as Shift-JIS and I'd like to convert it to UTF8 or something. Anyway, is it still bad if I share one of the files so I can get another opinion?

edit 2: Yeah, still stuck on how to convert this to (from?) Shift-JIS, or at least figure out how to make this bunch of hexadecimal into readable Japanese text. Any help?
« Last Edit: December 26, 2013, 11:47:54 pm by retrocombine »

esperknight

  • Full Member
  • ***
  • Posts: 130
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #6 on: December 27, 2013, 11:27:10 pm »
You don't need to convert it to UTF8 to make it usable.  Depending on your text editor (I use Editpad Lite) it can display it in SJIS no problem.  Then again I should mention I have Japanese language pack installed too which helps (Edit: I realized you're on a Mac, so this may not help you...).  And as I mentioned before, MadEdit supports SJIS as well and it's a pretty good hex editor.  But honestly it's up to you.  I don't know how to do it in C++ as I've never bothered with unicode there but I know C# can easily convert to and from UTF8 (it's actually what it uses internally i believe).

Far as extraction goes : http://www.mediafire.com/download/43ti4aqa44mkyae/cp_birthday_script_extract.rar

I included the scripts and my code.  It's coded in C# as I prefer it when doing text manipulation but I could easily write it in C++ if you'd like.  Should be compilable with Mono as I'm not using anything fancy.  But let me know and I can help you out.

From what I saw the pointers are a wee bit funky.  Some are 2 bytes and some 4 even if the the file may not be that large (like 02.bin).

Far as out of orderness and all, not surprising at all.  Most games don't store there strings in order.  Now this may be extracted in order as I'm thinking it would be using a scripting engine since I believe it's a VN right?  So they might be...

Also, this isn't made for insertion at all. I'd normally use Atlas for that as I'm lazy. Plus I'd make sure none of the strings duplicated and all that.  This is just a quick one to show how I'd extract it.  I can post other code to show I'd set it up for insertion with Atlas.

Also, some scripts are probably just scenes only with no text or what not or even just possibly menu scripts.  Never know.  But you may be right, it could be more difficult but one can hope :)  (The difficult part I think would be rebuilding the cpks and all but I've never done it before).

Let me know if you have any more questions or anything.  Hope this helps! :)

« Last Edit: December 28, 2013, 10:45:47 am by esperknight »

retrocombine

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #7 on: January 15, 2014, 12:34:49 pm »
Xamarin works just fine for opening C# files. I'm used to Python, but I think I understand what you're doing anyway. The particulars of iterating through a .bin and reading it byte by byte are new to me, though. Let me try and read through what your script does to see if I understand it...



First, you define the source directory.

Then, for each file in that source directory, you create a new BinaryReader object called bin.
This is used to "read primitive data types as binary values in a specific encoding". It defaults to UTF-8.
Then, if I understand correctly, you read backwards from the end, two bytes at a time? (
Code: [Select]
Seek(-2, SeekOrigin.End))
While the read position is greater than or equal to 0, you read forward two bytes and if that byte is equal to "0x020D" (the start position? how did you determine this?), then this byte is set as the text start position and the while loop is broken out of.
Else, move backwards four bytes (since you went back two and went forward two).

Create a list of pointers in UInt32.
Start reading from the beginning, and while the current pointer is less than the text_start (what does this mean?), iterate through each pointer.
If the current position (is that an index or a value?) is less than 0x250 (what is this?), set break_me to 0 (this variable only shows up once and is never used).
From the current position, read a 4-byte unsigned integer (UInt32) as a pointer.
If the pointer is greater than or equal to text_start AND less than the total length of the stream, add it to the list.
Else, repeat the condition for the pointer AND 0xFFFF together. If they both pass, add both the pointer and 0xFFFF to the list. Then, go back two bytes.
Else, go back two bytes.

When the while loop is complete, instantiate a ScriptWriter for encoding the pointers and writing to text.
For each pointer in the list, Seek "ptr - 1"(?) from the beginning.
Read the byte to check whether it is text, and if the byte is equal to 0x00 OR 0x02:
If it hasn't been done already, set the StreamWriter instantiation to encode the bytes in ShiftJIS, and write to a file called "(original filename).txt".
Then, read the byte to check whether it is 00 (i.e. whether or not it is empty). Keep track of the current length. While it is not empty, increment the length counter and check whether the next byte is 00.

Once the byte is 00, check to see if the length is greater than 0. If so, return to the beginning, find the pointer, encode it in ShiftJIS, then write to the file the pointer, then the ShiftJIS encoding after a newline. This will be repeated for each pointer in the pointer list.



So if I were to try making a program like this in Python, basically what I need to do is split the bin file into chunks of two bytes (would that work?), find the first occurrence of 0x020D, and set that as the beginning of the script.

Make a list to append pointers to, then start reading from the beginning, and while the current pointer is less than the beginning of the script (what does this mean?), iterate through each pointer. Take each pointer, encode it in ShiftJIS, and write it to a file. Stop when the stream has ended. Would that work?

Sorry if I get this wrong, I'm a novice developer. I'm not familiar with how binary streams are read, but if I can translate this into Python objects I can usually flub my way through it.

edit: Oh yeah, I was wondering: what version of Shift-JIS does this use? sjis, 2004, or x0213?
« Last Edit: January 15, 2014, 01:26:32 pm by retrocombine »

esperknight

  • Full Member
  • ***
  • Posts: 130
    • View Profile
Re: Opening the Guts of Corpse Party 2U: A Diary
« Reply #8 on: January 16, 2014, 07:31:30 pm »
Pretty spot on for what I'm doing :)  Ignore the break_me.... That was left over code for debugging.  I was using that to break on that position as near there there was a funky pointer where the first 2 were one pointer and the next two were another (guessing it concatenates the strings together).  I missed quite a bit of text that way so I wanted to make sure I picked it up.

The way I figured out the 0x020D (for anything in the binary it'll be read in backwards so when looking in a hex editor it'll be 0x0D02) I just looked a few of the files and noticed all the text blocks start with that.  So from there I just determine where the text block starts.

The reason I do the ptr - 1 is to verify if it's legit or not as you'll invariably get junk if you don't.  Each one will either have a 0 in front of it or just one will have the 0x02 (since it starts the text block that way).  This is a nice way to filter out the junk.  You also want to make sure it starts past the text start of the bank as there is no other text in the banks (or should be, but you never know...).  Plus far as reading pointers you dont want to read anything in the text bank as none of those will have pointers (or shouldn't...).  So really this is all to make sure we have legit data.

For writing this out, not sure how Python handles files but for C# I chose to use a StreamReader which handles text nicely but not binary data (not that you can't write out binary blobs with it).  So for this I encode it in SJIS for proper writing otherwise it'll come out as UTF16 which won't help me for use with Atlas.  If I was using C++ I'd just write it directly as binary data and not worry about conversion.

For version of SJIS, it's just SJIS, nothing fancy.

For a good example of how I'd insert it using atlas see here : http://subversion.assembla.com/svn/transprojects/psx/addies_present/script_txt/CHAP0/WORLD1.DLR.txt  It makes it nice and easy to write pointers with.  Course you could code it yourself to do it I but I prefer being lazy ;)  If you're curious to see other code feel free to browse through here and look for the tools folder in each game folder : http://subversion.assembla.com/svn/transprojects/  I'm a bit lazy in uploading stuff though so not all my code is uploaded yet...

Also feel free to email me if you like at esperknight at yahoo but I don't mind posting here either :)  I can't say I'm as familiar with python but I can read it and code a bit in it but I prefer C# and C++ :)