News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Author Topic: Legend of Dragoon script rewrite (and pointer questions)  (Read 8348 times)

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #20 on: April 12, 2016, 02:42:35 pm »
Not so fast. ;D
Yes, you have to change the extracted files.
In order to avoid having to change "all of the pointers in the file whereyou add 4 bytes", I propose that you add 4 bytes at the end of the file in question. (Just make the last text longer. Or if there is non-text stuff after it,you can take a step 0: just add bytes to the file without changing the text.)
Note: 4 bytes is likely not big enough to make the file span overan extra sector. I don't remember the sector size for the PSX, but for the PSP you'd add 2048 bytes (that is 1 sector). Don't add so much to a text entry. ;)

Then re-build an image with the ISO building tools that were mentioned either in this thread or in other ones.

(Adding 4 bytes to the iso image itself wasn't a valid test, indeed. It just likely broke the structure somewhere.)

theflyingzamboni

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #21 on: April 13, 2016, 01:06:48 pm »
Not so fast. ;D
Yes, you have to change the extracted files.
In order to avoid having to change "all of the pointers in the file whereyou add 4 bytes", I propose that you add 4 bytes at the end of the file in question. (Just make the last text longer. Or if there is non-text stuff after it,you can take a step 0: just add bytes to the file without changing the text.)
Note: 4 bytes is likely not big enough to make the file span overan extra sector. I don't remember the sector size for the PSX, but for the PSP you'd add 2048 bytes (that is 1 sector). Don't add so much to a text entry. ;)

Then re-build an image with the ISO building tools that were mentioned either in this thread or in other ones.

(Adding 4 bytes to the iso image itself wasn't a valid test, indeed. It just likely broke the structure somewhere.)
That sounds potentially promising. I'd like to clarify what you're saying with the structure of this particular game. So all the text is stored in single .BIN file within the ISO. This is what I have to work with, as the text portions are not extractable as files with any software I know of. Only the images are extractable. As far as I can tell, this BIN file contains just level-specific information, so art assets, text, maybe item placements and event scripts.

As I mentioned previously, it's split into different sections by area mainly, and the text is buried in the middle of these sections; there is a bunch of non-text stuff after it, as well as more level sections, so I don't think that I can add things to the end of the entire BIN file. It would be totally separated from the rest of the text, by tens of thousands of bytes or more, and I'm not sure the game would know how to put it into RAM with the rest of the text. Also, I originally thought all material for a single area was stored in one section, but I'll explain why I think that was wrong in a moment.

I'm not quite sure how to deal with the sectors. I know that the images, and I'm sure larger sections of text, span many sectors. A CD sector is apparently 2352 bytes. For game data, there is only a 16 byte header, meaning 2336 bytes are usable for data. Using that knowledge and jPSXdec's display of what sectors a particular image is located in, I believe I can identify where they end. If I multiply the length of the file by the number of sectors, I get a value that is 0x0920 less than the total file length (which is 2336, I suspect uncoincidentally). If I multiply any given sector number by 2356 and add 2336, it puts me at a line of (to my non-computer brain) garbage just above another line that always has 0x00FFFFFF FFFFFFFF FFFFFF00 XXXXXXXX. Above this line there's always a block of non-spaced hex all clustered together, so this is probably the code telling it to continue on to the next sector or something. And now I know why these code blocks were in the middle of particularly long sections of text. So to explain about level assets, I now know they are not in the same section (not to be confused with sector). For the level I've been looking at, there is a section with all the background images, below that a section with bunch of other stuff I don't know (maybe event scripts), and below that the section with my text and the unknown stuff surrounding it. So I think they're at least clustered by level.

There is no blank space between sectors though, so I have no way to add text to a sector without overrunning it, if that is what you were suggesting. I doubt I can add sectors without a pretty deep knowledge of PS1 asm and how it manages memory.

However. At the end of an individual section, there's a block of repeating 0x8C that pads out the empty space in the last sector of that section. The size varies depending on how much of the last sector was used, but I may be able to use this space for overflow text, as opposed to the end of the .BIN file. I checked a couple dump files, and being part of a sector for that section of code, it gets put into RAM with the rest of the section, so I know that it will load. Does this seem like a potentially viable route to take?
« Last Edit: April 13, 2016, 01:12:00 pm by theflyingzamboni »
ROM wasn't hacked in a day.

STARWIN

  • Sr. Member
  • ****
  • Posts: 454
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #22 on: April 13, 2016, 03:38:14 pm »
IIRC out of each 2352 bytes you usually have only 2048 bytes for visible data. Here is the format: https://en.wikipedia.org/wiki/CD-ROM#CD-ROM_XA_extension

00FFFF... for example is the "Sync pattern". I had this stuff written down somewhere.. (edit2: here http://www.romhacking.net/forum/index.php/topic,20306.msg285853.html#msg285853)

It is possible there to be visible 2324 byte sectors, but those IIRC are used for audio and such less interesting things. The header bytes tell whether it is a form1 or form2 sector, but you can pretty much assume you have 2048 visible bytes. So when you extract a file from the image, it reads a certain place in the image to get the folder/filesystem, gets the file sector and size and glues those 2048's together for you. When you insert an equal size file back to the image, it also recalculates those error correction/detection bytes.

So if you just insert 4 bytes like you did a moment ago, most of the sectors get out of place.. and if you change 4 bytes, the error code stuff remains as it was (and now incorrect). This is not terribly important but nice to understand.

By the way, the programming specs for no$psx contain a lot of information about ps1: http://problemkaputt.de/psx-spx.htm

I think you would no doubt be intelligent enough to be a hacker (read: asm is not very deep), but that of course doesn't mean certain things wouldn't take time..

I don't have a perfect attack plan for figuring out memory management of a PS1 game. Maybe I'll think about that a bit. There is a BIOS function in a known place that can be used by the game for reading files to RAM, for example (edit: err, actually, cdrom controller io port writes is more universal). Breaking there might reveal things, and at least no$psx has a some kind of logging support for cd rom commands sent. But on the other hand audio system is doing things most of the time. IIRC write breakpoints in RAM won't trigger when a section of data is loaded..

edit3:

basically, jumping 2352 bytes forward in the image in the data area of a sector containing that .bin file should be the same as jumping 2048 bytes forward from the same spot in the extracted .bin file (hopefully :p).

to your question, it is pretty much as you can see it. a specific amount (maybe too little?) of bytes that map to the RAM at required times - good. probably unused bytes if in the end of something. does it absolutely work? can't be certain but better chances than random. one small thing you can do is to put a read/write breakpoint on such a byte address and check if it does anything with it in/before that section of the game, if you want.
« Last Edit: April 14, 2016, 09:37:11 am by STARWIN »

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #23 on: April 14, 2016, 04:59:23 pm »
Err, gents... Don't bother with sectors just yet. I mean, thanks & OK for the size, but that's all we needed so far. (Again: don't edit the raw CD image; only edit the files in it and regenerate a valid image with tools.)

Now that you've identified there's an extra structure with images and so on (Well Done!), how about checking the start of the file for a kinf of header / pointers to these section starts?

theflyingzamboni

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #24 on: April 14, 2016, 09:33:15 pm »
basically, jumping 2352 bytes forward in the image in the data area of a sector containing that .bin file should be the same as jumping 2048 bytes forward from the same spot in the extracted .bin file (hopefully :p).

to your question, it is pretty much as you can see it. a specific amount (maybe too little?) of bytes that map to the RAM at required times - good. probably unused bytes if in the end of something. does it absolutely work? can't be certain but better chances than random. one small thing you can do is to put a read/write breakpoint on such a byte address and check if it does anything with it in/before that section of the game, if you want.
Good call on the sectors. The thing I looked at before didn't talk about the different forms for XA extension. They seem to fit Form 1. I can pick out the all the different pieces, and there are exactly 280 bytes at the end to match up with error detection/correction. However, the header and error bytes still show up in the extracted .bin file, so the jump is still 2352 bytes (though they are absent in the memory dump). Is this an issue?

Err, gents... Don't bother with sectors just yet. I mean, thanks & OK for the size, but that's all we needed so far. (Again: don't edit the raw CD image; only edit the files in it and regenerate a valid image with tools.)

Now that you've identified there's an extra structure with images and so on (Well Done!), how about checking the start of the file for a kinf of header / pointers to these section starts?
I actually did that a few days ago. The first "MRG" labelled code segment at the very start of the extracted .bin file (which starts with a standard 24-byte Mode 2/Form 1 header, and nothing additional before that) contains what looks like a table. I couldn't figure out how it pointed though, so I put it on hold temporarily. If it is, it seems to alternate between 2-byte "pointers" in the 1st and 3rd 4-byte columns, and 2-3 byte "pointers" in the 2nd and 4th, like so: http://i.imgur.com/ZhqcDo3.png

I'm a little confused by it because the 1st and 3rd columns ascend, while the 2nd and 4th columns are kind of all over. The 3-byte values might point to addresses in RAM by left-shifting by 0x02, but if they do there's also an offset added to it that I'm not sure how to figure out, not knowing what word points to what. I thought maybe the 2-byte codes were referring to the number of sectors into the extracted file a particular segment was, but I can't see how that would work mathematically. This segment of the code still seems the most likely candidate to me. It looks the most like a table of what I've seen, it's at the beginning of the extracted file, AND it's loaded into the same location in every memory dump that I've made (near the end), regardless of the level. If I can just figure out how it works.

EDIT: I am almost positive now. I counted up the number of "MRG" headers from the bottom of the extracted file until I reached the section with my dialogue. Then I counted up that same number of 3-byte words from the bottom of the presumed table. I set a breakpoint at the address that word was at in memory, and sure enough it broke right as that level was loading, and did not break in other levels. Setting more breakpoints, it broke at the 2-byte word before that, and the two 2- and 3-byte words after that. So there were three 2-/3-byte word pairs that the game broke at when loading that level, which (hopefully not just confirmation bias) is the same number of sections I had proposed had assets for that level. The odd thing was that the section with the text, which is the last of the three in the extracted file, seemed by my counting to correspond to the first word pair, meaning that if I'd started counting up to find one of the two preceding sections, I would have counted too far in the table. Which is why the "almost." I'm reasonably sure it's the pointer table, but I still don't know how it works mathematically, why there are two types of words, or why either the ordering or my counting seems to be slightly different than I would expect. I'm trying to figure it out with the debugger, but I haven't had any luck yet.
« Last Edit: April 15, 2016, 01:28:56 am by theflyingzamboni »
ROM wasn't hacked in a day.

STARWIN

  • Sr. Member
  • ****
  • Posts: 454
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #25 on: April 15, 2016, 10:43:49 am »
Try using cdmage b5 for extracting the file (check http://ffhacktics.com/wiki/Tools for a safe link), and check if the extracted .bin is any different. I think it should be like I said.

theflyingzamboni

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #26 on: April 15, 2016, 08:57:30 pm »
Try using cdmage b5 for extracting the file (check http://ffhacktics.com/wiki/Tools for a safe link), and check if the extracted .bin is any different. I think it should be like I said.
You are correct. Is it important for proper insertion and iso rebuilding for it to have the headers and error correction stuff removed?
ROM wasn't hacked in a day.

STARWIN

  • Sr. Member
  • ****
  • Posts: 454
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #27 on: April 16, 2016, 09:16:21 am »
All that is invisible to the game and did not exist for the game developers, so this way it maps better to RAM addresses. cdmage is all you need if file sizes stay the same and people always use these sort of extracted files unless the game hides something outside of the filesystem. More than that, you'd probably have to remove the headers/error stuff manually if you based your work on such a polluted file, because the insertion process assumes pure data!

I haven't tried rebuilding an image, but you probably don't have to do that manually, and even if you were to, it would be relevant only later in the process.

theflyingzamboni

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #28 on: April 18, 2016, 06:06:27 pm »
I've been a little busy recently, but just to keep this up to date, I did figure out the first part of that table of tables. I had the obvious idea that I could divide the total file length by 2048 to get the number of sectors. Then I multiplied the decimal of the 2-byte word from the pair I talked about in my last big post by 2048 (in scientific mode to avoid rounding), and the results were the same. So that definitely points to the sector that each "sub-file" starts at. I also found two more similar tables in the dump that correspond to the hex code at the start of the other two .bin files in the SECT folder of the .iso. So now I'm absolutely positive that this is one of the tables that I'm looking for.

EDIT: Ignore what I said in this paragraph here before if you saw it. The second word is the length of the sub-file. So the table heading each .bin file in the .iso tells the game what sector a sub-file starts on and how long it is. It does not give any indication of how it decides where to put it in RAM. I may still be able to change these values to add a sector and have it load properly though, I don't know. Still something I'd hope to avoid.

This is all probably important, but doesn't really help me with the original problem. I found the pointer table for sub-files, but not the pointer table for pointer tables within a sub-file that I assume exists. That brings me back to the start of the sub-file, which looks like some kind of table, but I've yet to figure out how it works, since it doesn't ascend consistently all the way through.

EDIT: Starting with the first number on the "table", it ascends every other word, while the words in between are more random. This may be another location/size kind of thing.

April 20, 2016, 04:11:10 am - (Auto Merged - Double Posts are not allowed before 7 days.)
EDIT 2: My guess was correct. The word-pair table within a sub-file gives the location (relative from the start of the sub-file) and length of each "asset" (text, images, etc.) contained in that sub-file (no adjustment). So each sub-file uses the same sort of indexing structure that each .bin file uses. As it turns out, there is a whole bunch of code preceding the text pointer table and text itself that is a part of that "asset". I don't know what all that code does, but hopefully it's not important for my purposes. I think this is as far as it goes. I don't think an individual asset contains a table for tables, at least no that I've noticed. So I guess my question is, does this sound like I've found all the tables I need to start trying to alter text size? Maybe add a few bytes and change the subfile tables to give the right offsets and lengths? Start messing with Cartographer and Atlas? I'm not sure what my best next move is.

Also, thanks to STARWIN. The addresses are based on a 2048 filesize, which was part of why I couldn't figure the table out before.

April 20, 2016, 01:52:35 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
EDIT 3: GOT IT! I'm still not sure about adding sectors and changing file size, but I added 4 bytes to a sentence of text, deleted 4 bytes of the 0x8C buffer at the end of the sub-file to keep the file size the same, altered the text and asset pointers, and inserted the updated file into the iso. The game worked normally from startup, and the new text displayed.

So now my to-do list is down to:
1. Learn Cartographer and Atlas so I don't have to manually alter all the hex code
2. (Maybe optional) Learn how to rebuild an iso with altered file sizes, in case a particular sub-file doesn't have enough extra space at the end for additional text
« Last Edit: April 20, 2016, 01:52:35 pm by theflyingzamboni »
ROM wasn't hacked in a day.

theflyingzamboni

  • Jr. Member
  • **
  • Posts: 92
    • View Profile
Re: Legend of Dragoon script rewrite (and pointer questions)
« Reply #29 on: May 16, 2016, 02:46:32 pm »
After an extensive detour learning basic C++ so I could program a basic bin splitter/merger that would update the game's VFS tables, and modify a few lines of Atlas, I actually have something to show for it all. Thanks to being able to rebuild the file and update the VFS, I can now add extra text without problem (although I still need to learn to rebuild the ISO in case I go over sector boundaries).

Here's a video showing the original and hacked dialogue in the opening scenes: https://www.youtube.com/watch?v=dXt229MOmlE

And a couple images:





Thanks for all the help you all have given me so far! At times I didn't think I would even make it to this point. Now if I'm just hoping I can get the Cartographer source code so I don't have to either program my own dumper or type in everything by hand for the full game.
ROM wasn't hacked in a day.