News: 11 March 2016 - Forum Rules

Author Topic: .  (Read 6023 times)

creeperton

  • Hero Member
  • *****
  • Posts: 604
    • View Profile
.
« on: September 03, 2014, 09:29:54 pm »
.
« Last Edit: November 16, 2015, 01:39:33 am by creeperton »

Vehek

  • Full Member
  • ***
  • Posts: 223
    • View Profile
Re: How do I figure out which file an instruction is loaded from?
« Reply #1 on: September 04, 2014, 03:36:36 am »
I don't fully understand what it's doing, but I think I found the data for that specific instruction.

Search for the hex 40 53 42 30.

Using a write breakpoint, I discovered that it was loading values from the scratch pad. Then I used a write breakpoint on the scratch pad and found various routines, but I needed to narrow it down. I set the breakpoint condition so it would only trigger when the register holding value to write had the exact value of the instruction. I looked at the memory address it had loaded from when building up the value.

I don't know how much changing those bytes will affect though. I think I saw the very next or nearby instructions coming from very far apart memory addresses.

Edit: Here's an example of one of the routines.
Code: [Select]
00070b28: 92a50000 lbu r5,0x0000(r21)
00070b2c: 26d6ffff addiu r22,r22,0xffff
00070b30: 12c000fa beq r22,r0,0x00070f1c
00070b34: 26b50001 addiu r21,r21,0x0001
00070b38: 00121c00 sll r3,r18,0x10
00070b3c: 00031c03 sra r3,r3,0x10
00070b40: 000310c0 sll r2,r3,0x03
00070b44: 00451004 sllv r2,r5,r2
00070b48: 146f0017 bne r3,r15,0x00070ba8
00070b4c: 02629825 or r19,r19,r2
00070b50: ae130000 sw r19,0x0000(r16)
00070b54: 26100004 addiu r16,r16,0x0004
« Last Edit: September 05, 2014, 02:01:59 am by Vehek »

STARWIN

  • Sr. Member
  • ****
  • Posts: 455
    • View Profile
Re: How do I figure out which file an instruction is loaded from?
« Reply #2 on: September 04, 2014, 06:07:54 am »
Did you load a savestate that had already loaded that code to RAM to test your change? Then the old code would be there (until it loads the code again).
« Last Edit: September 04, 2014, 06:16:36 am by STARWIN »

BlackDog61

  • Hero Member
  • *****
  • Posts: 784
    • View Profile
    • Super Robot Wars A Portable translation thread
Re: How do I figure out which file an instruction is loaded from?
« Reply #3 on: September 04, 2014, 05:00:53 pm »
If the PS1 is little endian, then there is a possibility that you weren't searching bytes in the right order, the first time you tried? (But if the value is calculated by code, which seems indicated in Vehek's post, then that shouldn't work either.)

creeperton

  • Hero Member
  • *****
  • Posts: 604
    • View Profile
.
« Reply #4 on: September 05, 2014, 03:24:06 pm »
.
« Last Edit: November 16, 2015, 01:39:27 am by creeperton »

Vehek

  • Full Member
  • ***
  • Posts: 223
    • View Profile
Re: How do I figure out which file an instruction is loaded from?
« Reply #5 on: September 05, 2014, 06:12:35 pm »
Just try changing the '40' in the bytes I posted with '00'. It may seem strange, but those bytes are really where it loads the instruction from, not the uncompressed data from BATTLE.OUT.

I'm guessing it's part of BATTLE.ARC (took a guess and looked in this file). Hmm, looks like there's compression. Until this is figured out, try to avoid any of the codes that patch memory after 0x180000.

Quote
Unless you know of a Windows tool that lets you search through specified binary files or directories for specific bytes?
Sorry, don't know. I tend to resort to placing files in a ZIP with no compression and searching through it for the hex, then searching/scrolling up to find the name. Knowing the LBAs of all the files are would probably help.

Edit:
Compression is standard Haruhiko compression. You can use one of the LZS tools for FF7/FF8.
« Last Edit: September 05, 2014, 06:59:37 pm by Vehek »

creeperton

  • Hero Member
  • *****
  • Posts: 604
    • View Profile
.
« Reply #6 on: September 05, 2014, 09:58:11 pm »
.
« Last Edit: November 16, 2015, 01:45:52 am by creeperton »

Valendian

  • Jr. Member
  • **
  • Posts: 68
    • View Profile
Re: How do I figure out which file an instruction is loaded from?
« Reply #7 on: September 09, 2014, 05:12:33 pm »
In answer to the question cross posted to the following forum

http://www.gamefaqs.com/boards/914326-vagrant-story/70000670#8

I thought it best to keep all this in one place.

Quote from: creeperton
Ok, time for an update.

Vehek found the value I need to change to make that one code permanent. He found it in BATTLE.ARC.
http://www.romhacking.net/forum/index.php?topic=18596

It turns out that BATTLE.ARC is compressed. It uses LZS compression, which is handy because there are tools to work with that at Qhimm.
http://forums.qhimm.com/index.php?topic=15521

LZS tools:
http://forums.qhimm.com/index.php?topic=15325.0

An *.ARC file is an uncompressed archive.
http://biolab.warsworldnews.com/viewtopic.php?f=3&t=125

I split BATTLE.ARC into it's 2 subfiles, subfile #0 and subfile #1. #0 apparently doesn't use LZS compression - I'll scan it with Trid later. #1 is LZS and it decompresses just fine.

My current questions for you are semi-related to these things.

My spreadsheets don't work too great because I don't have a tool that allows me to import multiple files into a disc image all at once. This is a problem because I have a directory of 256 files that I need to import into a disc image. CD Mage doesn't let me do this, nor does CD Tool or cdprog.

The solution is to work directly with the disc image.

When I was making my magic-only mod I needed to calculate the defense values of the armors in the game. This is a pain, because there are 5 bytes which determine these values. They work like this:

base defense = signed byte
defense boost 1 = signed byte
defense boost 2 = signed byte
defense selector 1 = 8 bits, 1 for each element
defense selector 2 = 8 bits, 1 for each element

=IF(defense selector 1 = yes)AND(defense selector 2 = yes)
THEN(effective defense = base defense + defense boost 1 + defense boost 2)

=IF(defense selector 1 = yes)AND(defense selector 2 = no)
THEN(effective defense = base defense + defense boost 1)

=IF(defense selector 1 = no)AND(defense selector 2 = yes)
THEN(effective defense = base defense + defense boost 2)

=IF(defense selector 1 = no)AND(defense selector 2 = no)
THEN(effective defense = base defense)

The actual spreadsheet is more involved than this, this is just a summary.

My point is that after doing this I realized that I can easily calculate base addresses to patch to each file in the disc image, but only if I know exactly how those addresses are found in the disc image.

Is there a specification for how *.bin/*.cue disc images are organized? Like is there a header which lists the starting points of each file, is there some relationship between the size, or name, or type, or location of a file in the human-readable disc image you see when you open it in CD Mage - and the location of that file in the human-unfriendly version?

Also what is error correction data? How can I locate it? Is it located in certain fixed places in a disc image (every 40,000 bytes)?

For example, let's say I have a disc image called "game.bin" with 3 files in it.

file-------length (bytes)
game.exe---80000
vid.str----200000
info.arc---40000

How would I find the location and length of each of these files in game.bin? How would I find error correction data?
---
http://biolab.warsworldnews.com/index.php
SaGa Frontier Community Forum

Yes there are standard specifications. A whole host of them, collectively known as ISO9660, or more informally as "The Rainbow Books". Specifically ECM 119 and ECM 130. You don't need to worry about this stuff unless you are building CD Authoring Software. But I will give you an overview of the topic and you can look further if you wish.

http://www.ecma-international.org/publications/standards/Ecma-119.htm
http://www.ecma-international.org/publications/standards/Ecma-130.htm

ECM 130 specifies the physical structure of a CDROM. This details how the sectors are laid out, and some of the various formats a sector may take. The error correction is found within the sector. Each sector contains its own error detection and correction. There are different versions of sectors for different purposes. Music CDs don't require strong error correction as they can simply interpolate between that last good sector and the next good sector and no one would be the wiser. But data CDs place a much higher burden on the error correction. Such sectors sacrifice useable disc space for integrity against defects. It's a compromise between capacity and recoverability. But the error detection and correction is implemented as a Reed Solomon Product Like Code, with two channels. The inner channel ( called P Parity in ECM 130 ) is much weaker than the outer channel ( Q Parity ). P Parity can correct up to 2 bytes within a 28 byte block, but more importantly P Parity can be used to flag which 28 byte blocks contain errors that can be handled by the stronger outer channel which can recover up to 4 bytes per 28 byte block. But taken together both P and Q parity can provide almost perfect protection from defects. For you're information a P Parity check looks like this
 P(x) = x^8 + x^4 + x^3 + x^2 + 1

To understand the Virtual File System you will need to read ECM 119. This will tell you more than you need to know about how files are organised on the physical disc. How sectors can be allocated to files, how you can move files to a new range of sectors, how you can resize a file, or even create new files and directories.

The first 16 sectors are reserved for system use. Sony use these sectors to store their license agreement data ( which by the way has invalid error correction / detection by design ) Then sector LBA=16 contains the Primary Volume Descriptor, all PSX discs have only one such descriptor but there may be multiple descriptors, called supplementary volume descriptors. PSX doesn't use them. The last volume descriptor is called the volume descriptor set terminator. This is found at sector LBA=17, Next follows the PathTable which is duplicated 4 times ( twice in little endian and twice in big endian for redundancy purposes ) You can use the Path Tables as a quick way to locate files given a file path. The path tables store only references to directories with no references to the files they contain. So to use the path table method you seek the directory then you scan that directory for the file. Its like a short cut, but actually requires that you write more code to make use of it. as the act of scanning a directory is all that is actually required and can be achieved without use of the path table. Once all of the path tables have been defined the next sector begins the root directory record. This contains directory entries for all the sub directories and files contained within the root directory. These record entries will tell you the file name its size and its sector number ( in the form of an LBA ). The only differences between an entry that refers to a sub directory and an entry that refers to a file are
[1] a sub directory will have the directory flag set in the flags field
[2] a file will have an extension and a ";1" appended ( if there are multiple files with the same name you would have "file.ext;1" then "file.ext;2" and "file.ext;3" and so on. ) The file names are listed in alphabetical order with no distinction between files and directories.

Now once you get to the stage where you have scanned a directory, found the file you are looking for and read all the sectors that compose that file into memory we begin to leave the world of nice specifications to follow and enter the world of Custom Virtual File Systems. This is when a software developer has implemented their own version of a Virtual File System that breaks up a huge archive into little bite sized chunks. This is why many people here will tell you that you don't need to deal directly with the ISO9660 File System as you may still have to deal directly with sectors and LBA's by manipulating these custom virtual file systems.

How do you find these LBA tables? Well you need to run pSX with logging enabled and you need to log all CDROM IO. now just play the game as normal and save the log. Any time you see a line in the log such as

[015b009c] cdrom: setloc 00000000:00000002:00000005

it will be closely followed by lines such as these
[015b00a4] cdrom: read byte 07801800 = 09 (800584dc)
[015b00a4] cdrom: write byte 07801800 = 01 (800584ec)

Those address in the brackets
 (800584dc)
 (800584ec)
Are a part of the function "setloc". mark the entry point of that function in your disassembly.
Place break points at the entry point. and follow execution to the jr without entering function calls ( use F7 to step into if you reach a jal use F6 to step over ) Once you have reached a range of addresses that you are familiar with you can reload a previous save state and inspect what is happening here. Where is that game engine obtaining the LBA? That will almost always be a large table of sector addresses