News: 11 March 2016 - Forum Rules, Mobile Version
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Valendian

Pages: [1]
Following pointers is a difficult task. It requires a lot of detective work. I find that you can use the stack as a cookie trail. If the pointers are read in one place then used in another, then there is likely to be a function call where the pointer is passed as an argument. Break on memory read/write to the data and save state. Then search the save state for the pointer. You are hoping to see a few occurrences near each other. This area is the stack (and is typically found at top of memory 801Fxxxx).
The stack grows down so the occurence with the highest address is the place where the pointer was read. Break on write to that address and you are within the function that reads the pointer. Locate the start of this function and step through it until the pointer is read. Its a lot of work following pointers but it pays off.

ROM Hacking Discussion / Re: PSX - Relocating pointer table
« on: February 27, 2018, 05:01:45 pm »
Im not the biggest fan of that mnemonic format but what is this doing?
Code: [Select]
800164bc 00621821: ADDU    801a6d3c (v1), 801a6d3c (v1), 000003f8 (v0)
There is no opcode for 32bit indexing. Is there any chance that this is a pseudo opcode? If so the pointer will appear split up like this

Code: [Select]
lui v0, 0x801b
ori v0, 0x6d3c

You're using a std::vector<char> for the sliding window, I think this is creating a new problem rather than solving an existing one. The problem is made more simple by using a fixed sized c-style char array all you need to do is use modulus wrap around to enforce boundary constraints. Also the negative start index in the Sliding Window is dubious, I'm not sure if this is because you are using a vector but I really would consider dropping the std::vector and switching to a fixed sized c-style char array.

Code: [Select]
    // using a std::vector is creating more problems than it solves
    realOffset = ((offset + i) > 0xFFF ? (offset + i) - 0x1000 : offset + i);
    buf.push_back(buf[sldWindow->getStart() + realOffset]);

Code: [Select]
    // if you use a char[] it would look more like this
    buf[end++ % sizeof(buf)] = buf[(offset + i) % sizeof(buf)];

And this appears to be LZSS. The distinguishing feature of this LZ variant is that it maintains a separate sliding window. It treats the sliding window as a FIFO queue. Earlier variants simply took the position in the output stream and grabbed bytes from earlier positions, (the sliding window was a subset of the output stream and would contain duplicate entries).

Also when you get around to encoding you will need to layer some other technique on top to get satisfactory results, the basic example I pointed to earlier doesn't do anything fancy and so its compression ratio is rather poor. Huffman Coding can be used to improve this ratio.

ROM Hacking Discussion / Re: ROM Hacks to Disable Dithering in PS1 Games
« on: February 21, 2018, 10:18:03 pm »
You need to modify the GPU Rendering Attributes which is command GPU0(E1).

Search for the following memory mapped IO address 0x1F801810 it will probably appear similar to the following
Code: [Select]
  lui     r4, 0x1F80     #
  lw      r3, 0x1810(r4) #

modify the code so bit 9 is cleared
forgot to add that you need to find the references that write 0xE1xxxxxx to that port address (E1 command).

you can zero init the buffer with char buf[BUFFER_SIZE]={0};

Not when the array is used more than once.

@730 if those errors only show up at the end then all you are missing is the end of file logic.

EI is the bitsize of the pointer and EJ is the same for copy size. N an F convert these to buffer size and lookahead size. The lookahead is ised during compression to find longer duplicates within the sliding window. P is also only used during compression. It is the minimum string length, less than this will be stored raw.

The decoder works like this.
Read a bit.
If its 1 then the next bytes are raw. so read 8 bit count and copy that many raw bytes from the packed input to the output. Every raw byte is also appended to the sliding window (buffer).

But if the bit was clear then read a pointer + size pair and copy from the window instead.

Your version is not bit packing so replace getbit with readbyte.

Note LZSS uses sliding windows. LZ just offsets from current output pointer. Watch that slight difference. As Gemeni noted all LZ decoders have a similar form. The different versions are just tweaks.

it keeps getting shifted right by 1 bit every loop that a byte is stored to the decompressed area (also at some point it gets 0xFF added to it, such that it ends up being 0xFFXX, XX being the loaded byte, and I think this loaded byte is also the bytes that are "skipped" that I mentioned).

I think you found the getbits function. The typical LZ type decoder has three paths:
  1  Uncompressed raw data
  2  Compressed pointer / size pair
  3  End of file marker (-1 in the code below)
Although any attempt to compress such a small payload will result in expansion. I think this is a more of a text preprocessor than any form of depacker or decrypter.

Anyways here's what an LZ decoder looks like
Code: [Select]
// source
unsigned int   insize;
unsigned int   outsize;
unsigned char *indata;
unsigned char *outdata;
unsigned int  inptr;
unsigned int  buf;
unsigned int  mask = 0;
unsigned char buffer[N * 2];

// get n bits
int getbit(int n) {
    int i, x;
    x = 0;
    for (i = 0; i < n; i++) {
        if (mask == 0) {
            if ( inptr >= insize ) return -1;
            buf = indata[ inptr++ ];
            mask = 128;
        x <<= 1;
        if (buf & mask) x++;
        mask >>= 1;
    return x;

void decode(void) {
    int i, j, k, r, c;
    for (i = 0; i < N - F; i++) buffer[i] = B; // clear the buffer to avoid infinite loops
    r = N - F;
    while ((c = getbit(1)) != -1) {
        if (c) {
            if ((c = getbit(8)) == -1) break;
            out[outsize++] = c;
            buffer[r++] = c;
            r &= (N - 1);
        } else {
            if ((i = getbit(EI)) == -1) break;
            if ((j = getbit(EJ)) == -1) break;
            for (k = 0; k <= j + 1; k++) {
                c = buffer[(i + k) & (N - 1)];
                outdata[ outsize++ ] = c;
                buffer[r++] = c;
                r &= (N - 1);

The bottom one is not compressed, "Ten emos que subdir". pSX can break on CDROM DMA, if you do not know the DMA address then set the range to all memory 80000000-801FFFFF

The PS1 memory addresses usually end in 0x80

Just to expand this a little:
pointers which refer to cached memory are in the 2 MB range:
    0x80000000 (00 00 00 80) - 0x801FFFFF (FF FF 1F 80)
Tune you eye to see 80 in the third column of a 4 byte word, it's and important signature for a pointer.

The MIPS CPU strictly enforces alignment of data. The instruction set requires that 4 bytes words lie on a 4 byte boundary, likewise for halfs (2 bytes). However small data like a byte may just happen to be 4 byte aligned. You can use a debugger to verify the size of data once you know where it lives in RAM. Place a Break-Point on read/write. You will see one of the following assembly instructions:
  4 byte word ... LW/SW (load/store)
  2 byte half ... LH/LHU/SH
  1 byte ... LB/SB
(Just be mindful that memory transfers will use word copies for byte arrays).

If you are using a debugger then the bytes are right their ready to be fuzzed, you can save/reload state and the turn around time is instant. You can lean on the hex editor for searching the save state. There is a fixed difference between the save state offset and RAM address, for pSX at least (doesn't compress the save states).

Nice choice to take on Tekken 3.Shows good taste. I did notice that those data structures have a variable length string of text. This usually means that the name is the last thing in the structure you already noticed those names are zero filled to a four byte alignment. Nulls like these are used to mark the end of text. Now you have a note that indicates that the structure begins four bytes later.I would question that.
Not sure if it helps but have you tried to count up all the names and search for that number. You will likely find a descriptor in the header.

Keep fuzzing those bytes

ROM Hacking Discussion / Re: How to use pSX emulator with hardware logging?
« on: February 02, 2018, 01:16:12 pm »
You need to go use the Front End or else it doesn't capture things correctly.
When you have the logging window open the log will display all bios calls and parameters. However you will need to learn how to read each log entry as the information is there but can be tricky to extract the information.

The psx frontend can force psx to run in a 320x240 window, grab it from here

fixed dead link

Gaming Discussion / Re: Vagrant Story Graphics Hack Released.
« on: September 14, 2014, 06:34:55 am »
Hmm... Model Swap + Twin Blades + All Debug Rooms

Just a graphics hack you say?

In answer to the question cross posted to the following forum

I thought it best to keep all this in one place.

Quote from: creeperton
Ok, time for an update.

Vehek found the value I need to change to make that one code permanent. He found it in BATTLE.ARC.

It turns out that BATTLE.ARC is compressed. It uses LZS compression, which is handy because there are tools to work with that at Qhimm.

LZS tools:

An *.ARC file is an uncompressed archive.

I split BATTLE.ARC into it's 2 subfiles, subfile #0 and subfile #1. #0 apparently doesn't use LZS compression - I'll scan it with Trid later. #1 is LZS and it decompresses just fine.

My current questions for you are semi-related to these things.

My spreadsheets don't work too great because I don't have a tool that allows me to import multiple files into a disc image all at once. This is a problem because I have a directory of 256 files that I need to import into a disc image. CD Mage doesn't let me do this, nor does CD Tool or cdprog.

The solution is to work directly with the disc image.

When I was making my magic-only mod I needed to calculate the defense values of the armors in the game. This is a pain, because there are 5 bytes which determine these values. They work like this:

base defense = signed byte
defense boost 1 = signed byte
defense boost 2 = signed byte
defense selector 1 = 8 bits, 1 for each element
defense selector 2 = 8 bits, 1 for each element

=IF(defense selector 1 = yes)AND(defense selector 2 = yes)
THEN(effective defense = base defense + defense boost 1 + defense boost 2)

=IF(defense selector 1 = yes)AND(defense selector 2 = no)
THEN(effective defense = base defense + defense boost 1)

=IF(defense selector 1 = no)AND(defense selector 2 = yes)
THEN(effective defense = base defense + defense boost 2)

=IF(defense selector 1 = no)AND(defense selector 2 = no)
THEN(effective defense = base defense)

The actual spreadsheet is more involved than this, this is just a summary.

My point is that after doing this I realized that I can easily calculate base addresses to patch to each file in the disc image, but only if I know exactly how those addresses are found in the disc image.

Is there a specification for how *.bin/*.cue disc images are organized? Like is there a header which lists the starting points of each file, is there some relationship between the size, or name, or type, or location of a file in the human-readable disc image you see when you open it in CD Mage - and the location of that file in the human-unfriendly version?

Also what is error correction data? How can I locate it? Is it located in certain fixed places in a disc image (every 40,000 bytes)?

For example, let's say I have a disc image called "game.bin" with 3 files in it.

file-------length (bytes)

How would I find the location and length of each of these files in game.bin? How would I find error correction data?
SaGa Frontier Community Forum

Yes there are standard specifications. A whole host of them, collectively known as ISO9660, or more informally as "The Rainbow Books". Specifically ECM 119 and ECM 130. You don't need to worry about this stuff unless you are building CD Authoring Software. But I will give you an overview of the topic and you can look further if you wish.

ECM 130 specifies the physical structure of a CDROM. This details how the sectors are laid out, and some of the various formats a sector may take. The error correction is found within the sector. Each sector contains its own error detection and correction. There are different versions of sectors for different purposes. Music CDs don't require strong error correction as they can simply interpolate between that last good sector and the next good sector and no one would be the wiser. But data CDs place a much higher burden on the error correction. Such sectors sacrifice useable disc space for integrity against defects. It's a compromise between capacity and recoverability. But the error detection and correction is implemented as a Reed Solomon Product Like Code, with two channels. The inner channel ( called P Parity in ECM 130 ) is much weaker than the outer channel ( Q Parity ). P Parity can correct up to 2 bytes within a 28 byte block, but more importantly P Parity can be used to flag which 28 byte blocks contain errors that can be handled by the stronger outer channel which can recover up to 4 bytes per 28 byte block. But taken together both P and Q parity can provide almost perfect protection from defects. For you're information a P Parity check looks like this
 P(x) = x^8 + x^4 + x^3 + x^2 + 1

To understand the Virtual File System you will need to read ECM 119. This will tell you more than you need to know about how files are organised on the physical disc. How sectors can be allocated to files, how you can move files to a new range of sectors, how you can resize a file, or even create new files and directories.

The first 16 sectors are reserved for system use. Sony use these sectors to store their license agreement data ( which by the way has invalid error correction / detection by design ) Then sector LBA=16 contains the Primary Volume Descriptor, all PSX discs have only one such descriptor but there may be multiple descriptors, called supplementary volume descriptors. PSX doesn't use them. The last volume descriptor is called the volume descriptor set terminator. This is found at sector LBA=17, Next follows the PathTable which is duplicated 4 times ( twice in little endian and twice in big endian for redundancy purposes ) You can use the Path Tables as a quick way to locate files given a file path. The path tables store only references to directories with no references to the files they contain. So to use the path table method you seek the directory then you scan that directory for the file. Its like a short cut, but actually requires that you write more code to make use of it. as the act of scanning a directory is all that is actually required and can be achieved without use of the path table. Once all of the path tables have been defined the next sector begins the root directory record. This contains directory entries for all the sub directories and files contained within the root directory. These record entries will tell you the file name its size and its sector number ( in the form of an LBA ). The only differences between an entry that refers to a sub directory and an entry that refers to a file are
[1] a sub directory will have the directory flag set in the flags field
[2] a file will have an extension and a ";1" appended ( if there are multiple files with the same name you would have "file.ext;1" then "file.ext;2" and "file.ext;3" and so on. ) The file names are listed in alphabetical order with no distinction between files and directories.

Now once you get to the stage where you have scanned a directory, found the file you are looking for and read all the sectors that compose that file into memory we begin to leave the world of nice specifications to follow and enter the world of Custom Virtual File Systems. This is when a software developer has implemented their own version of a Virtual File System that breaks up a huge archive into little bite sized chunks. This is why many people here will tell you that you don't need to deal directly with the ISO9660 File System as you may still have to deal directly with sectors and LBA's by manipulating these custom virtual file systems.

How do you find these LBA tables? Well you need to run pSX with logging enabled and you need to log all CDROM IO. now just play the game as normal and save the log. Any time you see a line in the log such as

[015b009c] cdrom: setloc 00000000:00000002:00000005

it will be closely followed by lines such as these
[015b00a4] cdrom: read byte 07801800 = 09 (800584dc)
[015b00a4] cdrom: write byte 07801800 = 01 (800584ec)

Those address in the brackets
Are a part of the function "setloc". mark the entry point of that function in your disassembly.
Place break points at the entry point. and follow execution to the jr without entering function calls ( use F7 to step into if you reach a jal use F6 to step over ) Once you have reached a range of addresses that you are familiar with you can reload a previous save state and inspect what is happening here. Where is that game engine obtaining the LBA? That will almost always be a large table of sector addresses

Newcomer's Board / Re: Vagrant Story [PSX] SHP Model Format
« on: May 03, 2013, 02:02:19 pm »
The playstation lacked a floating point unit so all psx games use fixed point in place of floating point. The values are stored as normal signed integers. The Q format used can vary widely but Q16.16 or Q8.8 are the most logical choices. I Just do a straight cast to float so the floats are integers in the range -65535.0 to +65535.0.

Another thing to watch out for is that the polygons index the vertices by a sort of compromise shifted offset. Its one shift left to make it a true offset and two shift rights are reqjired to make it an array index.

There are duplicate polygons no idea why but they are there and if you figure out why some polygons need to be duplicated please let me know.

Some models still don't parse correctly, perhaps these models we works in progress that got cut from the final production.

Any ways I will share my shp / wep model viewer and its source code via PM. It is not animated but you can learn from it.

Please keep in touch with me.

ROM Hacking Discussion / Re: Replacement for CD Mage in the works
« on: March 04, 2012, 11:51:19 pm »
For most PS1 games there are no such tools. I'm thinking of the future reverser that wants to break new ground and explore games that haven't been touched before. This is a situation where generic tools are the only option. You wouldn't need to resort to such a low level approach if there were tools already available that did the job.

I do understand you're point about keeping all the files on disk and keeping the image "clean". But you still have to keep a virgin image somewhere and make copies of it that you can mangle to your hearts content. I do this myself. But what I'm worried about is the time that is wasted to extract the file from the image to disk only to open the file in a hex editor make a few edits and reinsert back into the image. This could be so much more productive and user friendly. Just stick a plugin architecture onto the image tool. There could be a hex editor plugin, or a disassembler plugin, or a TIM plugin. Who knows what a determined user would like to add to it.

ROM Hacking Discussion / Re: Replacement for CD Mage in the works
« on: March 04, 2012, 12:04:27 pm »
@Gemini: It's true that the game may ignore the TOC entirely, but that doesn't mean that it isn't important to maintain the TOC. How would you go about extracting/importing files using standard CD image tools if the TOC is invalid? Does it not make sense to keep the files readily accessible. The modder would need to maintain the internal LBAs and filesizes. Which is something you would have to do anyways. But the files would still be visible to general purpose tools. Not just dumped somewhere in the image and out of reach.

Personal Projects / Re: Code Naturalizer
« on: October 24, 2010, 10:10:50 pm »
So is this gonna be a tool that generates the type of comments an amatuer asm coder would write or will it actually be more of a decompiler?

If its gonna be a decompiler I'd suggest you forget about the natural language interface and just focus on decompilation. The ideal user of such a tool would be someone well versed in coding asm and some form of HLL. Cater to this persons needs, let the person who's only learning asm pick things up in their own good time.

The difference is a tool that looks at a push opcode and says "a value is being pushed on top of the stack" rather than a tool that analyzes further and finds that this is a local variable that is kept on the stack and it appears to be an unsigned int, or a pointer to a structure which is made up of 5 int's and 2 char's. It's this kind of feedback that is needed IMO.

Pages: [1]