Saving gfx memory with MAME Debugger - "Unable to find CPU ':gfx'

Started by Hop, February 13, 2016, 12:11:58 PM

Previous topic - Next topic

Hop

QuoteDid you notice how all the 8x8 tiles are duplicated twice?

No - I've only managed to extract the first 8x8 set so far. TBH I've been struggling to figure out how to calculate the start addresses for the other sets from the gfx_range mapper_S9263B_table, or perhaps I am looking in the wrong place? Do all of the sets start at address zero, but only have a valid range in the address space?

I've tried to confirm that all the 8x8 tiles are duplicated twice with a quick search of gfx memory in a hex editor. I expected every 4-byte aligned 4-byte sequence from the region near address zero to be seen elsewhere. This seems true for most, but not all sequences.

Quotetwo 8x8 layouts that are offset by 32 bits but otherwise identical

I don't understand where the 32 bit offset comes from. From my understanding memory is built up by adding two bytes from ROM 1, then two from 2, then 3 then 4 then back to 1 and round again - each cycle will be 8 bytes. Each 8x8 element is 8x8x4 bits = 32 bytes and is contiguous in memory (albeit in short 1D bitplanes). I think this means that the data for each 8x8 element must be split between the 4 ROMs. This would mean that to draw 8x8 pixels data is required from all 4 roms. I must have misunderstood something.

EDIT: I think I've figured this out. In the 8x8 layouts the y spacing is 4*16 bits = 8 bytes, whereas each row is only 4 bytes. So the each pair of tiles in the pair of 8x8 sets are interleaved together row by row. Each row each tile from the first set uses 2 bytes from the ROM 1 and 2 bytes from ROM 2. Each row each tile from the second set uses 2 bytes from ROM 3 and 2 bytes from ROM4. This also explains why layout.charincrement is 2x the tile size in bytes.


Quoteotherwise the game could only display a particular tile at odd screen positions or at even positions depending on which ROMs it was in.

I might get this for backgrounds in which the tiles are drawn on a regular grid, but if an 8x8 sprite can be drawn at any pixel position then do you mean odd and even positions within a 2D array of tiles that make up an object?

QuoteAre you familiar at all with 2D console hacking (NES, SNES, GBA...)?

I've programmed the Amiga 500 and modern GPUs, but I missed out on the classic tile era and I'm enjoying learning how it all worked. I guess that old video hardware liked to have data fed to it in a particular format for performance. I actually find it quite hard to believe that data has to be duplicated to facilitate it in this case!

AWJ

Quote from: Hop on February 18, 2016, 05:52:59 PM
No - I've only managed to extract the first 8x8 set so far. TBH I've been struggling to figure out how to calculate the start addresses for the other sets from the gfx_range mapper_S9263B_table, or perhaps I am looking in the wrong place? Do all of the sets start at address zero, but only have a valid range in the address space?

I've tried to confirm that all the 8x8 tiles are duplicated twice with a quick search of gfx memory in a hex editor. I expected every 4-byte aligned 4-byte sequence from the region near address zero to be seen elsewhere. This seems true for most, but not all sequences.

I don't understand where the 32 bit offset comes from. From my understanding memory is built up by adding two bytes from ROM 1, then two from 2, then 3 then 4 then back to 1 and round again - each cycle will be 8 bytes. Each 8x8 element is 8x8x4 bits = 32 bytes and is contiguous in memory (albeit in short 1D bitplanes). I think this means that the data for each 8x8 element must be split between the 4 ROMs. This would mean that to draw 8x8 pixels data is required from all 4 roms. I must have misunderstood something.

EDIT: I think I've figured this out. In the 8x8 layouts the y spacing is 4*16 bits = 8 bytes, whereas each row is only 4 bytes. So the each pair of tiles in the pair of 8x8 sets are interleaved together row by row. Each row each tile from the first set uses 2 bytes from the ROM 1 and 2 bytes from ROM 2. Each row each tile from the second set uses 2 bytes from ROM 3 and 2 bytes from ROM4. This also explains why layout.charincrement is 2x the tile size in bytes.

Yep, you've got it. I think it works that way because the hardware fetches data from all four ROMs in parallel, over a 64-bit data bus (each ROM is 16 bits wide). The ROM chips available in 1988 weren't very fast, so rendering 3 scrolling layers plus sprites at an 8MHz dot clock required a fair bit of parallelism.

QuoteI might get this for backgrounds in which the tiles are drawn on a regular grid, but if an 8x8 sprite can be drawn at any pixel position then do you mean odd and even positions within a 2D array of tiles that make up an object?

There are no 8x8 sprites on CPS1. There are four layers: a tilemap (scrollable character grid) of 8x8 tiles, a tilemap of 16x16 tiles, a tilemap of 32x32 tiles, and sprites that are made out of 16x16 cels. It will help you understand the MAME source if you know that "scroll 1" is the 8x8 tilemap, "scroll 2" is the 16x16 one, and "scroll 3" is the 32x32.

Okay, if you look at the mapper table you'll see that there are three "banks", each corresponding to one set of four ROMs. Bank 0 is the ROMs numbered 1, 2, 3 and 4; bank 1 is the ROMs numbered 5, 6, 7 and 8, and bank 2 is the ROMs numbered 10, 11, 12 and 13. Banks 0 and 1 are entirely used by sprites, while bank 2 contains the last few thousand sprites, then the 32x32 tiles (SCROLL3), then the 8x8 tiles (SCROLL1), then the 16x16 tiles (SCROLL2).

You can see in gfxrom_bank_mapper() that if the requested code is a sprite or a SCROLL2 tile it gets multiplied by 2 (left shifted by 1, same thing), if it is a SCROLL3 tile it gets multiplied by 8, and if it is a SCROLL1 tile it is left alone. So the "unit" that the ranges in the mapper table are measured in is one duplicated 8x8 tile (which is half a 16x16 tile, or one eighth of a 32x32 tile). Meaning 16 bytes in one ROM, or 64 bytes in the whole interleaved set.

So, to answer your original question: the 32x32 tiles are at 0x20000-0x3FFFF in ROMs 10, 11, 12 and 13 (that's the address in each individual ROM--multiply by 4 if you're working with them interleaved). The 8x8 tiles are at 0x40000-0x4FFFF in those same ROMs, and the 16x16 tiles are at 0x50000-0x7FFFF. The first 0x20000 bytes of ROMs 10, 11, 12, 13, and the entirety of the other eight ROMs, contain sprites (which are in exactly the same format as 16x16 tiles).

If you press F4 and look at the graphics in MAME, you'll see the entire region (all 12 ROMs) decoded in all three formats, meaning you have to page through lots and lots of garbage before you get to valid 8x8 or 32x32 tiles. That's because MAME uses the same GFXDECODE info for all CPS1 games, regardless of which mapper they use. Most 1980s sprite-and-tile arcade systems weren't as flexible as CPS1 with its single ROM bus and per-game mappers; in most systems each layer had an entirely separate bus to its own dedicated ROMs, and the MAME F4 graphics viewer was designed with that as the assumption.

ETA: I think the "8x8" graphics you think you've extracted may actually be the quarters of the 16x16 sprites. The 8x8 and 16x16 formats have the same row stride, so if you decode one of them as the other you'll still get recognizable graphics (whereas the 32x32 will look like complete garbage)

Hop

Many thanks for all of the info.
QuoteThere are no 8x8 sprites on CPS1 ...  sprites that are made out of 16x16 cels ... I think the "8x8" graphics you think you've extracted may actually be the quarters of the 16x16 sprites ...

Yes. I was getting confused by seeing pieces of sprite data in the first two 8x8 layout visualisations in the MAME GFX viewer. I can now see that the scroll 1 data is indeed duplicated. I've managed to implement MAME GFX viewer / sf2 test mode "character" viewer and I almost understand everything that's going on.

QuoteYou can see in gfxrom_bank_mapper() that if the requested code is a sprite or a SCROLL2 tile it gets multiplied by 2 (left shifted by 1, same thing), if it is a SCROLL3 tile it gets multiplied by 8, and if it is a SCROLL1 tile it is left alone. So the "unit" that the ranges in the mapper table are measured in is one duplicated 8x8 tile (which is half a 16x16 tile, or one eighth of a 32x32 tile). Meaning 16 bytes in one ROM, or 64 bytes in the whole interleaved set.

This still confuses me a little. I've re-jigged the logic in the function into two functions that make sense to me:


void CalculateMinMaxCode(GfxType gfxType, unsigned int& minCode_out, unsigned int& maxCode_out)
{
int shift = 0;
switch (gfxType)
{
case kGfxType_Sprites:
shift = 1; // 16x16 = 2 units (2x 2x8x8 tiles)
break;
case kGfxType_Scroll1:
shift = 0; // 8x8 = 1 unit
break;
case kGfxType_Scroll2:
shift = 1;  // 16x16 = 2 units
break;
case kGfxType_Scroll3:
shift = 3;  // 32x32 = 8 units
break;
default:
FATAL_ERROR( "Unhandled case" );
}

unsigned int minUnitIndex = UINT32_MAX;
unsigned int maxUnitIndex = 0;
const struct gfx_range *range = mapper_table;
ASSERT(range);
bool bFoundRange = false;
while (range->gfxType != kGfxType_Invalid)
{
if (range->gfxType == gfxType)
{
minUnitIndex = Min(minUnitIndex, range->start);
maxUnitIndex = Max(maxUnitIndex, range->end);
bFoundRange = true;
}

++range;
ASSERT(range);
}

ASSERT(bFoundRange);
minCode_out = minUnitIndex >> shift;
maxCode_out = maxUnitIndex >> shift;
}

unsigned int FindGfxAddressInBytes( GfxType type, unsigned int code )
{
int shift = 0;
switch (type)
{
case kGfxType_Sprites:
shift = 1; // 16x16 = 2 units (2x 2x8x8 tiles)
break;
case kGfxType_Scroll1:
shift = 0; // 8x8 = 1 unit
break;
case kGfxType_Scroll2:
shift = 1;  // 16x16 = 2 units
break;
case kGfxType_Scroll3:
shift = 3;  // 32x32 = 8 units
break;
default:
FATAL_ERROR( "Unhandled case" );
}

// find range containing this code
unsigned int unitIndex = code << shift;
const struct gfx_range *range = mapper_table;
ASSERT(range);
while (range->gfxType != kGfxType_Invalid)
{
if( (range->gfxType == type) && (unitIndex >= range->start) && (unitIndex <= range->end) )
{
unsigned int bankBaseInUnits = range->bank * kBankSizesInUnits;
unsigned int unitIndexInBank = unitIndex & (kBankSizesInUnits - 1);
unsigned int unitIndexInRom = (bankBaseInUnits + unitIndexInBank);
unsigned int address = unitIndexInRom * kUnitSizeBytes;
ASSERT(address < kTotalGfxRomSizeBytes);
return address;
}

++range;
}

return kInvalidOffset;
}


CalculateMinMaxCode calculates the valid range of codes for each type, which can be fed into FindGfxAddressInBytes to find the address in gfx rom (interleaved) to allow the tile to be drawn. The min codes for each set match the min values I see in gfx RAM for the 4 types, which is good (sprite codes start at zero, scroll 1 starts at $4000, scroll 2 at $2800, scroll 3 at $400) 

So it seems that original code in cps_state::gfxrom_bank_mapper is designed take a tile code in (which is different for each set) and returns an index into gfx rom memory in units of the size of that set's element.



AWJ

Quote from: Hop on February 26, 2016, 07:50:15 PM
Many thanks for all of the info.
Yes. I was getting confused by seeing pieces of sprite data in the first two 8x8 layout visualisations in the MAME GFX viewer. I can now see that the scroll 1 data is indeed duplicated. I've managed to implement MAME GFX viewer / sf2 test mode "character" viewer and I almost understand everything that's going on.

This still confuses me a little. I've re-jigged the logic in the function into two functions that make sense to me:


void CalculateMinMaxCode(GfxType gfxType, unsigned int& minCode_out, unsigned int& maxCode_out)
{
int shift = 0;
switch (gfxType)
{
case kGfxType_Sprites:
shift = 1; // 16x16 = 2 units (2x 2x8x8 tiles)
break;
case kGfxType_Scroll1:
shift = 0; // 8x8 = 1 unit
break;
case kGfxType_Scroll2:
shift = 1;  // 16x16 = 2 units
break;
case kGfxType_Scroll3:
shift = 3;  // 32x32 = 8 units
break;
default:
FATAL_ERROR( "Unhandled case" );
}

unsigned int minUnitIndex = UINT32_MAX;
unsigned int maxUnitIndex = 0;
const struct gfx_range *range = mapper_table;
ASSERT(range);
bool bFoundRange = false;
while (range->gfxType != kGfxType_Invalid)
{
if (range->gfxType == gfxType)
{
minUnitIndex = Min(minUnitIndex, range->start);
maxUnitIndex = Max(maxUnitIndex, range->end);
bFoundRange = true;
}

++range;
ASSERT(range);
}

ASSERT(bFoundRange);
minCode_out = minUnitIndex >> shift;
maxCode_out = maxUnitIndex >> shift;
}

unsigned int FindGfxAddressInBytes( GfxType type, unsigned int code )
{
int shift = 0;
switch (type)
{
case kGfxType_Sprites:
shift = 1; // 16x16 = 2 units (2x 2x8x8 tiles)
break;
case kGfxType_Scroll1:
shift = 0; // 8x8 = 1 unit
break;
case kGfxType_Scroll2:
shift = 1;  // 16x16 = 2 units
break;
case kGfxType_Scroll3:
shift = 3;  // 32x32 = 8 units
break;
default:
FATAL_ERROR( "Unhandled case" );
}

// find range containing this code
unsigned int unitIndex = code << shift;
const struct gfx_range *range = mapper_table;
ASSERT(range);
while (range->gfxType != kGfxType_Invalid)
{
if( (range->gfxType == type) && (unitIndex >= range->start) && (unitIndex <= range->end) )
{
unsigned int bankBaseInUnits = range->bank * kBankSizesInUnits;
unsigned int unitIndexInBank = unitIndex & (kBankSizesInUnits - 1);
unsigned int unitIndexInRom = (bankBaseInUnits + unitIndexInBank);
unsigned int address = unitIndexInRom * kUnitSizeBytes;
ASSERT(address < kTotalGfxRomSizeBytes);
return address;
}

++range;
}

return kInvalidOffset;
}


CalculateMinMaxCode calculates the valid range of codes for each type, which can be fed into FindGfxAddressInBytes to find the address in gfx rom (interleaved) to allow the tile to be drawn. The min codes for each set match the min values I see in gfx RAM for the 4 types, which is good (sprite codes start at zero, scroll 1 starts at $4000, scroll 2 at $2800, scroll 3 at $400) 

So it seems that original code in cps_state::gfxrom_bank_mapper is designed take a tile code in (which is different for each set) and returns an index into gfx rom memory in units of the size of that set's element.

Yeah--the output of the MAME gfxrom_bank_mapper() function is in MAME "gfx_element" units, which start at the beginning of the first set of ROMs and are variable in size. An output of "0x8000" when a 32x32 tile is requested means "the 0x8000'th 32x32 tile if all the ROMs are bodged together and all decoded as 32x32". An output of "0x8000" when requesting an 8x8 tile means "the 0x8000'th 8x8 tile", which is not at the same address at all.

Like I said, the MAME 2d graphics subsystem assumes that each category of graphic data (16x16 sprites, 8x8 foreground tiles, 32x32 background tiles, etc.) resides in its own completely separate set of ROMs, which is how most arcade hardware works. Take a look at e.g. 1943, which is a board that fits MAME's assumptions very closely (and as a result the amount of code in its "video" module is tiny). Rather than a single "gfx" ROM region, 1943 has one region for each layer, and character codes in emulated VRAM directly map to MAME gfx_element indexes. CPS1 has to be massaged a bit in order to fit into the MAME system, and that massaging slightly obfuscates what's going on at the real hardware level.