News: 11 March 2016 - Forum Rules
Current Moderators - DarkSol, KingMike, MathOnNapkins, Azkadellia, Danke

Poll

Is it worth the effort to update IPS?

Yes, I think it would be useful
13 (56.5%)
No, this idea wouldn't be worth it
10 (43.5%)

Total Members Voted: 23

Author Topic: Proposal: IPS Revision for CRC checking  (Read 7910 times)

Spinner 8

  • Sr. Member
  • ****
  • Posts: 447
  • Pink Pretty Princess
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #20 on: April 01, 2016, 10:26:20 am »
It is the reason why I like IPS personally. That it doesn't ensure file identity.

Same. As long as there's a prompt, or a (disabled by default) selectable option to continue anyways, then I'm fine. The two BPS patchers I've tried don't have anything like that - they just refuse to work.

STARWIN

  • Sr. Member
  • ****
  • Posts: 450
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #21 on: April 01, 2016, 03:16:01 pm »
Putting it at the front has a problem I didn't think of before... it hinders patch-layering for patches which change the first couple of bytes (which may be common in NES hacks).

What does this mean?

Maybe they could be put at the front... but instead of the offset always being 0, it could be arbitrary -- that way you could put metadata blocks in a region of the file that the patch will be overwriting anyway.  I think I like that better.

Why don't you want to rewrite old data? Also, you cannot guarantee that the ips will have a sequence long enough for that anyway.

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #22 on: April 01, 2016, 04:23:09 pm »
What does this mean?
Why don't you want to rewrite old data? Also, you cannot guarantee that the ips will have a sequence long enough for that anyway.

If metadata blocks are at offset 0, they'll overwrite/mangle the first few bytes of the file.  To counteract that, the IPS will have to include the first few bytes of the original file in order to "fix" the mangling caused by the metadata blocks.

This can be a problem for patch layering.  If patch A changes the first few bytes, and you want to layer patch B over patch A ... patch B will overwrite A's changes... even if B doesn't really change anything in the first few bytes.

This will be a problem for NES hacks -- where header changes are fairly common.

So instead of metadata blocks having an offset of 0, they should be able to have any offset -- that way the patch generator can choose to mangle a portion of the file that the patch is already overwriting.

Yeah there's no guarantee there will be a block big enough -- but the mangling can at least be minimized.

STARWIN

  • Sr. Member
  • ****
  • Posts: 450
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #23 on: April 01, 2016, 04:53:09 pm »
Huh.. one point for Gemini I suppose, he already covered this.

edit: one issue for the EOF way when respecting the truncation as snarfblam suggested is that if patch A extends the ROM and patch B is a compatible patch with original/patch A version, doing A->B will truncate the file, which results in broken mess.

i have no idea what purpose the truncation is supposed to serve..
« Last Edit: April 01, 2016, 05:18:37 pm by STARWIN »

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #24 on: April 02, 2016, 03:11:54 am »
Yeah @ STARWIN.  That kind of mangling can't be avoided (or corrected!) with the EOF approach.  I'm pretty much sold on the "put it in a block" approach.

I started working on the interface for a lib that hopefully shouldn't be TOO difficult to drop into other patchers and/or soft-patching emulators.  And in doing so, I've wanted to change around/simplify a few things.

I'll try to write up a new/revised spec tomorrow with the changed ideas -- and details more fleshed out... but I might not get to it until Sunday.

Bregalad

  • Hero Member
  • *****
  • Posts: 2737
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #25 on: April 02, 2016, 08:19:34 am »
The problem if the patcher checks the CRC of the whole ROM before applying patches, is that it will not be possible to apply 2 patches one on the top of the other. However, doing that is often desirable, for instance a translation and two bugfixes patches you'll want to be applied simultaneously.

My opinion is that only the CRC of the patched blocks should be checked, instead of the whole ROM. That way, it becomes simple to check whether two patches are "compatible" or not.

In some cases, the blocks could overlap while still being somewhat compatible, it has happened to me that using PATCH1 then PATCH2 does not work, but that using PATCH2 and then PATCH1 does work. I do not remember the details unfortunately (*). But in those cases, it'd be nice to have just a warning with the location of the potentially conflicting data, but still allows the patching to be done. As opposed to have an error and have the patched refusing to apply it.

In all cases, any sane person will keep a backup of the clean ROM, so even if patching doesn't work, that's not the end of the world, you just throw the failed ROM away and there we go.

(*) As far as I can remember I was trying to patch Chrono Trigger with a french translation, while simulaneously apply the "Level-0" patch. It took me a lot of tries, but eventually I got something that was somewhat working. Apparently the Level-0 patch moves text arround, so applying that patch first and then applying the french translation did NOT work. However, applying first the french translation first, and then Level-0 worked - some text was in english again, (due to the Level-0 patch moving it arround), but the game as a whole still worked.

Now we can all agree that the "Level-0" patch is somewhat defective to do things this way, however, it just happens to do things that way and we cannot change that.

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #26 on: April 02, 2016, 10:33:16 am »
My opinion is that only the CRC of the patched blocks should be checked, instead of the whole ROM. That way, it becomes simple to check whether two patches are "compatible" or not.

Starwin suggested this a while back -- and I agree.  I'm leaning towards replacing the full-file CRC with an "only specific bytes" CRC.

Quote
But in those cases, it'd be nice to have just a warning with the location of the potentially conflicting data, but still allows the patching to be done. As opposed to have an error and have the patched refusing to apply it.

I agree.  The patcher I'm writing will never refuse, but will give a prompt when CRC or whatever doesn't match.

Also no patcher should modify the original file.  That's bonkers.  I realize many of them do -- and I still say it's bonkers.

My patcher will prompt for 3 files:  original/patch/modified.  It'll overwrite only if you provide the same filename for original & modified.

April 02, 2016, 03:54:31 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
So this is what I'm thinking for the final format spec.

I welcome a proofread and/or criticism



==========================================
Original IPS File Format:
=======================================
    5-byte      Header          'PATCH' (50 41 54 43 48)

    Data Block -- repeat this block until Footer is found
    {
        3-byte      Offset
        2-byte      Length
       
        if(Length == 0)
        {
            2-byte      RLE Length
            1-byte      RLE Value   (to be repeated 'RLE Length' times)
        }
        if(Length != 0)
        {
            N bytes     Data  (N = 'Length' bytes long)
        }
    }

    3-byte      Footer          'EOF' (45 4F 46)

    3-byte [optional]   Truncation value.  If present, file size should be reduced to this many bytes
                after patching

    -------------------

    Notes:
    - All values are stored unsigned big endian (high byte first)
    - It is impossible to have a Data Block with offset 0x454F46, since that would be the 'EOF' footer

==========================================




==========================================
Extended IPS Proposal:
==========================================
    5-byte      Header          'PATCH' (50 41 54 43 48)
   
    MetaData Block -- repeat this block until a normal Data block found
    {
        3-byte      Faux-offset (must not be 0x454F46 ('EOF'))
        2-byte      Length      (must be > 5)
       
        N bytes     Meta Data   -- must start with 'IPS2_'  (49 50 53 32 5F)
                                   See below for possible meta data blocks
    }
   
    Data Block -- repeat this block until Footer is found
    {
        (same as original format)
    }
   
    3-byte      Footer          'EOF' (45 4F 46)
   
    3-byte [optional]  Truncation value.  Same as original
   
    -------------------
   
    Notes:
    - MetaData Blocks should not make modifications to the target file.  The patcher should just extract information
        from the metadata to use.  However, legacy patchers will treat Metadata blocks as normal data blocks, and therefore
        it is likely metadata will be patched into the file regardless, causing the target file to be "mangled".
       
        To correct this, new patches should give a faux-offset for the metadata which will put the mangled data in a section
        of the file that will be overwritten by a future normal Data block.
     
    - Once the first normal Data block is found -- all future blocks should be treated as normal Data blocks.  All Metadata
        blocks must be placed before the first normal Data block.
       
    - In the rare instance that the first/only Data block's data begins with 'IPS2_', to prevent that block from being treated
        as Metadata, patch generators should either put a different block before it, or they should split the block into two
        parts to break up the header.  Ex:  One block for 'IP' and another block for 'S2_...'.
       
    - The truncation value exists for backwards compatability.  When creating patches, new patchers should include it *when
        necessary* to make the patch compatible with legacy patchers.  When applying patches, new patches should use this value
        only when the patch is a legacy patch (contains no MetaData blocks).  If the patch contains at least 1 MetaData block --
        regardless of what that block is, the truncation value should be ignored, and the Trnc/TRNC Metadata should be used
        instead (see below)
       
       
    -------------------
   
    MetaData blocks:
   
    Note:  Unless otherwise indicated -- only ONE of each type of block should exist.  If multiple are found, patchers should use
        the last one they find (later ones override previous ones).
       
    Note:  Future expansion via new blocks is possible.  Just remember that every block name must start with 'IPS2_'.
       
   
   
    - IPS2_Crc_####   ('####' is 4-byte big, endian, unsigned CRC value)
       
        This block provides a CRC so the patcher can verify the patch is being applied to an appropriate file.  Note that
        the CRC is meant to be more of a warning than a restriction -- and due to the multi-patch layering functionality
        of IPS, it may be common for users to want to apply patches even if the CRC doesn't match.
       
        Note that this is **NOT** a CRC of the entire file.  It is a CRC of the data to be overwritten by the patch (again
        to allow for patch layering)
       
        Rules for calculating the CRC:
       
        * Standard CRC-32 calculation (polynomial = 0x04C11DB7)
        * Iterate through all normal Data blocks in order, and hash the area they'll be overwriting
        * Hash only bytes that exist in the target file.  If the Data block exceeds the target file size (Ex: if the patch
            is expanding the file), stop hashing at EOF.
        * DO NOT hash MetaData faux-offsets.  Only normal Data blocks.
        * Note that it is possible for some bytes to be hashed multiple times, if multiple data blocks overlap. That is OK.
       
       
       
    - IPS2_SrcSize_##...  ('##...' is an N-byte, big endian, unsigned file size)
   
        This block specifies the expected file size of the target file.
        The file size can be a variable number of bytes, but MUST be between 1 and 8... though it usually will not exceed 3.
       
        Like IPS2_Crc_, this block is more of a warning than a rule, and should not prevent users from patching.
       
        Patchers can use this data to adjust for common minor differences in files.  For example, a patch for a SNES ROM
        may have been made for a ROM with no header, and the user is trying to apply it to a patch with a header.  The file
        size should make such a mistake easy to spot, and combined with a CRC the patcher should be able to automatically
        adjust and apply the patch as intended.
       
       
    - IPS2_Trnc_##...  ('##...' is an N-byte (between 1-8), big endian, unsigned file size)
    - IPS2_TRNC_##...
   
        These blocks should be treated as the same kind of block, and a patch file should only have 1 or the other, and never
        both.  If multiple are found, use whatever is found last.
       
        The given file size here is used to truncate the file.  This supersedes the legacy truncation value that was tacked on
        after EOF.
       
        'Trnc' will truncate the file **only if** the file size matched the given 'SrcSize'.
        'TRNC' will truncate the file unconditionally
       
        Notes:
        * 'Trnc' should be preferred since it is friendlier for multiple-patch layering.
        * 'Trnc' must appear in the patch *AFTER* a 'SrcSize' block.  If there was no SrcSize block before the Trnc block,
            the patcher should completely ignore the Trnc block.
        * 'SrcSize' block is not required for 'TRNC'
        * Truncation must only reduce the file size.  This block cannot be used to extend the file.  When applying a patch,
            if the target file size is less than or equal to the truncation value, do nothing.
           
           
           
    - IPS2_SrcName_nnn...  ('nnn...' is an N-byte UTF-8 encoded string... NOT null terminated)
   
        This block merely provides a name for the desired target file.  When applying a patch, a patcher may present this
        information to the user so they know what file they need to provide.
       
        Example:  "Super Mario World (no header)"
« Last Edit: April 02, 2016, 03:54:31 pm by Disch »

STARWIN

  • Sr. Member
  • ****
  • Posts: 450
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #27 on: April 03, 2016, 08:01:42 am »
I read most of it. It is so boring that it probably is consistent.

If you want to avoid checking for that rare instance IPS2_, you could instead define the metadata blocks to be those blocks that have the highest amount of the same offset (typically the only duplicate I assume). edit3: well, won't work if there is just one short block.. how about just having the metadata at the start of the block sequence and having a metadata end block hmm.. something along those lines (still being the most repetitive block)

If you want to save a few bytes (+ more chance to fit inside changed areas + less important reasons), you could name the metadata blocks with a single letter/byte value, and leave one value for extension purposes.

edit: as far as optimizing the bytes overall if the above reason is of interest, SrcName content is the biggest offender, so it could be split into many blocks with a certain max block size. although this is something that can be checked during patch creation if that matters..

edit2: if you are being as careful as you seem to in the spec, it should be stressed that the main theoretical flaw/risk is that it is not clear if there is an order in which the block patches have to be applied in "the original ips spec".
« Last Edit: April 03, 2016, 09:23:58 am by STARWIN »

snarfblam

  • Submission Reviewer
  • Hero Member
  • *****
  • Posts: 593
  • CANT HACK METROID
    • View Profile
    • snarfblam
Re: Proposal: IPS Revision for CRC checking
« Reply #28 on: April 03, 2016, 10:08:57 am »
Looks pretty good to me. One or two nitpicks:

Quote from: Original
        To correct this, new patches should give a faux-offset for the metadata which will put the mangled data in a section
        of the file that will be overwritten by a future normal Data block.

Perhaps:

Quote from: Suggestion
        To correct this, new patches must use a faux-offset for the metadata that will place the mangled data in a section
        of the file that will be completely overwritten by the following normal Data blocks.

And although it is implied by the above, for clarity it might be best to add:

Quote from: Suggestion
        Faux-offsets for metadata blocks must place the mangled data within the bounds of the resulting file.

tvtoon

  • Sr. Member
  • ****
  • Posts: 366
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #29 on: April 03, 2016, 12:28:34 pm »
As the guy who wrote a patcher that doesn't care about "EOF", I disagree with the idea. :D
I also don't understand all this silly thinking about checking a file consistency inside itself. It is logical fail, to state the least. ;)

Disch

  • Hero Member
  • *****
  • Posts: 2814
  • NES Junkie
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #30 on: April 03, 2016, 02:10:11 pm »
Quote
As the guy who wrote a patcher that doesn't care about "EOF", I disagree with the idea. :D

You don't conform to file specs?  For shame!   :P


I also don't understand all this silly thinking about checking a file consistency inside itself. It is logical fail, to state the least. ;)

The CRC isn't about checking IPS file consistency - it's about checking to make sure you're applying the patch to the correct file.

Patching the wrong file is a common problem, especially among beginners... and especially in the SNES world where sometimes you have a header and sometimes you don't.

tvtoon

  • Sr. Member
  • ****
  • Posts: 366
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #31 on: April 03, 2016, 02:46:08 pm »
I wasn't pointing your proposal but how things are implemented in some file formats (also patches). ;)

Zoinkity

  • Hero Member
  • *****
  • Posts: 564
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #32 on: April 04, 2016, 01:27:11 pm »
As a few posters have already mentioned, there's already patchers implementing some extensions following EOF, none of which test for a mark but assume the data is what they expect it to be if it follows EOF.

So, do you know how all of these existing patchers implemented that already?
For example, do they look at 3-4 bytes following the EOF mark, or are they going to the end of the file, reading 3-4 bytes back, and if not EOF treating as their data?  The implementation is very important in this case if you want yours to be ignored properly by existing tools, and hopefully it's implemented the same way across all that handle the resize extension.
(I want to say there's other extensions too, but that's complicating the point)

That said, I'm in the camp that extending IPS is not the best idea in the world.  CRC tests make sense for delta patching formats to ensure each command operates as expected.  CRC tests in something like IPS would only exist to ensure you're patching against the correct source, but this is a problem that can already be solved with a readme or nfo.
The common complaints against readmes would be people don't read them, and they're often excluded from redistributions for no good reason.  Containing that data in the patch simplifies enforcing it, versus, say, distributing an xml providing additional instructions/data a patcher could use for this and other extensions.
It may seem like IPS is such an old format people have become entrenched in their position on it, but the situation is somewhat reversed.  The real issue is that it's been around so long that it isn't just difficult to change, but changes could affect hundreds of different patchers that have been built and distributed over the years.  This wouldn't be obvious to users of the patch without the inclusion of a readme noting it deviates from the original IPS format, but again, you're in the same situation as before.

FAST6191

  • Hero Member
  • *****
  • Posts: 2848
    • View Profile
Re: Proposal: IPS Revision for CRC checking
« Reply #33 on: December 28, 2019, 10:55:25 am »
Raising the dead.
The other day I was wandering along and had another idea that would possibly solve the legacy patcher issue, or unknowns as far as respecting EOF.

As NOP, much less two, is seemingly a dubious prospect would it be possible to include some kind of CRC within the patch as patch commands but have the patch itself "correct" any issues that might be caused by it after the fact?
Patch location 10h with A
Patch location 10h with 9
Patch location 10h with 4
Patch location 10h with F
...
Patch location 10h with [whatever the original value was]
Oh look CRC (or SHA256 if you really wanted) is A94F...

CRCs are fixed length so you should not need a end of CRC section indicator, though one could surely be done (repeat restoration patch or some pattern some larger number of times). Put this at a fixed/knowable location in the IPS (before, after, in the middle of the file... it matters little even if I probably would pick the start).

Even if it ends up being binary you could dash out enough and then the patcher being aware of the original files restores it to its "proper value" for those legacy patchers. Only real negative beyond supplementary/optional patching would be it reduces patch size carrying capability but most things using IPS don't brush up against its limit and it is easily made optional. While post EOF stuff is tricky I don't imagine any IPS patcher having some kind of memory as far as what it has already patched and instead just repeats/iterates along the commands until EOF (be it literal end of file or EOF end of patch indicator) and throwing a hissy fit as a result. IPS optimisers and those "check for collisions" programs might have a hard time, but I think the former is one proof of concept program I doubt I could find today and the latter... even if thought them useful they are so few in number we could probably just update them to be aware of this.