Romhacking.net

Romhacking => ROM Hacking Discussion => Topic started by: Disch on March 29, 2016, 12:12:02 pm

Title: Proposal: IPS Revision for CRC checking
Post by: Disch on March 29, 2016, 12:12:02 pm
We all know IPS sucks.  But it's kind of like a "it's been here so long it'll never go away" situation.

So rather than trying to get everyone to switch to a new format, why don't we fix arguably the biggest problem with IPS?  That is... the inability to tell whether or not the patch is being applied to the proper ROM?


NOTE:   I'VE REVISED THIS FORMAT -- IGNORE THE FORMAT IN THIS POST AND LOOK AT THE ONE HERE:

http://www.romhacking.net/forum/index.php/topic,21580.msg302018.html#msg302018



current format:
Code: [Select]
5 byte header =     'PATCH'

{
    3 bytes =       offset
    2 bytes =       length of block
    N bytes =       block data
}  <- repeat block X times

3 byte footer =     'EOF'


My proposal:
Code: [Select]
5 byte header =     'PATCH'

{
    3 bytes =       0
    2 bytes =       length of block
    N bytes =       meta info block  (see below)
}  <- repeat block Y times

{
    3 bytes =       offset
    2 bytes =       length of block
    N bytes =       block data
}  <- repeat block X times

3 byte footer =     'EOF'



;;;;;;;;;;;;;;;;;;;;;;;;;;
meta info blocks:

'PATCH_CRC4_****'       = **** is the 4-byte CRC32 value of the original target file (binary, big endian)
'PATCH_SIZE_****'       = **** is the size of the original target file  (binary, big endian)
'PATCH_NAME_****'       = **** is a non-null terminated UTF-8 string indicating a casual name for the target file.  IE:  "Legend of Zelda" ... or "Super Metroid (no header)"
'PATCH_FILE_****'       = **** is a non-null terminated UTF-8 string indicating a common 'good-list' file name.  IE:  "Legend of Zelda, The (U) (PRG1).nes"

   ...  possible future ones if needed, but above should cover most cases well enough

For CRC & SIZE fields, each meta block indicates a 'valid' file.  So if there are multiple CRC blocks, a ROM with any of the given CRCs is considered "good".  Ditto for SIZE.  If both CRC and SIZE blocks are provided, ROM must match at least 1 CRC and at least 1 SIZE for it to be "good".

NAME and FILE fields are more suggestions than requirements, and are more for a GUI patcher to relay information to the user so they're not asking themselves "wtf am I supposed to do with this patch?"

Note:  SIZE field can be used by a smart patcher to identify headered/headerless SNES ROMs and adjust patch offsets appropriately so patch makers/users don't have to worry so much about that.


------------

The big pro to this approach is that it is completely backwards compatible for all files greater than 16 or so bytes.  Since it's backwards compatible, all existing patchers and soft-patching emus will still work, and old IPS patches will not need to be updated

The big con is that it'll need some redundant data.  Since the 'PATCH_XXX' bit will overwrite the first few bytes of the file, a following block will need to restore it.

I won't have time to actually work on a patcher until Wednesday -- but it should be pretty easy to implement.


Thoughts?  Feedback?



EDIT:  added 'NAME'/'FILE' bits.  Elaborated a bit on CRC and SIZE
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Gemini on March 29, 2016, 01:17:30 pm
If a patcher does effectively use EOF as an end marker (thus ignoring whatever comes after it), you can safely append the same metadata let whatever new patcher detect it and use it for additional checks. Older patches will ignore it and there should be no overwrite issues either at the beginning of a newly patched rom.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on March 29, 2016, 01:50:19 pm
Backwards compability may be undesirable.

old ips + old patcher = nothing changes, but people may think it checks if the file is good because this patcher does
new ips + old patcher = nothing changes, but people may think it checks if the file is good because this patcher does
old ips + new patcher = people may think it checks if the file is good because this patcher does, even when it actually does not
new ips + new patcher = people need to get a new patcher to get a new feature

You can get around half of it by making the patcher tell if it checked if the file is ok, but it won't cover old patchers / emulators.

maybe people use ips just because the patches are in ips. not because they are incapable of trying a new format or patcher if the instructions said so. in that case there is no need to call the new patch .ips.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: FAST6191 on March 29, 2016, 02:13:28 pm
If a patcher does effectively use EOF as an end marker (thus ignoring whatever comes after it), you can safely append the same metadata let whatever new patcher detect it and use it for additional checks.

That may be a bigger if than you realise. It is a bit better now people have stopped using command line and visual basic but there are still enough awful legacy patchers and emulators with IPS support out there, done wrong it could be seeing us all begging to go back to today when we had the retron5 and android emulators as the main annoyances.
That said most of the issues I saw there were from patches that omitted the EOF rather than things not respecting it, not that anybody tried to append things to them.

What may be better in some regards would be to consider an anime/certain other warez scenes style sticking a crc in square brackets approach and having a suggested patcher scan for that. Might get a bit ugly if you have optional patches that work with each or or hardcoded cheats or something but as long as there is a "do it anyway" option then I guess it would work.
Even without the renaming issue I wonder is there a way to effectively make a dummy/NOP/do nothing command in IPS that even the legacy patchers will respect/ignore/not change things for?
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on March 29, 2016, 06:34:08 pm
If a patcher does effectively use EOF as an end marker (thus ignoring whatever comes after it)

That's a risk I'm not willing to take.  It'd be very easy to imagine a patcher which doesn't stop until the actual eof.

Backwards compability may be undesirable.

If we're not going to preserve backwards compatability, there's no point in sticking to IPS.  Any number of other patching formats already do this -- the problem is they're not as widely used because IPS is the de-facto standard (at least in NES/SNES/etc era ROM hacks)

Quote
old ips + old patcher = nothing changes, but people may think it checks if the file is good because this patcher does
new ips + old patcher = nothing changes, but people may think it checks if the file is good because this patcher does

Nothing can be done about old patchers.  But we can create new patchers to phase old ones out.  This is all about introducing a gradual change to make IPS better.  It's not about making an abrupt change to instantly solve all the problems.  If we want an abrupt change we'd just stop using IPS entirely.

Quote
old ips + new patcher = people may think it checks if the file is good because this patcher does, even when it actually does not
New patchers can give a notice or warning if they're applying an 'old' patch.  Not really a big deal.

Quote
new ips + new patcher = people need to get a new patcher to get a new feature

New IPS's will work on an old patcher -- that's the whole point with this being backwards compatible.  So there's no NEED to update... you don't gain or lose anything if you stick with an old patcher.  But if you update, you gain extra checking.

Quote
maybe people use ips just because the patches are in ips. not because they are incapable of trying a new format or patcher if the instructions said so. in that case there is no need to call the new patch .ips.

There have been superior alternatives to IPS for years.  IPS remains relevant.  It's harder than you think to replace a de-facto standard.

The NES scene has a similar problem with the iNES (.nes) format -- they experimented with a much more descriptive UNIF format but it never caught on because everybody had iNES ROMs.  They're now kicking around the idea of iNES 2.0 which is fully backwards compatible with the current iNES format, but is much more descriptive.  The problem is people aren't really "creating new" ROMs so 2.0 isn't getting much use.


What may be better in some regards would be to consider an anime/certain other warez scenes style sticking a crc in square brackets approach and having a suggested patcher scan for that.

I'm not a fan of storing meta data in filenames.  I suppose it could work, but it'd be harder to work into a patcher (especially a soft patching emulator where the filename is predetermined).

Also I'm not sure I see the benefit.  Is there any benefit of doing this over doing what I proposed above?

Quote
Even without the renaming issue I wonder is there a way to effectively make a dummy/NOP/do nothing command in IPS that even the legacy patchers will respect/ignore/not change things for?

Only way to MAYBE do a NOP would be to have a block of length zero, but then you can't embed any metadata because your length is zero  =/

Hence my approach of just overwriting data with the metadata -- then overwriting it again to restore it.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: FAST6191 on March 29, 2016, 06:52:46 pm
Yeah I am not a fan of using file names either. I have seen some truly creative ways of screwing things up done in the past (my favourite was deleting the ROMs after they were selected to regain space... from NES ROMs) and file names is often the first thing people change -- "it had all this weird punctuation at the end of the file name".

I considered the zero length commands but the lack of a payload would pose the main problem. Are there multiple types of zero/NOP do you reckon? If so then rig up some fancy decoder that uses the different zero/NOP as a kind of key. Even if there are only two options you could do a very expanded form of binary. About all that would cost is the capacity of the eventual payload and there are not too many 8 or 16 bit era consoles we deal in that hit the 16 meg range anyway.

Personally I am happy enough to see IPS die, about the only benefit to it right now for me is that it is considerably easier to implement in hardware than almost any other type of patcher. The only effort I would make is to make a truly bombproof implementation of a patcher that handles every edge case we can find and cook up, and in the most permissive license we can think of and possibly in enough languages that embedded hardware through to php users can implement it.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on March 29, 2016, 07:03:52 pm
Are there multiple types of zero/NOP do you reckon?

A zero/nop consists of 8 bytes:

3 for the offset (theoretically you might be able to put a payload here)
2 for length (0 to indicate an RLE block)
2 for RLE length (0 to indicate NOP)
1 for RLE byte (theoretically you could have payload here)


With the offset and RLE byte you have 4 bytes... which is enough to hold a CRC and it would effectively be a NOP.  But there are problems with this:

1)  Behavior when the RLE length is zero is undefined.  I would assume patchers would treat this as a NOP since that's most logical, but there's no guarantee

2)  Putting payload in the offset portion is dangerous because a patcher may extend the file size to reach the offset before even realizing the block is a NOP -- this could result in an old patcher creating inflated files.

3)  'EOF' is a reserved value and can't be used in the offset -- and there's no way to ensure that value never appears as the CRC

4)  This is not very extensible anyway.  4 bytes of payload is very little, and the only way to expand this approach would be to string multiple of these NOP blocks together.

Quote
Personally I am happy enough to see IPS die,

Same.  But it should have died a decade ago and it still hasn't.  Might as well do what we can to improve it.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: snarfblam on March 29, 2016, 07:31:20 pm
There is already an extension to the IPS format (http://www.smwiki.net/wiki/IPS_file_format) that appends data after EOF. No reason you can't further extend it. It's a simple 3-byte file length specifier to allow a patch to truncate the ROM. Just to be super-duper-extra-mega-safe, you should include a file truncation entry that specifies the correct file size. That way, a patcher checking for that info won't be thrown off when it encounters something else following EOF. It should be relatively safe to include additional metadata following the truncate entry, giving you backwards compatible patches.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on March 29, 2016, 07:45:44 pm
There is already an extension to the IPS format (http://www.smwiki.net/wiki/IPS_file_format) that appends data after EOF. No reason you can't further extend it.

Hrm... that makes me feel a little better about appending data.  If that goes as far back as ancient SNEStool, then it's probably safe.

A downside to appending is you have to parse the entire file to get to the metadata -- and you probably will want the metadata before you care about the actual patch information.

One thing I definitely DON'T want to do is just tack-on a value like that truncation value.  That is not extensible at all... and in a way it kind of shoots this effort in the foot because now if we want to append something at the end of the file we have to include a target file size even if we don't want to.


I'm still leaning towards my original idea of putting all that stuff up front in a kind of block format ... and leaving EOF untouched.

What do you guys think would be better?
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on March 30, 2016, 08:09:21 am
I think your method is better, assuming that all patchers patch in order (edit: ... and assuming they do a while loop over ips content and not something stupid like file offsets). Though my original post stands otherwise.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: snarfblam on March 30, 2016, 01:42:47 pm
I think if you really wanted to get technical you could try to make the assertion that overlapping patches invoke undefined behavior. (Every spec I've looked at only explained file structure.) A patcher that is thrown off by unexpected data after EOF, on the other hand, is ill-behaved.

In reality, any patcher that is remotely worth using will tolerate either condition. Neither approach is any harder to implement, and neither has any significant drawback. (Who cares if you have to traverse the entire file to get the checksum? How many IPS patches have you run into that can't be applied virtually instantly?)

It's really a matter of personal preference, and since you're taking the initiative, I'd say it's really your call, Disch.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: AWJ on March 31, 2016, 04:23:14 am
I don't understand what you think is being gained by extending IPS instead of using one of the existing patch formats that does support verifying the base file (e.g. BPS).

No matter what format you use, you need new patches that contain the checksum information. Since the patches need to be updated one way or the other, you might as well just update them to BPS.

Likewise, patchers and emulators need to be updated to support the new checksum extension. Since they need to be updated anyway, you might as well update them by adding BPS support instead of yet another new format.

If anything, this idea is the worst of both worlds, because the users who are still stubbornly using ZSNES 1.42 in 2016 will be able to use these extended-IPS patches without seeing any of the actual benefits of the extension.

ETA: Posted before reading the entire thread. tl;dr: what STARWIN said. Also, AFAIK the problems with UNIF and the reason it failed went well beyond "not being iNES".
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Zynk on March 31, 2016, 04:43:00 am
What if you want that patch to work with multiple hacks? What should you use?
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on March 31, 2016, 08:08:39 am
What if you want that patch to work with multiple hacks? What should you use?

It is the reason why I like IPS personally. That it doesn't ensure file identity. This property could be improved too but.. I don't know if this is the discussion for that.

Overall we got ips because the patches are in ips and RHDN doesn't want to alter the patches, so it doesn't seem that anything is going to change anytime soon.

edit: I changed my mind, let's post it: a nice way would be to count the checksum over the old bytes over the patched offsets. This way the patch applies only to files where those particular locations were unchanged, which in various cases rules out bad alignment (header vs no header) and obviously uncompatible other hacks (though some risk remains). Would still fail to apply in some desirable situations where ips would work.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on March 31, 2016, 11:35:25 am
Likewise, patchers and emulators need to be updated to support the new checksum extension. Since they need to be updated anyway, you might as well update them by adding BPS support instead of yet another new format.

That's the thing though, this isn't yet another format.  It's still IPS.  People who are too stubborn to update will still be able to use apply these new patches.

The whole point is that updating will grant you extra functionality -- but if you don't update you don't gain or lose anything.  So updating isn't necessary.  Nobody will actually have to download a new patcher.

The problem with newer patch formats like BPS is familiarity and incompatability.  Aside from there being several conflicting formats, everyone knows and is familiar with IPS -- and they have to be because 99% of NES/SNES/etc ROM hacks are in IPS format.  So everyone in ROM hacking needs an IPS patcher.  It's kind of a pain in the arse to download a hack that's in a format you don't have a patcher for, then you have to hunt down a new patching program.  And yeah there are programs that apply multiple formats -- and you might not say it's that big of a deal... but it kind of is.

(Your choice of words in describing "yet another patch format" is sort of a testament that there isn't any real successor to IPS -- there are just multiple competing formats.  That might be another reason why none of them have really caught on)


Ultimately I mostly agree with you guys.  IPS is stupid and nobody should really use it.  But people still do.  And at this point it's unavoidable.  IPS has too big of an existing library, user base, and software support to ignore.  Simply discarding it and using a different format might be the ideal solution -- but it is not practical -- which is why it hasn't happened and likely will not happen at any point in the foreseeable future.

Quote
If anything, this idea is the worst of both worlds, because the users who are still stubbornly using ZSNES 1.42 in 2016 will be able to use these extended-IPS patches without seeing any of the actual benefits of the extension.

That's not the worst of any world -- that's the entire point.

Upgrade to deluxe package?  Get extra bells and whistles!
Don't want to upgrade?  Don't worry, you don't have to, it'll still work.

IMO all software packages should take that approach instead of trying to cram updates down users throats.  Backwards compatability is a wonderful thing.

Quote
edit: I changed my mind, let's post it: a nice way would be to count the checksum over the old bytes over the patched offsets. This way the patch applies only to files where those particular locations were unchanged, which in various cases rules out bad alignment (header vs no header) and obviously uncompatible other hacks (though some risk remains). Would still fail to apply in some desirable situations where ips would work.

That's a FANTASTIC idea!  =)

Full-file CRC does block that.  Changed-byte CRC would be much more flexible.

This would mean you'd have to parse the entire IPS before calculating the CRC... so it might make sense to put this on the end of the file instead of the front.

Putting it at the front has a problem I didn't think of before... it hinders patch-layering for patches which change the first couple of bytes (which may be common in NES hacks).

Maybe they could be put at the front... but instead of the offset always being 0, it could be arbitrary -- that way you could put metadata blocks in a region of the file that the patch will be overwriting anyway.  I think I like that better.

Quote from: STARWIN
It is the reason why I like IPS personally. That it doesn't ensure file identity. This property could be improved too but.. I don't know if this is the discussion for that.

This is absolutely the discussion for that.  I want to iron out all these details before I write up a patcher.  =)
Title: Re: Proposal: IPS Revision for CRC checking
Post by: FAST6191 on March 31, 2016, 11:58:42 am
Changed CRC would be nice but we do still see nested patches change other hacks, and indeed have encouraged others to nest things rather than use someone else's work. That said if it is the header (or possibly interleaved vs not) issue you are dodging then it could work.

Things other than IPS caught on where IPS falls short (those things with file systems and greater than 16 meg size ROMs/ISOs/whatever).

Anyway other than the reference + edge case thing I mentioned before I think I am in the "let them suffer" camp with regards to people persisting in using IPS.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on March 31, 2016, 12:51:07 pm
So I'm seeing a lot of pushback with people saying "don't bother, this wouldn't be worth it"

Which is a legitimate point.  So I added a poll to this thread so see how many people actually think this idea is worth pursuing.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: snarfblam on March 31, 2016, 05:56:09 pm
I think it would totally be worth it. It doesn't make the current patch format mess any more complicated. Worst case scenario, it doesn't catch on and we end up with a handful of patches that are slightly bigger than they need to be, best case scenario it supersedes LIPS and the world is a better place. Regardless of the final details, I'd definitely incorporate the functionality into my own IPS code.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: tryphon on March 31, 2016, 06:23:58 pm
^this
Title: Re: Proposal: IPS Revision for CRC checking
Post by: FAST6191 on March 31, 2016, 07:13:55 pm
Though I am not inclined to put in much effort beyond a discussion like this for the would be project I have no objections to a legacy friendly extension to the IPS format, and would applaud the effort at least. To what extent people would take it up I do not know, though I suspect not many for the same reasons we are all still suffering IPS to this day (technically superior formats to IPS having been around for how long now?).

Also relevant XKCD because why not
https://xkcd.com/927/
If this was the next next successor to IPS then I am sure would would all be pointing at the corpses of UPS, Ninja, BPS, DPS, fireflower and whatever else there is out there, the legacy friendly thing being about the only thing to prevent everybody from saying that (though I guess you knew that going in).
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Spinner 8 on April 01, 2016, 10:26:20 am
It is the reason why I like IPS personally. That it doesn't ensure file identity.

Same. As long as there's a prompt, or a (disabled by default) selectable option to continue anyways, then I'm fine. The two BPS patchers I've tried don't have anything like that - they just refuse to work.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on April 01, 2016, 03:16:01 pm
Putting it at the front has a problem I didn't think of before... it hinders patch-layering for patches which change the first couple of bytes (which may be common in NES hacks).

What does this mean?

Maybe they could be put at the front... but instead of the offset always being 0, it could be arbitrary -- that way you could put metadata blocks in a region of the file that the patch will be overwriting anyway.  I think I like that better.

Why don't you want to rewrite old data? Also, you cannot guarantee that the ips will have a sequence long enough for that anyway.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on April 01, 2016, 04:23:09 pm
What does this mean?
Why don't you want to rewrite old data? Also, you cannot guarantee that the ips will have a sequence long enough for that anyway.

If metadata blocks are at offset 0, they'll overwrite/mangle the first few bytes of the file.  To counteract that, the IPS will have to include the first few bytes of the original file in order to "fix" the mangling caused by the metadata blocks.

This can be a problem for patch layering.  If patch A changes the first few bytes, and you want to layer patch B over patch A ... patch B will overwrite A's changes... even if B doesn't really change anything in the first few bytes.

This will be a problem for NES hacks -- where header changes are fairly common.

So instead of metadata blocks having an offset of 0, they should be able to have any offset -- that way the patch generator can choose to mangle a portion of the file that the patch is already overwriting.

Yeah there's no guarantee there will be a block big enough -- but the mangling can at least be minimized.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on April 01, 2016, 04:53:09 pm
Huh.. one point for Gemini I suppose, he already covered this.

edit: one issue for the EOF way when respecting the truncation as snarfblam suggested is that if patch A extends the ROM and patch B is a compatible patch with original/patch A version, doing A->B will truncate the file, which results in broken mess.

i have no idea what purpose the truncation is supposed to serve..
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on April 02, 2016, 03:11:54 am
Yeah @ STARWIN.  That kind of mangling can't be avoided (or corrected!) with the EOF approach.  I'm pretty much sold on the "put it in a block" approach.

I started working on the interface for a lib that hopefully shouldn't be TOO difficult to drop into other patchers and/or soft-patching emulators.  And in doing so, I've wanted to change around/simplify a few things.

I'll try to write up a new/revised spec tomorrow with the changed ideas -- and details more fleshed out... but I might not get to it until Sunday.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Bregalad on April 02, 2016, 08:19:34 am
The problem if the patcher checks the CRC of the whole ROM before applying patches, is that it will not be possible to apply 2 patches one on the top of the other. However, doing that is often desirable, for instance a translation and two bugfixes patches you'll want to be applied simultaneously.

My opinion is that only the CRC of the patched blocks should be checked, instead of the whole ROM. That way, it becomes simple to check whether two patches are "compatible" or not.

In some cases, the blocks could overlap while still being somewhat compatible, it has happened to me that using PATCH1 then PATCH2 does not work, but that using PATCH2 and then PATCH1 does work. I do not remember the details unfortunately (*). But in those cases, it'd be nice to have just a warning with the location of the potentially conflicting data, but still allows the patching to be done. As opposed to have an error and have the patched refusing to apply it.

In all cases, any sane person will keep a backup of the clean ROM, so even if patching doesn't work, that's not the end of the world, you just throw the failed ROM away and there we go.

(*) As far as I can remember I was trying to patch Chrono Trigger with a french translation, while simulaneously apply the "Level-0" patch. It took me a lot of tries, but eventually I got something that was somewhat working. Apparently the Level-0 patch moves text arround, so applying that patch first and then applying the french translation did NOT work. However, applying first the french translation first, and then Level-0 worked - some text was in english again, (due to the Level-0 patch moving it arround), but the game as a whole still worked.

Now we can all agree that the "Level-0" patch is somewhat defective to do things this way, however, it just happens to do things that way and we cannot change that.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on April 02, 2016, 10:33:16 am
My opinion is that only the CRC of the patched blocks should be checked, instead of the whole ROM. That way, it becomes simple to check whether two patches are "compatible" or not.

Starwin suggested this a while back -- and I agree.  I'm leaning towards replacing the full-file CRC with an "only specific bytes" CRC.

Quote
But in those cases, it'd be nice to have just a warning with the location of the potentially conflicting data, but still allows the patching to be done. As opposed to have an error and have the patched refusing to apply it.

I agree.  The patcher I'm writing will never refuse, but will give a prompt when CRC or whatever doesn't match.

Also no patcher should modify the original file.  That's bonkers.  I realize many of them do -- and I still say it's bonkers.

My patcher will prompt for 3 files:  original/patch/modified.  It'll overwrite only if you provide the same filename for original & modified.

April 02, 2016, 03:54:31 pm - (Auto Merged - Double Posts are not allowed before 7 days.)
So this is what I'm thinking for the final format spec.

I welcome a proofread and/or criticism



==========================================
Original IPS File Format:
=======================================
    5-byte      Header          'PATCH' (50 41 54 43 48)

    Data Block -- repeat this block until Footer is found
    {
        3-byte      Offset
        2-byte      Length
       
        if(Length == 0)
        {
            2-byte      RLE Length
            1-byte      RLE Value   (to be repeated 'RLE Length' times)
        }
        if(Length != 0)
        {
            N bytes     Data  (N = 'Length' bytes long)
        }
    }

    3-byte      Footer          'EOF' (45 4F 46)

    3-byte [optional]   Truncation value.  If present, file size should be reduced to this many bytes
                after patching

    -------------------

    Notes:
    - All values are stored unsigned big endian (high byte first)
    - It is impossible to have a Data Block with offset 0x454F46, since that would be the 'EOF' footer

==========================================




==========================================
Extended IPS Proposal:
==========================================
    5-byte      Header          'PATCH' (50 41 54 43 48)
   
    MetaData Block -- repeat this block until a normal Data block found
    {
        3-byte      Faux-offset (must not be 0x454F46 ('EOF'))
        2-byte      Length      (must be > 5)
       
        N bytes     Meta Data   -- must start with 'IPS2_'  (49 50 53 32 5F)
                                   See below for possible meta data blocks
    }
   
    Data Block -- repeat this block until Footer is found
    {
        (same as original format)
    }
   
    3-byte      Footer          'EOF' (45 4F 46)
   
    3-byte [optional]  Truncation value.  Same as original
   
    -------------------
   
    Notes:
    - MetaData Blocks should not make modifications to the target file.  The patcher should just extract information
        from the metadata to use.  However, legacy patchers will treat Metadata blocks as normal data blocks, and therefore
        it is likely metadata will be patched into the file regardless, causing the target file to be "mangled".
       
        To correct this, new patches should give a faux-offset for the metadata which will put the mangled data in a section
        of the file that will be overwritten by a future normal Data block.
     
    - Once the first normal Data block is found -- all future blocks should be treated as normal Data blocks.  All Metadata
        blocks must be placed before the first normal Data block.
       
    - In the rare instance that the first/only Data block's data begins with 'IPS2_', to prevent that block from being treated
        as Metadata, patch generators should either put a different block before it, or they should split the block into two
        parts to break up the header.  Ex:  One block for 'IP' and another block for 'S2_...'.
       
    - The truncation value exists for backwards compatability.  When creating patches, new patchers should include it *when
        necessary* to make the patch compatible with legacy patchers.  When applying patches, new patches should use this value
        only when the patch is a legacy patch (contains no MetaData blocks).  If the patch contains at least 1 MetaData block --
        regardless of what that block is, the truncation value should be ignored, and the Trnc/TRNC Metadata should be used
        instead (see below)
       
       
    -------------------
   
    MetaData blocks:
   
    Note:  Unless otherwise indicated -- only ONE of each type of block should exist.  If multiple are found, patchers should use
        the last one they find (later ones override previous ones).
       
    Note:  Future expansion via new blocks is possible.  Just remember that every block name must start with 'IPS2_'.
       
   
   
    - IPS2_Crc_####   ('####' is 4-byte big, endian, unsigned CRC value)
       
        This block provides a CRC so the patcher can verify the patch is being applied to an appropriate file.  Note that
        the CRC is meant to be more of a warning than a restriction -- and due to the multi-patch layering functionality
        of IPS, it may be common for users to want to apply patches even if the CRC doesn't match.
       
        Note that this is **NOT** a CRC of the entire file.  It is a CRC of the data to be overwritten by the patch (again
        to allow for patch layering)
       
        Rules for calculating the CRC:
       
        * Standard CRC-32 calculation (polynomial = 0x04C11DB7)
        * Iterate through all normal Data blocks in order, and hash the area they'll be overwriting
        * Hash only bytes that exist in the target file.  If the Data block exceeds the target file size (Ex: if the patch
            is expanding the file), stop hashing at EOF.
        * DO NOT hash MetaData faux-offsets.  Only normal Data blocks.
        * Note that it is possible for some bytes to be hashed multiple times, if multiple data blocks overlap. That is OK.
       
       
       
    - IPS2_SrcSize_##...  ('##...' is an N-byte, big endian, unsigned file size)
   
        This block specifies the expected file size of the target file.
        The file size can be a variable number of bytes, but MUST be between 1 and 8... though it usually will not exceed 3.
       
        Like IPS2_Crc_, this block is more of a warning than a rule, and should not prevent users from patching.
       
        Patchers can use this data to adjust for common minor differences in files.  For example, a patch for a SNES ROM
        may have been made for a ROM with no header, and the user is trying to apply it to a patch with a header.  The file
        size should make such a mistake easy to spot, and combined with a CRC the patcher should be able to automatically
        adjust and apply the patch as intended.
       
       
    - IPS2_Trnc_##...  ('##...' is an N-byte (between 1-8), big endian, unsigned file size)
    - IPS2_TRNC_##...
   
        These blocks should be treated as the same kind of block, and a patch file should only have 1 or the other, and never
        both.  If multiple are found, use whatever is found last.
       
        The given file size here is used to truncate the file.  This supersedes the legacy truncation value that was tacked on
        after EOF.
       
        'Trnc' will truncate the file **only if** the file size matched the given 'SrcSize'.
        'TRNC' will truncate the file unconditionally
       
        Notes:
        * 'Trnc' should be preferred since it is friendlier for multiple-patch layering.
        * 'Trnc' must appear in the patch *AFTER* a 'SrcSize' block.  If there was no SrcSize block before the Trnc block,
            the patcher should completely ignore the Trnc block.
        * 'SrcSize' block is not required for 'TRNC'
        * Truncation must only reduce the file size.  This block cannot be used to extend the file.  When applying a patch,
            if the target file size is less than or equal to the truncation value, do nothing.
           
           
           
    - IPS2_SrcName_nnn...  ('nnn...' is an N-byte UTF-8 encoded string... NOT null terminated)
   
        This block merely provides a name for the desired target file.  When applying a patch, a patcher may present this
        information to the user so they know what file they need to provide.
       
        Example:  "Super Mario World (no header)"
Title: Re: Proposal: IPS Revision for CRC checking
Post by: STARWIN on April 03, 2016, 08:01:42 am
I read most of it. It is so boring that it probably is consistent.

If you want to avoid checking for that rare instance IPS2_, you could instead define the metadata blocks to be those blocks that have the highest amount of the same offset (typically the only duplicate I assume). edit3: well, won't work if there is just one short block.. how about just having the metadata at the start of the block sequence and having a metadata end block hmm.. something along those lines (still being the most repetitive block)

If you want to save a few bytes (+ more chance to fit inside changed areas + less important reasons), you could name the metadata blocks with a single letter/byte value, and leave one value for extension purposes.

edit: as far as optimizing the bytes overall if the above reason is of interest, SrcName content is the biggest offender, so it could be split into many blocks with a certain max block size. although this is something that can be checked during patch creation if that matters..

edit2: if you are being as careful as you seem to in the spec, it should be stressed that the main theoretical flaw/risk is that it is not clear if there is an order in which the block patches have to be applied in "the original ips spec".
Title: Re: Proposal: IPS Revision for CRC checking
Post by: snarfblam on April 03, 2016, 10:08:57 am
Looks pretty good to me. One or two nitpicks:

Quote from: Original
        To correct this, new patches should give a faux-offset for the metadata which will put the mangled data in a section
        of the file that will be overwritten by a future normal Data block.

Perhaps:

Quote from: Suggestion
        To correct this, new patches must use a faux-offset for the metadata that will place the mangled data in a section
        of the file that will be completely overwritten by the following normal Data blocks.

And although it is implied by the above, for clarity it might be best to add:

Quote from: Suggestion
        Faux-offsets for metadata blocks must place the mangled data within the bounds of the resulting file.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: tvtoon on April 03, 2016, 12:28:34 pm
As the guy who wrote a patcher that doesn't care about "EOF", I disagree with the idea. :D
I also don't understand all this silly thinking about checking a file consistency inside itself. It is logical fail, to state the least. ;)
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Disch on April 03, 2016, 02:10:11 pm
Quote
As the guy who wrote a patcher that doesn't care about "EOF", I disagree with the idea. :D

You don't conform to file specs?  For shame!   :P


I also don't understand all this silly thinking about checking a file consistency inside itself. It is logical fail, to state the least. ;)

The CRC isn't about checking IPS file consistency - it's about checking to make sure you're applying the patch to the correct file.

Patching the wrong file is a common problem, especially among beginners... and especially in the SNES world where sometimes you have a header and sometimes you don't.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: tvtoon on April 03, 2016, 02:46:08 pm
I wasn't pointing your proposal but how things are implemented in some file formats (also patches). ;)
Title: Re: Proposal: IPS Revision for CRC checking
Post by: Zoinkity on April 04, 2016, 01:27:11 pm
As a few posters have already mentioned, there's already patchers implementing some extensions following EOF, none of which test for a mark but assume the data is what they expect it to be if it follows EOF.

So, do you know how all of these existing patchers implemented that already?
For example, do they look at 3-4 bytes following the EOF mark, or are they going to the end of the file, reading 3-4 bytes back, and if not EOF treating as their data?  The implementation is very important in this case if you want yours to be ignored properly by existing tools, and hopefully it's implemented the same way across all that handle the resize extension.
(I want to say there's other extensions too, but that's complicating the point)

That said, I'm in the camp that extending IPS is not the best idea in the world.  CRC tests make sense for delta patching formats to ensure each command operates as expected.  CRC tests in something like IPS would only exist to ensure you're patching against the correct source, but this is a problem that can already be solved with a readme or nfo.
The common complaints against readmes would be people don't read them, and they're often excluded from redistributions for no good reason.  Containing that data in the patch simplifies enforcing it, versus, say, distributing an xml providing additional instructions/data a patcher could use for this and other extensions.
It may seem like IPS is such an old format people have become entrenched in their position on it, but the situation is somewhat reversed.  The real issue is that it's been around so long that it isn't just difficult to change, but changes could affect hundreds of different patchers that have been built and distributed over the years.  This wouldn't be obvious to users of the patch without the inclusion of a readme noting it deviates from the original IPS format, but again, you're in the same situation as before.
Title: Re: Proposal: IPS Revision for CRC checking
Post by: FAST6191 on December 28, 2019, 10:55:25 am
Raising the dead.
The other day I was wandering along and had another idea that would possibly solve the legacy patcher issue, or unknowns as far as respecting EOF.

As NOP, much less two, is seemingly a dubious prospect would it be possible to include some kind of CRC within the patch as patch commands but have the patch itself "correct" any issues that might be caused by it after the fact?
Patch location 10h with A
Patch location 10h with 9
Patch location 10h with 4
Patch location 10h with F
...
Patch location 10h with [whatever the original value was]
Oh look CRC (or SHA256 if you really wanted) is A94F...

CRCs are fixed length so you should not need a end of CRC section indicator, though one could surely be done (repeat restoration patch or some pattern some larger number of times). Put this at a fixed/knowable location in the IPS (before, after, in the middle of the file... it matters little even if I probably would pick the start).

Even if it ends up being binary you could dash out enough and then the patcher being aware of the original files restores it to its "proper value" for those legacy patchers. Only real negative beyond supplementary/optional patching would be it reduces patch size carrying capability but most things using IPS don't brush up against its limit and it is easily made optional. While post EOF stuff is tricky I don't imagine any IPS patcher having some kind of memory as far as what it has already patched and instead just repeats/iterates along the commands until EOF (be it literal end of file or EOF end of patch indicator) and throwing a hissy fit as a result. IPS optimisers and those "check for collisions" programs might have a hard time, but I think the former is one proof of concept program I doubt I could find today and the latter... even if thought them useful they are so few in number we could probably just update them to be aware of this.