That sounds like extreme overkill. Calculating a SNES checksum should not take much time for practically anything capable of running SNES games.
Well, the time it takes to calculate the checksum depends on the language. My PowerShell implementation even when multithreading was hitting 10-15 seconds.
And yes, on native code even without multiple threads it will be really fast, but there is still a difference between near instantaneous and instantaneous.
PCs today are all multi-core, so why not use it?
My code is about 700 lines long and about 70% of it is header detection, error handling and debugging code. It was no trouble at all to add multithreading, even with a low-level API such as std::thread.
A 32 Mbits ROM will have 4+ millions additions just to calculate the checksum, and assuming it's just unoptimized x86 mov and add instructions, there is no reason not to spread the load across multiple CPUs, if the implementation is simple enough.
As for it being 8 threads, the size of the ROM is divisible by 32 KB, so it's also divisible by 8, no need to perform any extra calculations to get the block size for each thread.
Also most modern CPUs are quad-cores and some run it on HT (8 logical cores).
Overall, as I've mentioned, I've made this mainly for including in scripts. The faster the implementation, the less clogging on the script execution, more so when I need to frequently execute it (I just attach this calculator to all my make scripts).