SNES higher sound rate - more quality or placebo?

Started by kindlekat, February 01, 2023, 09:56:26 PM

Previous topic - Next topic

kindlekat

There's enough people who swear by SNESAmp's higher sound rate = more crisp, clear audio as in cutoffs and transitions. I'm assuming (if true) that it's filling in some of the ADSR waveform gaps at the standard 32K rate via interpolation. But can it make that much of a quality difference (internal resampling)? 32K -> 48K -> 96K.

And if it's that important to people, why emulators don't take advantage of it (besides the possible coding difficulty)? Or are these "improvements" just magical fantasy misleading artifacts and side-effects of some up/over-sampling? Or does it work better on MIDI-type and not sample-based?

This subject has bothered me for awhile. I'm assuming the upgrade is marginal but have still wondered about for years - I don't see anyone mentioning something newer than SNESamp. Ignoring MSU-1 CD quality, and some future HD sample Mesen-like replacement technology.

FAST6191

Will have to look into the hardware in question but I do also have my audiophool alarm gently pulsing.

If you will permit some wittering on things you likely already know. For those playing along at home improvements in this world that are not all in the ear of the beholder generally happen in a few ways.

1) The replacement also takes on any aging capacitors and such like.
2) The replacement allows for a nicer audio out path than maybe some RF signal into an old nasty dry paper cone sporting 14 inch possibly mono CRT shooting it out the side, instead RCA into a suitable speaker angled hifi system or something.
3) Loudness wars, though that might rise something above the noise floor more and thus be an actual improvement when all is set to chosen volume.
4) Whatever 80s "good enough for a cost concerned mass produced device" digital analogue converter/DAC was used gets replaced with a modern one. Whether its original incarnation discards information as it were I do not know, and 32K sample rates* is still only chopping off ranges many that are old enough to have enjoyed it back when can no longer hear and almost certainly were never actively used in anything more than a noise channel on the baseline hardware. Waveform gaps are more likely explained by a lot of these things being square waves when all is said and done which modern stuff might well achieve better versions of (there will always be some capacitance or inductance to stop the infinitely quick rise and fall times, or indeed see things overshot. See also https://www.electronics-tutorials.ws/filter/filter_4.html and some of the other chapters in that series, basic filters being easy, higher order and active ones needing more, nice clamping setups even more still, which costs more for less and less for want of a better term gain).

1 and 2 might also combine if an external board is kept away from a noisy part of the main board.

*for those not subject to introductions to digital signals over the course of things then Nyquist Shannon sampling theorem is a choice search term, short version you need double the sample rate (get it going up and coming down) of whatever frequency you want to recreate. Aka why when the best humans top out at around 22000 Hz that CDs and such used 44000 and very little goes much above it in the time since (only stuff really doing it being stuff for mastering purposes where that can matter more when slowing/speeding things up, and for increasing volume to the limits -- sample going up and going down does not necessarily mean peak of the wave and matching position the other side so more samples increases the chances of that/gets closer to it and allows you to get more towards the max volume before the tops get chopped off/enter clipping). https://www.researchgate.net/figure/Age-related-hearing-loss-according-to-the-International-Organization-for-Standardization_fig1_338597788 is a reasonable hearing loss by age and frequency graph, with anybody subject to concerts, building sites, loud headphones and such, which is an awful lot of people in the grew up in the 80s-2000s cohort, probably being worse still.

There are some emulators that might sample at higher rates than the baseline hardware (desmume for the DS has had such a thing for some time now). At the same time the only golden ear audio enjoyers on consoles (and complainers in emulator discussions) that would give the SNES crowd a run for their money are the megadrive/genesis ones so lack of it and discussions there are a curious thing.

kindlekat

The above is very good. Things that tend to get lost and buried over time, over-and-over again. Or genuinely just forgotten for awhile. :woot!:


My mind may tell me that using Mesen at 96000 vs 48000 produces a slightly crispier output when I listen to something like DuckTales (-5% fuzzy). SameBoy felt like something was sharper when it upped to 384000 -> native output. Then I heard someone brought up higher internal resampling (gba) with endrift (mgba) but had the idea kicked down for being nonsense (can't find where).

I know there used to be a lot more arguments / threads about SNES internal sampling rate but I can only find this one now:
https://x704.net/bbs/viewtopic.php?f=13&t=3195

Did not know about desmume does it so worth checking it someday. :thumbsup:


Still was wondering if others felt some improvement on emulator-side. Given that the original interest mostly died away, feels like some Hocus Pocus illusion I'm hearing.

FAST6191

The GBA has an interesting twist in rather more recent times too
https://gbatemp.net/threads/hq-sound-in-native-gba-games-is-entirely-possible.625549/

On resampling then you mentioned substitution. Such things have been done for some things -- the Donkey Kong soundtracks were exported using the base samples from the original digital audio workstation, a not unquestionable act (master your audio for the worst building site speaker and all that), that would necessarily have been crushed any way you care to swing it on the SNES. Emulators to do much there have not been made explicitly like we see for texture replacements, and what little I did see for play was someone having some fun with Lua scripts (if track playing disable all but in game sound effects, play on PC in background, maybe also do loops and interrupts if you are really fancy). I imagine the future would be if shared sample libraries were used (entirely possible) that such things get replaced like so much sound font/instrument libraries but hey.

kindlekat

Thank you for mentioning both points!

I feel like I can see things clearly now:

1. The PSG systems generate in the MHz range. SameBoy feeds a large amount of raw data (384000) so a frontend can do the complex downsizing it can't. It just results in a more clear signal, not really more detail.

2. The "soundfont" samplers have the other way. Material is low like you point out; no AI or algorithm can reconstruct the missing texture that got grunged and banged (without some hint or cheat sheet). Resamplers (et al) can't replicate the higher fidelity of better material, which we can sometimes substitute or track down the original source.

Maybe higher internal rate can make more clear, but I'm now thinking it's the frontend up-converter that is doing more of the magic. If we could engineer the GBA to combine the two paths (PSG 1M+ rate and WAVE 32768 rate), then the quality would go up (for the PSG part). Or hack some 16-bit WAVE channel extension, but I'm getting greedy.


Thank you, again. I can be more at peace now. Time to think of other matters like when to use SC-55 vs SC-88 Pro soundfonts. :)

kindlekat

There's this really good thread about SNES vs Mega Drive hardware DACs, with lots of tech talk thrown around
https://archive.nes.science/nesdev-forums/f5/t6327.xhtml


Absolutely worth reading!
(although you likely know about it)  :happy:

FAST6191

I never really went much on nesdev forums. If I am wanting to contemplate the scope of my lack of hearing (concerts, engines, building sites are all in my past) and musical ability I usually go on https://hcs64.com/mboard/forum.php or whatever forums are hosting discussions for foobar2000 plugins.

tacoschip

Is there a downside to oversampling at 384K and using a kaiser to massage it down to 96K native, as opposed to the normal way of sinc to 32K and then upsample to native? I find SNES Audio to have some small dirty blurriness even with a sinc filter.

Bregalad

Quote from: kindlekat on February 01, 2023, 09:56:26 PMThere's enough people who swear by SNESAmp's higher sound rate = more crisp, clear audio as in cutoffs and transitions. I'm assuming (if true) that it's filling in some of the ADSR waveform gaps at the standard 32K rate via interpolation. But can it make that much of a quality difference (internal resampling)? 32K -> 48K -> 96K.
While it's quite possible you'll hear a slight improvement in the trebbles by switching from 32kHz to 48kHz, it's extremely unlikely you'll hear any difference above that mark (or that your sound hardware supports it in the 1st place).

What makes many SNES game music sound crispier/better is using cubic interpolation instead of Gaussian. Many emulators and SPC players have this feature.

TamiSaunders

Quote from: kindlekat on February 01, 2023, 09:56:26 PMThere's enough people who swear by SNESAmp's higher sound rate = more crisp, clear audio as in cutoffs and transitions. I'm assuming (if true) that it's filling in some of the ADSR waveform gaps at the standard 32K rate via interpolation. But can it make that much of a quality difference (internal resampling)? 32K -> 48K -> 96K.

And if it's that important to people, why emulators don't take advantage of it (besides the possible coding difficulty)? Or are these "improvements" just magical fantasy misleading artifacts and side-effects of some up/over-sampling? Or does it work better on MIDI-type and not sample-based?

This subject has bothered me for awhile. I'm assuming the upgrade is marginal but have still wondered about for years - I don't see anyone mentioning something newer than SNESamp. Ignoring MSU-1 CD quality, and some future HD sample Mesen-like replacement technology.
The idea that a higher sound rate can lead to a more crisp and clear audio is generally true. When audio is recorded at a higher sample rate, it can capture more detail and nuance in the sound wave, resulting in a higher quality audio output. However, the benefits of upsampling or resampling audio are often overstated, and the actual improvement in audio quality can be marginal.

It is possible that SNESAmp's resampling is filling in some of the gaps in the ADSR waveform and improving the audio quality, but the actual improvement will depend on the quality of the original audio source and how it was recorded. Furthermore, the improvement may not be significant enough to justify the additional computational resources required for resampling.