Gentlemen, interesting conversation! How do considerations here change in the case of digitized vinyl? For example, take a Rolling Stones record from late 60s or early 70s, that has been touched by man only once or twice, and utilize top notch equipment to digitize the record (and additionally eliminate Stylus microphics)? The guy does everything to 24 / 192, and it sounds fabulous (also very few ticks and pops).
I will say straight out that I am extremely passionate about > 16 bit playback,
regardless of the signal to noise ratio of the actual material that was originally digitalised to begin with. Many are obviously going to disagree with me, but I will always steadfastly maintain that when it comes to quantisation noise in the digital domain, it does not need to be even remotely audible in order to noticeable effect what we DO hear.
I can, for example, make a high quality 24 bit transcription of one of the modern-generation audiophile 45 RPM LP sets. Let's take for example a Decca recording from the early 1960s or even the late 50s - so pre-Dolby A to boot. The signal to noise ratio on these recordings would measure poorly to begin with, let alone the additional noise caused by it being delivered in an LP format (LP itself, turntable, phono amplifier, etc).
Yet when I take the 24 bit file I create from the above material, I can reduce the word length to 16 bits using any number of noise shaping dither techniques, and the resulting 16 bit files all sound different to each other - and none of them sound like (or as good as) the original 24 bit file. This goes against the theory that so long as the signal to noise and dynamic range of the material is less than 96 dB, then 16 bit makes no difference at all compared to 24 bit. If we believe the maths and the theory, the 16 bit files should sound identical to the 24 bit files because the signal to noise ratio is not high enough to begin with.
And I find this to happen consistently - I have never encountered any 24 bit file - even a fully digital one - that does not lose sound quality when the word length is reduced. And bear in mind when I listen, I don't listen loudly. When it comes to classical, the loudest peaks might measure only 90 dB at the most which means the average listening level is more like the low 60s dB range.
Infact only last week I was going through my 24 bit Decca vinyl LP transcriptions and determining on an individual cases-by-case basis which of three noise shaping curves I would employ (this is only for my CD versions - obviously the 24 bit files remain intact). I decided on those three curves after many months of auditioning many dither modules and the three I came up with sounded the best, even though none of them sound like the 24 bit source material. The reason for having three curves to chose from is that each curve has certain sonic strengths and weaknesses. The trick therefore is to use the curve that represents the best compromise. It is different in each and every case. But of course if I subscribed to pure mathematical theory, I've been wasting my time. But my listening skills tell me a completely different story - what happens 120 dB down is VERY audible at NORMAL (far lower than even 100 dB peak) listening levels.
The only thing I agree with is that for practical purposes, the vast majority of recording equipment isn't going to manage a whole lot better than the low 120s range to begin with. But that is still a whole lot better than 16 bits, especially when some headroom is needed during the recording process.
One additional thing I should point out even though it should be obvious - the higher the sample rate, the less effect reducing the word length will have. If, for example, I had a 96 Khz 24 bit file, I could reduce the word length to 16 bits and I would likely not be able to hear the difference. The reason for that is I can push the quantisation noise so far above the musical spectrum that it has no audible effect, whilst the audible spectrum can enjoy a noise floor exceeding 20 bits across the entire frequency range. But at low sample rates (<= 48 khz), there simply isn't enough bandwidth to have a flat noise floor below around 20 to 21 bits across the entire audio spectrum. I suppose this is also a reason why progressively higher DSD rates sound better and better too.