That is what is bothering me. if in HDCD there are four more bits imbedded in the same space, how can that not get into the way of the transport reading the unadulterated music?
Well, it is an extreme neat trick
.
HDCD takes advantage of the fact that your music is not jumping from very loud to very faint every millisecond. If you look at the spectrum of your music, you see that there are loud passages and not so loud passages. When the music gets loud, you don't hear the faint sounds. But when the levels get low, you do.
HDCD takes advantage of above by making the 16-bit represent different levels of loudness. When the music is quiet, it represents the bottom most 16 bits of 20 bit music. When the music is very loud, then it represents the upper 16 bits out of 20. When in between, then the level corresponds to that range.
The HDCD decoder therefore needs to be instructed to change the representative sample values based on what the encoder saw. The information is buried in the low order bits of audio. Since the volume in the music doesn't change rapidly, that information is sparsely inserted. It is further modulated to resemble noise, acting as dither.
Putting it together, HDCD represents a 20-bit value in 16 bits of CD format by getting rid of information that is either zero (upper bits when the music is not so loud) or the low order bits (which would not be audible when the music is loud). A special form of essentially lossless compression.
Make sense?