It seems as though there is a clash between hardware-oriented people vs. the software types, with very few people who have the ability to think in terms of the whole system and its genuine beauty.
In my mind, it all seems remarkably simple, and extremely elegant:
The local DAC clock is what directly defines the output sample rate and timing. Sample jitter at the output derives directly from this clock only. There are two possible basic configurations:
1. In an optimal configuration, the local DAC clock is the master clock and the rest of the system is slaved to it (examples would be the standard CD player, a PCI sound card, asynchronous USB). Here, the clock can be a simple but high precision crystal oscillator with fixed frequency, giving very low jitter. There is a small hardware buffer at the DAC and other buffers in the system. Except for gross dropouts and data loss (if the overall system is not "real time enough"), this configuration can not be influenced by the timing of the software/OS etc.
2. In less optimal configurations, the local DAC clock has to be adjusted in response to the data (and embedded clock) streaming to it (e.g. SPDIF, isochronous USB-based DACs) and is affected wrt jitter by noise in the cable and jitter in the timing of the data/clock transmission. In this case, the master clock (defining sample rate) is the one at the sending end, and the local DAC clock is slaved to this. The data retrieval within the PC/playback device is also slaved to this, so its precise timing does not influence jitter. Data is, again, hardware-buffered at the DAC and played out at a rate set by the local clock's frequency, but here the local DAC clock frequency has to be derived from the timing of the data stream using a jitter-attenuating PLL that makes occasional, minimal frequency adjustments to maintain the same average sample rate as the sending end. As long as the sending end is hardware-buffered and regulated by a fixed-frequency clock, although the jitter will be 'worse' than option (1), it will not be affected by software/OS timing considerations. Luckily, this is just what happens:
Isochronous transfer
Data is sent out in frames every millisecond...
...The rate at which the frames go out is determined by a oscillator driving the USB bus.
This rate is independent of everything else going on in the PC.
http://www.thewelltemperedcomputer.com/KB/USB.html
And that's, er, it... The debate is only around whether the OS can not be guaranteed to provide the data in time (not the precise accuracy of that timing) which would lead to severe and highly-audible glitches, or some mysterious and undefinable noise/PSU/ground issue related to the load put on the CPU by everything it is having to do. Everything else is pretty sensible and was not designed by people completely ignorant of what jitter is or mystified as to how computers and DACs work.