It takes about 100 ps of random jitter at 20 kHz (and much more at lower frequencies) to reach the lsb level of a 16-bit DAC. At 1 kHz it takes over 2 ns (see e.g.
http://www.whatsbestforum.com/showthread.php?1322-Jitter-101). That's quite a bit, and that is to induce an error that is some 96 dB below full-scale. Add some music or a movie and it is unlikely you'd hear several ns of random jitter. My belief (shared by others) is that random jitter is essentially a non-issue for modern systems.
Signal-dependent (deterministic) jitter is another issue, with lsb-level jitter causing significant reductions in spurious-free dyanmic range. Same for adding a bit of clock or other signal to the input (or vice-versa). For example, lsb-level injection of a fixed tone can reduce the maximum spur from ~140 dB below full-scale to "only" 96 dB or so. See my series on Jitter (101, 102, plus a thread on cable bandwidth and jitter) in the technical area of WBF (you can start at
http://www.whatsbestforum.com/showthread.php?2829-Don-s-Tech-Series and click on the links). Isolating the clock and signal is critical for high-resolution converters, and so a trade can be made between keeping the clock close to reduce added noise, and moving it away to keep it from coupling to the signal. However, the maximum clock coupling typically occurs inside the DAC, so keeping the clock source near the DAC is generally optimal.
Asynchronous DACs and other schemes serve to reduce clock jitter (random and deterministic) at the DAC, where the actual conversion takes place. On the output side, the analog buffer and image filters typically (almost always) dominate the performance.
The AES spec is for studios, where you want a number of components all sharing the same master clock to stay closely synchronized during recording, mix-down, and mastering sessions that can take hours and days. I don't know if you actually need 1 ppm/year stability; without running any numbers my guess it is another of those things chosen to ensure inaudibility of any artifacts induced by clock drift.
HTH - Don