Here's an excellent description by Dustin Forman of ESS and Resonessence on the Invicta/Mirus SD card and the pitfalls of USB:
"We all like digital music because information stored in digital form is, in principle, incorruptible and preserved. The first forays into digital music were compromised by the need to compress the data: the lowest sample rate was used and even then the data files were too large to store conveniently. Compression was invented (MP3) to make long play-lists viable on early hardware. Thankfully, as technology has moved on, lossy compression is no longer needed and emerging standards are all loss-less. All manner of digital sources can now deliver digital music to a player: USB is popular, WiFi is often mentioned, and of course SD cards can be used as well.
To the casual listener the digital data is just another source of music: bit-perfect digital data from any source is identical, whether it be delivered by USB or on an SD card. What matters is the quality and care with which that digital data was captured at the studio, because, we generally assume, once captured into the digital domain it is now inviolate and available for reproduction anywhere, hence the digital revolution that surrounds us.
To reproduce the quality captured by the digital encoding at the studio is the challenge for Resonessence and other high-end Audio DAC manufacturers. To produce the very best sound quality requires attention to a number of aspects that are well known and some that are less well-known. For example, we learn from the comments of experienced audiophiles that the listening experience can differ in detail depending on where the digital data is originating (USB, Memory stick, CD etc).
This is a surprise, why do different digital music sources sound different? Is there something about the audio engineering in the DAC that can explain this? What do we need to attend to in the DAC design to minimize unexpected issues such as this?
The first consideration is that the audio device must be the timing master – we cannot rely on the imprecise timing that a typical computer provides – that is far inferior to the audio requirement. Audiophiles are well aware of the problems of jitter and good clock management. WiFi is an even bigger challenge because data flow within a typical radio environment is exceptionally unpredictable and prone to drop-out.
At the lowest level, low jitter and precise timing means that the audio clock has to be the “master clock” to the degree that even a phase locked loop cannot achieve. [Some solutions attempt to lock the average rate of the audio clock with a phase locked loop – but this is inferior to the extreme high “Q” that a master crystal audio oscillator can achieve]. Consequently, for the highest quality audio reproduction a low phase noise oscillator defines the master clock at the audio reproduction site (that is, within the DAC system and in a managed noise environment). Well-known buffering and flow control devices then surround this audio subsystem and ensure that data is clocked in and out of the interface as the audio subsystem needs it.
In principle this solution (audio master clock, sophisticated flow control to the digital transport electronics) is all we need – we are done with the design. However, experience teaches otherwise. Audiophiles can perceive a difference between data source such as USB and SD card – how can this be since digital data is just digital data isn’t it? How can there be any distinction between data that has arrived over the USB link as opposed to data that has been extracted from the SD card since they are each a bit-perfect copy of the other?
There can be differences due to what engineers and scientists call second or sometimes “higher order” effects. We will describe one of them in order to explain what these can be, and we will simply state that the Invicta has been designed to mitigate this and other similar effects.
Consider the example of USB audio. All credible systems have solved the problem of flow control and placed the master clock at the DAC (asynchronous USB and the Audio USB standards etc) and commonly, to lessen the load on the USB host, a buffer memory of significant depth is present in the signal path to allow for unpredictable data delays. Two processes (at least, maybe more) are then running in the DAC: the first is the high precision clock and data flow to the DAC element itself to minimize phase noise, the second is a supervisory asynchronous process that is watching the buffer memory status and feeding back to the USB host to maintain long term synchronization. And, that surprisingly, is the source of a second order problem that evidence suggests may explain why one data source differs from another.
That asynchronous process of flow control is occurring with frequency characteristics right in the middle of the audio band: the flow control process kicks in and out in the millisecond timescale, and as a result of this, data is flowing in bursts that have audio band frequency products. To make the situation worse, the USB driver itself is a low impedance buffer sucking pulses of charge from the power supply over many frequencies, but with a strong audio band component. The buffer memory load and unload is again a current drain, more so if the memory is larger, and again it has audio frequency elements in it. All this means that the power supply is stressed: there will be a small (if the design is done right) modulation of the power proportional to the USB flow control. This modulation breaks into the audio stream primarily through clock phase modulation (which does not degrade DNR and does not degrade THD and so cannot be seen in the specifications, but there is no doubt at all that it is audible).
All high end manufacturers have a guideline: the more power supplies the better – precisely to mitigate effects such as these. A good DAC design will have a completely separate power supply for the USB interface to minimize this effect. However, there is next step after multiple supplies and that is galvanic isolation. Galvanic isolation truly separates the power supplies because the use of multiple power supplies alone does not isolate the ground connection. Galvanic isolation as used in the Invicta does truly isolate the grounds as well as the power. With galvanic isolation the audio system ground is not disturbed at all by the USB flow control artifact and the phase noise modulation with power supply variation is at the very minimum. [It is still not zero because there are other effects that even couple between galvanically isolated domains, but it is the very lowest achievable.]
Contrast the difficulty of achieving the absolute maximum performance in the presence of these second order effects with the USB and SD Card. In the USB interface the flow control is affected by the USB host: its performance will change the detailed operation of the flow control process in the audio DAC. The basic problem is that the USB is a serial data source and marshalling (or “serializing”) of the data into the USB “pipe” is a necessity. However in the SD Card the data is randomly accessed at a far higher rate that in the USB (in a few nano-seconds) and in any order that the controller cares to ask for it. In this case then the frequency domain characteristics of the data access process are under our control: we can ensure that access is “even” and does not exhibit a frequency characteristic in the audio band. But even more than this, the SD Card does not have a line capacitance to charge and discharge (the USB does: the wavefront propagating in the controlled environment of the cable presents a significant load to the driver). The charge disturbances are lower to begin with before we add our “evening out” procedure when using the SD Card source.
We should stress that these are indeed second order effects, far below the level of attention typically applied to a lower-cost consumer product.
Finally, there are other second order effects. You may rightly guess that the display has a similar artifact: a characteristic current draw that can be in the audio band if care is not applied and so forth. All that we can find are taken care of in the Invicta."
And here's some info on how it sends data and decodes Flac from the SD card:
"The FLAC is decoded into RAM where it becomes normal lossless PCM data, it then is sent to a FIFO for buffering between the RAM and the DAC. Then it is transmitted to the DAC via I2S interface. All this is done for the lowest jitter. The FLAC file is decoded in chucks in a streaming fashion. This is VERY difference from a regular CPU, where the bursty processing can have audible effects. The FLAC decoder is actually a hybrid hardware decoder and software. We did this so the CPU loading is actually very constant (smooth) during the stream decoding. This is only possible since we did not use a general purpose CPU, but rather made a custom decoder in the FPGA to accomplish this. I think the results speak for themselves. I have not heard of one person claiming they can tell the difference on the INVICTA (MIRUS), where its fairly well accepted that on a computer that is decoding on the fly, audible differences can be perceived."