Sampling 101

DonH50

Member Sponsor & WBF Technical Expert
Jun 22, 2010
3,947
306
1,670
Monument, CO
Sampling is a very complex problem but the basics are not all that difficult to understand. We’re simply taking samples of things, audio signals in this case, at a particular rate and resolution (terms to be defined in context shortly). Let’s start with the Shannon Sampling Theory, the basis for digital audio:

“If a function of time f(t) contains no frequencies higher than W Hertz, it is completely determined by giving the value of the function at a series of points spaced 1/2W seconds apart.”

Claude Shannon was researching information theory and the most efficient ways to transmit and recover information. Harry Nyquist, a controls expert, refined the theorem and gave us the well-known Nyquist frequency limit: the maximum bandwidth that can be captured by sampling at frequency fs is fbw < fs/2.

A few comments on the sampling theorem:

  1. It applies to an infinite series of samples of infinite precision (i.e. “analog” samples, or using an infinite number of bits). This is not quite digital…
  2. The theorem relates to signal bandwidth, not its maximum frequency. For example, a cell phone band only 30 kHz wide but centered at 1 GHz (10^9 Hz) can in theory be recovered by a sampler operating at just over 60 kS/s (60,000 samples per second). The sampler must have 1 GHz input bandwidth, and all other signals must be filtered out. I will not discuss bandpass sampling further here.
  3. Frequencies above Nyquist are still captured by an ideal sampler, but their frequency content is folded, or aliased, so the signals fall within the fs/2 region. Thus frequency information is lost, but amplitude information is not. To prevent aliasing, there must be no frequency >= fs/2, which implies filters are needed before the analog to digital converter (ADC).
  4. The theorem works on the output (DAC) side as well, though the DAC can (and does) generate higher-frequency components than fs/2 without aliasing since they occur after sampling has taken place.
  5. The theorem itself does not describe practical implementation.

Most ADCs first convert a signal into discrete (quantized) time values using a track-and-hold (T/H) circuit, then quantize the amplitude to produce a digital signal. Thus we go from signals continuous in time and amplitude, to discrete time and continuous amplitude, and finally to digital signals having discrete time and amplitude. An ADC typically includes the T/H so we don’t see that intermediate step. However, sampling errors in time and amplitude are important in achieving a high-quality digital result.

The figure below shows a ramp input signal before (middle) and after (bottom) sampling by a T/H. The clock is high to track the signal, and low to hold. A perfect T/H instantly follows the signal when in track mode, and perfectly maintains the signal value in hold mode. There are a myriad of non-ideal effects in real systems including noise, nonlinearity (distortion), clock and signal feed through, finite bandwidth and acquisition time, hold-mode droop, glitches when changing state, etc. For now, just knowing what happens in a T/H is enough. Not all ADCs include a T/H, combining time sampling with amplitude quantization in one step. (Click on the figure to see bigger -- I shrunk them down to upload.)

20100712_TH_image..jpg

After the T/H a quantizer converts the held values to discrete amplitude values. The resolution of the quantizer, and thus the complete ADC, is expressed in bits – numbers to the base two. One bit has two levels (2^1 = 2); four bits, 16 levels (2^4), etc. A 16-bit ADC has 65,536 discrete amplitude steps – quite a few, but a long way from infinite. The figure below shows a 1 kHz signal sampled at 44.1 kS/s (44.1 kHz, CD rate) by a 4-bit ADC. Very low resolution, but makes it easy to see what happens. The input is in red and output in blue. The output has levels from 0 to 15 (16 different values, or least-significant bits, lsbs). The error signal (dashed line) ties to the right-hand Y axis and for this perfect ADC ranges from -0.5 to +0.5 lsbs. Note the random appearance of the error signal; for an ideal ADC this is very close to (and generally treated as) simple white noise. This error is the quantization noise.

20100712_4b_time_i&.jpg

Stepping up to 16 bits give us the plot below. The number of steps is the same because the sampling time has not changed. However, the amplitude resolution is much finer (notice the output more closely follows the input). The error curve is a bit misleading since it is in lsbs; the error is actually only 1/4096 the size of the 4-bit ADC’s error! (The difference between 4 and 16 bits is 12 bits, and 2^12 = 4096.) I thought about plotting the two error curves together, but by eye the 16-bit error is essentially a flat line at the same scale as the 4-bit error signal.

20100712_16b_time_&#.jpg

To really see the difference in resolution, let’s look at the frequency response (using fast Fourier transforms, FFTs) of the two converters as shown below. Note the difference in the noise floor between the two ADCs – big change! Recall that these are ideal ADCs with no error sources in time or amplitude. The difference is all from the difference in resolution.

20100712_4b_FFT_im&#97.jpg

20100712_16b_FFT_i&#109.jpg

Now let’s put up some numbers. Treating quantization noise as white noise, it is fairly easy to calculate the signal-to-noise ratio (SNR). It can be calculated in closed form for a sine wave, or we can root-sum-square all the noise in the FFT and compare it to the signal power. Either way, after playing with the math we find:

SNR = 6.02N+1.76 dB for an N-bit ADC (or DAC)

The 4-bit ADC has SNR of only ~26 dB while the 16-bit ADC’s SNR is over 98 dB. To put this in perspective, consider that a quiet room may be around 50 dB in sound pressure level (SPL); the threshold of pain is around 120 dB, and maximum peaks for something like a close gunshot or loud jet engine may be 140 dB or more. Using 120 dB as a maximum, that’s 70 dB of dynamic range in our room (before we start hurting). Getting 70 dB requires about 11 bits; using 16 bits provides additional headroom for those really big peaks and quiet rooms. Just like analog systems, you really don’t want to hear clipping!

The SNR is the ratio between the signal and the total noise. The difference between the signal and the highest other (noise or distortion) peak is the spurious-free dynamic range (SFDR). Calculating the SFDR is more complex and there is not a simple solution (one solution uses Bessel functions, but they make my head hurt). For an ideal N-bit ADC, the SFDR is about 9N dB, or about 36 dB for a 4-bit and 144 dB for a 16-bit ADC. The noise floor in the 16-bit plot is higher than 144 dB because only 64k points were used in the FFT; more points would reduce the noise floor but the figures get big and hard to upload.

Some other interesting things can be seen in the frequency plots. Although the actual input frequency is not exactly 1 kHz to avoid spectral leakage (1), there is still some “clumping” of energy visible in the 4-bit plot. The clumps are not all harmonically related to the signal, creating an unpleasant sound (2), and contribute to “digital noise” that nobody seems to enjoy. With 16 bits, the noise floor is much smoother in addition to be being much lower. Aside: early delta-sigma converters also exhibited non-harmonic tones related to the modulator loop and digital filters. These tones contributed to the poor sound from those early converters, both ADCs and DACs. Modern designs use more advanced architectures and techniques that virtually eliminate those tones.

So, more bits yield higher resolution and a much lower noise floor. Higher sampling rates allow higher frequencies to be captured and, because filters can affect frequency relationships in the signals, facilitate lower-order filters with higher cut-off frequencies that reduce their impact in the desired signal (audio) band.

Enough for now! Feel free to ask questions; more to come. - Don

Notes:

(1) FFTs operate upon a finite data set. Spectral leakage occurs when the signal samples do not fit precisely within the FFT’s window, creating artificial “skirts” around the signal frequency. Window functions are used to smoothly roll off the data at the start and end of the sample set and reduce these skirts. To eliminate leakage, the input signal period must start and end exactly at the window boundaries. There are several schemes to ensure this; I used the routine in the IEEE’s Standard 1241 for ADC testing.

(2) We are more sensitive to distortion frequencies that are unrelated to the signal. A pure sine wave (pure tone) is a single spike in an FFT. A triangle wave, sort of a “buzzy” sine wave, has only odd harmonics (multiples of the signal frequency). A square wave, a nastier raspy buzz, also has only odd harmonics, but at a higher level than a triangle wave. Frequency spikes not harmonically related tend to “stick out” more and sound worse than harmonic spurs to our hearing. That's one reason intermodulation distortion, the distortion that happens when two signals mix (multiply) imperfectly (i.e. the multiplier is not perfectly linear), is often more important than harmonic distortion.

As an interesting aside, bipolar transistors have an exponential error characteristic and when operated differentially (as most circuits operate) exhibit mostly odd harmonics. Tubes have a factorial distortion series and thus have intrinsically lower distortion than bipolar transistors. Because they are typically used in single-ended circuits, tubes exhibit primarily second-order distortion and sound better to our ears when distorting. An ideal FET’s distortion is a square-law characteristic ending at the second-order term, providing the lowest theoretical distortion. In practice, higher-order terms are present, but FET’s still provide among the lowest distortion when properly biased. They do have other issues so are not a panacea, natch.
 
Last edited:

amirm

Banned
Apr 2, 2010
15,813
37
0
Seattle, WA
Great job Don. I am sure it takes a lot of effort to write such detailed articles and remove a lot of the complexity.
 

Ethan Winer

Banned
Jul 8, 2010
1,231
3
0
75
New Milford, CT
That's a great introduction Don and I feel guilty correcting this one minor error:

A triangle wave, sort of a “buzzy” sine wave, has only even harmonics (multiples of the signal frequency). A square wave, a nastier raspy buzz, has only odd harmonics.

A triangle wave has only odd harmonics, at a lower level than a square wave. The "rule" is if a non-sine wave form is symmetrical it contains only odd harmonics. If it's asymmetrical it has both odd and even harmonics. I'm sure you know that and this was just a typo. I mention this only for the benefit of others who may not know the difference.

--Ethan
 

DonH50

Member Sponsor & WBF Technical Expert
Jun 22, 2010
3,947
306
1,670
Monument, CO
Thanks folk!

@Amir -- Yeah, it took more work than I thought, and I'm still probably riding a line between too much and too little technical content (too much for absolute laymen, too little for anybody that actually knows this stuff). It's actually a little easier to deal with upper level and grad students as I don't have to worry if I am covering enough of the basics.

@muralman1 -- I have in mind to address delta-sigma vs. standard Nyquist converters in a future thread. The simple answer is that a really good Nyquist converter has potentially fewer dynamic issues than an oversampled delta-sigma design and does not require extensive digital noise filters (though requires a much higher-order anti-alias filter before the ADC). In practice, it's not clear to me there's an advantage one over the other, but a lot depends on the implementation. However, you can oversample with a Nyquist converter, so I need to clear up some semantics/terminology first.

@Ethan -- Ouch, yes, thanks! I was thinking of a sinusoidal pulse train (half-wave) since that's what a "real" triangle wave ends up looking like (and I had that on my mind for other reasons). I'll correct that error.

Amir and Ethan, thanks for help on the figures. I need to piddle more with figures -- I went for GIFs to save the bits, but perhaps JPEGs a little larger would let people actually see the plots without having to click on each one (breaks up the flow).
 

Ethan Winer

Banned
Jul 8, 2010
1,231
3
0
75
New Milford, CT
^^^ My "rule" for images is to use JPG for photos, and GIF files for graphs and text. Using JPG for graphs usually makes larger files, and there's always those little dot-type artifacts around the lines and text. Photos benefit from more than 256 colors, and the lossy artifacts blend in well. But graphs usually have only a few colors, and the razor sharp edges with GIF files maintains clarity. Posting GIF files around 400-500 pixels wide works well.

--Ethan
 

DonH50

Member Sponsor & WBF Technical Expert
Jun 22, 2010
3,947
306
1,670
Monument, CO
Thanks Ethan, I agree, great minds and so forth. I'll stick with GIF for plots, but play around with the size/resolution a bit. Part of the problem may have been the conversion from the raw graphs in my math analysis program to EMFs for Word then to GIF. Now I know I can upload the files, I'll try a more direct approach by exporting straight from Mathcad/Matlab/whatever. Always better with fewer wires in the signal path... :) - Don
 

Ethan Winer

Banned
Jul 8, 2010
1,231
3
0
75
New Milford, CT
I often use the screen capture feature in my graphics app to copy data from a program to a GIF file. That might be more direct than exporting through several formats. Then again, lossless formats should be fine no matter how many "generations" the data goes through.

--Ethan
 

vinylphilemag

WBF Founding Member
Apr 30, 2010
810
1
328
56
Kelowna, BC
www.vinylphilemag.com
^^^ My "rule" for images is to use JPG for photos, and GIF files for graphs and text. Using JPG for graphs usually makes larger files, and there's always those little dot-type artifacts around the lines and text. Photos benefit from more than 256 colors, and the lossy artifacts blend in well. But graphs usually have only a few colors, and the razor sharp edges with GIF files maintains clarity. Posting GIF files around 400-500 pixels wide works well.

--Ethan

May I humbly suggest PNG images for all types? Unlike JPEGs, PNGs use lossless compression, hence less (or even no) artifacts around solid colours.
 

DonH50

Member Sponsor & WBF Technical Expert
Jun 22, 2010
3,947
306
1,670
Monument, CO
@Ethan: No screen capture per se (I am aware of) in the high-level math programs I am using, but their export functions actually work well and is straight from the original data to whatever format I need. I have a lot of control over the graphs, but they are very high-resolution so I need to tone them down for the web. At work I tend to write in Word and paste in the figures as EMFs, then export everything to PDF (or rather our tech writer does). We have to use Word for most reports so that's what I am used to. In this case, creating the files straight from the math program will bypass any intermediate steps. I have high hopes...

@Rich: I tried PNG from my Word file but it didn't seem to make much difference, but OTOH the GIFs in my post are not too clear compared to what I saw on my PC before uploading so I may try that again, thanks. I have fairly high resolution on my monitors (1920 x 1200 or higher). I'll also have to see if my math programs export PNG (probably). I tried a JPEG export, at which point I remembered that Word does not like the JPEGs from my math s/w (they come out solid black). Not sure who's to blame, but rather than have them pont fingers at each other I'll try PNG instead.

@anyone: I noticed when I clicked on my plots in this thread that the files now appear to have a .jpg extension. Does the BBS automatically convert any uploaded figures to JPG? That may explain why they look "fuzzier" than on my screen (prior to uploading).

Curious, and confused, so life is normal - Don
 

LesAuber

Well-Known Member
Jun 21, 2010
141
0
361
Don,
Same here FWIW. When reading I opened the plots and found them adequately legible. I got what you were driving at even though my first thought was ouch. Been years since I'd seen some of that math. I'm interested in seeing what comes next.

One way to represent aliasing and how it gets folded back into the recording is in video. Everyone has seen how fans, propellers, etc seem to turn backwards under some conditions as a result of the frame rate being slower than the rpm. If I'm following correctly this happens with audio too. The sample is there but in the wrong place or out of time.

Ethan can correct me if I'm misinterpreting but by screen capture I believe he means the windows hotkey function, Alt-PrintScreen, which captures the active window to the clipboard. From there you can paste into word, photo editor etc. I don't know of any windows based software that doesn't work with this other than ones that set an overlay like video players. I'm guessing this is his intent since some of his graphs contained the toolbars etc which are captured too with this method. If you've already been here, sorry.
 

amirm

Banned
Apr 2, 2010
15,813
37
0
Seattle, WA
Don, I made a forum change last night which now shows attached messages inline rather than thumbnails. So that fixes one of the usability issues.

I think the main problem right now is that the images are simply too small. Any resizes will soften the images some. So if there is a way to tell matlab to create your target resolution, say, 640x480, that would fix that problem.

As to forum converting images to JPG, I didn't think it did that but it is possible.
 

DonH50

Member Sponsor & WBF Technical Expert
Jun 22, 2010
3,947
306
1,670
Monument, CO
Les: The real scary thing is that I left out most all the math and it's still probably too much. I'm hoping that even those who haven't used it for a while can scratch their heads a bit and figure it out (guess time will tell). Hey, I left out calculus, what more do you want? :)

I have a scheme I have used for years to explain aliasing involving triangles around fs/2; I was just too lazy to include it (and was afraid the post was getting too long). I shall try to add it to this thread.

Amir: Thanks for that! It is not a killer for me either way; I just prefer to not have to click on the image and try to remember the text whilst clicking back and forth (senility?) As for size, I went for small and probably (ok, obviously) overdid it. I can output most any size, but have been trained by other sites to keep things as small as possible and remain legible. I can bump them up -- how large is reasonable?

Amir and Steve: When I hover my cursor over a picture, the file tag appears to have a ".jpg" extension, thus my query. I don't really care, save if I know it's going to JPEG automatically, I'll probably just start that way.

Thanks all! - Don
 

amirm

Banned
Apr 2, 2010
15,813
37
0
Seattle, WA
Looks like it indeed converts attached messages to JPG. I use external servers and there, they show up as is.
 

Ethan Winer

Banned
Jul 8, 2010
1,231
3
0
75
New Milford, CT
@Ethan: No screen capture per se (I am aware of) in the high-level math programs I am using

I meant to use the screen-cap feature of a separate graphics program. I've used Paintshop Pro for years, which has a ton of great features for a reasonable cost, but it's no longer sold. I'm pretty sure even basic graphics programs can do this. A good example is capturing screens from Room EQ Wizard, the room acoustic measuring program I use. REW offers graphic export, but it saves images as JPG files which are larger than needed and not perfectly clear. So when I want to save a screen to post online or use in a video, I set the screen size in REW as I want, which auto-sizes the graphs. Then I go into Paintshop Pro and use it to capture just the graph area. Then I can save it in any format I want.

--Ethan
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. This is THE place where audiophiles and audio companies discuss vintage, contemporary and new audio products, music servers, music streamers, computer audio, digital-to-analog converters, turntables, phono stages, cartridges, reel-to-reel tape machines, speakers, headphones and tube and solid-state amplification. Founded in 2010 What’s Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals, we enjoy learning about new things and meeting new people, and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing