Simple Subtraction Shows Huge Difference Between MP3 and PCM Version of Same Audio!

#1
Last night, I decided to try a little thought experiment. I took a clip of jazz music from the original CD and I exerpted about 40 seconds or so. I saved that as WAV uncompressed. Then I saved a second copy at MP3 320kbps.

The top waveform is the original file:
Uncompressed Aud&#10.jpg


This second waveform is the DIFFERENCE between the MP3 and the PCM original:
Compressed MP3 Au&.jpg

As a sanity check, I tested this method with two copies of the PCM audio and they nulled out to zero --a flat line. So the second waveform is all that you're MISSING from 'transparent audio MP3', apparently. I never expected the difference to be so great. I was expecting some high frequency differences, very small in amplitude.

Audibly, the sustained bass notes are MISSING in the summed/difference out of phase version, so the bass fundamentals are the only thing MP3 reproduced accurately. The transients are all different, as are the harmonic overtones on the string bass, all the midrange, and the treble.

Bitrates 128 and above produce similar results. Something to consider the next time you think about compressing that CD to an MP3. ;)
 

Nicholas Bedworth

WBF Founding Member
May 7, 2010
312
0
0
Maui, where else?
#2
If I'm understanding the context of what you're doing here.. keep in mind that MP3 is highly lossly, and to my ears, unsatisfactory regardless of the bit rates. There all kinds of artifacts, distortions (spatial, timbral, whatever) added, and of course, a great deal is lost in translation into Lower Slobovian, in this case.

The (much) more recent and sophisticated codecs such as Windows Media give much-easier-to-take compression, meaning better sound quality, for a given bit rate. And of course there are many truly lossless codecs such as WMA Pro, FLAC, and so forth.

MP3 sounds, to my ears, lousy even on car radios, and in a decent system, even worse... completely destructive of musical reality.
 
#3
I'm aware that MP3 is highly lossy on the high frequency end; I just wasn't as aware of how much it mucks with the phase throughout the upper bass harmonics and midrange as well. The resulting audio was downright surprising. I expected far less difference audio.

For storage, I use Monkey's Audio and FLAC most of the time. I often rip my CDs to WAV files and keep them on the hard drive for fast access.

Windows Media Encoder is pretty impressive. I use it for my Adventures in Anime Music live stream and at 64kbps, it sounds almost as good as MP3 at 128kbps.

MP3 just lacks so much in terms of dynamics. It's not just the swishy sound. It completely falls apart with encodes from analog cassette tapes, as it can't handle random phase shifts between channels well at all.

My test was a very revealing and enlightening experience.
 
Jul 1, 2010
8,677
2
0
#4
So it would appear, at a glance, that your mp3 lost more than the file had in the first place. Interesting. Of course what matters is what we hear. As an internet radio listener, I'm very familiar with what lossy audio sounds like. I don't hear it at 320kbps (AAC). I really have to listen closely to hear any of it at 256kbps and by the time I get there I'm not paying much attention to the music, which really misses the point. MHO - I think loss (and jitter, for that matter) is to today's audiophiles what harmonic distortion may have been to us in the 70s -- something to get all worked up about when looking at scopes, that doesn't have much audible effect, even when the numbers look pretty bad. YMMV.

P
 
Apr 3, 2010
15,814
2
0
Seattle, WA
#5
Just a bit of math:

128kbps MP3 represents 11:1 compression. Inverted, 90% of the file is thrown away and 10% is what you are hearing! Quite remarkable, isn't it? PCM is extremely wasteful in the way it represents audio.

Even 384kbps represents 3.7:1 compression or only 27% of the original file!!!

Now, MP3 codecs also roll off the high frequencies in addition to performing perceptual compression. Other codecs like WMA or AAC do not do this 128kbps+ (WMA also supports this at lower bitrates).
 

Vincent Kars

WBF Technical Expert: Computer Audio
Jul 1, 2010
860
0
0
#6
Might it be that these enormous differences are not a matter of ‘real’ differences between the 2 formats but due to latency of the MP3 encoder?
If it trails a couple of samples behind due to the latency of the encoding process, the mathematical differences will be big by ‘design’
 
#7
The increase in peak amplitude caught my attention too. How can you get more than you started with via subtraction? Baffling.

It's mostly transient attacks and overtones in the difference file.

MP3 joint stereo cuts off like a cliff at 16KHz. But I did find that in Stereo Mode, LAME Encoder will encode without this cutoff filter. I analyzed the spectrum of two versions of an MP3 file and found this difference.

320kbps doesn't sound bad for some types of music, though some transient energy seems lost, compared with the original, based purely on listening. The lower bitrates have that swishy sound on the highs AND loss of dynamics.
 
Jul 1, 2010
8,677
2
0
#8
Are you sure what it says is true - the top waveform is the original? Because it sure looks like the bottom has the better dynamic range and no loss obvious of information. Of course, showing no obvious loss of information would be what you would expect if the compression is doing it's job. It should only throw away that which is masked.

P
 

Nicholas Bedworth

WBF Founding Member
May 7, 2010
312
0
0
Maui, where else?
#9
@ Vincent... there are quite a few differences between MP3 and WMA/WMA Pro, in the underlying algorithms, in the overall approach, etc.

Perhaps Amir could throw in a few words on the topic?

Of course these days, compression, at least for audio, is less of a necessity than when either MP3 (~1990) or WMA (~2000) was developed. With today's gear and bandwidth, in much of the world, I'm not sure why one would bother going beyond lossless compression, which is usually around 1.5 to 2.5 to one compression, depending on the content.

The sonic differences are present in just about any parameter of evaluation one could imagine.
 
#10
Are you sure what it says is true - the top waveform is the original? Because it sure looks like the bottom has the better dynamic range and no loss obvious of information. Of course, showing no obvious loss of information would be what you would expect if the compression is doing it's job. It should only throw away that which is masked.

P
Yup, absolutely. That anomaly has been baffling explanation. I'm going to do some tests with sine waves and see if I can figure out what's going on with the phase rotation during compression. Interestingly, the difference audio sounds like a mix-minus--sans the bass sustain. The rest of the music sounds pretty much all there in a vague sort of way.
 
#11
For one think the phenonena is repeatable with music, but not with synthesized tones. First I tried a 400hz sine wave. The compressed version, subtracted to a perfect null. So too did and FM horn sound I generated with FM synthesis. It seems that when the data is the same in both channels, the compression is pretty accurate. Next is to try mixing different things in Left and Right and then subracting the MP3 version from such a program..
 
#12
I just made a discovery which invalidates the whole test!

Apparently, when saving as MP3, 24 mS of padding is added to both ends of the file. This throws off the time alignment of the subtraction, resulting in these crazy results.

Oddly enough, when I did the simple sine wave test, the file length remained unchanged when I saved as MP3. But when I created a test file with FM horn in the right and 400Hz sine in the left, as soon as I saved as MP3, it got 48 mS longer in length.

I went back and carefully checked the length of my music clips.. the compressed version was 48mS longer too! So it's impossible to sum them accurately. I wonder if MP3 works like Long-GOP MPEG, in that you can only trim files to the nearest GOP, which is 1/2 second in the video world.. seems we may have a corollary to that rule in MP3 audio compression..
 
#14
I was able to pad the wav file to the same length to get them to line up. Even lined up, the sound of the difference file had a flangey, phasey sound to it. MP3 definately rotates the phase, probably several times as the frequency band increases.

No more MP3 for me, unless it's the ONLY available copy of something hard to find.
 

FrantzM

Member Sponsor & WBF Founding Member
Apr 20, 2010
6,464
2
38
#15
Jul 8, 2010
1,232
0
0
70
New Milford, CT
#16
the sound of the difference file had a flangey, phasey sound to it. MP3 definately rotates the phase, probably several times as the frequency band increases.
In my experience, when people talk about a "flangey phasey" sound what they really mean is the hollow sound of missing frequencies due to comb filtering. That is, the sound of a phaser or flanger effect unit. Phase shift per se isn't usually audible, but the severely skewed response you get from combining a signal with a phase shifted version of itself is definitely audible. Maybe you can clarify what sound you mean? Or, since this is what you heard in the difference signal, I guess it makes sense that it would sound like comb filtering.

--Ethan
 
#17
That's what I said, "The sound of the difference file". :)
I would conclude that the MP3 has a rotating phase where the angle of rotation is frequency-dependant. That, when subtracted from the PCM file, sure would create the effect of a comb filter.
 
Dec 22, 2010
2
0
0
#19
However slow you read, you still can read faster than a narrator can speak.

The version you downloaded is called and "abridged" version. They trim out whole sections to shorten the book.

Just delete it and try one of the versions from the page above.
 

About us

  • What’s Best Forum is THE forum for high end audio, product reviews, advice and sharing experiences on the best of everything else. A place where audiophiles and audio companies discuss existing and new audio products, music servers, music streamers and computer audio, digital to audio convertors (DACS), turntables, phono stages, cartridges, reel to reel, speakers, headphones, tube amplifiers and solid state amplification. Founded in 2010 What's Best Forum invites intelligent and courteous people of all interests and backgrounds to describe and discuss the best of everything. From beginners to life-long hobbyists to industry professionals we enjoy learning about new things and meeting new people and participating in spirited debates.

Quick Navigation

User Menu

Steve Williams
Site Founder | Site Owner | Administrator
Ron Resnick
Site Co-Owner | Administrator
Julian (The Fixer)
Website Build | Marketing Managersing