Over on REG's Audio Forum, Robert E. Greene has recently been making the case that the problems blocking progress toward realism in home music reproduction come down to just two things: (1) how commercial recordings are made, and (2) how speakers radiate energy into the room and how the room reflects/absorbs that energy.
To understand how it all could come down to these factors, you need to understand the goals of music reproduction in the home. Then we can look at why the above two factors are so important to realistic music reproduction at home and why they are such obstacles to realistic reproduction.
"You are there" v. "they are here"
The Goal
This is a non-issue for anyone who pursues "the absolute sound" as defined in The Absolute Sound (TAS) and by purists like me. Surely the goal of accurate reproduction is "you are there," not "they are here."
Interpreting the Goal
The problems arise in (1) interpreting what "you are there" means in terms of perspective and (2) in mapping from the soundfield captured by the mikes to the soundfield heard by our ears when reproduced by the listening room speakers. This “reciprocity” problem is the biggest stumbling block remaining in achieving “you are there” realism at home.
First, opinions differ over whether the goal is to reproduce some idealized listener perspective in the concert hall, or "what the mikes heard." The first is somewhat arbitrary, but can perhaps be agreed upon (say, 10th row center). The second is unknowable in certain absolute ways (spatial perspective, for instance, with widely separated mikes and more than two mikes in any configuration), but clearly deducible in others (frequency response, amount of small details audible, etc.). But "concert hall sound" is the goal of both camps, even though they may disagree on exactly the perspective from which the reproduced result should be judged.
Problems in Getting There
Second, assuming "you are there" is the goal, and we adopt the "10th row center" idealized perspective as the proper interpretation of the goal, it is apparent that the recording/playback system would have to introduce some distortions of perspective (distortions of both space and frequency response) in order to achieve this goal for most commercial recordings, given where microphones are typically located. Compared to 10th row center, the mikes are usually placed very close to the musicians and very high up. Certainly the sound captured by the mikes will have considerably more high frequency content than any audience perspective, both due to the fact that high frequencies diminish with distance and due to the upward beaming of many orchestral instruments, both because of the construction of the instrument, and the reflective floor under the musicians’ chairs.
Why is such a technique almost universally used? One reason has to do with the apparent spatial perspective captured by most stereo arrays of microphones compared to what the human ear/brain hears from any given position. Most stereo mike arrays, no matter whether you listen through headphones or monitor speakers, make things sound farther away than our ears do in a given position. Thus, most stereo mike arrays exaggerate both the absolute distance from the mikes and the relative distances of first row v. last row musicians from the mikes: if you aren't careful in how you set up the mikes, the first row musicians will sound like they are in your lap and the last row will sound like they are in the next county. Positioning the mikes near the front of the stage and high up, through simple geometry, tends to reduce the differences in mike-to-musician distances between the front row and the back row. If done artfully, such positioning can yield a much more lifelike perspective, getting the front row folks out of your lap and increasing the presence of the back of the orchestra, while still yielding a good feeling of front-to-back depth of the group.
But you can never really know how such recordings are supposed to sound from either a tonal balance perspective OR a spatial perspective. As to tonal balance, if the engineer does not include some high frequency roll off, the sound captured by mikes with flat frequency response will be far too bright, compared to any audience perspective. And even if the engineer specifies an exact roll off curve in the recording notes, this is still just his judgment about what makes that recording sound balanced on some arbitrary monitoring system. The home listener, unless using the same room, speakers, speaker and listener positioning, acoustical treatment, etc., will not hear what the engineer heard, and, even if he does match the engineer’s set up as closely as possible, the listener’s tastes may not match those of the engineer.
And as to spatial perspective, experiments such as cupping your hands behind your ears suggest rather strongly that increasing the space between the stereo mikes will tend to exaggerate the spatial qualities of a group of instruments through some sort of auditory parallax or triangulation effect. Compared to normal listening, if you effectively increase the distance between your ears by cupping your hands behind them, front-to-back depth is exaggerated, side-to-side perspective is unnaturally focused, and individually instruments acquire an exaggerated sense of palpability or three-dimensionality. Such exaggerated spatial effects are probably actually on recordings of orchestras using widely separated stereo mikes.
Whether he knew it or not, HP, through TAS, by using such Mercury and RCA orchestral recordings as touchstones, and then trying to reproduce something like an audience perspective from such recordings played back at home without actual tone controls or other overt equalization, pushed the high-end audio industry in directions from which it has never recovered. The tonal balance problems caused by the non-flat mikes and up-high/in-close positioning used for such recordings favor electronics which greatly soften the highs (think traditional tube sound) and speakers which exaggerate the bass and slope off the highs (think big Infinity and Genesis products). And the fact that exaggerated spatial effects are in fact present on the recordings led to an obsession with the ability of equipment to reproduce three dimensional soundstaging and even three dimensional images of individual instruments.
Two-Channel Movements Toward the Goal
In a letter published in Issue 77 of TAS, I argued that even such recordings (i.e., ones made with widely spaced in-close/up-high mike arrays) could be used as references if only they were made with some sort of “Rosetta Stone” technique which would thoroughly document the differences between the live sound in the hall as heard from a preferred audience perspective and the live mike feed and tape recording when played back on a known standard monitoring system. Certainly this would be a giant step in the right direction, but such a method is still beset with difficulties stemming from uncontrolled variables, including those related to the acoustics of the listening room v. those of the studio. What is really needed is a stereo microphone arrangement which delivers tonal balance and spatiality using two loudspeakers in the listening room which is quite similar to what a listener would have heard in the recording hall when listening from the physical position where the microphones are placed, the “reciprocity” I'm talking about.
Single-point (e.g., Blumlein, M-S, X-Y) and quasi-single-point (e.g., ORTF) set ups seem to yield less of the perspective distortion effect and can therefore be placed further from the musicians with good results. Blumlein does this best of all and if the microphone response is subjectively correct can produce a believable facsimile of a 10th-row-center perspective simply by placing the stereo mike array in the 10th-row-center physical position. Nevertheless, most practitioners still mount even Blumlein mike arrays fairly high up to reduce the spatial distortion. We are left still wondering what such recordings should really sound like. Also, Blumlein only works its spatial magic for the front quadrant of 90 degrees; the hall ambience coming from angles outside that quadrant is not well mapped in the sense of reciprocity between concert hall and listening room.
[Continued in Part Two . . . .]
To understand how it all could come down to these factors, you need to understand the goals of music reproduction in the home. Then we can look at why the above two factors are so important to realistic music reproduction at home and why they are such obstacles to realistic reproduction.
"You are there" v. "they are here"
The Goal
This is a non-issue for anyone who pursues "the absolute sound" as defined in The Absolute Sound (TAS) and by purists like me. Surely the goal of accurate reproduction is "you are there," not "they are here."
Interpreting the Goal
The problems arise in (1) interpreting what "you are there" means in terms of perspective and (2) in mapping from the soundfield captured by the mikes to the soundfield heard by our ears when reproduced by the listening room speakers. This “reciprocity” problem is the biggest stumbling block remaining in achieving “you are there” realism at home.
First, opinions differ over whether the goal is to reproduce some idealized listener perspective in the concert hall, or "what the mikes heard." The first is somewhat arbitrary, but can perhaps be agreed upon (say, 10th row center). The second is unknowable in certain absolute ways (spatial perspective, for instance, with widely separated mikes and more than two mikes in any configuration), but clearly deducible in others (frequency response, amount of small details audible, etc.). But "concert hall sound" is the goal of both camps, even though they may disagree on exactly the perspective from which the reproduced result should be judged.
Problems in Getting There
Second, assuming "you are there" is the goal, and we adopt the "10th row center" idealized perspective as the proper interpretation of the goal, it is apparent that the recording/playback system would have to introduce some distortions of perspective (distortions of both space and frequency response) in order to achieve this goal for most commercial recordings, given where microphones are typically located. Compared to 10th row center, the mikes are usually placed very close to the musicians and very high up. Certainly the sound captured by the mikes will have considerably more high frequency content than any audience perspective, both due to the fact that high frequencies diminish with distance and due to the upward beaming of many orchestral instruments, both because of the construction of the instrument, and the reflective floor under the musicians’ chairs.
Why is such a technique almost universally used? One reason has to do with the apparent spatial perspective captured by most stereo arrays of microphones compared to what the human ear/brain hears from any given position. Most stereo mike arrays, no matter whether you listen through headphones or monitor speakers, make things sound farther away than our ears do in a given position. Thus, most stereo mike arrays exaggerate both the absolute distance from the mikes and the relative distances of first row v. last row musicians from the mikes: if you aren't careful in how you set up the mikes, the first row musicians will sound like they are in your lap and the last row will sound like they are in the next county. Positioning the mikes near the front of the stage and high up, through simple geometry, tends to reduce the differences in mike-to-musician distances between the front row and the back row. If done artfully, such positioning can yield a much more lifelike perspective, getting the front row folks out of your lap and increasing the presence of the back of the orchestra, while still yielding a good feeling of front-to-back depth of the group.
But you can never really know how such recordings are supposed to sound from either a tonal balance perspective OR a spatial perspective. As to tonal balance, if the engineer does not include some high frequency roll off, the sound captured by mikes with flat frequency response will be far too bright, compared to any audience perspective. And even if the engineer specifies an exact roll off curve in the recording notes, this is still just his judgment about what makes that recording sound balanced on some arbitrary monitoring system. The home listener, unless using the same room, speakers, speaker and listener positioning, acoustical treatment, etc., will not hear what the engineer heard, and, even if he does match the engineer’s set up as closely as possible, the listener’s tastes may not match those of the engineer.
And as to spatial perspective, experiments such as cupping your hands behind your ears suggest rather strongly that increasing the space between the stereo mikes will tend to exaggerate the spatial qualities of a group of instruments through some sort of auditory parallax or triangulation effect. Compared to normal listening, if you effectively increase the distance between your ears by cupping your hands behind them, front-to-back depth is exaggerated, side-to-side perspective is unnaturally focused, and individually instruments acquire an exaggerated sense of palpability or three-dimensionality. Such exaggerated spatial effects are probably actually on recordings of orchestras using widely separated stereo mikes.
Whether he knew it or not, HP, through TAS, by using such Mercury and RCA orchestral recordings as touchstones, and then trying to reproduce something like an audience perspective from such recordings played back at home without actual tone controls or other overt equalization, pushed the high-end audio industry in directions from which it has never recovered. The tonal balance problems caused by the non-flat mikes and up-high/in-close positioning used for such recordings favor electronics which greatly soften the highs (think traditional tube sound) and speakers which exaggerate the bass and slope off the highs (think big Infinity and Genesis products). And the fact that exaggerated spatial effects are in fact present on the recordings led to an obsession with the ability of equipment to reproduce three dimensional soundstaging and even three dimensional images of individual instruments.
Two-Channel Movements Toward the Goal
In a letter published in Issue 77 of TAS, I argued that even such recordings (i.e., ones made with widely spaced in-close/up-high mike arrays) could be used as references if only they were made with some sort of “Rosetta Stone” technique which would thoroughly document the differences between the live sound in the hall as heard from a preferred audience perspective and the live mike feed and tape recording when played back on a known standard monitoring system. Certainly this would be a giant step in the right direction, but such a method is still beset with difficulties stemming from uncontrolled variables, including those related to the acoustics of the listening room v. those of the studio. What is really needed is a stereo microphone arrangement which delivers tonal balance and spatiality using two loudspeakers in the listening room which is quite similar to what a listener would have heard in the recording hall when listening from the physical position where the microphones are placed, the “reciprocity” I'm talking about.
Single-point (e.g., Blumlein, M-S, X-Y) and quasi-single-point (e.g., ORTF) set ups seem to yield less of the perspective distortion effect and can therefore be placed further from the musicians with good results. Blumlein does this best of all and if the microphone response is subjectively correct can produce a believable facsimile of a 10th-row-center perspective simply by placing the stereo mike array in the 10th-row-center physical position. Nevertheless, most practitioners still mount even Blumlein mike arrays fairly high up to reduce the spatial distortion. We are left still wondering what such recordings should really sound like. Also, Blumlein only works its spatial magic for the front quadrant of 90 degrees; the hall ambience coming from angles outside that quadrant is not well mapped in the sense of reciprocity between concert hall and listening room.
[Continued in Part Two . . . .]