Music Recording Techniques For Stereo Playback

© 2008 Dave Luepke, Carrollton, TX

The purpose of this paper is to provide information, in a straightforward manner, to people who want to gain a better understanding of microphone placement for recording music, whether instrumental or vocal, in an effort to make better recordings. This paper focuses on stereo recording techniques, and will not address surround sound recording.

There are thousands of people making recordings who have never gotten any basic information about microphone placement techniques, and so they simply set up a couple mics in what they think is a good place. Often, the recordings have too much or too little bass, poor or exaggerated left/right stereo imaging, or poor depth (front/back) balance among various sections of the instruments or voices, or too much or too little room "ambience". But to those folks, I say, take heart!, you're not alone! Many commercially released recordings from major labels have obvious deficiencies as well. Classical music listeners have no doubt heard orchestral recordings where the violins sound "on mic" and the brass or percussion sounds like they're at the other end of the concert hall. Hopefully, the following information will help to reduce such deficiencies, and improve your recording quality.

For music recording which is intended for stereo playback, a variety of microphone placement techniques have become well-established over many decades. Among them are A/B, Decca tree, MS, ORTF, X-Y, close miking, and others. Each of these techniques have proven to possess certain useful and desirable characteristics in various situations. Each of them also possess characteristics which are undesirable in certain situations. A knowledge of these characteristics will help a person to make better recordings.

First, let's go over some basics of how we hear. Our two ears and the head which is between them is more than just a nice-looking way to design a human. It's fundamental to how we determine the direction from which a sound is coming, in other words: to locate a sound source. This ability is called "localization". Two fundamental and dominant abilities allow this to happen: the difference in a sound's "arrival time" at each ear, and the difference in sound level and quality at each ear. If a sound is produced directly in front, behind, above, or below a person (or anywhere else in that plane), there is no difference in arrival time or level between the two ears. In these cases, our brain is able to use other information to determine the source direction - above, front, back and below all have their own unique characteristics, but, in general, they are more subtle than the lateral differences. As a result, our localization ability is much more acute in the horizontal plane than in the vertical plane.

For sounds which come from some "off-axis" direction (i.e., left or right), it arrives at one ear before the other (arrival time difference), and, because the head acts as a barrier, there is also a level difference (which varies with frequency) between the two ears. Our brain processes these differences and determines the direction of the sound source. An everyday example of this is when somebody somewhere in a room talks to us, we are able to determine in what direction that person is.

On the other hand, our ability to process arrival time and level differences becomes seriously degraded at low frequencies. This is why, in stereo systems, it is sometimes acceptable to have only one sub-woofer rather than two. For very low tones (lower than about 70 Hz), we have a hard time knowing where the sound is coming from.

Intimately related to the level difference phenomenon is the fact that that difference becomes more significant as the frequency of the tone is raised. In other words, higher frequency tones will experience a greater loss of level as they diffract around the head and to the ear on the other side, than lower frequency tones.

The key point here is that differences in "arrival time" and sound "level" account for the majority of our sound localization ability.

However, the shape of the ear also plays an important role. Because the ear is not symmetrical, sound which arrives from different directions is passed to the middle and inner ear with different "transfer" characteristics. In the course of reflecting off the various curves and shapes of the outer ear (the pinna), some frequencies are accentuated and some are attenuated. Thus, each source direction has its own unique "signature" by the time it reaches the eardrum. Our brain is very familiar with these differences, and uses them to help determine the direction of the sound. This, along with head/neck/body shape, is the main clue which helps us to determine source direction for above/front/behind/below sound sources.

We can also determine if a sound source is close to us, or farther away. In a nutshell, this is dependent on two main factors: the overtone structure of the sound, and the relative amounts of "direct" sound and "reflected" (or "reverberant") sound. Reflected sound is sound which has impinged upon a surface and reflects toward the listener. For example, when a person is standing near you and talking, you can hear more detail in their voice than if they are some distance away, and the direct sound from their mouth significantly dominates over the sound reflected from the room’s surfaces. This is why, especially in a very "live" room or very noisy room, it's easier to understand a person that is close to you than one which is far away – the farther away, the more significant is the reflected, or "reverberant" sound, causing "smearing" of the details and a loss of clarity.

You may ask: What does all of this have to do with stereo recording techniques? First, in order to make a high quality stereo recording, a person needs an understanding of how we hear and and how we determine where a sound originates and how far away it is. Secondly, and this is where it really hits home: with a stereo system, both ears hear both speakers (two sources), and those speakers are a fixed distance from the listener, with a fixed spacing between the speakers.

Two goals of producing good stereo recordings are to convince our brains that various musical instruments or voices are located across a space from one speaker to the other and places in between, and to make them sound closer or farther away than the speakers really are. A big part of the recording engineer’s ability to make that happen depends on his understanding of human hearing, and then to apply it when deciding on microphone placement.

The size and acoustics of the room and the ensemble also play an important role in deciding what mic technique and placement is best, but that's a topic for a little farther down the road. Before that, let's look at some well-established techniques. Later, I’ll give some examples of why one technique may work better than another in a particular situation.

ORTF. (See photo.) This technique was developed by the French government broadcasting organization, "Office de Radiodiffusion Television Francaise", which accounts for the acronym name. The technique consists of using a pair of directional microphones with their capsules (the business end of the mic) placed anywhere from about 8 inches to as much as 15 inches apart and angled away from each other by anywhere from about 45 degrees to as much as 180 degrees. Note that the specifics of the technique were developed in concert with the standardization of microphones used by ORTF, and that the use of other microphones, along with various ensemble and room sizes, accounts for the wide latitude of placement and angle in common usage. Variations in specific microphone choice, spacing and angle aside, the concept is generally referred to as ORTF.

This technique delivers both of the primary aspects of directional localization: Level differences due to the angled orientation, (the mic which is aimed more toward a source will pick up that source louder than the one which is aimed more away from it); and time differences due to the distance between the two microphone capsules. By changing the distance between the capsules, and the angle between them, a skilled engineer has a great deal of control over the resulting stereo image.

The pickup pattern (cardioid, hypercardioid, etc.) of the specific microphones used, along with the angle between them, will determine how subtle or obvious the level differences will be, and will also play a role in the relative quality and quantity of ambient sound picked up. It should be noted that a microphone's pickup pattern varies with frequency, and different microphones have different characteristics in this regard. So, for example, not all cardioid microphones have the same pickup pattern characteristics. This has a definite and important effect on the quality of "off-axis" sound.

X-Y. (See photo.) This consists of a pair of directional microphones, one placed immediately above the other, with their capsules aligned horizontally and angled anywhere from 45 to as much as 150 degrees to each other, depending on the specific mics and the situation. In other words, two mics are positioned at essentially the same place, but one is pointed more to the left and the other is pointed more to the right. This is also often referred to as a "coincident" mic technique.

Because the capsules are as close to being in the same place as is practical, there is no difference in the time that a sound reaches one or the other. As a result, this technique relies solely on level differences resulting from the fact that the mics are aimed in different directions.

A potential shortcoming of this technique is that there is no "uncorrelated" sound recorded. ("Uncorrelated sound", in this context, is sound which arrives at one microphone at a different time, and with a different pattern of reverberant sound, as compared to the other microphone.) So, because both mics are picking up the same sound, just at different levels, the impression of "spaciousness" can be noticeably lacking (as a result of the lack of uncorrelated sound). However, that can often be remedied by adding a little artificial stereo reverberation during editing.

A/B. This technique generally uses two omnidirectional mics, although directional mics can be used. They are spaced anywhere from about 2 feet to several yards apart and are usually oriented parallel to each other, although they may be angled outward. This technique is often referred to as "spaced omnis". The technique often provides a very "spacious" sounding recording, and, because omni's are often used, low frequency (bass) quality is often better than with directional microphones. A potential downside is placing the mics too far apart, resulting in an "echo" effect between the two playback speakers, although this is rare except for very large spacing (greater than several yards, or in certain rooms with side wall reflection issues).

Because omnidirectional mics are generally used for this technique, most of the directional information is created by arrival time differences (which are significant when the mics are spaced several yards apart). Also, when using omni's, although one mic is closer to some sound sources than the other mic, sound level differences don't play a major role in the directional information, because even with a large distance between them, it is often not enough to result in more than a few dB difference in level between them. As a result, the impression of "spaciousness" can be quite noticeable, but source localization is often indistinct.

M/S (Mid/Side). This is actually a combination of mic placement and electronic signal processing. A bi-directional mic is placed so that it faces directly to the left and the right, and a directional mic (e.g., a cardioid) is placed immediately above or below it, pointed straight forward. This technique has the advantage of allowing the mixing engineer to vary the amount of "blend" between the two mics in a manner which provides more or less ambient sound, along with a broader or narrower stereo image. In effect, at mix time, the ratio between the direct sound from the source and the ambient sound from the room can be controlled.

Since it involves some basic signal processing, additional equipment is needed, along with a person who understands how to use it. Unless a purpose-built M/S mic is used (and there are several), it can also be an unsightly mic mounting for live performances. However, these factors should not deter a person from using the technique when it is warranted, because it can produce a very nice stereo image. It should be noted that, as with the X-Y technique, there is no uncorrelated sound, so the ambient quality (not quantity) can still be very lacking in spaciousness.

Summary of "stereo pair" techniques: Through placement and choice of microphones, the ORTF, X-Y, and A/B techniques all provide the opportunity to control the relative amounts of direct sound and reverberant sound, and to control the apparent horizontal spread and near/far placement of the ensemble in the stereo "stage". The M/S technique also provides this, but requires more equipment. Choosing the best technique and finding just the right place to put the mics is a combination of the science explained here, the ensemble being recorded, the sound qualities of the room, and experience. So, especially for live location recording, knowing something about the science is essential to getting off to a good start, but to become highly skilled at it, a person will need to gain experience by recording a wide variety of ensembles in a wide variety of venues, and objectively analyze and critique each recording in an effort to improve one's skill.

Close miking. This is just as the name implies: microphones placed close to the various sound sources, as a means of achieving some isolation of their sound from the sound of other sources in the ensemble, and from the ambient sound in the room. Whether it is a jazz quartet, or a 100 piece orchestra, close miking allows the engineer to have more control over the balance of the various instruments, and to create a stereo image which differs from the actual conditions in the recording venue. This can be advantageous when recording outdoors, or in a room which doesn't have the most desirable acoustics, or when a particular "sound stage" is desired for the reproduced sound.

However, it also requires many more mics and associated equipment, and more time to set up. For high quality results, it also requires the engineer to have a solid understanding of instrument sound radiation patterns, sound pressure levels, and the characteristics of the specific mics which are used. When doing the final mix, it will also require some artificial reverberation and EQ-ing, as well as a skilled mixing engineer, in order to achieve a believable audio balance and perspective.

There are other techniques, typically which are combinations of the ones noted above, but this overview will allow us to discuss the usage of each of these popular basic techniques.

In the next section of this paper, which I hope to finish soon, I’ll take a few example scenarios and discuss the pros and cons of various techniques in those situations.

Back to YourAmerica home page