The difference between monophonic (mono) and stereophonic (stereo) sound is the number of channels used to record and playback audio. Mono signals are recorded and played back using a single audio channel, while stereo sounds are recorded and played back using two audio channels. As a listener, the most noticeable difference is that stereo sounds are capable of producing the perception of width, whereas mono sounds are not.
Mono vs. Stereo Audio Files
Playback systems that make use of two speakers are referred to as stereo systems. Stereo audio files, such as stereo MP3 and WAV files, contain left channel and right channel information that tell the left and right speaker when to push and pull air.
If you’ve ever looked at the waveform of a stereo audio file within a digital audio workstation (DAW), you’ve likely noticed that there are two waveforms apart of the file. Each waveform represents a single channel of audio.
Mono audio files only contain a single audio channel.
If you're interested in learning more about how speakers work, check out our Studio Monitor Buyer's Guide.
Mono vs. Stereo Playback
Stereo systems are capable of creating the impression of sound source localization. Sound source localization refers to the human ability to locate the position of a sound source within a space.
For example, if you hear a dog barking, it’s relatively easy to determine the direction the sound is coming from, and how far away the sound source (the dog) is. Most people should be able to localize sounds with decent accuracy, even with their eyes closed.
It makes sense to assume that you’d perceive the sound produced by a stereo system to come from two distinct sound sources; the left speaker and right speaker. In some situations, you will perceive sound coming from two different directions, but this isn’t always the case.
The human brain is easy to trick because it uses simple concepts to localize sounds. These concepts include timing differences between sounds reaching your left and right ear, sound wave frequency, sound wave pressure levels, dynamic range, and reverberation amount.
Stereo systems exploit how gullible your brain is to create the “impression” of sound source localization between the system’s left and right speaker.
For example, when the left and right speaker play the exact same signal, you’ll perceive the source of the sound to be positioned directly between the speakers; this is referred to as a phantom mono sound source because the true sound sources (the speakers) are positioned out to the sides.
Your brain relies on sound wave timing differences to decipher the left/right positioning of a sound. A sound source closer to your left ear will produce sound waves that reach your left ear before reaching your right ear. Even though these timing differences are small, they help your brain localize the sound.
When you mirror this process using a second speaker, and you feed both speakers the same signal, your brain assumes the sound source is in front of you.
As differences are introduced to one of the signals, the sound they produce will be perceived as wider.
Width (X-axis) is just one of the three dimensions you’re capable of perceiving via the use of a stereo system. The other dimensions include height (Y-axis) and depth (Z-axis). Collectively, these three dimensions form a 3-D space known as a stereo image.
David Gibson’s book, The Art of Mixing, does a great job of visualizing stereo imaging. In the following video, you’ll be able to see and hear how track elements have been positioned within the stereo field of different songs.
Note: Make sure you listen to the audio in the following video using a pair of headphones, or a stereo playback system. If you choose to use a stereo playback system, ensure that you’ve positioned your head in an equilateral triangle with the speakers. Your speakers should be angled 45° toward you; this will ensure that you perceive the stereo image as intended.
Frequency dictates the height at which you perceive sound within a stereo field. High-frequency sounds localize themselves above low-frequency sounds. For example, the hi-hats in a song will sound as though they’re positioned above the bass guitar.
Depth is affected by a sound’s relative level, dynamic range, and reverberation amount. Sounds with a weak level and less dynamic range tend to appear toward the back of the stereo field, as do sounds containing excessive reverb.
Mono playback systems use one speaker and can only produce a two dimensional image consisting of height and depth. Two speakers are required to create the directional timing differences that your brain needs to perceive width.
Mono vs. Stereo Recording
When you record a sound source using a single microphone, you capture a single channel of audio. Playing back a mono recording like this can be achieved using a single speaker, or a pair of speakers.
Mono sounds that are played back via the use of a stereo system will playback in dual-mono. The single channel of audio is duplicated and sent to both the left and right speaker.
To capture true stereo recordings, you need to use two microphones. When you process a stereo recording, you’ll need to pan one of the microphone recordings to the left, and the other to the right.
The following video by Steven Law contains a back-to-back comparison of a mono guitar recording and stereo guitar recording. The stereo microphone technique he's using is called the X-Y technique; it tends to produce a moderately wide stereo image that is also mono compatible. You can achieve an even wider stereo image by angling the microphones outward further, or by using different acoustic guitar recording techniques.
Many handheld recorders provide the ability to record in stereo, capturing sound using a pair of built-in microphones. Information captured by the left microphone is saved to the left channel of the saved audio file, while information captured by the right microphone is saved to the right channel. The Zoom H4N Pro contains two microphones, and allows you to record in stereo.