How Sound Waves Create Music: Frequency, Timbre, and Acoustics

Music is, at its physical foundation, organized vibration — patterns of sound waves that the human ear and brain interpret as pitch, tone color, and harmony. This article explains the physics of sound and how the properties of waves give rise to the rich world of musical experience.

The InfoNexus Editorial TeamMay 8, 20267 min read

What Is Sound?

Sound is a mechanical wave — a pattern of compression and rarefaction (expansion) propagating through a medium, typically air. When any object vibrates — a plucked guitar string, a struck drumhead, a singer's vocal cords — it pushes on the surrounding air molecules, creating regions of high pressure (compressions) and low pressure (rarefactions) that radiate outward in all directions from the source. These pressure variations travel through the air at approximately 343 meters per second (at 20°C at sea level), reach our eardrums, and are transduced by the auditory system into electrical nerve signals that the brain interprets as sound.

Unlike electromagnetic waves (such as light), sound cannot propagate through a vacuum — it requires a physical medium. Sound travels faster in denser media: roughly 1,480 m/s through water and 5,100 m/s through steel. This is why you can hear a distant train through the rail tracks before you hear it through the air, and why sound in a concert hall behaves very differently from sound in an open field.

A sound wave is characterized by three fundamental physical properties: frequency, amplitude, and waveform. Each of these maps directly onto a perceptual quality that defines how we experience music — pitch, loudness, and timbre respectively.

Frequency and Pitch

Frequency is the number of complete pressure cycles (compressions + rarefactions) that pass a given point per second, measured in Hertz (Hz). One Hz equals one cycle per second. The human ear is sensitive to frequencies between approximately 20 Hz and 20,000 Hz (20 kHz), though this range narrows with age — particularly at the high end. Frequencies below 20 Hz (infrasound) and above 20 kHz (ultrasound) exist but are inaudible to humans.

Frequency is perceived as pitch: the higher the frequency, the higher the pitch. The musical note A above middle C (A4, concert A) is standardized at exactly 440 Hz — this is the international reference pitch to which orchestras tune. The note A in the next higher octave (A5) vibrates at 880 Hz — exactly double the frequency. This doubling relationship defines the octave, the most fundamental interval in music, and explains why notes an octave apart sound so similar: the higher note's overtones (see below) align almost perfectly with those of the lower note, producing a sense of unity despite the difference in pitch.

In Western music's equal temperament tuning system, the octave is divided into 12 equal semitones. Each semitone represents a frequency ratio of 2^(1/12) ≈ 1.0595 — meaning each note is approximately 5.95% higher in frequency than the note a semitone below it. This precise mathematical partitioning, while a slight compromise from the pure ratios of earlier tuning systems, allows instruments to play equally in tune in all 12 major and minor keys — a crucial practical requirement once chromatic keyboard instruments became standard.

Musical NoteFrequency (Hz)Octave Relationship
A2110 Hz2 octaves below A4
A3220 Hz1 octave below A4
A4 (Concert A)440 HzReference pitch
A5880 Hz1 octave above A4
A61,760 Hz2 octaves above A4

Amplitude and Loudness

Amplitude is the magnitude of pressure variation in a sound wave — how much the air pressure deviates from its ambient level. Larger amplitude means more energetic vibration and is perceived as greater loudness. Sound intensity is measured on a logarithmic scale in decibels (dB), because the human ear's sensitivity spans an enormous dynamic range: from the threshold of hearing (0 dB) to the threshold of pain (around 130 dB), a difference of 10^13 in physical pressure intensity.

The logarithmic nature of the decibel scale means that a 10 dB increase represents a tenfold increase in sound energy but is perceived as roughly a doubling of loudness. A normal conversation registers around 60 dB; a rock concert approximately 110–120 dB; a jet engine at 100 feet around 140 dB. Exposure to sounds above 85 dB for extended periods causes cumulative, irreversible hearing damage — a major occupational hazard for musicians and sound engineers.

In music notation, dynamics (loudness markings) are expressed in Italian terms: pianissimo (pp, very soft), piano (p, soft), mezzo-forte (mf, moderately loud), forte (f, loud), fortissimo (ff, very loud). These relative markings give performers interpretive latitude within a physical range suited to the instrument and hall.

Timbre: The Fingerprint of Sound

Timbre (pronounced "TAM-ber" or "TIM-ber") is the quality or "color" of a sound that distinguishes different instruments or voices playing the same pitch at the same loudness. It is why a violin and a clarinet playing A4 at the same amplitude sound obviously different — recognizably so, even to untrained ears. Timbre is the most complex of the three fundamental sound properties, and the most musically expressive.

The physical basis of timbre is the harmonic series. When a real instrument produces a note, it does not generate a pure sine wave at a single frequency. Instead, it produces the fundamental frequency (the perceived pitch) simultaneously with a series of overtones (also called harmonics or partials) at integer multiples of the fundamental. A string vibrating at 220 Hz also vibrates simultaneously at 440 Hz (2nd harmonic), 660 Hz (3rd harmonic), 880 Hz (4th harmonic), and so on. The relative amplitudes of these harmonics — which ones are strong, which are weak, which are absent — determine the timbre.

A flute, for example, produces relatively few strong overtones — its tone is dominated by the fundamental, giving it a pure, clear quality. A violin's bowed string generates a rich array of overtones up to very high frequencies, producing its warm but complex, reedy character. A clarinet, due to its cylindrical bore and single reed, preferentially produces odd-numbered harmonics (1st, 3rd, 5th...) giving it its distinctive hollow, woody color. Percussion instruments like drums and cymbals have inharmonic overtones — partials that are not simple integer multiples of the fundamental — which is why they produce "noise-like" tones rather than clear pitches.

Beyond the steady-state harmonic content, timbre is also shaped by the attack (how quickly and in what manner a note starts), sustain (how the tone is maintained), and decay/release (how it fades). These time-varying aspects of a sound's character — collectively described as its envelope — are why a piano note with its initial click and rapid decay sounds so different from a bowed string with its smooth, sustained swell, even if both are producing the same fundamental frequency.

Resonance, Acoustics, and the Instrument

Resonance is the phenomenon by which a physical system vibrates with greater amplitude at certain frequencies — its resonant frequencies or natural frequencies. Every musical instrument exploits resonance to amplify and shape sound. A guitar string on its own produces a nearly inaudible sound; coupled to the guitar's hollow wooden body, the string's vibration drives the body at its resonant frequencies, amplifying certain harmonics and shaping the guitar's characteristic sound.

Wind instruments are essentially resonating air columns. By buzzing lips (brass) or a reed (woodwinds) into a tube of specific dimensions, the player excites standing waves in the air column — resonant patterns determined by the tube's length, shape, and end conditions. Opening or closing holes (woodwinds) or changing tube length (valves in brass, slide in trombone) alters the resonant length and therefore the pitch. The flare of a bell on a horn is not decorative; it modifies the reflection conditions at the open end in ways that improve intonation and enrich the harmonic content of the tone.

Room acoustics profoundly affect the musical experience. Reverberation — the persistence of sound after the source stops, due to repeated reflections off walls, ceiling, and floor — is the acoustic signature of a space. Concert halls are acoustically engineered to achieve ideal reverberation times (1.8–2.2 seconds for symphonic music) and diffuse reflection patterns that create envelopment (the sense of being surrounded by sound) without muddying clarity. Recording studios use absorptive materials to minimize reverberation for clean recording, then add artificial reverb digitally to simulate the warmth of real spaces.

Psychoacoustics: How the Brain Hears Music

The ear's mechanical transduction of sound into nerve signals is only the beginning of musical perception. The brain applies sophisticated processing that transforms raw acoustic data into musical experience — and often generates perceptions that go beyond (or even contradict) the physical stimulus.

Beats are a classic psychoacoustic phenomenon: when two pure tones very close in frequency are sounded together, the listener perceives a periodic fluctuation in loudness — a wavering or throbbing — at a rate equal to the frequency difference. If two tuning forks vibrate at 440 Hz and 444 Hz, the listener hears 4 beats per second. This phenomenon is how instrument technicians fine-tune to precise pitch: when the beats disappear, the instruments are in unison. Beats slower than approximately 15 Hz are perceived as rhythmic fluctuation; faster beats are heard as a rough, dissonant texture.

The missing fundamental is another striking phenomenon: if the 2nd, 3rd, and 4th harmonics of a tone are played (e.g., 440, 660, and 880 Hz) without the fundamental (220 Hz), the brain still "hears" the missing 220 Hz pitch. The auditory system infers the fundamental from the pattern of harmonics above it — a form of auditory scene analysis that helps the brain identify the pitch of sounds even in noisy environments where some frequencies may be masked.

Consonance and dissonance — the sense that certain intervals are stable and pleasing while others are tense and unsettling — have both physical and cultural dimensions. Intervals with simple frequency ratios (the octave at 2:1, the perfect fifth at 3:2) tend to sound consonant because their overtone series align with minimal beating. Intervals with complex ratios (the minor second at 16:15) produce many fast, rough-sounding beats and are generally perceived as dissonant. But cultural context matters enormously: the tritone (the interval dividing the octave exactly in half), historically called diabolus in musica (the devil in music) by medieval theorists, is a familiar, even attractive sound in blues, jazz, and heavy metal. Music training and cultural exposure significantly shape which intervals are experienced as consonant or dissonant.

music theoryacousticsphysics of sound

Related Articles