📝

Speech Frequency Spectrogram

See your voice as a scrolling spectrogram — frequency content over time, with brighter colour for stronger energy. Watch vowels, formants, and consonant bursts appear in real time.

ℹ This is a spectrogram — a picture of frequency vs. time from an FFT. It does not recognise words or label phonemes (that needs speech recognition). The vertical axis is logarithmic over your chosen range. To save an image, use your device’s screenshot. Your mic is analyzed live and never recorded or uploaded.

Microphone is off. Click “Start microphone”, then speak or sing to see the spectrogram scroll.

How It Works

Many times a second, the tool runs a Fast Fourier Transform (FFT) on the latest slice of your microphone signal, giving the energy at each frequency. It paints that as a vertical column of coloured pixels — low frequencies at the bottom, high at the top on a logarithmic scale — and scrolls the image left so a continuous spectrogram builds up over time. Brighter pixels mean more energy. In speech you’ll see horizontal bands (the formants that shape vowels), a striped low region (the voice’s harmonics), and brief vertical smears for plosives and the high fuzz of fricatives like “s” and “sh”. Everything runs live on your device; nothing is recorded.

A spectrogram shows sound; it doesn’t read it. This tool does not transcribe words or tag phonemes — that’s speech recognition, a different problem. Use it to study how vowels and consonants look, to compare your speech or singing over time, or to spot a nasal or breathy quality. The picture also depends on your microphone’s response and the room.

Frequently Asked Questions

Is my microphone recorded or uploaded?
No. Audio is analyzed live in your browser and never leaves your device. Stopping the mic releases it immediately.
Can it identify words or phonemes?
No. It draws frequency-vs-time; it doesn’t recognise speech. Reading words from audio is speech recognition, which this tool intentionally doesn’t do.
What are the horizontal bands?
Those are formants — resonances of your vocal tract that define vowels. Their positions shift as you change vowel; the lowest stripes are the voice’s harmonics.
How do I save the image?
Use your operating system’s screenshot tool (e.g. Shift+Win+S on Windows, Shift+Cmd+4 on macOS). Freeze first to capture a still moment.
Why does the same sound look different on another device?
Every microphone has its own frequency response, and rooms add reflections. Compare sounds on the same setup for consistency.