Frequency Heatmap Generator
Drop an audio file and render a full static spectral heatmap — time on the horizontal axis, frequency on the vertical, and magnitude as colour. A windowed STFT (FFT + Hann window, 50% overlap) is computed across the whole clip, with five colour scales, log/linear frequency axis, adjustable dB floor/ceiling, and PNG + SVG export. Everything runs locally — nothing is uploaded.
Idle — upload an audio file to generate its spectral heatmap.
Heatmap controls
Floor maps to the darkest colour, ceiling to the brightest. Lower the floor to reveal quiet detail; tighten the range for high-contrast peaks.
Analysis details
| File | — |
| Duration | — |
| Sample rate | — |
| Nyquist (max freq) | — |
| FFT size | 2048 (1024 bins) |
| Time columns | — |
| Resolution | — |
Frequencies above Nyquist (sample rate ÷ 2) cannot appear — they were never captured in the file. Long clips are downsampled to at most 1,400 time columns for display.
How to Use
- Drop a file (or click the zone to browse). WAV, MP3, OGG, FLAC and M4A all work where your browser can decode them. The file is decoded entirely in your browser.
- The heatmap renders automatically: each vertical slice is one FFT, plotted left→right over time, with colour encoding magnitude in dB.
- Pick a colour scale and a frequency axis (log is best for music and voice; linear spreads high frequencies evenly).
- Drag the floor / ceiling sliders to re-map the colour range — lower the floor to see faint detail, raise it to suppress noise.
- Change FFT size to trade frequency resolution against time resolution, then Export PNG or SVG.
Understanding Your Results
A spectral heatmap (a spectrogram rendered as a static image) shows three dimensions at once: time across the bottom, frequency up the side, and magnitude as colour. Horizontal stripes are sustained tones or harmonics; vertical streaks are transients (clicks, drum hits, consonants); diagonal lines are pitch glides or sweeps.
The colour represents relative magnitude in dBFS — decibels relative to digital full scale, not calibrated sound-pressure level. The brightest colour is whatever your ceiling slider is set to; the darkest is the floor. Two files mastered at different levels will look different even if they contain the same content, because the absolute reference depends on the file’s own peak level, not on a physical SPL.
The vertical extent is bounded by Nyquist — half the file’s sample rate. A 44.1 kHz file can only show up to ~22 kHz; nothing above that was ever recorded. The vertical resolution is one FFT bin, Δf = sample rate ÷ FFT size. The horizontal resolution is the number of time columns; for long files the frames are sampled evenly down to a fixed maximum (1,400) — one frame kept, the rest skipped — so a very brief event that falls on a skipped frame can be thinned out or missed entirely.
How It Works
When you drop a file, the browser’s AudioContext.decodeAudioData turns it into raw PCM samples — locally, with no upload. The tool then runs a Short-Time Fourier Transform (STFT):
- The chosen channel is split into overlapping frames of N samples (the FFT size), stepped by 50% (the hop).
- Each frame is multiplied by a Hann window to reduce spectral leakage, then transformed with a radix-2 FFT.
- Each frame’s magnitude spectrum becomes one column of the heatmap. Magnitudes are window-normalised and converted to dBFS (20 · log10).
- If the clip would produce more than 1,400 frames, frames are sampled evenly down to 1,400 columns so the canvas, memory and exports stay bounded.
Rendering maps each pixel row to a frequency bin through the selected axis (log or linear) and each pixel column to a time column, then looks the dB value up in a 256-entry colour-map lookup table whose endpoints are your floor and ceiling. PNG export re-renders the heatmap into a clean canvas and downloads it via canvas.toBlob. SVG export writes a standalone SVG document: the heatmap raster is embedded as a bounded base64 image and the axis ticks/labels are emitted as vectors, downloaded via a Blob. This is a 2D-canvas renderer — no WebGL is used or claimed.
Honesty & limits. Results are bounded by the file’s sample rate (Nyquist) and the FFT resolution you choose. Colours are relative, uncalibrated dBFS — useful for comparing content within one clip, not for absolute SPL. Long files are downsampled for display. The model is the standard windowed STFT; like all spectrograms it trades time resolution against frequency resolution and cannot beat that uncertainty limit. Nothing leaves your device.