AI Note Detection Tool

Detect musical notes in real time from your microphone or an uploaded audio file. See notes on a virtual staff, piano roll, and spectrum display with polyphonic chord detection, note duration tracking, confidence scores, and MIDI-like CSV export — all processed 100% locally in your browser with zero uploads.

AI Note Detection Tool

🔒 Your audio never leaves your device — 100% local processing, zero uploads. Chrome Firefox Safari Edge
Space Start/Stop R Reset F Freeze C Calibrate
Microphone:
Sensitivity: 50
--
0 Hz
Confidence:
0%
Chord: --

Detected Notes

Note Freq (Hz) Start Duration Confidence

Piano Roll

Staff Notation

Frequency Spectrum

Note History Log

How to Use the AI Note Detection Tool

  1. Grant Microphone Access or Upload a File

    Click Start Listening to use your microphone, or upload an audio file (MP3, WAV, OGG, FLAC, M4A). All processing happens locally in your browser — nothing is ever uploaded.

  2. Play or Sing Into Your Microphone

    Play an instrument, sing, or hold your device near a speaker. The tool detects notes in real time, including polyphonic chords with multiple simultaneous notes.

  3. Read the Detected Notes

    The current note appears in the large display with its frequency and confidence score. Detected chords are identified below. The notes list shows every note with its start time and duration.

  4. Explore the Visualizations

    The piano roll shows notes as colored rectangles scrolling over time (like a DAW). The staff notation places note heads on treble and bass clef lines. The spectrum shows the raw frequency content with peaks at detected notes.

  5. Export Your Results

    Click Export CSV to download a file with every detected note, its start time, duration, frequency, and confidence. Use Export MIDI Data for a MIDI-like output with note number, velocity, and timestamps.

Understanding Your Results

Current Note Display

The large note name shows the strongest detected pitch at this moment, along with its octave number and exact frequency in Hz. The confidence bar indicates how clearly the note stands out from background noise — higher confidence means a cleaner, more distinct pitch.

Chord Detection

When multiple notes sound simultaneously, the tool identifies the chord name (e.g., C major, Am7, Dm). It analyzes the intervals between detected pitches and matches them against common chord patterns including major, minor, diminished, augmented, and seventh chords.

Piano Roll

The piano roll displays notes as colored rectangles on a time axis, similar to a digital audio workstation (DAW). Pitch maps to the vertical axis (higher notes are higher on screen) and time scrolls from right to left. Note length corresponds to the detected duration. This view is ideal for seeing melodic patterns and rhythmic timing.

Staff Notation

Notes are placed on a traditional five-line staff with treble and bass clef. Note heads appear at the correct line or space for their pitch. Ledger lines are drawn for notes above or below the staff. This view is useful for musicians who read standard notation.

Frequency Spectrum

The spectrum shows the frequency distribution of the current audio frame. Peaks in the spectrum correspond to detected notes. The height of each peak relates to the amplitude (loudness) of that frequency component.

Notes List & History

Every detected note is logged with its start time, duration, frequency, and confidence. The history log provides a scrollable text record of all detected notes in chronological order, useful for reviewing a performance or transcription session.

How AI Note Detection Works

Multi-Pitch Detection via FFT Peak Finding

The core of this tool is multi-pitch estimation using the Fast Fourier Transform (FFT). The audio signal is windowed (Hann window) and transformed into the frequency domain, producing a magnitude spectrum. Unlike monophonic pitch detectors that find a single fundamental frequency, this tool identifies multiple spectral peaks simultaneously by scanning for local maxima above a dynamic noise-floor threshold. Each peak is matched against a table of equal-temperament note frequencies (A4 = 440 Hz) to determine the closest musical note.

Harmonic Product Spectrum for Polyphonic Analysis

Raw FFT peaks alone can confuse harmonics (overtones) with fundamentals. The tool applies the Harmonic Product Spectrum (HPS) technique: the spectrum is downsampled by factors of 2, 3, and 4, and the products are multiplied together. Fundamental frequencies are reinforced while harmonics are suppressed, allowing the algorithm to distinguish between a note's fundamental and its overtones. This is critical for polyphonic detection where multiple notes produce overlapping harmonic series.

Note Onset Detection via Spectral Flux

Knowing when a note begins is as important as knowing which note it is. The tool computes spectral flux — the sum of positive differences between consecutive magnitude spectra. A sudden increase in spectral flux indicates a note onset (a new note has been struck or sung). A dynamic threshold adapts to the signal level to avoid false triggers in quiet passages and missed onsets in loud ones. Note offset is detected when the energy at a note's frequency drops below the threshold for a sustained period, allowing the tool to estimate note duration from onset to offset.

Chord Identification

Simultaneously detected notes are analyzed for intervallic relationships. The tool computes the semitone intervals between all note pairs, normalizes them to a single octave, and matches the resulting interval set against a dictionary of known chord types (major, minor, diminished, augmented, dominant 7th, major 7th, minor 7th, suspended, and more). The root note is determined by testing each detected note as a potential root and selecting the match with the highest confidence.

Frequently Asked Questions

Can this tool detect multiple notes at the same time?

Yes. The tool uses multi-pitch detection via FFT peak finding and Harmonic Product Spectrum analysis to identify multiple simultaneous notes. It can detect chords, intervals, and polyphonic passages with up to 6–8 concurrent notes depending on signal clarity.

How accurate is the note detection?

For clear, sustained notes (piano, guitar, voice), accuracy is typically within 1 semitone with high confidence. Polyphonic accuracy depends on the complexity of the audio — simple chords are detected reliably, while dense orchestral textures may produce partial results. The confidence score for each note indicates detection reliability.

What audio file formats can I upload?

The tool accepts any format your browser can decode: MP3, WAV, OGG, FLAC, AAC, M4A, and WebM. Files are decoded entirely in your browser — nothing is uploaded to any server. Files up to 50 MB are supported.

How does the piano roll visualization work?

The piano roll displays detected notes as colored rectangles on a scrolling time axis, similar to how notes appear in a DAW (Digital Audio Workstation). The vertical position represents pitch (higher notes are higher on screen), the horizontal length represents duration, and the color intensity reflects confidence. Time scrolls from right to left so you can see the most recent notes.

Can I use this to transcribe music?

Yes, for simple to moderately complex passages. Upload an audio file or play music near your microphone, and the tool will detect notes with timestamps and durations. Export the results as CSV for use in a spreadsheet or the MIDI-like data format for further processing. For best results, use clean recordings with minimal reverb and background noise.

What is the MIDI-like export format?

The MIDI-like export produces a list of note events with MIDI note number, start time (seconds), duration (seconds), and velocity (derived from the note's amplitude). This format can be imported into DAWs, notation software, or custom scripts for further editing and arrangement.

Is my audio data private and secure?

Absolutely. All note detection, spectrum analysis, and visualization runs 100% in your browser using the Web Audio API. No audio is recorded, stored, transmitted, or uploaded to any server. Your microphone stream is processed in real time and never persisted.