Free Video Captioner & Audiogram Maker Online

AI & LLM Tools

What is Free Video Captioner & Audiogram Maker Online?

Video Captioner & Audiogram Maker is a privacy-first browser tool that automatically generates and burns captions into videos or creates shareable audiogram clips from audio files. It uses the Whisper base.en speech recognition model compiled to WebAssembly, running the full transcription pipeline locally on your device. The result is a downloadable WebM video with subtitles overlaid exactly in sync with the spoken words.

How it works

The tool decodes your media file using the browser's AudioContext API, downsamples the audio to 16 kHz mono Float32 data, and sends it to a Web Worker running the Whisper base.en ONNX model. The model returns time-stamped text segments, which the component renders onto an HTML Canvas in sync with media playback. On export, the Canvas stream and audio track are captured by the browser's MediaRecorder API and saved as a WebM file — no server involved at any stage.

Features & Benefits

Runs 100% in your browser — video and audio never leave your device
Four caption style presets: TikTok, Classic, Minimal, and Karaoke word-highlight
Inline transcript editor lets you fix any AI transcription mistake before exporting
Audiogram mode adds animated waveform visualizer (bars, line, or blocks) over a dark gradient background
No account, no upload limits, no watermark

Frequently Asked Questions

What file formats are supported?

Video: MP4, MOV, WebM, MKV, AVI. Audio: MP3, WAV, M4A. The tool auto-detects whether to show video frames or an audiogram background.

Why does the first run download 74 MB?

The Whisper base.en model weights are downloaded once and cached in your browser. Subsequent uses are instant — no re-download.

How long does export take?

Export records in real-time using the browser's MediaRecorder API, so a 30-second video takes roughly 30 seconds to capture. A progress bar tracks the recording.

Can I edit the transcription?

Yes. Click any segment in the transcript panel to edit the text inline before exporting. Changes are reflected immediately on the canvas.

What is the output format?

The exported file is a WebM (VP9 + Opus) video. Most platforms and media players support it; you can convert to MP4 offline if needed.

Related Tools

Audio Transcription

Transcribe audio to text entirely in your browser using Whisper. No upload, no server - your audio never leaves your device.

Prompt Optimizer & Minifier

Compress markdown prompts, reduce JSON schemas to YAML to cut down on API token costs.

Image to Text (OCR)

Extract text from images using OCR. Runs in your browser via WebAssembly - supports 100+ languages, no upload needed.

Popular Utilities

JSON Formatter & Validator

Format, validate, and minify JSON instantly in your browser. Your data never leaves your device.

JWT Decoder

Decode JWT tokens and inspect header and payload instantly in your browser. Your tokens never leave your device.

Word Counter

Count words, characters, sentences, and estimate reading time instantly in your browser. No sign-up required.