Free tool · runs in your browser

Video Transcriber

Turn your own recordings — interviews, talking-head videos, webinars, voice notes — into a punctuated transcript and subtitle files, entirely in your browser. Your file is never uploaded.

Transcription runs entirely in your browser — your file is never uploaded. The AI model downloads once (~75 MB) and is then cached; long recordings take a while to process on a laptop.

A transcript is the raw material for a week of content

Once a recording is text, it stops being a single video and becomes a quarry. The transcript of one interview or webinar gives you the pull-quotes for graphics, the captions for clips, the show notes, the blog draft, and the thread — without re-watching anything. It’s the first step of the repurposing workflow: transcribe the core asset, then mine it. And before you post the clips, our video resizer reframes them for each platform.

How it works

How to transcribe a recording

Upload, transcribe, export — the whole flow:

  1. 1

    Add your video or audio

    Drop in a recording you have — an interview, a talking-head video, a webinar, a voice note. It's read into the browser, never uploaded.

  2. 2

    Pick the language

    Choose the spoken language, or leave it on auto-detect for mixed or uncertain audio.

  3. 3

    Transcribe

    The AI model downloads once (~75 MB) and is cached; then it transcribes locally. Long recordings take a while on a laptop.

  4. 4

    Read and tidy the transcript

    The text comes back punctuated and paragraph-broken. Skim it and fix any names or jargon the model didn't know.

  5. 5

    Export what you need

    Plain text for repurposing, timestamped text to navigate, or SRT / VTT subtitle files for captions.

Which export for which job

Four ways to take the transcript out, and what each is for:

ExportBest forTiming
Plain textShow notes, blog drafts, and repurposing into postsNone
Timestamped textSkimming to a moment or pulling a quote with its timeInline [mm:ss]
SRTSubtitles for most video editors and upload screensYes
VTTCaptions for web video and HTML5 playersYes

Frequently asked questions

Is my file uploaded to a server?

No. The audio is extracted and transcribed entirely in your browser by an AI model that downloads once and then runs on your device — your file never leaves your computer, and there's no account or log. You can transcribe confidential interviews or unreleased recordings safely.

Why is there a one-time download, and why can it be slow?

Because the transcription happens on your machine rather than a server, the speech model — around 75 MB — downloads to your browser the first time and is then cached. And running speech recognition in the browser is genuinely heavy: a long recording can take several minutes on a laptop. It's the trade for keeping your audio private and the tool free. A short clip finishes quickly.

Can I transcribe a YouTube video or paste a URL?

No — and that's deliberate. This tool only works on a file you already have, because the media is processed in your browser, not fetched from a server. It's built for transcribing your own recordings — interviews, videos, webinars, voice notes — not for pulling text from someone else's video.

How accurate is it?

Good on clear audio, and the transcript comes back punctuated with paragraph breaks. Clean, single-speaker recordings transcribe well; heavy background noise, strong accents, crosstalk, and specialist jargon are where errors creep in. Treat it as a strong first draft — always read it through and fix names, terms, and the odd misheard word before you publish.

What languages does it handle?

The model is multilingual and covers the major world languages, with auto-detect for when you're not sure or the audio is mixed. Accuracy is highest in English and the other large languages it was most trained on; pick the language explicitly when you know it for the best result.

What can I export?

Plain text for repurposing into posts and show notes, timestamped text for navigating to a moment, and SRT or VTT subtitle files for captioning. SRT is the safe default for most video tools and upload screens; VTT is for web players.