Guides

When to use Vosk Small vs Vosk Large

The two models are not interchangeable. The honest breakdown of when each is the right pick.

Vosk

Models

Accuracy

Finn GlasCo-Founder + Engineering

·April 22, 2026·

1 min read

Key takeaways

Small: ~50 MB, runs in ~5 % of audio duration on a single CPU core, ~92 % accuracy on clean German.

Large: ~1.5 GB, runs in ~25 % of audio duration, ~96 % accuracy on the same audio.

The 4 % gap matters disproportionately for proper nouns + technical vocabulary.

Free plan ships small only. Paid plans default to large with one-click re-run to small if you want speed.

Use Small when

You're capturing a quick thought on the go - the kind of note you'd otherwise write three words of into a phone. The transcript is for you, not for publication. Speed matters more than accuracy. Most voice memos and journaling fall here.

Use Large when

The transcript will leave Sprachmemo. Interview that becomes an article. Meeting that becomes minutes the team relies on. Lecture that's the source for your exam prep. Anywhere a downstream reader will hold the text up against the audio - the 4 % gap is too embarrassing to ship. For interviews specifically, see how to transcribe a long German interview accurately.

Switch between models any time

Re-transcription is one click in the model picker on the recording row. Old runs aren't lost - the small-model transcript stays available alongside the large-model one in data.transcriptions. Useful if you want to compare or roll back to a faster version.

Share this article

Try Sprachmemo

Free plan, no credit card. We host in Germany. You can export and delete everything self-serve.

Written by

Finn Glas

Co-Founder + Engineering

Finn is one of the Co-Founders. He owns the engineering side, the infrastructure, and most of the late-night fixes that ship before anyone notices.

finn.glas at aicuflow dot comLinkedIn Website