What AI model does VocalFuse use for transcription?

VocalFuse uses OpenAI Whisper Large (ggml-large-v3) running locally through whisper.cpp — not a small cloud API or distilled model.

Which platforms support VocalFuse AI transcription?

VocalFuse runs on Windows 10+, macOS 12+, and Ubuntu 22.04+. NVIDIA GPU acceleration is optional via CUDA.

DESKTOP APP · v1.0.1

VocalFuse

Local AI transcription and AI note taking powered by OpenAI Whisper Large — dictation, meetings, and batch exports without sending audio to the cloud.

Built on whisper.cpp · Model file ggml-large-v3.bin · From $10/mo

<100msAvg dictation latency

~3 GBWhisper Large on disk

$10–15Monthly plans

100% localAudio stays on-device

Read the docs Create account

Windows 10+ macOS 12+ Ubuntu 22.04+ CUDA optional

Speech to text at desk speed

Hold your hotkey, speak, and watch text land in Slack, Word, VS Code, or any focused window. VocalFuse runs inference on your GPU or CPU — no round-trip to a cloud API, no per-minute billing surprises.

✓ Always-on-top pill overlay with 16-bar live visualizer
✓ Drag grip, note-mode switch, settings & close controls
✓ Batch transcribe files to TXT, DOCX, SRT, VTT
✓ Pro adds AI note taker + email summaries

Always-on-top pill · drag grip to reposition

The VocalFuse desktop UI

Pixel-accurate replicas of the Windows client from VocalFuse_cpp — 248×38 pill overlay, six capture states, settings dialog, and system tray menu.

Pill overlay — all states

Idle

Hold to speak

Border #c4a04a · “Hold to speak”

Note ready

Click for notes

Note switch on · “Click for notes”

Recording

Border #e2c06a · 16-bar visualizer

Busy

Working...

Border #9a7a34 · “Working...”

Note taking

Pro batch capture · live meters

Note sending

Sending notes...

SMTP summary dispatch · “Sending notes...”

Settings dialog

460px panel with Product License, Note Taking Mode (email SMTP), and Updates — matching settings.cpp.

Settings

Vocal Fuse

Product License

Required to use Vocal Fuse.

Product key

VF-7K2M-9QPL-4XHN

Product key verified.

Note Taking Mode

Record continuously in 3-minute batches until stopped.

Email Delivery

Sent via SMTP when a note session ends.

Send to

notes@company.com

SMTP server

smtp.company.com

Port

587

Username

smtp-user

Password

••••••••••••

Updates

Compare your installed version with the latest release.

Installed1.0.1

Latest1.0.1

You are up to date.

Control layout

Right-side controls: 36×16 note switch, settings gear, close button. Hover states shown below.

Hover — settings

Hold to speak

Hover — close

Hold to speak

System tray

Tray context menu: Settings, Minimize, Show, Close — from tray.cpp.

Layout reference: Pill body #451218, drag grip 18px, logo 18px, Segoe UI labels, note switch 36×16. See the full wireframes in VocalFuse docs.

How VocalFuse works

Subscribe on the web, install the desktop client, activate with your product key, and dictate anywhere.

Subscribe

Pick Basic ($10) or Pro ($15) on the pricing page. Stripe handles billing — cancel anytime.

Download & install

Grab the installer from Downloads. First launch pulls ggml-large-v3.bin if needed.

Activate & configure

Paste your product key, set your hotkey, enable GPU acceleration, and position the pill overlay where you want it.

Dictate & export

Real-time speech to text into any app. Pro users batch-record meetings and export structured notes.

Built for professionals who can't send voice to the cloud

Real-time dictation

Global hotkey capture with sub-100ms partials on capable hardware. Text injects into the focused field or copies to clipboard.

OpenAI Whisper Large

Full Large weights — not a distilled cloud API model. Studio-grade accuracy for accents, jargon, and long-form narration.

Batch transcription

Drop podcasts, interviews, and lectures. Export subtitles and documents without uploading audio anywhere.

AI note taker (Pro)

Capture up to 3 minutes per batch, structure meeting notes locally, and email summaries via SMTP.

Privacy by architecture

Microphone audio never leaves your machine. License checks use HTTPS; voice data does not.

Developer-friendly

License verification API, architecture docs, and local AI guides for teams embedding Fuse intelligence.

Basic vs Pro

Feature	Basic — $10/mo	Pro — $15/mo
OpenAI Whisper Large local dictation	✓	✓
Always-on-top pill widget	✓	✓
Hold-to-record hotkey	✓	✓
Batch file transcription (TXT, SRT, VTT, DOCX)	✓	✓
AI note taker mode	—	✓
3-minute batch meeting capture	—	✓
Email summaries via SMTP	—	✓
Priority updates	—	✓

Compare plans & subscribe

Who uses VocalFuse?

Developers & writers

Dictate commits, docs, and emails without leaving the IDE. Whisper Large handles technical vocabulary better than lightweight cloud models.

Legal & healthcare

Keep sensitive conversations on-device. No third-party transcription bot joins your calls — audio stays in your trust boundary.

Students & creators

Transcribe lectures and podcasts offline. Pro turns long sessions into structured notes you can export or email.

System requirements

Operating system	Windows 10/11 (64-bit), macOS 12+, Ubuntu 22.04+
Memory	16 GB RAM recommended for OpenAI Whisper Large
GPU	Optional NVIDIA GPU with CUDA for lowest latency
Storage	~3 GB for `ggml-large-v3.bin`
Model path	`C:\VocalFuse\models\`
Permissions	Microphone; optional accessibility APIs for text injection
Account	Active VocalFuse subscription for downloads and license validation

Need tuning advice? Read the Local AI Guide for GPU setup, or the full VocalFuse documentation for install and troubleshooting.

VocalFuse

Speech to text at desk speed

The VocalFuse desktop UI

Pill overlay — all states

Settings dialog

Settings

Product License

Note Taking Mode

Email Delivery

Updates

Control layout

System tray

How VocalFuse works

Subscribe

Download & install

Activate & configure

Dictate & export

Built for professionals who can't send voice to the cloud

Real-time dictation

OpenAI Whisper Large

Batch transcription

AI note taker (Pro)

Privacy by architecture

Developer-friendly

Basic vs Pro

Who uses VocalFuse?

Developers & writers

Legal & healthcare

Students & creators

System requirements

Start dictating in minutes

VocalFuse FAQ