VocalFuse Documentation

Complete reference for installing, licensing, and using the VocalFuse desktop client — local dictation with OpenAI Whisper Large, batch transcription, and Pro AI note taking.

v1.0.1 ggml-large-v3.bin Win · macOS · Linux whisper.cpp

1 · Account & plan

2 · Install

Download from Downloads, run the installer, allow microphone access.

3 · Dictate

Paste your product key, press Ctrl+Shift+Space, speak into any app.

Overview

VocalFuse is Fuse Intelligence's local-first speech-to-text desktop app. Audio is transcribed on your machine with OpenAI Whisper Large through whisper.cpp (VocalFuse.exe). Subscriptions, billing, and installers are managed through this website.

Basic ($10/mo): Unlimited local dictation, pill overlay, batch file transcription, product key & updates
Pro ($15/mo): Everything in Basic plus AI note taker mode, 3-minute batch recording, email summaries
Model: ggml-large-v3.bin (~3 GB) stored under C:\VocalFuse\models\
Privacy: Voice audio is processed locally — not uploaded for cloud transcription

New here? Follow the Getting Started walkthrough, or explore the VocalFuse product page for feature comparisons.

Requirements

Component	Minimum	Recommended
OS	Windows 10 64-bit, macOS 12, Ubuntu 22.04	Latest stable release with security patches
CPU	4-core x64	8+ cores for smoother partials
RAM	8 GB	16 GB for OpenAI Whisper Large
GPU	Not required	NVIDIA with CUDA for <100ms dictation
Disk	~3 GB free	SSD for faster model load
Network	HTTPS for license checks	Stable connection on first model download

Grant microphone permission at OS level. Some apps require accessibility permissions for direct text injection — clipboard paste is the fallback.

Install

Sign in at Login with an active VocalFuse subscription.
Open Downloads and install the build for your platform.
On first launch, VocalFuse downloads ggml-large-v3.bin into C:\VocalFuse\models\ if the file is missing.
Launch VocalFuse.exe, sign in with your Fuse account, and enter your product key from Product Keys.
Configure your dictation hotkey and test with the pill overlay visible.

Downloads and license validation require an active plan. Manage billing at Subscriptions.

Product key & license API

Your product key ties VocalFuse to your subscription tier. Copy it from Product Keys and paste it inside the desktop app on first run or after reinstall.

The client validates against the Fuse API on startup with an offline grace period when HTTPS is temporarily unavailable.

GET /api/license/verify.php?key=YOUR_LICENSE_KEY&product=vocalfuse

Returns JSON confirming validity, product slug, and plan entitlements.

{ "valid": true, "product": "vocalfuse", "plan": "pro", "expires_at": null }

If verification fails repeatedly, confirm your subscription is active and that firewalls allow HTTPS to http://localhost/api/license/verify.php.

Dictation modes

Mode	Plan	Description
Hold-to-record	Basic + Pro	Press and hold the global hotkey while speaking. Release to finalize and inject text into the focused application.
Toggle dictation	Basic + Pro	Optional setting: tap hotkey to start/stop continuous capture for longer passages.
Batch file drop	Basic + Pro	Drag audio files onto the app window for offline transcription and export.
AI note taker batch	Pro only	Record up to 3 minutes from the microphone, structure notes locally, optionally email a summary.

For best live dictation latency, enable GPU acceleration and close competing GPU workloads. See Performance tuning.

Hotkeys & defaults

Action	Default	Notes
Toggle / hold dictation	`Ctrl+Shift+Space`	Rebind in Settings → Hotkeys
Show / hide pill	Tray menu	Pill persists per-monitor position via drag grip
Paste fallback	`Ctrl+V`	Used when injection into target app is blocked

Pill overlay UI

The floating overlay uses the documented rv-pill component: draggable grip, logo, and level meters. Border color reflects capture state.

Idle

Ready — press your hotkey to dictate.

Recording

Warm border — capturing audio locally.

Busy

Muted border — transcription or injection in flight.

Drag the grip to reposition. Placement is saved per monitor and restored on next launch.

Batch transcription

Drop supported audio files (WAV, MP3, M4A, FLAC depending on build) onto the VocalFuse window or use the batch panel from the tray menu. Inference runs locally with the same OpenAI Whisper Large weights as live dictation.

Add one or more files to the queue.
Select output format (see Exports).
Start processing — progress appears in the app shell; no upload step.
Open the output folder or copy transcript text into your workflow.

Pro tip: For meeting capture, use Pro batch recording from the microphone when you cannot record system audio directly. See Meeting transcription.

Export formats

Format	Extension	Best for
Plain text	`.txt`	Quick paste into docs, tickets, chat
Word document	`.docx`	Formatted handoff to stakeholders
SubRip subtitles	`.srt`	Video editors, YouTube uploads
WebVTT	`.vtt`	HTML5 players, web captions

Settings panel

In-app preferences are grouped into account, hotkeys, inference, and Pro features. Web-side theme and profile settings live at Account Settings.

VocalFuse settings

Account

License key

••••-••••-VF

✓ Verified Manage on web

Hotkeys

Toggle dictation

Ctrl + Shift + Space

GPU acceleration
CUDA / DX / Vulkan (build-dependent)

Model: OpenAI Whisper Large · C:\VocalFuse\models\

Pro
Note taker & email summaries

Setting	Location
Product key	Desktop → Account (synced from web)
Dictation hotkey	Desktop → Hotkeys
GPU / CPU threads	Desktop → Performance
AI note taker + SMTP	Desktop → Pro (requires Pro plan)
Theme & profile name	Web → Account Settings

AI note taking (Pro)

VocalFuse Pro extends dictation into structured capture: batch microphone recording (up to 3 minutes), local note organization, and optional email summaries via SMTP — while keeping audio off third-party transcription clouds.

Confirm your plan includes Pro on Subscriptions.
Enable Note taker in the desktop Settings panel.
Start a batch capture session from the tray menu or Pro panel.
Review structured notes, export, or send email summary if SMTP is configured.

Organize running notes during calls; paste into Slack, Docs, or your IDE from the overlay.
Prefer GPU acceleration during live narration for responsive partials.
Use batch file mode for pre-recorded lectures when you already have audio files.

Learn more on the AI note taker landing page.

Performance tuning

Symptom	Fix
Slow partials (>200ms)	Enable GPU in Settings → Performance; close other GPU apps
High RAM usage	Ensure ggml-large-v3.bin is the only loaded model; restart after long sessions
CPU-only laptops	Reduce background apps; expect higher latency than CUDA builds
Long batch jobs	Process overnight; exports write to disk as each file completes

Deep dive: Local AI Guide · Architecture

Updates

Stable releases publish to Downloads when ready. User-visible changes are summarized on the changelog.

This documentation targets release 1.0.1 as served by the Fuse API. After updating, restart VocalFuse so license and model manifests refresh.

Troubleshooting

No text appearing at cursor: Confirm the destination app is focused. Grant accessibility permissions on macOS. Try clipboard paste fallback from the pill menu.
High latency or stuttering partials: Enable GPU acceleration, verify CUDA drivers, and reduce competing GPU workloads. See Performance tuning.
“Invalid key” or license errors: Renew on Subscriptions, re-copy the key from Product Keys, check HTTPS/firewall rules.
Model missing or failed download: Ensure ggml-large-v3.bin exists in C:\VocalFuse\models\. Re-run first-launch download or place the file manually, then restart VocalFuse.exe.
Pro features greyed out: Verify Pro (or admin) entitlements via license API response. Basic plans do not include AI note taker or email summaries.
Microphone not detected: Check OS privacy settings, correct input device in Sound settings, and that no other app holds exclusive mic access.

VocalFuse FAQ

What AI model does VocalFuse use for transcription?

VocalFuse uses OpenAI Whisper Large (ggml-large-v3) running locally through whisper.cpp — not a small cloud API or distilled model.

Which platforms support VocalFuse AI transcription?

VocalFuse runs on Windows 10+, macOS 12+, and Ubuntu 22.04+. NVIDIA GPU acceleration is optional via CUDA.