VocalFuse Documentation

Complete reference for installing, licensing, and using the VocalFuse desktop client — local dictation with OpenAI Whisper Large, batch transcription, and Pro AI note taking.

v1.0.1 ggml-large-v3.bin Win · macOS · Linux whisper.cpp
1 · Account & plan

Register, then subscribe on Pricing (Basic $10 or Pro $15).

2 · Install

Download from Downloads, run the installer, allow microphone access.

3 · Dictate

Paste your product key, press Ctrl+Shift+Space, speak into any app.

Overview

VocalFuse is Fuse Intelligence's local-first speech-to-text desktop app. Audio is transcribed on your machine with OpenAI Whisper Large through whisper.cpp (VocalFuse.exe). Subscriptions, billing, and installers are managed through this website.

  • Basic ($10/mo): Unlimited local dictation, pill overlay, batch file transcription, product key & updates
  • Pro ($15/mo): Everything in Basic plus AI note taker mode, 3-minute batch recording, email summaries
  • Model: ggml-large-v3.bin (~3 GB) stored under C:\VocalFuse\models\
  • Privacy: Voice audio is processed locally — not uploaded for cloud transcription
New here? Follow the Getting Started walkthrough, or explore the VocalFuse product page for feature comparisons.

Requirements

ComponentMinimumRecommended
OSWindows 10 64-bit, macOS 12, Ubuntu 22.04Latest stable release with security patches
CPU4-core x648+ cores for smoother partials
RAM8 GB16 GB for OpenAI Whisper Large
GPUNot requiredNVIDIA with CUDA for <100ms dictation
Disk~3 GB freeSSD for faster model load
NetworkHTTPS for license checksStable connection on first model download

Grant microphone permission at OS level. Some apps require accessibility permissions for direct text injection — clipboard paste is the fallback.

Install

  1. Sign in at Login with an active VocalFuse subscription.
  2. Open Downloads and install the build for your platform.
  3. On first launch, VocalFuse downloads ggml-large-v3.bin into C:\VocalFuse\models\ if the file is missing.
  4. Launch VocalFuse.exe, sign in with your Fuse account, and enter your product key from Product Keys.
  5. Configure your dictation hotkey and test with the pill overlay visible.
Downloads and license validation require an active plan. Manage billing at Subscriptions.

Product key & license API

Your product key ties VocalFuse to your subscription tier. Copy it from Product Keys and paste it inside the desktop app on first run or after reinstall.

The client validates against the Fuse API on startup with an offline grace period when HTTPS is temporarily unavailable.

GET /api/license/verify.php?key=YOUR_LICENSE_KEY&product=vocalfuse

Returns JSON confirming validity, product slug, and plan entitlements.

{ "valid": true, "product": "vocalfuse", "plan": "pro", "expires_at": null }

If verification fails repeatedly, confirm your subscription is active and that firewalls allow HTTPS to http://localhost/api/license/verify.php.

Dictation modes

ModePlanDescription
Hold-to-record Basic + Pro Press and hold the global hotkey while speaking. Release to finalize and inject text into the focused application.
Toggle dictation Basic + Pro Optional setting: tap hotkey to start/stop continuous capture for longer passages.
Batch file drop Basic + Pro Drag audio files onto the app window for offline transcription and export.
AI note taker batch Pro only Record up to 3 minutes from the microphone, structure notes locally, optionally email a summary.

For best live dictation latency, enable GPU acceleration and close competing GPU workloads. See Performance tuning.

Hotkeys & defaults

ActionDefaultNotes
Toggle / hold dictationCtrl+Shift+SpaceRebind in Settings → Hotkeys
Show / hide pillTray menuPill persists per-monitor position via drag grip
Paste fallbackCtrl+VUsed when injection into target app is blocked

Pill overlay UI

The floating overlay uses the documented rv-pill component: draggable grip, logo, and level meters. Border color reflects capture state.

Idle

Ready — press your hotkey to dictate.

Recording

Warm border — capturing audio locally.

Busy

Muted border — transcription or injection in flight.

Drag the grip to reposition. Placement is saved per monitor and restored on next launch.

Batch transcription

Drop supported audio files (WAV, MP3, M4A, FLAC depending on build) onto the VocalFuse window or use the batch panel from the tray menu. Inference runs locally with the same OpenAI Whisper Large weights as live dictation.

  1. Add one or more files to the queue.
  2. Select output format (see Exports).
  3. Start processing — progress appears in the app shell; no upload step.
  4. Open the output folder or copy transcript text into your workflow.
Pro tip: For meeting capture, use Pro batch recording from the microphone when you cannot record system audio directly. See Meeting transcription.

Export formats

FormatExtensionBest for
Plain text.txtQuick paste into docs, tickets, chat
Word document.docxFormatted handoff to stakeholders
SubRip subtitles.srtVideo editors, YouTube uploads
WebVTT.vttHTML5 players, web captions

Settings panel

In-app preferences are grouped into account, hotkeys, inference, and Pro features. Web-side theme and profile settings live at Account Settings.

VocalFuse settings

Account
••••-••••-VF
✓ Verified Manage on web
Hotkeys
Ctrl + Shift + Space
GPU acceleration
CUDA / DX / Vulkan (build-dependent)
Model: OpenAI Whisper Large · C:\VocalFuse\models\
Pro
Note taker & email summaries
SettingLocation
Product keyDesktop → Account (synced from web)
Dictation hotkeyDesktop → Hotkeys
GPU / CPU threadsDesktop → Performance
AI note taker + SMTPDesktop → Pro (requires Pro plan)
Theme & profile nameWeb → Account Settings

AI note taking (Pro)

VocalFuse Pro extends dictation into structured capture: batch microphone recording (up to 3 minutes), local note organization, and optional email summaries via SMTP — while keeping audio off third-party transcription clouds.

  1. Confirm your plan includes Pro on Subscriptions.
  2. Enable Note taker in the desktop Settings panel.
  3. Start a batch capture session from the tray menu or Pro panel.
  4. Review structured notes, export, or send email summary if SMTP is configured.
  • Organize running notes during calls; paste into Slack, Docs, or your IDE from the overlay.
  • Prefer GPU acceleration during live narration for responsive partials.
  • Use batch file mode for pre-recorded lectures when you already have audio files.

Learn more on the AI note taker landing page.

Performance tuning

SymptomFix
Slow partials (>200ms)Enable GPU in Settings → Performance; close other GPU apps
High RAM usageEnsure ggml-large-v3.bin is the only loaded model; restart after long sessions
CPU-only laptopsReduce background apps; expect higher latency than CUDA builds
Long batch jobsProcess overnight; exports write to disk as each file completes

Deep dive: Local AI Guide · Architecture

Updates

Stable releases publish to Downloads when ready. User-visible changes are summarized on the changelog.

This documentation targets release 1.0.1 as served by the Fuse API. After updating, restart VocalFuse so license and model manifests refresh.

Troubleshooting

No text appearing at cursor
Confirm the destination app is focused. Grant accessibility permissions on macOS. Try clipboard paste fallback from the pill menu.
High latency or stuttering partials
Enable GPU acceleration, verify CUDA drivers, and reduce competing GPU workloads. See Performance tuning.
“Invalid key” or license errors
Renew on Subscriptions, re-copy the key from Product Keys, check HTTPS/firewall rules.
Model missing or failed download
Ensure ggml-large-v3.bin exists in C:\VocalFuse\models\. Re-run first-launch download or place the file manually, then restart VocalFuse.exe.
Pro features greyed out
Verify Pro (or admin) entitlements via license API response. Basic plans do not include AI note taker or email summaries.
Microphone not detected
Check OS privacy settings, correct input device in Sound settings, and that no other app holds exclusive mic access.

VocalFuse FAQ

What AI model does VocalFuse use for transcription?

VocalFuse uses OpenAI Whisper Large (ggml-large-v3) running locally through whisper.cpp — not a small cloud API or distilled model.

Which platforms support VocalFuse AI transcription?

VocalFuse runs on Windows 10+, macOS 12+, and Ubuntu 22.04+. NVIDIA GPU acceleration is optional via CUDA.