Fix Poor Transcription Quality

If Hedy’s transcripts are full of errors — missed words, wrong proper nouns, garbled phrases — the cause is rarely the AI model itself. It’s almost always one of: poor audio capture environment, the wrong microphone, the wrong language setting, or a provider that doesn’t match your use case. Here’s how to diagnose and fix each, ranked by how often each is the culprit.

First, Confirm the Basics

Before changing anything, check these:

Is the Meeting/Class Language set to the language you’re actually speaking? Settings > Profile > Language Preferences. The default speech recognition provider (Whisper) does not auto-detect language — it transcribes assuming the language you configured. If those don’t match, every word will be wrong. See Transcript Came Out in the Wrong Language.
Is the right microphone selected? Settings > Sessions > Microphone Settings. If you accidentally chose a Bluetooth headset that’s unplugged or a USB mic that’s disconnected, Hedy is recording silence and the transcript is garbage.

Most “low quality” complaints are one of these two settings, not anything technical.

Improve the Audio Environment

Hedy doesn’t apply any client-side noise suppression, automatic gain control, or echo cancellation. The audio that goes into transcription is essentially what your microphone picks up. Cleaner audio in = cleaner transcript out.

Get the microphone closer to the speakers. For in-person meetings, a phone placed in the middle of a small table works for 4-5 people. For a large room, 8+ people, or noisy environments, use multiple devices or a dedicated conference mic.
Reduce background noise. Fans, air conditioning, kitchen appliances, traffic, and other people talking in the background all degrade accuracy. Close doors and windows. Turn off the fan if possible.
Avoid recording from laptop speakers played by laptop speakers. If you’re trying to capture a meeting that’s playing through laptop speakers (e.g., a video on YouTube), use the system audio capture features instead. See Hedy Isn’t Capturing Other Participants in Virtual Meetings.
Don’t talk over each other. Overlapping speech is the hardest case for any speech recognition. Hedy’s diarization tries to separate speakers, but if multiple people speak at once, accuracy drops sharply.

Choose the Right Speech Recognition Provider

Hedy supports five speech recognition providers — three local, two cloud. You can see and change them at Settings > Speech & AI > Speech Recognition Options.

Provider	Type	Best for	Trade-off
Local Speech Recognition (Whisper) — default	Local	Privacy-sensitive use, working offline, broad language support	Slower than cloud on integrated graphics; uses configured meeting language (no auto-detect)
Local Speech Recognition (Parakeet) [Beta]	Local (Apple Silicon Macs and supported iPhone/iPad models)	Faster real-time transcription for English and major European languages	Beta; narrower language list than Whisper; may misidentify similar languages
Local Speech Recognition (Nemotron) [Beta]	Local (Apple Silicon Macs and supported iPhone/iPad models)	Faster real-time transcription with on-device speaker labels; has an English-only and a multilingual mode	Beta; identifies language from audio rather than your meeting language setting
Deepgram (requires your own API key)	Cloud	Cloud accuracy, multi-language auto-detect, large meetings	Requires Deepgram account and API key; not local
OpenAI (requires your own API key)	Cloud	Cloud accuracy, language auto-detection	Requires OpenAI account and API key; not local

If you’re using the default Whisper provider and accuracy isn’t good enough, try the following in order, depending on your situation:

On Apple Silicon Macs or supported iPhone/iPad models, for English or major European languages: try Parakeet or Nemotron. They run on Apple’s Neural Engine and are often faster and more accurate than Whisper for real-time transcription. Both are still beta — they identify language from the audio, so watch for “similar language” misidentification (e.g., German vs. Dutch). For non-English meetings on Nemotron, use its Multilingual mode.
For multi-language meetings, accented speech, or noisy environments: try Deepgram (multi-language auto-detect) or OpenAI (auto-detect). Both require you to bring your own API key, but they typically outperform local models on hard audio.
If you need to stay offline or fully private and Whisper is slow on your hardware: see Fix Slow Transcription on Windows (GPU Settings) for the Windows-specific GPU acceleration fix, or move to Parakeet if you’re on Apple Silicon.

Use Custom Vocabulary for Proper Nouns

If Hedy mis-transcribes names, technical terms, product names, or industry jargon, add them to Custom Vocabulary.

Open Hedy’s Settings
Go to Personalization > Custom Vocabulary > Manage Vocabulary Terms
Enter each term in “Enter a custom term…” and tap Add
Make sure Enable Custom Vocabulary is on

Custom Vocabulary feeds directly into the local Whisper transcription as a prompt, helping it recognize and spell domain-specific terms correctly. It also helps the transcript cleanup step (which runs across all providers, including Parakeet, Nemotron, Deepgram, and OpenAI) catch and fix mistakes.

Note: Custom Vocabulary has its strongest direct effect when you’re using local Whisper STT. For Parakeet, Nemotron, Deepgram, and OpenAI, the cleanup step still benefits from your vocabulary list, but the speech recognizer itself doesn’t receive it as a prompt.

For a longer guide on building a good vocabulary list, see Custom Vocabulary Guide.

Fix Microphone Hardware Issues

If audio quality is degrading mid-session or only certain speakers come through, the hardware is suspect:

Bluetooth headsets often degrade as battery drops or when range increases. See AirPods and Bluetooth Headphones Cutting Out.
USB microphones can suffer from cable issues — try a different USB port, or a different cable
Built-in laptop mics are fine for one or two people sitting close to the keyboard. They’re not great for conference rooms.
Phones inside cases or under fabric can sound muffled

A quick test: record a short voice memo with the same microphone in Voice Memos / Recorder / a similar simple app. If that recording sounds bad, the problem is the mic — not Hedy.

Audio Format Hedy Uses

For reference, Hedy captures audio at 16 kHz, mono, 16-bit PCM — the standard for speech recognition. This format goes directly to local Whisper and Deepgram. For OpenAI Realtime, Hedy resamples to 24 kHz before sending (OpenAI’s required format). All of these are fine for speech but lossy for music or high-fidelity audio. Don’t expect great results trying to transcribe songs.

When to Escalate

If you’ve checked all the above and accuracy is still poor:

Note the specific kind of error (wrong words, missed sections, wrong speaker attribution, total garbage)
Capture a 30-second sample where the error happens
Contact us through the chat widget with the sample and your provider/language setup

We can usually identify whether it’s environmental, configuration, or a provider issue.

Still having trouble? Contact us through the chat widget with your provider, your Meeting/Class Language setting, your device model, and a sample where the issue is visible.