TalkThrough

Live captions and translation for any audio playing on your Mac. Captures system audio with ScreenCaptureKit, transcribes it on-device using macOS 26's SpeechTranscriber, and translates each line via DeepL — all rendered in a floating overlay window that stays on top of every Space.

Useful for watching foreign-language video or sitting in a meeting whose audio you don't speak.

Requirements

macOS 26.0 or newer (uses SpeechTranscriber, introduced in the macOS 26 Speech framework).
Xcode 26 or newer to build.
A free DeepL API key — sign up at https://www.deepl.com/pro-api.
An Apple Developer account (free tier is fine) for code signing during development.

Setup

git clone https://github.com/greg7gkb/TalkThrough.git
cd TalkThrough
cp Config/Secrets.example.xcconfig Config/Secrets.xcconfig
# Edit Config/Secrets.xcconfig and paste your DeepL API key
open TalkThrough.xcodeproj

Build and run from Xcode. The first launch will prompt for:

Speech Recognition — required to use SpeechTranscriber.
Screen & System Audio Recording — required for ScreenCaptureKit to capture audio playing through other apps.

The first transcription session also downloads the speech model for your locale (a few seconds to a minute, one-time).

Configuration

The target translation language is hard-coded to Spanish ("ES") in ContentView.swift for now. Change it to any DeepL target language code — "EN", "FR", "DE", "JA", etc. — to translate into something else. A picker is on the roadmap; see PLAN.md.

How it works

ScreenCaptureKit ──audio buffers──▶ SpeechTranscriber ──text──▶ ViewModel
                                    (macOS 26 Speech                │
                                     framework)                     ▼
                                                                 DeepL
                                                                    │
                                                                    ▼
                                                            floating overlay

AudioCaptureEngine opens an SCStream on the main display, captures audio only (video is throttled to 1fps at 2×2 to keep ScreenCaptureKit happy), and forwards AVAudioPCMBuffers.
SpeechEngine wraps SpeechAnalyzer + SpeechTranscriber with the progressiveTranscription preset. It resamples the audio to whatever format the transcriber wants, runs the results stream, and emits a running transcript by tracking results keyed by audio time range.
LiveTranslateViewModel drives both engines and debounces translation calls: partials wait 500ms in case more text arrives, but a max-wait clamp forces a fire after 1s so continuous source audio doesn't starve the translator.
TranslationService is a thin DeepL REST client using header-based auth (Authorization: DeepL-Auth-Key).

See PLAN.md for the prior phases, current state, and what's next.

Privacy and data

Audio capture is local; raw audio buffers are sent to Apple's SpeechTranscriber (which transcribes on-device on macOS 26+) and then discarded.
Transcribed text is sent to DeepL over HTTPS for translation. No other network calls.
Your API key lives only in Config/Secrets.xcconfig, which is gitignored.

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TalkThrough

Requirements

Setup

Configuration

How it works

Privacy and data

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Config		Config
TalkThrough.xcodeproj		TalkThrough.xcodeproj
TalkThrough		TalkThrough
.gitignore		.gitignore
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

TalkThrough

Requirements

Setup

Configuration

How it works

Privacy and data

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages