Local text-to-speech on Ubuntu using Kokoro TTS with fallbacks. Use when the user asks you to speak, say, read aloud, announce, or narrate text. Also use for TTS playback testing, switching Kokoro voices, adjusting speech volume, or debugging audio output issues. Triggers: "say this", "say hello", "read that back to me", "read aloud", "speak", "announce", "narrate", "tell me out loud", "hear it out loud", "text to speech", "TTS", "voice test". Do NOT use for: recording audio, transcription, speech-to-text, playing music/media files, or dictation input.
Resources
1Install
npx skillscat add ema93sh/ai-voice-stack/ai-say Install via the SkillsCat registry.
AI Say
Local text-to-speech via ~/.local/bin/ai-say. Kokoro-first with flite and
speech-dispatcher fallbacks. Requires the voice stack to be installed first.
When to use this
Use this skill when the user wants text spoken aloud through their speakers:
- "say hello", "say this out loud"
- "read that back to me", "read this aloud"
- "speak", "announce", "narrate"
- "text to speech", "TTS"
- "voice test", "test audio output"
- "switch voice", "change TTS voice", "try a different voice"
- "it's too quiet", "adjust volume", "speech volume"
- "audio not working", "TTS broken", "no sound"
When NOT to use this
Do not activate for these requests — they belong to other tools:
- Recording or transcription — "transcribe this", "speech to text", "STT", "dictate"
- Media playback — "play this mp3", "play music", "open audio file"
- Audio hardware — "install audio drivers", "configure ALSA", "set up Bluetooth speaker"
- Dictation input — "type what I say", "voice input" (that's dictate-start/stop, not ai-say)
Prerequisites
The voice stack must be installed before using this skill:
git clone https://github.com/Ema93sh/ai-voice-stack.git
cd ai-voice-stack
scripts/voice-up-setup.sh --with-system-depsThis installs ai-say to ~/.local/bin/ and sets up Kokoro TTS.
Usage
Speak text:
~/.local/bin/ai-say "Hello world"Pipe text:
echo "Read this aloud" | ~/.local/bin/ai-sayTemporary voice override:
AI_KOKORO_VOICE=am_michael ~/.local/bin/ai-say "Different voice"Available Voices
English (American)
Female: af_alloy, af_aoede, af_bella, af_heart (default), af_jessica,af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky
Male: am_adam, am_echo, am_eric, am_fenrir, am_liam,am_michael, am_onyx, am_puck, am_santa
English (British)
Female: bf_alice, bf_emma, bf_isabella, bf_lily
Male: bm_daniel, bm_fable, bm_george, bm_lewis
Other Languages
Spanish (ef_dora, em_alex), French (ff_siwis), Hindi (hf_alpha,hf_beta, hm_omega, hm_psi), Italian (if_sara, im_nicola),
Japanese (jf_alpha, jf_gongitsune, jf_nezumi, jf_tebukuro, jm_kumo),
Portuguese (pf_dora, pm_alex), Chinese (zf_xiaobei, zf_xiaoni,zf_xiaoxiao, zf_xiaoyi, zm_yunjian, zm_yunxi, zm_yunxia, zm_yunyang)
Persistent Voice Change
Edit ~/.config/ai-audio.env:
export AI_KOKORO_VOICE="${AI_KOKORO_VOICE:-af_bella}"Volume / Gain
If speech is too quiet:
export AI_KOKORO_GAIN_DB=18
~/.local/bin/ai-say "Volume test"Diagnostics
Check stack health:
~/.local/bin/voice-statusCheck audio devices:
pactl list short sinks
pactl list short sources
cat ~/.config/ai-audio.envTest direct tone on a specific sink:
ffmpeg -hide_banner -loglevel error -f lavfi -i 'sine=frequency=880:duration=3' -f wav - | paplay --device='<sink-name>'Subcommands
Install
Run the full voice stack installer. Finds the repo automatically or clones it:
bash scripts/install.sh --with-system-depsDoctor
Check installation health — reports PASS/FAIL for every component:
bash scripts/doctor.shHow to use ai-say (process)
Always use ~/.local/bin/ai-say as the entry point. Never call Kokoro,
flite, or spd-say directly — ai-say handles engine selection, chunking,
volume boost, and sink routing automatically.
- Pass text as an argument:
~/.local/bin/ai-say "text" - Or pipe text:
echo "text" | ~/.local/bin/ai-say - Override voice with env var:
AI_KOKORO_VOICE=am_fenrir ~/.local/bin/ai-say "text" - For diagnostics, run
~/.local/bin/voice-statusfirst, thenbash scripts/doctor.sh
Do NOT:
- Call
kokoro-synthesize.pyor the Kokoro Python API directly - Use
spd-say,flite,espeak, oraplaydirectly - Modify files under
~/.local/share/kokoro-tts/unless troubleshooting - Speak content the user did not ask to hear
Definition of done
The skill is working correctly when:
~/.local/bin/ai-say "test"produces audible speechbash scripts/doctor.shreports all checks PASS- The agent used
~/.local/bin/ai-say(not a direct TTS engine call) - Only text the user requested was spoken
Notes
ai-sayis Kokoro-first, with flite and spd-say fallback paths.- Text is truncated at 1600 chars and chunked at 320 chars for reliable playback.
- Keep messages respectful and follow user intent exactly for spoken content.