๐Ÿ”ญAI Tools Scout
LeaderboardMCPSkillsContentAbout
๐Ÿ”ญAI Tools ScoutยทData updated every 6 hours
LeaderboardMCPSkillsContentAbout
โ† Back to Leaderboard

Best Open Source AI Audio Projects

19 audio projects ranked by GitHub stars, weekly growth, and maintenance health.

Project data last synced 13h ago.
#ProjectCategoryStars โ–ผWeekly โ–ฝTrendHealth โ–ฝLanguageUpdated โ–ฝ
1
Whisper
Robust Speech Recognition via Large-Scale Weak Supervision
๐ŸŽต Audio99.3K+37056Python26d ago
2
Unsloth
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
๐ŸŽต Audio64.0K072Python16h ago
3
Real Time Voice Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
๐ŸŽต Audio59.7K042Python2mo ago
4
Coqui TTS
๐Ÿธ๐Ÿ’ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
๐ŸŽต Audio45.3K+4559Python21mo ago
5
Chattts
A generative speech model for daily dialogue.
๐ŸŽต Audio39.2K052Python1mo ago
6
Mockingbird
๐Ÿš€Clone a voice in 5 seconds to generate arbitrary speech in real-time
๐ŸŽต Audio36.9K043Python2mo ago
7
Voicebox
The open-source AI voice studio. Clone, dictate, create.
๐ŸŽต Audio25.2K072TypeScript15d ago
8
AudioCraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
๐ŸŽต Audio23.3K+2255Jupyter Notebook2mo ago
9
Voxcpm
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
๐ŸŽต Audio18.2K055Python1d ago
10
Nemo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
๐ŸŽต Audio17.2K067Python13h ago
11
Speechbrain
A PyTorch-based Speech Toolkit
๐ŸŽต Audio11.5K053Python8d ago
12
Rtranslator
Open source real-time translation app for Android that runs locally
๐ŸŽต Audio9.9K036C++3d ago
13
Mlx Audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
๐ŸŽต Audio7.0K069Python6d ago
14
Whisperkit
On-device Speech AI for Apple Silicon
๐ŸŽต Audio6.1K045Swift13h ago
15
Argmax Oss Swift
On-device Speech AI for Apple Silicon
๐ŸŽต Audio6.1K045Swift13h ago
16
Maths Cs Ai Compendium
Become a cracked AI/ML Research Engineer
๐ŸŽต Audio3.7K039TypeScript5d ago
17
TTS Webui
A single Gradio + React WebUI with extensions for ACE-Step, OmniVoice, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice, ParlerTTS, Stable Audio, MMS, StyleTTS2, MAGNet, AudioGen, MusicGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, and Bark!
๐ŸŽต Audio3.1K044TypeScript7d ago
18
Youtube Shorts Pipeline
Automated YouTube Shorts pipeline: news โ†’ script โ†’ AI visuals โ†’ voiceover โ†’ captions โ†’ upload
๐ŸŽต Audio2.0K031Python11d ago
19
Openless
Hold a key, speak, release โ€” AI-polished text appears at your cursor in any app. Open-source voice input for macOS & Windows. (ๆŒ‰ไฝๅฟซๆท้”ฎ่ฏด่ฏ๏ผŒๆพๅผ€ๅณๅพ—ๆถฆ่‰ฒๅŽ็š„ๆ–‡ๅญ—)
๐ŸŽต Audio1.2K075HTML19h ago

Weekly AI open-source movers

Get the fastest-growing projects, useful MCP servers, and technical reads in one weekly email.