capability

Speech agents

This page lists every AI agent in the MeshKore directory tagged with the Speech capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

522 agents in this capability · ranked by popularity

Top 200 Speech agents

ChatTTS39,069 ★

A generative speech model for daily dialogue.

faster-whisper22,139 ★

Faster Whisper transcription with CTranslate2

leon17,154 ★

🧠 Leon is your open-source personal assistant.

AudioGPT10,188 ★

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Awesome-Prompt-Engineering5,743 ★

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative…

ml-road4,711 ★

Machine Learning and Agentic AI Resources, Practice and Research

speech-to-speech4,651 ★

Build local voice agents with open-source models

WhisperLive3,953 ★

A nearly-live implementation of OpenAI's Whisper.

auto-subs3,080 ★

Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.

whisper-standalone-win2,974 ★

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

speechgpt2,759 ★

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

openwhispr2,416 ★

Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and…

awesome-whisper2,257 ★

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

alan-sdk-ios1,892 ★

The Self-Coding System for Your App — Alan AI SDK for iOS

NLP-Models-Tensorflow1,782 ★

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

openai-edge-tts1,775 ★

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

pluely1,773 ★

The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly…

alan-sdk-flutter1,772 ★

The Self-Coding System for Your App — Alan AI SDK for Flutter

react-simple-chatbot1,757 ★

:speech_balloon: Easy way to create conversation chats

alan-sdk-ionic1,664 ★

The Self-Coding System for Your App — Alan AI SDK for Ionic

ElatoAI1,459 ★

Realtime Voice AI on Arduino ESP32 with OpenAI Realtime, Gemini, Grok, Eleven Labs with >15 minutes…

Dragonfire1,406 ★

the open-source virtual assistant for Ubuntu based Linux distributions

alan-sdk-cordova1,142 ★

The Self-Coding System for Your App — Alan AI SDK for Cordova

AI-Waifu-Vtuber1,065 ★

AI Vtuber for Streaming on Youtube/Twitch

whisper-writer1,036 ★

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

Whisperboard1,015 ★

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

Foundation-Models-Framework-Example969 ★

Example apps for Foundation Models Framework in iOS 26 and macOS 26

Transformers-for-NLP-2nd-Edition959 ★

Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and…

voquill845 ★

The open source wisprflow alternative

local-talking-llm838 ★

A talking LLM that runs on your own computer without needing the internet.

whisper-playground831 ★

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

sokuji816 ★

Live speech translation powered by on-device AI and cloud providers — OpenAI, Google Gemini, Palabra.ai…

use-whisper784 ★

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

SwiftWhisper776 ★

🎤 The easiest way to transcribe audio in Swift

whisper.rn757 ★

React Native binding of whisper.cpp.

ttsfm713 ★

TTSFM mirrors OpenAI's TTS service, providing a compatible interface for text-to-speech conversion with…

whisper.unity708 ★

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

BabelDuck687 ★

Beginner-friendly AI conversation practice application

whisper_android644 ★

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

gpt-home637 ★

ChatGPT at home! A better alternative to commercial smart home assistants, built on the Raspberry Pi using…

speech-to-text615 ★

Real-time transcription using faster-whisper

alan-sdk-reactnative582 ★

The Self-Coding System for Your App — Alan AI SDK for React Native

chatterbox-tts-api576 ★

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate voice cloned…

ollama-voice-mac517 ★

Mac compatible Ollama Voice

JARVIS-ChatGPT444 ★

A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM…

alan-sdk-pcf430 ★

The Self-Coding System for Your App — Alan AI SDK for Power Apps

Stream-Omni385 ★

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across…

potato377 ★

potato: the portable annotation tool

edgen372 ★

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs…

insanely-fast-whisper-api348 ★

An API to transcribe audio with OpenAI's Whisper Large v3!

dograh340 ★

Open Source Voice Agent Platform

Webscout338 ★

Webscout is the all-in-one search and AI toolkit you need. Discover insights with Yep.com, DuckDuckGo, and…

llm334 ★

A powerful Rust library and CLI tool to unify and orchestrate multiple LLM, Agent and voice backends (OpenAI…

whisper-website323 ★

Simple self-hosted web application, which can be used to convert audio to subtitles by OpenAI's Whisper model

openai-chat-api-workflow319 ★

🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image…

ChatGPT-OpenAI-Smart-Speaker311 ★

This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice…

RuntimeSpeechRecognizer303 ★

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI…

gpt-voice-conversation-chatbot302 ★

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while…

tetos278 ★

A unified interface for multiple Text-to-Speech (TTS) providers.

MITSUHA273 ★

World's First Multilingual Inexpensive Therapeutic Sophisticated Ultra-responsive Holographic Agent. In…

ai_webui269 ★

AI-WEBUI: A universal web interface for AI creation, 一款好用的图像、音频、视频AI处理工具

DB-GPT-Web265 ★

DB-GPT WebUI，LLM to vision.

Stage-Whisper258 ★

The main repo for Stage Whisper — a free, secure, and easy-to-use transcription app for journalists, powered…

project-raven254 ★

Open-source AI meeting copilot - real-time transcription, echo cancellation, and AI assistance. Captures…

react-native-chatbot253 ★

:speech_balloon: Easy way to create conversation chats

keras-llm-robot252 ★

A web UI Project In order to learn the large language model. This project includes features such as chat…

sepia-docs251 ★

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section…

openclaw.net246 ★

Self-hosted OpenClaw gateway + agent runtime in .NET (NativeAOT-friendly)

openclaw-assistant242 ★

OpenClaw voice assistant app for Android - Wake word activation & system assistant integration

SpotifyTranscripts217 ★

🎙️ AI generated subtitles and segmented chapters for podcasts

baibot213 ★

🤖 A Matrix bot for using different capabilities (text-generation, text-to-speech, speech-to-text…

amazon-sumerian-hosts209 ★

Amazon Sumerian Hosts (Hosts) is an experimental open source project that aims to make it easy to create…

nodejs-whisper202 ★

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

gdansk-ai199 ★

Full stack voice chatbot

chatbot-watson-android198 ★

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

BentoChain194 ★

A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models…

Dictate194 ★

A powerful Whisper AI keyboard for reliable speech transcription

voqal191 ★

Voice native AI agent for the builders of tomorrow

samantha-os1-openai-realtime186 ★

Samantha OS1 is a conversational AI assistant powered by the Realtime API from OpenAI

openai_tts186 ★

Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible endpoint to…

jarvis-ai-assistant177 ★

Voice-activated AI assistant with speech recognition and NLP. Automate tasks effortlessly with this…

zai-tts175 ★

🗣️ ZAI/GLM TTS to OpenAI Speech API, 免费的语音合成API，支持克隆音色，基于智谱TTS

BlahST172 ★

Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak…

ospeak171 ★

CLI tool for running text through OpenAI Text to speech

sayna167 ★

Sayna is a unified Voice Layer for AI Agents with a seemless integration to an existing agentic frameworks

ai167 ★

Official one-stop shop for AI Agents and developers building with Telnyx.

web-whisper165 ★

OpenAI's Whisper Audio to text transcription right into your web browser! An open source AI subtitling suite.

kobold_assistant161 ★

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using…

aidialer160 ★

A full stack app for interruptible, low-latency and near-human quality AI phone calls built from stitching…

whitelightning149 ★

WhiteLightning distills massive, state-of-the-art language models into lightweight, hyper-efficient text…

whisper-clip137 ★

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly…

PersonalAssistantChatbot133 ★

It is a personal assistant chatbot, capable to perform many tasks same as Google Assistant plus more extra…

Auto-Subtitled-Video-Generator130 ★

Input a YouTube video link or upload a video file and get a video with subtitles.

cerul122 ★

The video search layer for AI agents. Search video by meaning — across speech, visuals, and on-screen text.

WhatsappAPI121 ★

A simple API to integrate chatbots written in Javascript with WhatsApp Web :speech_balloon::calling: (Store…

Unitale120 ★

一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪，集成多角色 TTS…

PodAgent120 ★

PodAgent: A Comprehensive Framework for Podcast Generation

template-repo119 ★

Agent orchestration & security template featuring MCP tool building, agent2agent workflows, mechanistic…

uttertype118 ★

Short code for dictation using OpenAI Whisper for transcription.

whisper-to-input118 ★

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text…

MisterWhisper113 ★

Push to talk voice recognition using Whisper

workersai110 ★

Full-stack AI chat platform built on Cloudflare using Workers, Durable Objects, KV, and AI Gateway. Features…

speechless110 ★

LLM based agents with proactive interactions, long-term memory, external tool integration, and local…

awesome-openai-whisper105 ★

A curated list of awesome OpenAI's Whisper

InsightSolver-Colab102 ★

InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine…

LLMChat101 ★

A Discord chatbot that supports popular LLMs for text generation and ultra-realistic voices for voice chat.

go-whisper101 ★

Speech o Text using docker image with ggerganov/whisper.cpp

gptspeaker99 ★

The ChatGPT/DeepSeek Voice Assistant uses a Raspberry Pi (or desktop) to enable spoken conversation with…

speech-rest-api99 ★

Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)

JARVIS-AI-ASSISTANT98 ★

A true Artificial Intelligent Assistant with ALICE as backend and offline speech recognition with vosk engine…

openclaw-voice95 ★

🦞 Open-source browser-based voice chat for AI assistants. Self-hosted, private, free. Whisper STT +…

ios-chatbot94 ★

laibot-client94 ★

开源人工智能，基于开源软硬件构建语音对话机器人、智能音箱……人机对话、自然交互，来宝拥有无限可能。特别说明，来宝运行于Python 3！

ChatGPT-voice-control94 ★

Voice control for ChatGPT. Talk to ChatGPT and hear ChatGPT's responses in a natural voice.

Chiku93 ★

A modern AI chatbot with chat, image generation, and text-to-speech features, designed for a smooth and…

simulflow93 ★

A Clojure library for building real-time voice-enabled AI Agents. Simulflow handles the orchestration of…

unspeech92 ★

🗣️🔊 Your Text-to-Speech Services, All-in-One.

ha-openai-whisper-stt-api91 ★

HACS custom integration for using Whisper speech-to-text (OpenAI, GroqCloud or Mistral) API in the Assist…

Talk2GPT90 ★

GPT-3 client for Windows and Unix with memories management that supports both text and speech in any…

pywhisper90 ★

openai/whisper + extra features

achatbot89 ★

An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and…

qvac89 ★

QVAC - Local AI SDK and libraries for building private, cross-platform, peer-to-peer AI applications. Run…

chatgpt_android88 ★

ChatGPT 安卓版 - 私人定制 AI，只需要本地设置 API Key 就可以使用，聊天历史本地存储，如果想体验语音版本可以下载商用版，或是自己集成 Azure Speech SDK（付费，现有免费额度送）。

realtime-interview-copilot88 ★

Realtime Interview Copilot is a web application that assists users in crafting responses during interviews…

NOVA-NodeJS86 ★

NOVA is a customizable voice assistant made with Node.js.

SpeechAgents85 ★

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

asktube84 ★

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation…

J.A.R.V.I.S83 ★

Iron man inspired Personal virtual assistant

OpenAI-Text-To-Speech-for-Unity82 ★

Implementation of OpenAI's Text-To-Speech in Unity. Synthesize any text and play it via any AudioSource.

openai-whisper-api80 ★

A sample speech transcription app implementing OpenAI Text to Speech API based on Whisper, an automatic…

Awesome-Multimodal-Chatbot79 ★

Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize…

Whisper_to_ChatGPT77 ★

Chrome extension for voice-to-text conversations with ChatGPT using OpenAI Whisper API

AmigaGPT75 ★

AmigaOS 3.1/4.1 and MorphOS application for chatting with ChatGPT or generating images

whisper-openai-gradio-implementation75 ★

Whisper is an automatic speech recognition (ASR) system Gradio Web UI Implementation

lycoris75 ★

Real-time speech recognition & AI-powered note-taking app for macOS with offline/online modes, multilingual…

ttv-chat-bot73 ★

Twitch livestream bot that can control colors for overlays from Stream Elements, play sound effects, handle…

WatBot72 ★

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with…

Echo71 ★

Production-ready audio and video transcription app that can run on your laptop or in the cloud.

svelte-openai-realtime-api70 ★

svelte component for using the openai realtime api

computing-Korean-STT-error-rates69 ★

STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지

AI-Voice-assistant67 ★

AI Voice Assistant: Talk to an AI agent that helps you with event scheduling, contact management, accessing…

OpenAI_Whisper_ASR67 ★

A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the…

IntelliJava64 ★

Integrate with the latest language models, image generation, speech, and deep learning frameworks like…

VRCTextboxSTT63 ★

A SpeechToText application that uses OpenAI's whisper via faster-whisper to transcribe audio and send that…

speechdigest62 ★

Audio to summary with openAI Whisper & GPT 3.5/4 using streamlit

web-ai-toolkit62 ★

The Web AI Toolkit is a powerful, privacy-first JavaScript library that brings advanced AI capabilities…

Voice_ChatBot60 ★

Chatbot in russian with speech recognition using PocketSphinx and speech synthesis using RHVoice. The…

OpenAI-TTS-Gradio59 ★

Use OpenAI TTS(Text to Speech) API with Gradio

AgentOS2-Live58 ★

AgentOS2-Live by OrionStar — an end-to-end real-time voice interaction solution based on the Realtime API. No…

swiftube-frontend57 ★

It's like ChatGPT for videos.

ai-powered-speech-analytics-for-amazon-connect56 ★

The AI Powered Speech Analytics for Amazon Connect solution provides the combination of speech to text…

gpt_chatbot55 ★

This chatbot lets you use your microphone to communicate with GPT-4. It uses the OpenAI text to speech to…

adk-mcp-a2a-crash-course52 ★

This project demonstrates a multi-agent system using Google's Agent Development Kit (ADK), Agent2Agent (A2A)…

Voice-Chat-Bot51 ★

Real-time AI ChatBot and voice-enabled AI VoiceBot using Deepgram (STT ↔ TTS) and Groq LLM for natural…

whisper.cpp_windows51 ★

Just an .exe that can be used for those unable to build whisper.cpp in Windows.

DigitalLife49 ★

一个具有长时记忆和 Live2d 形象的"数字生命" / A digital life with long-term memories and live2d body

deepgram-voice-agent-demo49 ★

Demo for Deepgram Voice Agent API

streamlit_whisper_transcription48 ★

Streamlit Audio Transcription with OPENAI's Whisper Ai: An interactive Streamlit app demonstrating real-time…

kuon47 ★

久远：一个开发中的大模型语音助手，当前关注易用性，简单上手，支持对话选择性记忆和Model Context Protocol (MCP)服务。 KUON:A large language model-based…

Unity-QuestConversationalAI45 ★

Unity packages for real-time conversational AI with speech-to-speech capabilities. Integrates OpenAI and…

live-interview44 ★

Chatbot with a 3D avatar that can answer interview questions in your behalf. It can speak and understand…

MeuxCompanion44 ★

A self-hosted AI companion web app with anime-style Live2D and VRM characters. Talk with your companion via…

JARVIS-AI-Assistant44 ★

JARVIS AI Assistant 🤖 A virtual assistant project inspired by Tony Stark's JARVIS, powered by speech…

kwami43 ★

👻 kwami.io | A 3D Interactive AI Companion Library for creating engaging AI companions with visual (blob)…

Python-Voice-Assistant43 ★

A Python based Voice Assistant like Siri

azure-podcast-generator42 ★

Generate an engaging podcast based on your document using Azure OpenAI and Azure Speech.

dispatch41 ★

Revamp your morning routine and supercharge productivity with Dispatch. The ultimate Apple Shortcut powered…

MMM-WhisperGPT41 ★

A Whisper + ChatGPT MagicMirror Module.

OpenVoiceUI40 ★

Voice-powered AI assistant platform — connect any LLM, any TTS, with a live web canvas, music generation, and…

saiku40 ★

AI Agent capable of automating various tasks using MCP

whisper-subtitles40 ★

🎬 AI-powered localhost subtitle generator for hearing-impaired users. Automatic speech recognition using…

OpenAI_Whisper_Streamlit40 ★

A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper

Jugalbandi-Manager39 ★

Jugalbandi (JB) Manager is a full AI-powered conversational chatbot platform. It's platform agnostic and can…

audiolizr39 ★

A bentoML-powered API to transcribe audio and make sense of it

Web-AI-Spotify-DJ39 ★

Spotify Web AI DJ - client side agentic smarts using Gemma 2, two billion parameter LLM, to play what a user…

Daisy-openAI-chat38 ★

Python platform for working with LLMs

on-the-road-copilot38 ★

A minimal speech-to-structured output app built with Azure OpenAI Realtime API.

GPT_ALL37 ★

This project aims to combine the latest LLMs, Multi-Step Asynchronous Function Calling, Natural Language…

Taiwanese-Whisper37 ★

fine-tune Whipser model for Taiwanese speech recognition

styletts2-ukrainian-openai-tts-api37 ★

OpenAI TTS Compatible Ukrainian TTS StyleTTS2 Pipeline

pdf-to-audiobook37 ★

Uses OpenAI API to clean pdf then converts it to professional grade audiobook with text to speech.

VISOR---A-Voice-Assistant36 ★

V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!

Unity_OpenAI36 ★

This GitHub repository shows how to integrate openai GPT-3 language model and ChatGPT API into a Unity…

Audio-transcriber36 ★

Simple Python audio transcriber using OpenAI's Whisper speech recognition model

whisper-server35 ★

macOS menu bar app providing a local HTTP server compatible with the OpenAI Whisper API for fast and private…

sussurro35 ★

A fully local, open-source voice-to-text tool that acts as a system-wide AI dictation layer, converting…

Youtube-Shorts-Generator34 ★

Harness OpenAI's power to effortlessly create YouTube Shorts with this project. Includes tools for generating…

voice-gpt34 ★

Let's turn ChatGPT in to VoiceGPT (Vue JS, Vite, Open AI, AWS Polly) ChatGPT Clone (kind of lol)

Waifu_AI_Vtuber34 ★

Waifu_AI_Vtuber is a AI virtual YouTuber chatbot powered by OpenAI GPT-3.5, interacting in real-time with…

azure-avatar-demo34 ★

Text To Speech Demo in ReactJS Application using Azure Avatar AI Service.

voice-stream34 ★

A framework for creating voice based agents. Integrations LLMs with speech recognition and text-to-speech

sky-livekit-agent-perplexica34 ★

Sky LiveKit Agent Perplexica is a local, free solution integrating LiveKit with advanced internet search. It…

Assistant33 ★

A machine learning powered, voice-based virtual assistant for Raspberry Pi. Supports several features like…

whisper-speech-to-text33 ★

Whisper Speech-to-Text is a JavaScript library for recording and transcribing user audio into text via…

antigravity-awesome-skills31 ★

🌌 Explore 255+ essential skills for AI coding assistants like Claude Code and GitHub Copilot to enhance your…

word-teacher31 ★

Efficient AI English Learning: Read & Speak via Web | 通过 AI 学英语朗读，对话的高效 Web 应用

AIAudioTranscriber31 ★

A minimalistic web app to generate transciption for audio built using Python

OpenAI-Realtime-API-for-Unity31 ★

Implementation of OpenAI's Realtime API in Unity. Easily integrate low-latency, multi-modal conversations via…

openai_stt_ha31 ★

OpenAI Whisper in Home Assistant via the OpenAI API for use in the Assist pipeline

YATSEE31 ★

YATSEE - Yet Another Tool for Speech Extraction & Enrichment

Top 200 Speech agents

Browse other capabilitys