capability

Inference agents

This page lists every AI agent in the MeshKore directory tagged with the Inference capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

437 agents in this capability · ranked by popularity

Top 200 Inference agents

vllm76,376 ★

A high-throughput and memory-efficient inference and serving engine for LLMs

whisper.cpp49,594 ★

Port of OpenAI's Whisper model in C/C++

ChatTTS39,069 ★

A generative speech model for daily dialogue.

Langchain-Chatchat37,806 ★

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 |…

faster-whisper22,139 ★

Faster Whisper transcription with CTranslate2

CosyVoice20,540 ★

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

free-llm-api-resources18,619 ★

A list of free LLM inference resources accessible via API.

llama-cookbook18,287 ★

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with…

petals10,064 ★

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

plano6,288 ★

Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety…

deepreasoning5,361 ★

A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with…

superduper5,265 ★

Superduper: End-to-end framework for building custom AI applications and agents.

eko4,902 ★

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

gpustack4,827 ★

A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for…

csghub4,672 ★

CSGHub is a brand-new open-source platform for managing LLMs, developed by the OpenCSG team. It offers both…

cactus4,621 ★

Low-latency AI engine for mobile devices & wearables

FedML4,032 ★

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and…

GenerativeAIExamples3,908 ★

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

FastDeploy3,673 ★

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

optillm3,421 ★

Optimizing inference proxy for LLMs

spiceai2,870 ★

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI…

YC-Killer2,664 ★

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free…

intel-extension-for-transformers2,176 ★

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run…

claw-compactor2,134 ★

14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis…

dstack2,089 ★

Control plane for agents and engineers to provision compute and run training and inference across NVIDIA…

llama2-webui1,940 ★

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper`…

neuron-ai1,837 ★

The PHP Agentic Framework to build production-ready AI driven applications. Connect components (LLMs, vector…

LLMCompiler1,837 ★

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

nanocoder1,693 ★

A beautiful local-first coding agent running in your terminal - built by the community for the community ⚒

company-research-agent1,666 ★

An agentic company research tool powered by LangGraph and Tavily that conducts deep diligence on companies…

AgentDock1,622 ★

Build Anything with AI Agents

edgeai-for-beginners1,411 ★

This course is designed to guide beginners through the exciting world of Edge AI, covering fundamental…

airunner1,315 ★

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows

Jlama1,268 ★

Jlama is a modern LLM inference engine for Java

awesome-ai-web-search1,264 ★

List of software that allows searching the web with the assistance of AI…

parallax1,238 ★

Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere

EmbedAnything1,171 ★

Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀

llmgateway1,084 ★

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface.

OpenAlpha_Evolve999 ★

OpenAlpha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous…

Llama-2-Open-Source-LLM-CPU-Inference976 ★

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

openinference919 ★

OpenTelemetry Instrumentation for AI Observability

vllm-mlx820 ★

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama…

llama3.java804 ★

Llama 3+ inference in pure Java

blast775 ★

Browser-LLM Auto-Scaling Technology

GenossGPT752 ★

One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace…

mlx-omni-server698 ★

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple…

LLaMA_MPS584 ★

Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.

uzi576 ★

CLI for running large numbers of coding agents in parallel with git worktrees

MiniSearch553 ★

Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM…

aikit515 ★

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

rkllama499 ★

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning models on…

LLM-VM491 ★

irresponsible innovation. Try now at https://chat.dev/

vllm-cli487 ★

A command-line interface tool for serving LLM using vLLM.

chipper486 ★

✨ AI interface for tinkerers (Ollama, Haystack RAG, Python)

MARTI483 ★

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

DreamServer481 ★

Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image…

edsl454 ★

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market…

sagify443 ★

LLMs and Machine Learning done easily

mlxstudio443 ★

MLX Studio - Home of JANG_Q - Image Gen/Edit + Chat/Code All in one - + OpenClaw (Anthropic API)

local-llm-function-calling437 ★

A tool for generating function arguments and choosing what function to call with local LLMs

super-rag392 ★

Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one…

OpenArc384 ★

Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over…

Context-Engine381 ★

Context-Engine MCP - Agentic Context Compression Suite

NanoLLM366 ★

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models…

incognide355 ★

Explore the unknown, build the future, own your data.

HackBot347 ★

AI-powered cybersecurity chatbot designed to provide helpful and accurate answers to your…

skills331 ★

inference.sh Agent skills for using our API to give your agents access to hundreds of apps and other agents

chat.petals.dev318 ★

💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client

zinc306 ★

Zig INferenCe Engine — Local LLM inference on AMD GPUs and Apple Silicon

LLM-Hub288 ★

Local AI Assistant on your phone

Kotlin-AI-Examples262 ★

A collection of Kotlin-based examples featuring AI frameworks such as Spring AI, LangChain4j, and more …

ThunderAgent259 ★

A simple, fast and robust program-aware agentic inference system.

ChordMiniApp250 ★

Music Analysis, Chord Recognition, Beat Tracking, Guitar Diagrams, Piano Visualizer, Lyrics Transcription…

Mano-P241 ★

Mano-P: Open-source GUI-VLA agent for edge devices. #1 on OSWorld (specialized, 58.2%). Runs locally on Apple…

Rustchain232 ★

DePIN for Vintage Hardware — Proof-of-Antiquity blockchain where old machines outmine new ones. AI-powered…

bespoke_automata222 ★

Bespoke Automata is a GUI and deployment pipline for making complex AI agents locally and offline

gym-cooking221 ★

🏆 gym-cooking: Code for "Too many cooks: Bayesian inference for coordinating multi-agent collaboration"…

openai_trtllm219 ★

OpenAI compatible API for TensorRT LLM triton backend

pocketgroq217 ★

PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced…

xagent216 ★

A production-ready platform for dynamic AI agents — plan, use tools, and complete real work without hardcoded…

insights-lm-local-package209 ★

Open-source, fully private and local alternative to NotebookLM. Chat with your documents, generate audio…

graphsignal-python205 ★

Graphsignal Python SDK

demo-chatbot199 ★

A template to create any LLM Inference Web Apps using Python only

decapod197 ★

Decapod is the daemonless, local-first control plane agents call on demand to converge on human intent, shape…

grove192 ★

Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling

HiveMind191 ★

HiveMind Protocol - A Local-First, Privacy-Preserving Architecture for Agentic RAG

CVPR2018_attention179 ★

Context Encoding for Semantic Segmentation MegaDepth: Learning Single-View Depth Prediction from Internet…

local-deepsearch-academic178 ★

An implementation of Google Deep Search 🕵️ with support for 1000+ references, local inference, chatting with…

DyPRAG177 ★

[arxiv: 2503.23895] Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement

llm-api170 ★

Run any Large Language Model behind a unified API

booster168 ★

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

libre-chat165 ★

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline capable and…

smg161 ★

Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM…

grps_trtllm160 ★

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service…

promptbook156 ★

Turn your company's scattered knowledge into AI ready Books ✨

kongruity_155 ★

kongruity leverages LLMs and cohesion scoring to transform unstructured creative-engineering process…

aibitat150 ★

Multi-Agent Conversation Framework in TypeScript

ialacol147 ★

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

gpt4local140 ★

Openai-style, fast & lightweight local language model inference w/ documents

EcoAssistant134 ★

EcoAssistant: using LLM assistant more affordably and accurately

Gensokyo-llm133 ★

开源的智能体项目 支持6种聊天平台 Onebotv11一对多连接 流式信息 agent 对话keyboard气泡生成 支持10+大模型接口(持续更新) 具有将多种大模型接口转化为带有上下文的通用格式的能力.

Toolio133 ★

GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered structured output (3SO) and…

faster-chat131 ★

A blazingly fast, privacy first & OPEN AI Chat Interface

llm-inference126 ★

Large Language Model (LLM) Inference API and Chatbot

local-llms-on-android123 ★

Run large language models like Qwen and LLaMA locally on Android for offline, private, real-time question…

Awesome-AI-For-Security122 ★

A curated list of tools, papers, and datasets for applying AI to cybersecurity tasks. This list primarily…

llm-interface121 ★

A simple NPM interface for seamlessly interacting with 36 Large Language Model (LLM) providers, including…

SLED119 ★

SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model …

MAS-Zero117 ★

Designing Multi-Agent Systems with Zero Supervision

axiom-voice-agent114 ★

Run a <400ms latency Voice Agent on just 4GB VRAM. Fully offline, no API keys required. Optimized for GTX…

opentau113 ★

Using Large Language Models for Repo-wide Type Prediction

lm-proxy111 ★

OpenAI-compatible HTTP LLM proxy / gateway for multi-provider inference (Google, Anthropic, OpenAI, PyTorch)…

twosetai111 ★

All the code and materials

inference-gateway109 ★

An open-source, cloud-native, high-performance gateway unifying multiple LLM providers, from local solutions…

Oxide-Lab107 ★

Modern desktop application (Rust + Tauri v2 + Svelte 5 + Candle (HF)) for communicating with AI models that…

BlazorGPT99 ★

BlazorGPT is a Blazor Server application that uses Semantic Kernel plus OpenAI, Azure OpenAI and Ollama for…

deep-active-inference-mc99 ★

Deep active inference agents using Monte-Carlo methods

Llamatik98 ★

True on-device AI for Kotlin Multiplatform (Android, iOS, Desktop, JVM, WASM). LLM, Speech-to-Text and Image…

LLMinator97 ★

Gradio based tool to run opensource LLM models directly from Huggingface

langport94 ★

Langport is a language model inference service

Indic-Subtitler94 ★

Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.

llmariner94 ★

Extensible generative AI platform on Kubernetes with OpenAI-compatible APIs.

orka-reasoning93 ★

Orchestrator Kit for Agentic Reasoning - OrKa is a modular AI orchestration system that transforms Large…

deep-recall91 ★

Enterprise-grade memory framework for LLMs featuring GPU-optimized inference, vector storage, and automated…

infermux89 ★

Route inference across LLM providers. Track cost per request.

InferrLM84 ★

On-device AI for iOS & Android

Conduit84 ★

🦑 Unified Swift SDK for LLM inference across local and cloud providers

pixelbot82 ★

Multimodal AI agent, an interactive data studio with on-demand ML inference, media generation, and a database…

Noema-Declarative-AI80 ★

A declarative way to control LLMs.

pasllm80 ★

PasLLM - LLM inference engine in Object Pascal (synced from my private work repository)

Rules.txt80 ★

A rationalist ruleset for "debugging" LLMs, auditing their internal reasoning and uncovering biases; also a…

edge-veda80 ★

On-device AI SDK for Flutter — LLM inference, vision, STT, TTS, image generation, embeddings, RAG, and…

gitvoyant77 ★

Temporal Code Intelligence platform. Time-series complexity analysis across Python, JavaScript, Java, and Go…

home76 ★

Confidential is software for private, secure AI workloads, agents, and inference. It lets you provide on-prem…

ContextPilot74 ★

Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM…

monocle74 ★

Monocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI…

wingman74 ★

Inference Hub for AI at Scale

otto-m872 ★

Flowchart-like UI to interconnect LLM's and Huggingface models, and deploy them as a REST API with little to…

quickstart-streaming-agents72 ★

Build, deploy, and orchestrate event-driven agents natively on Apache Flink® and Apache Kafka®

rag-ops72 ★

This project applies the core knowledge from the LLMOps module, including the design and implementation of…

Awesome-LLMs-ICLR-2466 ★

It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on…

LLM_Powered_Video_Search65 ★

[SOICT 2024] LLM-Powered Video Search: A Comprehensive Multimedia Retrieval System

llm.f9064 ★

LLM inference in Fortran

web3-ai-trading-agent62 ★

Build an Autonomous Web3 AI Trading Agent (BASE + Uniswap V4 example)

AI-ML61 ★

A curated, hands-on library of notebooks, demos, and resources for AI/ML, Deep Learning, Generative AI…

minrlm59 ★

Stop forcing LLMs to answer in one pass. Give them a runtime. Recursive Language Model that improves any LLM…

Eloquent59 ★

The most feature-complete local AI workstation. Multi-GPU inference, integrated Stable Diffusion + ADetailer…

yalla56 ★

A tiny LLM Agent with minimal dependencies, focused on local inference.

awesome-local-ai55 ★

152 open-source tools to run LLMs 100% locally – no cloud, no API keys, no censorship

LlamaLib55 ★

Cross-Platform High-Level LLM Library

home-mind54 ★

AI assistant for Home Assistant with cognitive memory. Supports Anthropic, OpenAI, and Ollama (local…

sibila54 ★

Extract structured data from local or remote LLM models

auto-ollama52 ★

run ollama & gguf easily with a single command

ai-platform52 ★

PHP library for interacting with AI platform provider.

aura51 ★

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering…

tokio-prompt-orchestrator51 ★

Multi-core, Tokio-native orchestration for LLM pipelines.

sample-genai-on-eks-starter-kit51 ★

A comprehensive toolkit for deploying production-ready Generative AI infrastructure on Amazon EKS. Includes…

Zikkaron51 ★

Biologically-inspired persistent memory engine for Claude Code. 26 cognitive subsystems, Hopfield networks…

flame50 ★

A distributed system for Agentic AI

taskyon49 ★

Browser based Interface for Generative AI. Chat/Agent/Taskmanager Hybrid.

Project-Chimera49 ★

Neuro-Symbolic-Causal AI - Project Chimera | 🌌 An open research project exploring formal verification of AI…

Cre4T3Tiv349 ★

RAG-with-Cross-Encoder-Reranker49 ★

Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.

dataclysm47 ★

Pull high-quality, efficient embeddings for PubMed, arXiv and Wikipedia from Huggingface and use for local…

MAHPPO47 ★

PyTorch implementation of the paper: Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate…

llmtrace46 ★

Zero-code LLM security & observability proxy. Real-time prompt injection detection, PII scanning, and cost…

monkeys-with-typewriters46 ★

The complete AI platform on a $3 microcontroller. Sub-millisecond inference. Zero hallucinations.

membership_inference46 ★

Python package to create adversarial agents for membership inference attacks againts machine learning models

llmBench45 ★

llmBench is a high-depth benchmarking tool designed to measure the raw performance of local LLM runtimes…

wingman44 ★

Wingman is the fastest and easiest way to run Llama models on your PC or Mac.

AutoToM44 ★

[NeurIPS 2025 𝐒𝐩𝐨𝐭𝐥𝐢𝐠𝐡𝐭] AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

BlockRank43 ★

BlockRank makes LLMs efficient and scalable for RAG and in-context ranking

SocialDeductionLLM43 ★

Training and inference code for "Training Language Models for Social Deduction with Multi-Agent Reinforcement…

Helios-Engine43 ★

Helios Engine is a powerful and flexible Rust framework for building LLM-powered agents with tool support…

Qurio42 ★

Qurio brings multi-provider models, custom agents, reusable skills, MCP servers, HTTP tools, retrieval…

OpenVitamin41 ★

OpenVitamin is a local-first AI execution platform that unifies Agents, Workflows, and multi-model inference…

MOTO-Autonomous-ASI41 ★

MOTO - Autonomous ASI Deep Research Harness by Intrafere - creative novelty-seeking researcher for S.T.E.M…

llmedge41 ★

Android native AI inference library, bringing gguf models and stable-diffusion inference on android devices…

opla40 ★

Empower Your Productivity with Local AI Assistants

Maestro40 ★

LM-Kit Maestro is a secure, innovative desktop application that orchestrates AI agents offline, empowering…

ht39 ★

ht - a shell command that answers your questions about shell commands

drama-engine39 ★

A Framework for Narrative Agents

crew-news38 ★

CrewNews is an AI news generator that delivers an unbiased version of the news for a given topic, using…

anthropic-proxy-rs37 ★

A proxy server that intercepts Anthropic API requests and converts them to OpenAI-compatible format, enabling…

Live2D-LLM-Chat36 ★

Live2D + ASR + LLM + TTS → Real-time communication + Offline Deployment/Cloud Inference 实时沟通 本地部署/云端推理

llm-scale-deploy-guide36 ★

An end-to-end pipeline to optimize and host LLM for 100K parallel queries

axe36 ★

axe - a precision agentic coder. large codebases. zero bloat. terminal-native. precise retrieval. powerful…

CausalAgent35 ★

这是一个由LangGraph协议主导的因果分析Muti-Agent,结合MCP,RAG等多种工具进行辅助进行因果分析,提供给用户一份完善的因果分析的分析报告和因果图

vllm-factory33 ★

Production inference for encoder models - ColBERT, GLiNER, ColPali, embeddings etc. - as vLLM plugins for…

snapllm33 ★

🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching

financial-CrewAI-Agents-streamlit32 ★

Financial CrewAI Agents (LangChain, YF Tools, Ai Crew, Groq Inference)

magnet32 ★

the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly

shinzo32 ★

Complete observability platform for AI agents and MCP servers. Improve your AI deployment outcomes, identify…

fluent_gpt_app32 ★

Fluent GPT App, an open-source, multi-platform desktop application that brings the power of GPT models to…

asimov32 ★

A Python framework for building AI agent systems with robust task management in the form of a graph execution…

kjarni31 ★

Native and Private ML inference engine, embeddings, classification, reranking, search, and text generation…

ballradar31 ★

[KDD 2023] Ball Trajectory Inference from Multi-Agent Sports Contexts Using Set Transformer and Hierarchical…

captain-claw30 ★

AI agent with multi-agent orchestration, autonomous cognitive systems, and a full management dashboard

fara-agent30 ★

A local browser automation agent based on Microsoft Fara-7B model optimized for LM Studio inference.

turbo-ocr30 ★

Fast GPU OCR server. 270 img/s on FUNSD. TensorRT FP16, PP-OCRv5, HTTP + gRPC.

Browse other capabilitys