Category
Image agents
1,832 Image AI agents indexed on MeshKore — the most complete public catalog, ranked by popularity and updated daily.
1,832 agents · ranked by popularity · refine in the directory →
Top 100 Image agents
AI generates natively editable PPTX from any document — real PowerPoint shapes with native animations, not images · by Hugo He
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Open-source components, blocks, and AI agents designed to speed up your workflow. Import them seamlessly into your favorite tools through Registry and MCPs.
Replace port numbers with stable, named local URLs. For humans and agents.
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capabilities.
Deep Learning and Reinforcement Learning Library for Scientists and Engineers
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
The agent-native LLM router for OpenClaw. 41+ models, <1ms routing, USDC payments on Base & Solana via x402.
🌻 一键拥有你自己的 ChatGPT+众多AI 网页服务 | One click access to your own ChatGPT+Many AI web services
AI Product Design Agent - Open Source
ConardLi's open-source Skills collection, featuring web design, knowledge retrieval, image generation, and more.
🐬DeepChat - A smart assistant that connects powerful AI to your personal world
谷歌新书Agent设计模式(agentic design patterns)最佳中文版,持续优化。附:在线阅读、pdf和epub电子书下载。
【🔞🔞🔞 内含不适合未成年人阅读的图片】基于我擅长的编程、绘画、写作展开的 AI 探索和总结:StableDiffusion 是一种强大的图像生成模型,能够通过对一张图片进行演化来生成新的图片。ChatGPT 是一个基于 Transformer 的语言生成模型,它能够自动为输入的主题生成合适的文章。而 Github Copilot 是一个智能编程助手,能够加速日常编程活动。
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
end to end app store screenshot creation using AI
Kode CLI — Design for post-human workflows. One unit agent for every human & computer task.
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
A collection of agent skills for CAD, robotics and hardware design
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
The UI design language and React library for Conversational UI
Riona Ai Agent 🌸 is built using Node.js and TypeScript 🛠️, designed for seamless job execution 📸. It's lightweight, efficient, and still evolving 🚧—exciting new features coming soon! 🌟
【三年面试五年模拟】AIGC/LLM/AI Agent算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
The visual feedback tool for agents.
ChatGPT + DALL-E + WhatsApp = AI Assistant :rocket: :robot:
🦖 𝗟𝗲𝗮𝗿𝗻 about 𝗟𝗟𝗠𝘀, 𝗟𝗟𝗠𝗢𝗽𝘀, and 𝘃𝗲𝗰𝘁𝗼𝗿 𝗗𝗕𝘀 for free by designing, training, and deploying a real-time financial advisor LLM system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 𝘷𝘪𝘥𝘦𝘰 & 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴
Implementation of 17+ agentic architectures designed for practical use across different stages of AI system development.
Generate, animate and schedule your AI characters 🤖
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
🌊 AChat - An open-source/self-hosted/local-first AI platform, designed for enterprises and teams, perfectly combining powerful local processing capabilities with seamless remote synchronization.
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.
🌊 A Human-in-the-Loop workflow for creating HD images from text
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)
A secure persistent personal agent server in Rust. One binary, sandboxed execution, multi-provider LLMs, voice, memory, Telegram, WhatsApp, Discord, Teams, and MCP tools. Secure by design, runs on your hardware.
Generate images by NovelAI | 基于 NovelAI 的画图机器人
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance
AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI
🤖📐专为数学建模设计的 Agent ,自动完成数学建模,生成一份完整的可以直接提交的论文。 An Agent Designed for Mathematical Modeling ,Automatically complete mathmodel and generate a complete paper ready for submission.
PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control
🍭 Lobe UI - an open-source UI component library for building AIGC web apps
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
DingTalk Workspace is an officially open-sourced cross-platform CLI tool from DingTalk. It unifies DingTalk’s full suite of product capabilities into a single package, is designed for both human users and AI agent scenarios.
Supercharged experience for multiple models such as ChatGPT, DALL-E and Stable Diffusion.
Free and Open-Source, Easy-to-Use Laravel eCommerce Platform, Base on the Laravel . It supports multiple languages and currencies, Integrates AI agents. The platform features customizable visual design and a rich plugins on marketplace.
Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, speech synthesis and recognition, web search, memory, presets, assistants,and more. Linux, Windows, Mac
Turn your PC, Mac, or Linux box into a private AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation.
Agentic Design Patterns
🚀 LangGraph for Java. A library for develop AI Agentic Architectures in the Java ecosystem. Designed to work seamlessly with both Langchain4j and Spring AI.
Generate images from texts. In Russian
supports Telegram, Discord, Slack, Lark(飞书),钉钉, 企业微信, QQ, 微信, compatible with various LLMs including OpenAI, Gemini, DeepSeek, Doubao, and OpenRouter. It offers intelligent conversation, image generation, video creation, and more. Works seamlessly in both private chats and group settings.
ROSA 🤖 is an AI Agent designed to interact with ROS1- and ROS2-based robotics systems using natural language queries. ROSA helps robot developers inspect, diagnose, understand, and operate robots.
A simple yet powerful agent framework for personal assistants, designed to enable intelligent interaction, multi-agent collaboration, and seamless tool integration.
为 AI Agent 设计的 JS 逆向 MCP Server,内置反检测,基于 chrome-devtools-mcp 重构 | JS reverse engineering MCP server with agent-first tool design and built-in anti-detection. Rebuilt from chrome-devtools-mcp.
This course is designed to guide beginners through the exciting world of Edge AI, covering fundamental concepts, popular models, inference techniques, device-specific applications, model optimization, and the development of intelligent Edge AI agents.
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
[EMNLP 2025 Oral] MemoryOS is designed to provide a memory operating system for personalized AI agents.
Easily select and manage your preferred AI digital assistants on Android.
It's not AI that takes away your job, but the people who master the use of AI tools. The most deadly attack is a dimension-reducing strike: destroying you has nothing to do with you - from "The Three-Body Problem". 中文说明: 抢走你工作的不是AI,而是掌握使用AI工具的人。 降维打击最为致命:毁灭你,与你何干《三体》
Application implementation with business use cases for safely utilizing generative AI in business operations
The TypeScript library for building AI applications.
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC. Ideal for VTubing, streaming, and virtual assistant applications.
Agent-MCP is a framework for creating multi-agent systems that enables coordinated, efficient AI collaboration through the Model Context Protocol (MCP). The system is designed for developers building AI applications that benefit from multiple specialized agents working in parallel on different aspects of a project.
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
Build and Deploy a Full Stack MERN AI Image Generation App MidJourney & DALL E Clone
Build your own Cowork, AI Scientist and other SoTA Agents just by editing config files. Support anthropic skills. An infinite-horizon agent framework designed for long-running, complex tasks.
A Python-based lightweight robot simulator designed for navigation, control, and learning
超级AI大脑一个基于SpringCloud微服务架构,已对接GPT-3.5、GPT-4.0、百度文心一言、stable diffusion AI绘图、Midjourney绘图等。支持web,Android,IOS,H5多端应用,使用了OpenAI的ChatGPT模型实现了智能聊天机器人。用户可以在界面上与聊天机器人进行对话,聊天机器人会根据用户的输入自动生成回复。同时也支持画图,用户输入文本,便可以自动制作文生文生图。持续更新中,更多功能等着你来解锁
WebRover is an autonomous AI agent designed to interpret user input and execute actions by interacting with web elements to accomplish tasks or answer questions. It leverages advanced language models and web automation tools to navigate the web, gather information, and provide structured responses based on the user's needs.
Autonomous self-evolving agents. Vision-grounded layered memory and self-written skills for LLM agents that operate your computer.
open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for designing complex, interactive environments where agents can act, learn, and evolve.
ChatGPT CLI is a powerful, multi-provider command-line interface for working with modern LLMs. It supports OpenAI, Azure, Perplexity, LLaMA, and more, with features like streaming, interactive chat, prompt files, image/audio I/O, MCP tool calls, and an experimental agent mode for safe, multi-step automation.
Awesome AI Memory | LLM Memory | A curated knowledge base on AI memory for LLMs and agents, covering long-term memory, reasoning, retrieval, and memory-native system design. Awesome-AI-Memory 是一个 集中式、持续更新的 AI 记忆知识库,系统性整理了与 大模型记忆(LLM Memory)与智能体记忆(Agent Memory) 相关的前沿研究、工程框架、系统设计、评测基准与真实应用实践。
🤖 Components Library for Quickly Building LLM Chat Interfaces.
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.
【新增智能体模式】安卓端全场景GPT助手,可用音量键唤起并进行语音交流,支持联网、拍照、模板、附件解析、智能体模式等 | GPT assistant for Android, activated via volume keys for voice interaction, supporting features such as networking, taking photos, templates, parsing PDF and Office documents, and agent mode.
Open-source real-time digital human agent platform. Build voice-first AI agents with WebRTC, persona memory, tools, RAG, and optional digital-human video.
AI-powered tools to enhance Anki flashcards with explanations, mnemonics, illustrations, and adaptive learning for medical school and beyond
A Cursor skill that gives AI agents real UI component knowledge — best practices, layout patterns, and design-system conventions for 60+ interface components — so it generates production-grade UI instead of generic output.
[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!
AI-First Album: Chat with your gallery using plain language! LLM Vision + RAG + Album/Gallery.
🤖 Beautifully designed chatbot components based on shadcn/ui
Train Models Contrastively in Pytorch
🎨 Image collector, support for custom acquisition source, compatible with Windows and MacOS!| 图像采集器,支持自定义采集源,兼容Windows和MacOS!
High-fidelity HTML design and prototype guidance skill for AI agents
Microsoft Foundry (demos, documentation, accelerators).
JTokkit is a Java tokenizer library designed for use with OpenAI models.
Multi-agent framework for design, simulation, and auditing.
AI Agnostic (Multi-user and Multi-bot) Chat with Fictional Characters. Designed with scale in mind.
End-to-end RAG system design, evaluation, and optimization. 极客时间RAG训练营,RAG 10大组件全面拆解,4个实操项目吃透 RAG 全流程。RAG的落地,往往是面向业务做RAG,而不是反过来面向RAG做业务。这就是为什么我们需要针对不同场景、不同问题做针对性的调整、优化和定制化。魔鬼全在细节中,我们深入进去探究。
A self-hostable personal AI agent with vector memory, Composio tools, and Telegram.
🦀An agentic AI assistant that lives in your chats, inspired by nanoclaw and incorporating some of its design ideas. Built with Rust 🦀
🌌 Give a soul to your digital waifu. Soul of Waifu is an immersive desktop roleplay & AI companion engine with Live2D/VRM avatars, real-time voice chat, and local LLM support. Watch your characters come to life.
ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.
日本語UIをAIエージェントに正しくつくらせるためのDESIGN.md集。Japanese DESIGN.md collection for AI agents — extending Google Stitch format with CJK typography.
Self-healing infrastructure for AI agent payments. 90.3% auto-recovery.