Image agents

1,832 Image AI agents indexed on MeshKore — the most complete public catalog, ranked by popularity and updated daily.

1,832 agents · ranked by popularity · refine in the directory →

Top 100 Image agents

AI generates natively editable PPTX from any document — real PowerPoint shapes with native animations, not images · by Hugo He

stable-baselines3★ 13,330

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

ui★ 11,895

Open-source components, blocks, and AI agents designed to speed up your workflow. Import them seamlessly into your favorite tools through Registry and MCPs.

portless★ 9,504

Replace port numbers with stable, named local URLs. For humans and agents.

Machine-Learning-Interviews★ 8,308

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

awesome-gpt4o-images★ 8,057

Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capabilities.

TensorLayer★ 7,389

Deep Learning and Reinforcement Learning Library for Scientists and Engineers

AppAgent★ 6,753

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

ClawRouter★ 6,517

The agent-native LLM router for OpenClaw. 41+ models, <1ms routing, USDC payments on Base & Solana via x402.

ChatAny★ 6,516

🌻 一键拥有你自己的 ChatGPT+众多AI 网页服务 | One click access to your own ChatGPT+Many AI web services

superdesign★ 6,505

AI Product Design Agent - Open Source

garden-skills★ 6,288

ConardLi's open-source Skills collection, featuring web design, knowledge retrieval, image generation, and more.

deepchat★ 5,841

🐬DeepChat - A smart assistant that connects powerful AI to your personal world

agentic-design-patterns★ 5,392

谷歌新书Agent设计模式(agentic design patterns)最佳中文版，持续优化。附：在线阅读、pdf和epub电子书下载。

understand-prompt★ 5,391

【🔞🔞🔞 内含不适合未成年人阅读的图片】基于我擅长的编程、绘画、写作展开的 AI 探索和总结：StableDiffusion 是一种强大的图像生成模型，能够通过对一张图片进行演化来生成新的图片。ChatGPT 是一个基于 Transformer 的语言生成模型，它能够自动为输入的主题生成合适的文章。而 Github Copilot 是一个智能编程助手，能够加速日常编程活动。

ComfyUI-Copilot★ 5,198

An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance

app-store-screenshots★ 5,157

end to end app store screenshot creation using AI

Kode-CLI★ 5,077

Kode CLI — Design for post-human workflows. One unit agent for every human & computer task.

LLM-Engineers-Handbook★ 5,065

The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices

text-to-cad★ 4,928

A collection of agent skills for CAD, robotics and hardware design

tiny-universe★ 4,867

《大模型白盒子构建指南》：一个全手搓的Tiny-Universe

ChatUI★ 4,396

The UI design language and React library for Conversational UI

Riona-AI-Agent★ 4,218

Riona Ai Agent 🌸 is built using Node.js and TypeScript 🛠️, designed for seamless job execution 📸. It's lightweight, efficient, and still evolving 🚧—exciting new features coming soon! 🌟

AIGC-Interview-Book★ 3,800

【三年面试五年模拟】AIGC/LLM/AI Agent算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。

agentation★ 3,758

The visual feedback tool for agents.

whatsapp-chatgpt★ 3,751

ChatGPT + DALL-E + WhatsApp = AI Assistant :rocket: :robot:

hands-on-llms★ 3,412

🦖 𝗟𝗲𝗮𝗿𝗻 about 𝗟𝗟𝗠𝘀, 𝗟𝗟𝗠𝗢𝗽𝘀, and 𝘃𝗲𝗰𝘁𝗼𝗿 𝗗𝗕𝘀 for free by designing, training, and deploying a real-time financial advisor LLM system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 𝘷𝘪𝘥𝘦𝘰 & 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴

all-agentic-architectures★ 3,377

Implementation of 17+ agentic architectures designed for practical use across different stages of AI system development.

agentheroes★ 3,371

Generate, animate and schedule your AI characters 🤖

Ask-Anything★ 3,339

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

AChat★ 3,258

🌊 AChat - An open-source/self-hosted/local-first AI platform, designed for enterprises and teams, perfectly combining powerful local processing capabilities with seamless remote synchronization.

InternGPT★ 3,203

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

lsp-ai★ 3,183

LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

dalle-flow★ 2,832

🌊 A Human-in-the-Loop workflow for creating HD images from text

rl-baselines3-zoo★ 2,810

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

dalle-playground★ 2,743

A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

moltis★ 2,709

A secure persistent personal agent server in Rust. One binary, sandboxed execution, multi-provider LLMs, voice, memory, Telegram, WhatsApp, Discord, Teams, and MCP tools. Secure by design, runs on your hardware.

novelai-bot★ 2,538

Generate images by NovelAI | 基于 NovelAI 的画图机器人

HuixiangDou★ 2,490

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

ArcReel★ 2,363

AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频，跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI

MathModelAgent★ 2,129

🤖📐专为数学建模设计的 Agent ,自动完成数学建模，生成一份完整的可以直接提交的论文。 An Agent Designed for Mathematical Modeling ,Automatically complete mathmodel and generate a complete paper ready for submission.

gym-pybullet-drones★ 2,022

PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control

lobe-ui★ 2,017

🍭 Lobe UI - an open-source UI component library for building AIGC web apps

cambrian★ 2,001

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

dingtalk-workspace-cli★ 1,992

DingTalk Workspace is an officially open-sourced cross-platform CLI tool from DingTalk. It unifies DingTalk’s full suite of product capabilities into a single package, is designed for both human users and AI agent scenarios.

anse★ 1,974

Supercharged experience for multiple models such as ChatGPT, DALL-E and Stable Diffusion.

beikeshop★ 1,903

Free and Open-Source, Easy-to-Use Laravel eCommerce Platform, Base on the Laravel . It supports multiple languages and currencies, Integrates AI agents. The platform features customizable visual design and a rich plugins on marketplace.

py-gpt★ 1,799

Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, speech synthesis and recognition, web search, memory, presets, assistants,and more. Linux, Windows, Mac

DreamServer★ 1,792

Turn your PC, Mac, or Linux box into a private AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation.

Agentic-Design-Patterns★ 1,705

Agentic Design Patterns

langgraph4j★ 1,682

🚀 LangGraph for Java. A library for develop AI Agentic Architectures in the Java ecosystem. Designed to work seamlessly with both Langchain4j and Spring AI.

ru-dalle★ 1,646

Generate images from texts. In Russian

MuseBot★ 1,590

supports Telegram, Discord, Slack, Lark（飞书），钉钉, 企业微信, QQ, 微信, compatible with various LLMs including OpenAI, Gemini, DeepSeek, Doubao, and OpenRouter. It offers intelligent conversation, image generation, video creation, and more. Works seamlessly in both private chats and group settings.

rosa★ 1,527

ROSA 🤖 is an AI Agent designed to interact with ROS1- and ROS2-based robotics systems using natural language queries. ROSA helps robot developers inspect, diagnose, understand, and operate robots.

NagaAgent★ 1,519

A simple yet powerful agent framework for personal assistants, designed to enable intelligent interaction, multi-agent collaboration, and seamless tool integration.

js-reverse-mcp★ 1,498

为 AI Agent 设计的 JS 逆向 MCP Server，内置反检测，基于 chrome-devtools-mcp 重构 | JS reverse engineering MCP server with agent-first tool design and built-in anti-detection. Rebuilt from chrome-devtools-mcp.

edgeai-for-beginners★ 1,477

This course is designed to guide beginners through the exciting world of Edge AI, covering fundamental concepts, popular models, inference techniques, device-specific applications, model optimization, and the development of intelligent Edge AI agents.

Ovis★ 1,452

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

MemoryOS★ 1,408

[EMNLP 2025 Oral] MemoryOS is designed to provide a memory operating system for personalized AI agents.

SwitchAI★ 1,358

Easily select and manage your preferred AI digital assistants on Android.

hello-ai★ 1,350

It's not AI that takes away your job, but the people who master the use of AI tools. The most deadly attack is a dimension-reducing strike: destroying you has nothing to do with you - from "The Three-Body Problem". 中文说明：抢走你工作的不是AI，而是掌握使用AI工具的人。降维打击最为致命：毁灭你，与你何干《三体》

generative-ai-use-cases★ 1,347

Application implementation with business use cases for safely utilizing generative AI in business operations

modelfusion★ 1,320

The TypeScript library for building AI applications.

airunner★ 1,315

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows

handcrafted-persona-engine★ 1,276

An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC. Ideal for VTubing, streaming, and virtual assistant applications.

Agent-MCP★ 1,239

Agent-MCP is a framework for creating multi-agent systems that enables coordinated, efficient AI collaboration through the Model Context Protocol (MCP). The system is designed for developers building AI applications that benefit from multiple specialized agents working in parallel on different aspects of a project.

rl-baselines-zoo★ 1,204

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

project_ai_mern_image_generation★ 1,194

Build and Deploy a Full Stack MERN AI Image Generation App MidJourney & DALL E Clone

infiAgent★ 1,171

Build your own Cowork, AI Scientist and other SoTA Agents just by editing config files. Support anthropic skills. An infinite-horizon agent framework designed for long-running, complex tasks.

ir-sim★ 1,084

A Python-based lightweight robot simulator designed for navigation, control, and learning

springboot-openai-chatgpt★ 1,001

超级AI大脑一个基于SpringCloud微服务架构，已对接GPT-3.5、GPT-4.0、百度文心一言、stable diffusion AI绘图、Midjourney绘图等。支持web，Android，IOS，H5多端应用，使用了OpenAI的ChatGPT模型实现了智能聊天机器人。用户可以在界面上与聊天机器人进行对话，聊天机器人会根据用户的输入自动生成回复。同时也支持画图，用户输入文本，便可以自动制作文生文生图。持续更新中，更多功能等着你来解锁

WebRover★ 994

WebRover is an autonomous AI agent designed to interpret user input and execute actions by interacting with web elements to accomplish tasks or answer questions. It leverages advanced language models and web automation tools to navigate the web, gather information, and provide structured responses based on the user's needs.

Photo-agents★ 959

Autonomous self-evolving agents. Vision-grounded layered memory and self-written skills for LLM agents that operate your computer.

Agentarium★ 934

open-source framework for creating and managing simulations populated with AI-powered agents. It provides an intuitive platform for designing complex, interactive environments where agents can act, learn, and evolve.

chatgpt-cli★ 930

ChatGPT CLI is a powerful, multi-provider command-line interface for working with modern LLMs. It supports OpenAI, Azure, Perplexity, LLaMA, and more, with features like streaming, interactive chat, prompt files, image/audio I/O, MCP tool calls, and an experimental agent mode for safe, multi-step automation.

Awesome-AI-Memory★ 918

Awesome AI Memory | LLM Memory | A curated knowledge base on AI memory for LLMs and agents, covering long-term memory, reasoning, retrieval, and memory-native system design. Awesome-AI-Memory 是一个集中式、持续更新的 AI 记忆知识库，系统性整理了与大模型记忆（LLM Memory）与智能体记忆（Agent Memory）相关的前沿研究、工程框架、系统设计、评测基准与真实应用实践。

pro-chat★ 899

🤖 Components Library for Quickly Building LLM Chat Interfaces.

FinMem-LLM-StockTrading★ 899

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design

AIOpsLab★ 885

A holistic framework to enable the design, development, and evaluation of autonomous AIOps agents.

gpt-assistant-android★ 878

【新增智能体模式】安卓端全场景GPT助手，可用音量键唤起并进行语音交流，支持联网、拍照、模板、附件解析、智能体模式等 | GPT assistant for Android, activated via volume keys for voice interaction, supporting features such as networking, taking photos, templates, parsing PDF and Office documents, and agent mode.

CyberVerse★ 857

Open-source real-time digital human agent platform. Build voice-first AI agents with WebRTC, persona memory, tools, RAG, and optional digital-human video.

AnkiAIUtils★ 855

AI-powered tools to enhance Anki flashcards with explanations, mnemonics, illustrations, and adaptive learning for medical school and beyond

ui-design-brain★ 803

A Cursor skill that gives AI agents real UI component knowledge — best practices, layout patterns, and design-system conventions for 60+ interface components — so it generates production-grade UI instead of generic output.

4KAgent★ 793

[NeurIPS 2025] 4KAgent: Agentic Any Image to 4K Super-Resolution. An intelligent computer vision agent that can magically restore any image to perfect-4K!

album-ai★ 792

AI-First Album: Chat with your gallery using plain language! LLM Vision + RAG + Album/Gallery.

shadcn-chatbot-kit★ 790

🤖 Beautifully designed chatbot components based on shadcn/ui

contrastors★ 790

Train Models Contrastively in Pytorch

pic-gather★ 789

🎨 Image collector, support for custom acquisition source, compatible with Windows and MacOS！| 图像采集器，支持自定义采集源，兼容Windows和MacOS！

cc-design★ 788

High-fidelity HTML design and prototype guidance skill for AI agents

Azure-AIGEN-demos★ 753

Microsoft Foundry (demos, documentation, accelerators).

jtokkit★ 741

JTokkit is a Java tokenizer library designed for use with OpenAI models.

arbiter★ 741

Multi-agent framework for design, simulation, and auditing.

agnai★ 735

AI Agnostic (Multi-user and Multi-bot) Chat with Fictional Characters. Designed with scale in mind.

rag-in-action★ 734

End-to-end RAG system design, evaluation, and optimization. 极客时间RAG训练营，RAG 10大组件全面拆解，4个实操项目吃透 RAG 全流程。RAG的落地，往往是面向业务做RAG，而不是反过来面向RAG做业务。这就是为什么我们需要针对不同场景、不同问题做针对性的调整、优化和定制化。魔鬼全在细节中，我们深入进去探究。

trustclaw★ 715

A self-hostable personal AI agent with vector memory, Composio tools, and Telegram.

microclaw★ 704

🦀An agentic AI assistant that lives in your chats, inspired by nanoclaw and incorporating some of its design ideas. Built with Rust 🦀

Soul-of-Waifu★ 701

🌌 Give a soul to your digital waifu. Soul of Waifu is an immersive desktop roleplay & AI companion engine with Live2D/VRM avatars, real-time voice chat, and local LLM support. Watch your characters come to life.

ComfyUI-IF_AI_tools★ 699

ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generation workflow by leveraging the power of language models.

awesome-design-md-jp★ 699

日本語UIをAIエージェントに正しくつくらせるためのDESIGN.md集。Japanese DESIGN.md collection for AI agents — extending Google Stitch format with CJK typography.

helix★ 689

Self-healing infrastructure for AI agent payments. 90.3% auto-recovery.

Browse other category pages

Code23,874 AI Infra22,308 Data5,097 Business3,050 Content1,286 Audio1,077 Personal837 Crypto295 Translation290 Demo5 Infrastructure1