category
Data & Research agents
This page lists every AI agent in the MeshKore directory tagged with the Data & Research category. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.
4,998 agents in this category · ranked by popularity
Top 200 Data & Research agents
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with…
AI agents running research on single-GPU nanochat training automatically
Financial data platform for analysts, quants and AI agents.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from…
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule…
LLM驱动的 A/H/美股智能分析器:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US…
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box…
Data infrastructure for AI
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each…
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
An autonomous agent for deep financial research
Tongyi Deep Research, the Leading Open-source Deep Research Agent
"DeepTutor: Agent-Native Personalized Learning Assistant"
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give…
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming…
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
本项目是一个面向小白开发者的大模型应用开发教程,在线阅读地址:https://datawhalechina.github.io/llm-universe/
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the…
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
基于 Playwright 和AI实现的闲鱼多任务实时/定时监控与智能分析系统,配备了功能完善的后台管理UI。帮助用户从闲鱼海量商品中,找到心仪产品。
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100%…
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
UAParser.js - The Essential Web Development Tool for User-Agent Detection. Detect Browsers, OS, Devices…
Pioneering Automated GUI Interaction with Native Agents
A Python library for anomaly detection across tabular, time series, graph, text, and image data. 60+…
A research prototype of a human-centered web agent
A lightweight, lightning-fast, in-process vector database
AI Observability & Evaluation
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling…
Your personal intelligence agent. Watches the world from multiple data sources and pings you when something…
Private & local AI personal knowledge management app for high entropy people.
An EVM compatible Substrate chain, powered by StorageHub and secured by EigenLayer
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
A customisable 3D platform for agent-based AI research
Monitor browser logs directly from Cursor and other MCP compatible IDEs.
AI + Data, online. https://vespa.ai
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral…
Data transformation framework for AI. Ultra performant, with incremental processing. 🌟 Star if you like it!
An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify…
Build ChatGPT over your data, all with natural language
A complete web-based remote monitoring and management web site. Once setup you can install agents and perform…
Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety…
Turn any webpage into structured data using LLMs
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl
Add-on agent to generate and expose cluster-level metrics.
🔍大模型应用开发实战一:RAG 技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/
Open-source framework for building AI-powered apps in JavaScript, Go, and Python, built and used in…
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher…
Superduper: End-to-end framework for building custom AI applications and agents.
MineContext is your proactive context-aware AI partner(Context-Engineering+ChatGPT Pulse)
Structured data extraction and instruction calling with ML, LLM and Vision LLM
An AI-powered data science team of agents to help you perform common data science tasks 10X faster.
Self-hosted AI accounting app. LLM analyzer for receipts, invoices, transactions with custom prompts and…
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Machine Learning and Agentic AI Resources, Practice and Research
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with…
CSGHub is a brand-new open-source platform for managing LLMs, developed by the OpenCSG team. It offers both…
Neo4j graph construction from unstructured data using LLMs
The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector…
A community-driven way to read and chat with AI bots - powered by chatGPT.
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production…
Olares: An Open-Source Personal Cloud to Reclaim Your Data
Knowledge Agents and Management in the Cloud
HelixDB is an open-source graph-vector database built from scratch in Rust.
A library of reinforcement learning components and agents
Easiest and laziest way for building multi-agent LLMs applications.
A system for agentic LLM-powered data processing and ETL
Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and…
The most accurate document search and store for building AI apps
Main repository for Datadog Agent
Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions…
Superfast AI decision making and intelligent processing of multi-modal data.
A quick guide (especially) for trending instruction finetuning datasets
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for…
Personal AI Notebooks. Organize files & webpages and generate notes from them. Open source, local & open…
Agent Skills as a Memory Layer
Evaluation and Tracking for LLM Experiments and AI Agents
A native macOS app that allows users to chat with a local LLM that can respond with information from files…
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and…
An event-driven framework designed to build and orchestrate multi-agent AI systems. It enables seamless…
Data framework for your LLM applications. Focus on server side solution
AI Search & RAG Without Moving Your Data. Get instant answers from your company's knowledge across 100+ apps…
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
🏆 Top-1 on 5+ benchmarks | Web UI | Supports MiroThinker, Claude, Kimi, OpenAI
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
Laminar - open-source observability platform purpose-built for AI agents. YC S24.
A cross-platform video structuring (video analysis) framework. If you find it helpful, please give it a star…
🔬 Harness Vibe Research with Self-evolving AI Scientists
Database system for AI-powered apps
Thinking notebook and Markdown editor.
Learn to build your Second Brain AI assistant with LLMs, agents, RAG, fine-tuning, LLMOps and AI systems…
Gonzo! The Go based TUI log analysis tool
Postgres MCP Pro provides configurable read/write access and performance analysis for you and your AI agents.
🪁 A lightweight, modern Kubernetes dashboard that unifies multi-cluster and resource management…
一个基于 AI 的 Hacker News 中文播客项目,每天自动抓取 Hacker News 热门文章,通过 AI 生成中文总结并转换为播客内容。
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video…
动手学Ollama,CPU玩转大模型部署,在线阅读地址:https://datawhalechina.github.io/handy-ollama/
Label, clean and enrich text datasets with LLMs.
Distributed vector search for AI-native applications
Empowering RAG with a memory-based data interface for all-purpose applications!
Hermes Agent 从入门到精通 · 橙皮书系列 · Nous Research 开源 AI Agent 框架实战指南
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more…
Paper2Agent is a multi-agent AI system that automatically transforms research papers into interactive AI…
Research project. A Memory solution for users, teams, and applications.
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR…
STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by…
拼好RAG:手搓并融合了GraphRAG、LightRAG、Neo4j-llm-graph-builder进行知识图谱构建以及搜索;整合DeepSearch技术实现私域RAG的推理;自制针对GraphRAG的评估框架|…
编程导航 2025 年 AI 开发实战新项目,基于 Spring Boot 3 + Java 21 + Spring AI 构建 AI 恋爱大师应用和 ReAct 模式自主规划智能体YuManus,覆盖 AI…
用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion…
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and…
A self-learning data agent built with systems engineering principles. It grounds answers in 6 layers of…
Easy token price estimates for 400+ LLMs. TokenOps.
All-in-one productivity app and AI assistant with Tasks, Notes, Calendar, Diary and Bookmarks.
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit…
Query your Apple Health data with natural language 💬 🩺
⚡ Energy consumption metrology agent. Let "scaph" dive and bring back the metrics that will help you make…
Apache ServiceComb Pack is an eventually data consistency solution for micro-service applications…
A financial agent for investment research
The PHP Agentic Framework to build production-ready AI driven applications. Connect components (LLMs, vector…
FinRL®-Meta: Dynamic datasets and market environments for FinRL.
An experimentation and research platform to investigate the interaction of automated agents in an abstract…
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Aria is Your AI Research Assistant Powered by GPT Large Language Models
Spring AI Alibaba DataAgent
Meet Ava, the WhatsApp Agent
Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!
A curated list of resources about AI agents for Computer Use, including research papers, projects…
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
A ChatGPT web client that supports multiple users, multiple languages, and multiple database connections for…
Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
Trench — Open-Source Analytics Infrastructure. A single production-ready Docker image built on ClickHouse…
🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.
WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a…
PaSa -- an advanced paper search agent powered by large language models. It can autonomously make a series of…
自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text…
A toolkit to create optimal Production-readyRetrieval Augmented Generation(RAG) setup for your data
Turn Chinese natural language into structured data 中文自然语言理解
ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and…
A curated list of 100+ resources for building and deploying generative AI specifically focusing on helping…
LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain
The data primitive for the agent loop.
OpenSource Production ready Customer service with built in Evals and monitoring
AI Agent for Twitter Personality Analysis
An Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn…
A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap.
Open Brain — The infrastructure layer for your thinking. One database, one AI gateway, one chat channel — any…
FinnewsHunter: Multi-agent financial intelligence platform powered by AgenticX. Real-time news analysis…
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason…
Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request…
Developer friendly Natural Language Processing ✨
An AI knowledge base/agent built with .Net 9, AntBlazor, Semantic Kernel, and Kernel Memory, supporting local…
Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥
入门资料整理:1.多因子股票量化框架开源教程 2.学界和业界的经典资料收录 3.AI + 金融的相关工作,包括LLM, Agent, benchmark(evaluation), etc.
A lightweight, cloud-native data transfer agent and aggregator
Open source implementation and extension of Google Research’s PaperBanana for automated academic figures…
Plex HTTP Anidb Metadata Agent (HAMA)
A plugin for IDA that can help to analyze binary file, it can be based on commonly used AI big models such as…
The first distributed AGI system. Thousands of autonomous AI agents collaboratively train models, share…
Humans and AI agents, building knowledge bases together. Self-hosted document annotation, version control…
AI agent for deep LinkedIn profile analysis.
TrustRAG:The RAG Framework within Reliable input,Trusted output
Autonomous Agents (LLMs) research papers. Updated Daily.
A curated list of awesome skills, tools, integrations, and resources for Hermes Agent by Nous Research
Chat with Hacker News using natural language. Built with OpenAI Functions and Vercel AI SDK.
Open source and self-hostable browser automation library for AI agents
Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust 🦀
A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic…
A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure…
Build your autonomous hedge fund in minutes. AutoHedge harnesses the power of swarm intelligence and AI…
Neo4j GraphRAG for Python
Calling Python functions from the Ruby language
A Python framework that emulates Grok Heavy functionality using intelligent multi-agent orchestration. Deploy…
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Notebooks & Example Apps for Search & AI Applications with Elasticsearch
:helicopter: 保险行业语料库,聊天机器人
🐙 Give your AI a life — open-source agent infrastructure for team collaboration.
Search + Chat = SearChat(AI Chat with Search), Support OpenAI/Anthropic/VertexAI/Gemini, DeepResearch…
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
The open source post-building layer for agents. Our environment data and evals power agent post-training (RL…
Semantica 🧠 — A framework for building semantic layers, context graphs, and decision intelligence systems…
This is the Plato Research Dialogue System, a flexible platform for developing conversational AI agents.
The open-source alternative to Carbon.ai. Build powerful RAG applications with any data source, at any scale.
N.E.K.O. — A proactive, native omni-modal AI companion featuring 24/7 ambient awareness, agent capability and…
Deep Reinforcement Learning toolkit: record and replay cryptocurrency limit order book data & train a DDQN…
Resource, examples & tutorials for multimodal AI, RAG and agents using vector search and LLMs
Embeddable vector database for Go with Chroma-like interface and zero third-party dependencies. In-memory…
Multi-Agent Resource Optimization (MARO) platform is an instance of Reinforcement Learning as a Service…
🎯🗯 Dataset generation for AI chatbots, NLP tasks, named entity recognition or text classification models…
🦀 Crabwalk 🦀 Real-time companion monitor for OpenClaw agents.
📚 Process PDFs, Word documents and more with spaCy
The way we interact with our data is changing.
Epsilla is a high performance Vector Database Management System
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large…
Deep research agent to help you find the best GitHub repositories 🕵️!