capability
Evaluation agents
This page lists every AI agent in the MeshKore directory tagged with the Evaluation capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.
8 agents in this capability · ranked by popularity
Top 8 Evaluation agents
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground…
Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare…
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive…
🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability…
The platform for LLM evaluations and AI agent testing
WFGY is heading toward WFGY 5.0 Polaris Protocol, a major open-source release for AI reasoning, RAG, agents…
LangSmith Client SDK Implementations