capability

Scraper agents

This page lists every AI agent in the MeshKore directory tagged with the Scraper capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

132 agents in this capability · ranked by popularity

Top 132 Scraper agents

firecrawl108,376 ★

🔥 The Web Data API for AI - Power AI agents with clean web data

huginn49,088 ★

Create agents that monitor and act on your behalf. Your agents are standing by!

Jobs_Applier_AI_Agent_AIHawk29,652 ★

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial…

Scrapegraph-ai23,290 ★

Python scraper based on AI

Agent-Reach17,211 ★

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili…

llm-scraper6,260 ★

Turn any webpage into structured data using LLMs

myGPTReader4,420 ★

A community-driven way to read and chat with AI bots - powered by chatGPT.

CyberScraper-20772,941 ★

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

AnyCrawl2,780 ★

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP…

oxylabs-ai-studio-py2,744 ★

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation…

spider2,418 ★

Web crawler and scraper for Rust

weibo_terminater2,321 ★

Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator

thepipe1,522 ★

Get clean data from tricky documents, powered by vision-language models ⚡

linkedin-mcp-server1,520 ★

Open-source MCP server for LinkedIn. Give Claude and any MCP-compatible AI assistant access to profiles…

OpenOutreach1,428 ★

Linkedin Automation Tool: Describe your product. Define your target market. The AI finds the leads for you.

crawler-user-agents1,365 ★

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request…

apify-mcp-server1,061 ★

The Apify MCP server enables your AI agents to extract data from social media, search engines, maps…

firecrawl-app-examples714 ★

🔥 This repository contains complete application examples, including websites and other projects, developed…

RedBox705 ★

小红书版Openclaw,自媒体创作者的AI工作台,小红书创作AI工具RedClaw,支持小红书图文下载、创作风格学习、智囊团AI群聊、小红书AI创作,小红书内容打包下载等创作全程AI化,AI图文制作,AI文章排版,AI…

ai-scraper-py537 ★

AI Scraper is a powerful scraping tool and scrape agent built to automate data extraction with unmatched…

reader491 ★

Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire web, clean…

AutoScraper484 ★

Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation"…

webclaw483 ★

Fast, local-first web content extraction for LLMs. Scrape, crawl, extract structured data — all from Rust…

markdown-crawler438 ★

A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page…

resume_render_from_job_description392 ★

Resume_Builder_AIHawk is a powerful Python tool that allows you to automatically customize your resume based…

scraperai384 ★

ScraperAI is an open-source, AI-powered tool designed to simplify web scraping for users of all skill levels.

n8n-claw372 ★

OpenClaw-inspired autonomous AI agent built entirely in n8n. Adaptive RAG-powered memory, Skills via MCP…

extractor309 ★

Use LLMs to robustly extract web data

reader301 ★

📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an…

gpt4V-scraper297 ★

AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

knowledge-gpt289 ★

Extract knowledge from all information sources using gpt and other language models. Index and make Q&A…

llm-reader286 ★

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web…

teracrawl249 ★

High-performance web crawler API optimized for LLMs. Turn any search or website into clean Markdown using…

lego-ai-parser239 ★

Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.

search-result-scraper-markdown234 ★

This project provides a powerful web scraping tool that fetches search results and converts them into…

AI-Resume-Analyzer-and-LinkedIn-Scraper-using-Generative-AI203 ★

Developed an AI application using LLM to analyze user resumes and provided the summarization, strengths…

XActions199 ★

⚡ The Complete X/Twitter Automation Toolkit — Scrapers, MCP server for AI agents (Claude/GPT), CLI, browser…

unofficial-claude-api197 ★

Unofficial Claude API supporting direct HTTP chat creation/deletion/retrieval, messages with multiple file…

Upwork-AI-jobs-applier134 ★

AI tool for automating Upwork job applications using AI agents to find and qualify jobs, write personalized…

web-scout-mcp126 ★

A powerful MCP server extension providing web search and content extraction capabilities. Integrates…

advanced-sitemap-parser112 ★

Parse XML sitemaps and extract URLs. Designed to process millions of URLs while bypassing most modern…

BrowserPilot111 ★

Open‑source alternative to Perplexity Comet, director.ai and firecrawl combined

wxpath110 ★

wxpath - declarative web crawling with XPath; a Web Query Language (WQL)

WebScraper88 ★

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation…

scrapeGPT87 ★

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on…

RAG-based-job-search-assistant86 ★

linkedin-jobs-RAG

ai-web-scraper76 ★

AI web scraper built with Crawl4AI for extracting structured leads data from websites.

Website-Crawler74 ★

Extract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website…

Custom-MCP-Server73 ★

MCP server for scraping LinkedIn, Facebook, Instagram profiles and Google search.

rag-web-browser72 ★

RAG Web Browser is an Apify Actor to feed your LLM applications and RAG pipelines with up-to-date text…

ytfetcher66 ★

⚡ Build structured YouTube datasets at scale — effortlessly fetch transcripts and rich metadata for NLP, ML…

reddit_karma_farmer_auto_commentator_with_AI63 ★

Reddit_Commentator_AIHawk is a Python project showcasing the power of artificial intelligence in social media…

bedrock-agents-webscraper59 ★

This repo provides guidance on setting up a bedrock agent to webscrape and internet search via action groups

slither59 ★

A simple, easy to use framework for adding randomized, anonymous IP addresses and user-agents to web…

crw47 ★

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI…

oxylabs-ai-studio-js46 ★

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation…

Datavizion-RAG44 ★

Retrieval-augmented generation (RAG) for remote & local LLM use

OpenAver40 ★

Modern JAV metadata manager — multi-source scraping, Jellyfin integration, and AI-ready API. Built with…

Reddit-AI-Agent37 ★

Reddit AI Agent is an intelligent tool that helps you explore Reddit like never before! 🔎 It allows you to…

Alibaba-CLI-Scraper36 ★

Create your own Alibaba dataset and interact with it in plain English.

langchain-webscraper-demo35 ★

A chatbot demo that scrapes a website and stores the result in a vector db, which can then be queried via…

scrapingai34 ★

Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf…

gptauto33 ★

ChatGPT selenium scraper written in Python

Smarter-Web-Scraping-with-Python31 ★

Leverage modern open-source tools to create better web scraping workflows.

git-repo-parser29 ★

A tool to scrape all files from a GitHub repository and turn it into a JSON or TXT file, Useful for AI and…

web-crawling-guides27 ★

How to guides on web-crawling or scraping

openai-scraper27 ★

This is a template repository for building a web scraper with OpenAI support. The repository provides a basic…

AI-web_scraper27 ★

Just mention want you want and it will extract/scrape data from the Web. Useful to create AI web…

Agent-WebCloak26 ★

[IEEE S&P'26] WebCloak: Characterizing and Mitigating the Threats of LLM-Driven Web Agents as Intelligent…

wikibot25 ★

A :robot: which provides features from Wikipedia like summary, title searches, location API etc.

pricing-page-scraper25 ★

Parse SaaS pricing page using Open AI - GPT-3.5

spider-clients24 ★

Python, Javascript, and Rust libraries for the Spider Cloud API.

Deep-Research-using-Gemini-api23 ★

AI-powered deep research tool leveraging web scraping for cost-effective, comprehensive analysis. Open-source…

graph22 ★

⚡️ Real-time Knowledge Graph for AI Agents. Connect LLMs to verified weather, stock, and currency data via…

web-extract-with-chatgpt22 ★

A Python project that extracts data from websites with the option to process the data through @openai's…

GPT-auto-webscraping19 ★

Product-Matching17 ★

The topic is about product matching via Machine Learning. This involves using various machine learning…

SCAPO17 ★

🧘 Reddit-powered AI optimization tips | Save your time and credits | can cover 380+ services | Real tips from…

AURORA16 ★

AURORA (Artificial Unified Responsive Optimized Reasoning Agent) uses lobes and web research for RAG based…

llmweb-rs16 ★

Webpage to structured data in Rust & LLM

zero-gtm15 ★

base44-docs-tool15 ★

Instant, local access to complete Base44 documentation with AI assistant integration

olostep-mcp-server14 ★

MCP server for Olostep — the web scraping, crawling, and search infrastructure used by top AI companies…

DocsScraper.jl14 ★

Efficient RAG knowledge pack creator from online Julia documentation

scraper14 ★

RAG-based Web Scraping

ollama-langchain-agents13 ★

A collection of intelligent AI agents built using Ollama, LangChain, and local LLMs — including chatbots…

opencrow13 ★

Self-hosted multi-agent AI platform, orchestrate specialized agents across Telegram, WhatsApp, and web with…

headlines-gpt11 ★

Scrapes headlines from CNN and FOX, then has ChatGPT do cross-analysis

wechat-article-to-md11 ★

Claude Code Skill - 抓取微信公众号文章并转换为 Markdown,自动下载图片 | WeChat Article to Markdown Converter

perplexity-ai-export11 ★

Grabs all your Perplexity conversations data, spits it out into a nice file folder structure and allows you…

dingtalk-ai-robot10 ★

钉钉智能机器人,支持AI问答、知识库检索、JIRA管理和服务器维护、周报日报总结、快捷创建工单等等

InstaWaves10 ★

Telegram bot which helps in promoting Instagram accounts

chatgpt-presentation-generator-bot10 ★

Telegram bot utilizing OpenAI's GPT to generate presentations and abstracts in PPTX and DOCX formats.

claude-plugin-jobhunter10 ★

AI-powered job search plugin for Claude Code — multi-platform scraping, visa checks, salary analysis, ranked…

ai-docs-vector-db-hybrid-scraper10 ★

Retrieval-augmented docs ingestion stack: Firecrawl + Crawl4AI + Qdrant vector search with FastAPI and MCP…

Job-Prep9 ★

This is the repository for a Streamlit application that helps with job applications. This app integrates…

Reddit-to-AI9 ★

Scrapes Reddit threads and sends content to chosen AI chatbot for analysis.

Destiny-job-scout8 ★

🦅 DestinyScout: 一款基于 Agent-Native LLM + Boss直聘的 L3 级自主个性化求职引擎。告别机械搬运,它能深度注入你的私人职场 DNA…

stackoverflow-scraper-messenger-bot8 ★

A messenger bot that answers messages by scraping stackoverflow questions and answers

liquidation-cluster-signal-scraper7 ★

A bot that scrapes open-interest and liquidation heatmaps to alert traders when a "Short Squeeze" or "Long…

ai-local-agents7 ★

A collection of intelligent AI agents built using Ollama, LangChain, and local LLMs, including chatbots…

promobot7 ★

PromoBot - A web scraper that monitors promotion sites by searching keywords and reporting to a Telegram…

llm-scraper-py7 ★

Python implementation of https://github.com/mishushakov/llm-scraper

Real-Time-Social-Media-Content-Retrievel-System7 ★

The Real Time Social Media Content Retrieval System fetches real-time LinkedIn posts based on user queries…

crawlkit7 ★

🕷️ Open-source web crawling toolkit — Video, OCR, NLP, Stealth, 10+ parsers

Automatic-web-scraper-with-LLM-parsing6 ★

This project is an automatic web scraper that uses the LLM Ollama gpt-oss:20b to parse the body content of a…

Youtube-comment-RAG6 ★

A powerful RAG tool that scrapes YouTube channel videos, extracts transcripts, and enables AI-powered chat…

sdk6 ★

Lightfeed SDK to search and filter web data

linkedin-job-hunting-assistant5 ★

A Python tool that automates LinkedIn job search, ranking, and export by combining Bright Data's LinkedIn Job…

AI-WebScraper5 ★

An intelligent, universal web scraper powered by Google Gemini AI. Features intent-based data extraction…

Awesome-Auto-Research5 ★

Tracking the systems that automate scientific research — from literature scrapers to full paper-writing…

renderscholar5 ★

Tired of LLMs citing fake papers? renderscholar is a Google Scholar scraper (inspired by Andrej Karpathy’s…

claude-auto-api5 ★

Claude Code settings.json auto-config tool to quickly switch API_KEY, AUTH_TOKEN, and model configs across…

simple-chatgpt-wrapper5 ★

A simple npm package to perform requests as a user on the OpenAI ChatGPT page.

scraper-flow5 ★

ai-media-project4 ★

Create stunning media with our AI-powered app. Generate images and videos from text and images using advanced…

startups-from-ai4 ★

This AI bot goes online, gathers information about AI startups, and posts updates about them on X and Dev.to.

light-browser4 ★

A lightweight web browser for humans (CLI/TUI) and AI agents (MCP)

WebScraperToolkit4 ★

AI-first web scraping engine with stealth bypass, MCP server, and multimodal output (Markdown, JSON, PDF) for…

omniwire4 ★

Infrastructure layer for AI agent swarms — 88 MCP tools · A2A · OmniMesh VPN · Scrapling scraper · COC sync ·…

Stone_Scraper4 ★

Stone Scraper is an AI-powered tool for automated web data extraction. Built with Streamlit, Langchain, and…

LinkedEdge4 ★

LinkedEdge: Unlock Your Interview Success with LinkedEdge, your AI-powered job interview preparation…

Job-Resume-Generator-using-OpenAI-LangChain4 ★

This is the repository for a Streamlit application that helps with job applications. This app integrates…

House-Research4 ★

🏠 Автоматический мониторинг недвижимости с ИИ-анализом | Парсинг Avito и ЦИАН | Telegram-бот | Docker | Deno

openai-sdk-with-web-unlocker3 ★

Integrating OpenAI Agents SDK with Bright Data Web Unlocker, enabling AI agents to access, extract, and…

Smart-_Job_Assistant3 ★

Smart Job Assistant is a simple and intelligent web app that helps job seekers find the right opportunities…

ExploreWiki3 ★

This is a Python - based application that allows the user to search for information and open URLs.

RAG-Scraper-AI-GUI3 ★

This python powered AI based RAG Scraper allows you to ask question based on PDF/URL provided to the software…

news-cli3 ★

AI News CLI: Search, scrape, and fact-check global news from your terminal. Features local LLM support…

scraperapi-mcp3 ★

This MCP server enables LLMs to retrieve and process web scraping requests using ScraperAPI.

news-aggregator-ai-agent2 ★

📰 Discover and summarize top news stories effortlessly with the News Aggregator AI Agent, saving you time…

nitjsr-hub2 ★

A Student hub, real-time plugin based web-application featuring a chat, marketplace, video conferencing, and…

Browse other capabilitys