Open Source|MIT Licensed

The fastest open-source web scraper for LLMs.

Distill the web.

Convert any web page into clean, LLM-ready Markdown. Built in Rust with intelligent HTTP-to-browser fallback. Self-hosted, no API keys, no rate limits.

498msMedian ScrapeAcross 35 real-world URLs
97%Quality Win RateLLM judge (vs alternatives)
1.8xFasterThan nearest competitor
0DependenciesNo Redis, no Postgres
Features

Everything you need to scrape the web

Built for developers who need fast, reliable web data extraction for LLM pipelines and AI agents.

Two-Tier Rendering

Starts with a fast HTTP fetch. If content density is too low, automatically escalates to full Chromium browser automation. Speed when possible, reliability when needed.

LLM-Optimized Markdown

Strips navigation, ads, footers, cookie banners, and boilerplate. Preserves code blocks, heading hierarchy, and extracts metadata including OG tags.

Five Endpoints, One Server

Scrape a page, crawl a site, map URLs via sitemaps, search the web, or generate llms.txt files. All from a single lightweight server.

MCP Server for AI Agents

Built-in Model Context Protocol server. Connect Claude, Cursor, or any MCP client to give your agent web access.

Zero Dependencies

A single Rust binary. No Redis, no PostgreSQL, no external services. Run with cargo run or docker-compose up.

100% Open Source

MIT licensed. Self-hosted, no vendor lock-in, no API keys, no rate limits. Free forever.

How it works

URL to Markdown in milliseconds

Three steps from URL to LLM-ready Markdown.

01

Send a URL

POST to any of the 4 endpoints -- scrape, crawl, map, or search.

02

Smart rendering

HTTP first. If the page is a JavaScript SPA, automatic fallback to headless Chromium.

03

Get clean Markdown

Structured output with metadata, ready for your LLM pipeline.

Benchmarks

Better output. Faster.

Benchmarked across 35 real-world URLs spanning 7 content categories against leading alternatives. Quality evaluated by LLM judge (Claude).

Quality (LLM Judge)

Per-category win rate across 35 URLs

CategoryEssenceAlternativesTies
Structured5/50/50
News4/50/51
Reference5/50/50
Content5/50/50
Dynamic4/51/50
Docs5/50/50
E-Commerce4/51/50
Total32/352/352

Speed Comparison

Average response time by category

Docs1.6x faster
Essence
334ms
Next best
536ms
News2.6x faster
Essence
449ms
Next best
1152ms
Dynamic2.2x faster
Essence
540ms
Next best
1166ms
Reference1.2x faster
Essence
826ms
Next best
968ms
Structured1.3x faster
Essence
929ms
Next best
1178ms

Benchmark conducted April 2026 against Firecrawl and Crawl4AI (both self-hosted via Docker). LLM judge evaluated content relevance, noise removal, readability, structural coherence, and information completeness. Full methodology

Comparison

Why Essence

How Essence compares to other scraping tools. No spin -- just data.

FeatureEssenceAlternatives
LLM-ready Markdown
Open source licenseMITAGPL / Apache
Self-hostedSingle binary, zero depsRedis + services / Docker
Browser fallbackAutomatic (content-aware)Manual / always-on
MCP serverBuilt-inSeparate package
API key requiredCloud tiers
Rate limitsNoneTiered pricing
Quality (LLM judge)97% win rateBest alternative: 26%
Median speed498ms908ms+
Built-in searchDuckDuckGoVaries
LanguageRustTypeScript / Python
PricingFree foreverFree tier + paid
Integration

Works with everything

A simple REST API. Use it from any language, any framework, or connect your AI agent via MCP.

curl -X POST http://localhost:8080/api/v1/scrape \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'
ScrapeFetch a single page/api/v1/scrape
CrawlTraverse a site/api/v1/crawl
MapDiscover URLs/api/v1/map
SearchSearch the web/api/v1/search
LLMs.txtGenerate llms.txt/api/v1/llmstxt
Get started

Ready to build?

Start getting Web Data for free and scale seamlessly. Self-hosted, no credit card needed.

Terminal
git clone https://github.com/ruchit-p/essence.git
cd essence/backend
cp .env.example .env
cargo run --release
# Server running at http://localhost:8080