Introduction
Essence — the fastest open-source web scraper for LLMs
Essence is a web retrieval engine built in Rust that converts any web page into clean, LLM-ready Markdown. It uses an intelligent HTTP-to-browser fallback strategy for maximum speed and compatibility.
Key Features
- Two-tier rendering — fast HTTP fetch for most pages, automatic Chromium fallback for SPAs
- Six REST endpoints — scrape, crawl, map, search, extract, and llms.txt generation
- Structured extraction — pull typed JSON from pages via CSS selectors or LLM-based extraction
- MCP server — native AI agent integration via Model Context Protocol
- OpenAPI spec — auto-generated spec at
/api/docs/openapi.json - Official SDKs — Python and TypeScript clients
- Zero dependencies — no Redis, no Postgres, no external services
- Self-hosted — MIT licensed, deploy anywhere
Quick Start
git clone https://github.com/ruchit-p/essence.git
cd essence/backend
cp .env.example .env
cargo run --release
# Server running at http://localhost:8080Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/v1/scrape | POST | Single-page capture with automatic engine selection |
/api/v1/crawl | POST | Multi-page traversal with dedup and rate limiting |
/api/v1/map | POST | URL discovery via sitemaps and in-page links |
/api/v1/search | POST | Web search with optional result scraping |
/api/v1/extract | POST | Structured data extraction (CSS selectors or LLM) |
/api/v1/llmstxt | POST | Generate llms.txt and llms-full.txt files |
/api/docs/openapi.json | GET | OpenAPI 3.1 specification |
/mcp | GET | MCP server for AI agent tool use |
/health | GET | Health check |
Next Steps
- Installation & Quickstart — get Essence running locally
- API Reference — full endpoint documentation with examples
- Python SDK / TypeScript SDK — official client libraries
- MCP Setup — connect Essence to Claude, Cursor, and other AI tools