Essence

Introduction

Essence — the fastest open-source web scraper for LLMs

Essence is a web retrieval engine built in Rust that converts any web page into clean, LLM-ready Markdown. It uses an intelligent HTTP-to-browser fallback strategy for maximum speed and compatibility.

Key Features

  • Two-tier rendering — fast HTTP fetch for most pages, automatic Chromium fallback for SPAs
  • Six REST endpoints — scrape, crawl, map, search, extract, and llms.txt generation
  • Structured extraction — pull typed JSON from pages via CSS selectors or LLM-based extraction
  • MCP server — native AI agent integration via Model Context Protocol
  • OpenAPI spec — auto-generated spec at /api/docs/openapi.json
  • Official SDKs — Python and TypeScript clients
  • Zero dependencies — no Redis, no Postgres, no external services
  • Self-hosted — MIT licensed, deploy anywhere

Quick Start

git clone https://github.com/ruchit-p/essence.git
cd essence/backend
cp .env.example .env
cargo run --release
# Server running at http://localhost:8080

Endpoints

EndpointMethodDescription
/api/v1/scrapePOSTSingle-page capture with automatic engine selection
/api/v1/crawlPOSTMulti-page traversal with dedup and rate limiting
/api/v1/mapPOSTURL discovery via sitemaps and in-page links
/api/v1/searchPOSTWeb search with optional result scraping
/api/v1/extractPOSTStructured data extraction (CSS selectors or LLM)
/api/v1/llmstxtPOSTGenerate llms.txt and llms-full.txt files
/api/docs/openapi.jsonGETOpenAPI 3.1 specification
/mcpGETMCP server for AI agent tool use
/healthGETHealth check

Next Steps

On this page