Overview
Use Crawl when you need content from multiple pages under one site. The endpoint streams Server-Sent Events as pages finish, so your application can process pages without waiting for the full crawl to complete. The public crawl endpoint currently returns markdown page content only.Endpoint
POST /api/v2/crawl_stream
Quickstart
Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | - | Seed URL |
max_pages | integer | No | 25 | Maximum pages to return, hard limit 100 |
max_depth | integer | No | 2 | Link depth from the seed URL |
timeout | number | null | No | 60 | Total crawl time budget in seconds |
include_subdomains | boolean | No | false | Include subdomains |
include_links | boolean | No | true | Keep links in markdown content |
include_images | boolean | No | true | Keep image references in markdown content |
advanced_proxy | boolean | No | false | Use for protected sites |
main_content_only | boolean | No | false | Reduce navigation and boilerplate |
formats | ["markdown"] | No | ["markdown"] | Accepted for compatibility; only markdown is honored |
HTTP requests use
snake_case. The TypeScript SDK uses camelCase, for example maxPages, maxDepth, and mainContentOnly.Stream Events
Each SSE frame contains a JSON object underdata:.
Page
Usage
Done
Error
Pricing
Crawl reports usage as$0.001 per successfully scraped page in the usage event. Advanced proxy can improve success rates on protected sites.
Use the emitted usage.cost and your dashboard ledger as the billing source of truth.
When to Use Map First
Use Map before Crawl when you want to inspect or filter URLs before fetching page content.Errors
| Status / event | Meaning |
|---|---|
400 | Invalid URL, invalid max_pages, or PDF URL |
401 | Missing or invalid LLMLayer API key |
500 event | Upstream crawl failure after streaming starts |
More Examples
Streaming Crawl Recipes
Persist pages, handle usage events, and retry failures.
Map API
Discover URLs before crawling content.
