Skip to main content
POST
/
api
/
v2
/
scrape
Scraper API - Multi-format content extraction
curl --request POST \
  --url https://api.llmlayer.dev/api/v2/scrape \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "https://example.com/article",
  "formats": [
    "markdown",
    "screenshot"
  ],
  "include_images": true,
  "include_links": true,
  "advanced_proxy": false,
  "main_content_only": false
}
'
{
  "markdown": "<string>",
  "url": "<string>",
  "statusCode": 123,
  "html": "<string>",
  "pdf": "<string>",
  "screenshot": "<string>",
  "title": "<string>",
  "cost": 123,
  "metadata": {}
}

Authorizations

Authorization
string
header
required

Bearer token authentication using your LLMLayer API key. Include in Authorization header as: Bearer YOUR_LLMLAYER_API_KEY

Body

application/json
url
string<uri>
required

URL to scrape

Example:

"https://example.com/article"

formats
enum<string>[]
required

Output formats to generate. 'pdf' is accepted for backward compatibility but ignored in this endpoint.

Available options:
markdown,
html,
screenshot,
pdf
Example:
["markdown", "screenshot"]
include_images
boolean
default:true

Include images in markdown output

Include hyperlinks in markdown output

advanced_proxy
boolean | null
default:false

Enable advanced proxy for heavily protected sites.

main_content_only
boolean | null
default:false

Extract only the main content, excluding navigation and boilerplate.

Response

Scraped content in requested formats

markdown
string
required

Content in markdown format (always returned)

url
string
required

Final URL after any redirects

statusCode
integer
required

HTTP status code (200 for success)

html
string | null

Content in HTML format (when 'html' in formats)

pdf
string | null

Legacy field. Present for backward compatibility; usually null/empty.

screenshot
string | null

Base64-encoded screenshot image (when 'screenshot' in formats)

title
string | null

Page title extracted from metadata

cost
number

Cost in USD ($0.001 per format requested)

metadata
object

Additional metadata extracted from the page