Overview
Use the Scraper API when you already have a URL and need page content. It supports:
markdown for LLM-ready text
html for raw rendered markup
screenshot for a base64 PNG capture
PDF URLs are not scraped by this endpoint. Use the PDF Content API for PDF text extraction.
Endpoint
POST /api/v2/scrape
Quickstart
import { LLMLayerClient } from 'llmlayer' ;
const client = new LLMLayerClient ({
apiKey: process . env . LLMLAYER_API_KEY ,
});
const page = await client . scrape ( 'https://www.ycombinator.com/blog' , {
formats: [ 'markdown' ],
mainContentOnly: true ,
});
console . log ( page . title );
console . log ( page . markdown );
console . log ( page . statusCode );
Format Response field Best for Cost markdownmarkdownLLM input, summaries, retrieval $0.001htmlhtmlArchival or custom parsing $0.001screenshotscreenshotVisual verification $0.001
You can request multiple formats in one call:
const page = await client . scrape ({
url: 'https://www.ycombinator.com' ,
formats: [ 'markdown' , 'html' , 'screenshot' ],
});
pdf is accepted by some clients for backward compatibility, but this endpoint does not generate PDF output. Direct PDF URLs return a validation error. Use /api/v2/get_pdf_content.
Request Parameters
Parameter Type Required Default Description urlstringYes - Public http or https page URL formatsstring[]Yes ["markdown"] in SDKsAny of markdown, html, screenshot include_imagesbooleanNo trueInclude image references in markdown include_linksbooleanNo trueInclude links in markdown advanced_proxybooleanNo falseUse for heavily protected sites main_content_onlybooleanNo falseReduce navigation and boilerplate
HTTP requests use snake_case. The TypeScript SDK uses camelCase, for example advancedProxy and mainContentOnly.
Response
{
"markdown" : "# Article title \n\n Article body..." ,
"html" : null ,
"screenshot" : null ,
"pdf" : null ,
"url" : "https://www.ycombinator.com/blog" ,
"title" : "Article title" ,
"statusCode" : 200 ,
"cost" : 0.001 ,
"metadata" : {
"description" : "..."
}
}
Field Type Description markdownstring | nullMarkdown content when available/requested htmlstring | nullHTML content when requested screenshotstring | nullBase64 PNG when requested pdfstring | nullLegacy field; normally null urlstringFinal URL after redirects titlestring | nullPage title statusCodeintegerTarget status code costnumber | nullBilled cost metadataobject | nullExtracted metadata
Pricing
Base cost is $0.001 per requested supported format. Advanced proxy adds $0.004 when enabled.
markdown only: $0.001
markdown + screenshot: $0.002
markdown + html + screenshot: $0.003
markdown + proxy: $0.005
Errors
Status Meaning 400Invalid URL, unsupported scheme, DNS failure, or PDF URL sent to Scraper 401Missing or invalid LLMLayer API key 403Blocked private/unsafe target 500Upstream scrape failure
See Errors & Refunds for the shared error format.
More Examples
Search + Scrape Pipeline Search the web, scrape pages, and answer from collected context.
Extract API Use structured extraction when you need schema-shaped data.