Overview
Use Extract when you need one page transformed into application-ready data. A single request can combine modes, sharing the same page fetch:
Mode Response field Requires LLM Cost jsonstructured_dataYes $0.005summarysummaryYes $0.005qaanswerYes $0.005linkslinksNo $0.001brandbrandNo $0.002
All result fields are present on every response. Modes you did not request are null.
Endpoint
POST /api/v2/extract
Quickstart: Structured Data
import { LLMLayerClient } from 'llmlayer' ;
const client = new LLMLayerClient ({
apiKey: process . env . LLMLAYER_API_KEY ,
});
const program = await client . extract ( 'https://www.ycombinator.com/about' , {
modes: [ 'json' ],
jsonSchema: {
program: 'string' ,
duration: 'string' ,
funding: 'string' ,
benefits: [ 'string' ],
},
instructions: 'Return concise values and use null when a field is missing.' ,
});
console . log ( program . structured_data );
console . log ( program . summary ); // null
Request Parameters
Parameter Type Required Default Description urlstringYes - Public http or https page URL modesstring[]No ["json"]Any of json, summary, qa, links, brand json_schemaobject | stringConditional nullRequired when modes includes json querystringConditional nullRequired when modes includes qa instructionsstringNo nullExtra guidance for LLM modes: json, summary, qa response_languagestringNo autoBest for summary and Q&A text advanced_proxybooleanNo falseUse for heavily protected pages main_content_onlyboolean | nullNo API-selected Omit to let the API choose the best mode-specific default
HTTP requests use snake_case. The TypeScript SDK uses camelCase, for example jsonSchema, responseLanguage, and advancedProxy.
Response
{
"url" : "https://www.ycombinator.com/about" ,
"title" : "What Happens at YC | Y Combinator" ,
"metadata" : {
"description" : "..."
},
"structured_data" : {
"program" : "Y Combinator startup program" ,
"duration" : "3 months" ,
"funding" : "$500k per company" ,
"benefits" : [ "office hours" , "founder community" , "Demo Day" ]
},
"summary" : null ,
"answer" : null ,
"links" : null ,
"brand" : null ,
"cost" : 0.005 ,
"response_time" : "3.42" ,
"statusCode" : 200
}
Field Type Description urlstringFinal URL after redirects titlestring | nullPage title metadataobject | nullPage metadata found by the scraper structured_dataobject | nullResult of json mode summarystring | nullResult of summary mode answerstring | nullResult of qa mode linksarray | nullResult of links mode brandobject | nullResult of brand mode costnumber | nullTotal cost for selected modes response_timestringTotal processing time in seconds statusCodeinteger200 on success
The structured result field is structured_data in the API, Python SDK, and TypeScript SDK.
Combining Modes
profile = client.extract(
"https://www.ycombinator.com" ,
modes = [ "summary" , "links" , "brand" ],
)
print (profile.summary)
print (profile.links)
print (profile.brand)
const profile = await client . extract ( 'https://www.ycombinator.com' , {
modes: [ 'summary' , 'links' , 'brand' ],
});
console . log ( profile . summary );
console . log ( profile . links );
console . log ( profile . brand );
JSON Schema Guidance
json_schema can be:
A formal JSON schema
An example object
A plain object where values describe expected types
A plain-text description
For reliable extraction:
Keep schemas focused.
Use arrays only when the page clearly contains lists.
Put normalization rules in instructions.
Use nullable expectations when fields may be missing.
Errors and Refunds
Status Common reason Charged? 400Missing json_schema, missing query, invalid mode, PDF URL No 422Empty extractable content, or JSON output truncated Depends on whether AI work ran 500Page fetch failed Refunded when failure happens before AI work 502Brand fetch or model JSON failure Depends on failure stage
See Errors & Refunds for exact error codes and refund behavior.
More Examples
Extract Recipes Products, lists, Q&A, and brand enrichment.
API Reference Full request and response schema.