Overview
The Answer API combines live web search with LLM generation in one request. Use it when you need:- Current information from the web
- Source-backed responses
- Structured output (JSON) for downstream processing
- Streaming UX for chat and copilots
Endpoints
| Endpoint | Method | Best for | Supports JSON output |
|---|---|---|---|
/api/v2/answer | POST | Standard request/response flows | Yes |
/api/v2/answer_stream | POST (SSE) | Real-time streaming UX | No |
provider_key is deprecated and currently ignored by both endpoints. It is still accepted for backward compatibility.Authentication
All requests require:Model Selection
| Model | Pricing model | Best for |
|---|---|---|
llmlayer-web | Flat $0.007 × max_queries | Default recommendation |
llmlayer-fast | Flat $0.009 × max_queries | Faster responses |
openai/gpt-4o-mini | Token pricing + LLMLayer fee | Budget + quality |
openai/gpt-5.1 | Token pricing + LLMLayer fee | Highest reasoning quality |
Quickstart
Non-streaming (/answer)
Streaming (/answer_stream)
Request Parameters
This is the canonical request-body table for both endpoints.| Parameter | Type | Required | Default | Applies to | Cost impact | Details |
|---|---|---|---|---|---|---|
query | string | Yes | - | Both | - | User question/instruction |
model | string | Yes | - | Both | Depends on model | Example: llmlayer-web, openai/gpt-4o-mini |
search_type | string | No | general | Both | - | general or news |
date_filter | string | No | anytime | Both | - | anytime, hour, day, week, month, year |
location | string | No | us | Both | - | Country code for localized search |
domain_filter | string[] | No | null | Both | - | Include domains, or exclude with - prefix |
search_context_size | string | No | medium | Both | Indirect | low, medium, high |
max_queries | integer | No | 1 | Both | Increases LLMLayer fee | Range 1-4 |
max_tokens | integer | No | 1500 | Both | Increases model usage | Response length cap |
temperature | number | No | 0.7 | Both | - | Range 0.0-2.0 |
response_language | string | No | auto | Both | - | Example: en, fr, es |
citations | boolean | No | false | Both | Indirect | Adds inline citation markers |
return_sources | boolean | No | false | Both | - | Includes sources array |
return_images | boolean | No | false | Both | + $0.001 | Includes images array |
answer_type | string | No | markdown | /answer | - | markdown, html, json |
json_schema | object or string | Conditional | null | /answer | - | Required when answer_type="json" |
system_prompt | string | No | null | Both | - | Custom behavior instructions |
provider_key | string | No | null | Both | - | Deprecated, accepted, ignored |
HTTP requests use
snake_case field names. JavaScript SDK examples use camelCase.Parameter Rules
max_queriesmust be between1and4.answer_type="json"requiresjson_schema./api/v2/answer_streamdoes not support structured JSON output.provider_keydoes not change routing or billing.- Use
-domain.comindomain_filterto exclude domains.
Non-streaming Response Contract (/answer)
| Field | Type | When present | Description |
|---|---|---|---|
answer | string | object | Always | Generated answer (object when JSON mode) |
sources | array | If return_sources=true | Source documents used |
images | array | If return_images=true | Image search results |
response_time | string | Always | Total processing time in seconds |
input_tokens | integer | Always | Input token usage |
output_tokens | integer | Always | Output token usage |
model_cost | number | null | Usually | Model usage cost |
llmlayer_cost | number | Always | LLMLayer infrastructure cost |
title, link, snippet (plus provider-specific extras).
Image objects typically include title, imageUrl, thumbnailUrl, source, link.
Streaming Event Contract (/answer_stream)
The stream is Server-Sent Events (text/event-stream). Each frame contains JSON under data:.
| Event type | Payload | Notes |
|---|---|---|
sources | { "type": "sources", "data": Source[] } | Emitted when return_sources=true |
images | { "type": "images", "data": Image[] } | Emitted when return_images=true |
answer | { "type": "answer", "content": "..." } | Main text chunks |
usage | { "type": "usage", "input_tokens": ..., "output_tokens": ..., "model_cost": ..., "llmlayer_cost": ... } | Billing and token usage |
done | { "type": "done", "response_time": "..." } | Final event |
error | { "type": "error", "error": "..." } | Runtime stream error |
Practical Examples
1) News summary with citations
2) Structured JSON extraction (/answer only)
3) Domain-constrained answer
Error Handling
/answer error format
Common status codes
| Status | Category | Typical reason |
|---|---|---|
400 | Validation | Missing/invalid request parameters |
401 | Authentication | Missing or invalid LLMLayer API key |
429 | Rate limit/provider | Provider or account rate limiting |
500 | Internal/provider | Unexpected backend/provider failure |
502 | Provider | Upstream provider-specific failure |
/answer_stream errors
- Runtime failures are emitted as stream events (
type: error). - Early validation failures can appear as an immediate single
errorframe.
Pricing
Standard token-priced models
LLMLayer fixed-price models
Use
llmlayer_cost and model_cost from responses as the billing source of truth.Implementation Checklist
- Start with
/answerunless you need progressive rendering. - Set
max_queries=1first; increase only for research-style queries. - Enable
return_sources=truefor trust-sensitive use cases. - Use
answer_type="json"+json_schemafor structured pipelines. - Add retry/backoff logic for transient
429/500/502paths. - Keep keys server-side and log
llmlayer_cost+ token usage.
FAQ
When should I use /answer_stream instead of /answer?
When should I use /answer_stream instead of /answer?
Use
/answer_stream for chat UIs and live typing effects. Use /answer for batch jobs, strict request/response flows, and JSON structured output.Can I stream JSON structured output?
Can I stream JSON structured output?
No. Streaming returns incremental text chunks and does not support JSON schema-constrained output.
How do I control source quality?
How do I control source quality?
Use
domain_filter, search_type, date_filter, and search_context_size together. For factual tasks, use lower temperature.Is provider_key still supported?
Is provider_key still supported?
It is accepted to avoid breaking older clients, but currently ignored.
How do I minimize cost?
How do I minimize cost?
Start with
llmlayer-web, keep max_queries=1, only request images/sources when needed, and tune max_tokens to expected output length.Next Steps
Web Search API
Raw search results without LLM generation
Scraper API
Extract full page content from URLs
Answer Stream Endpoint
OpenAPI reference for SSE endpoint
Python SDK
Python package and usage examples
TypeScript SDK
JS/TS package and usage examples
Need Help?
Discord Community
Ask implementation questions
