Skip to main content
POST
/
api
/
v2
/
answer_stream
Answer API - Stream responses in real-time
curl --request POST \
  --url https://api.llmlayer.dev/api/v2/answer_stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "What are the latest developments in quantum computing?",
  "model": "openai/gpt-4o-mini",
  "location": "us",
  "provider_key": "<string>",
  "system_prompt": "<string>",
  "response_language": "auto",
  "answer_type": "markdown",
  "search_type": "general",
  "json_schema": "{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}",
  "citations": false,
  "return_sources": false,
  "return_images": false,
  "date_filter": "anytime",
  "max_tokens": 1500,
  "temperature": 0.7,
  "domain_filter": [
    "wikipedia.org",
    "-reddit.com"
  ],
  "max_queries": 3,
  "search_context_size": "medium"
}
'
"data: {\"type\":\"sources\",\"data\":[{\"title\":\"Example\",\"link\":\"https://example.com\",\"snippet\":\"...\"}]}\n\n"

Authorizations

Authorization
string
header
required

Bearer token authentication using your LLMLayer API key. Include in Authorization header as: Bearer YOUR_LLMLAYER_API_KEY

Body

application/json
query
string
required

The search query or question to answer

Example:

"What are the latest developments in quantum computing?"

model
string
required

LLM model to use (e.g., openai/gpt-4o-mini, anthropic/claude-sonnet-4, groq/llama-3.3-70b-versatile)

Example:

"openai/gpt-4o-mini"

location
string
default:us

Country code for localized search results (us, uk, ca, etc.)

Example:

"us"

provider_key
string | null

Your own API key for the model provider. If provided, you pay the provider directly and LLMLayer only charges for search infrastructure.

system_prompt
string | null

Custom system prompt to override default behavior. Use this to customize how the LLM processes the search results.

response_language
string
default:auto

Language for the response. 'auto' detects from query, or specify language code (en, es, fr, etc.)

Example:

"auto"

answer_type
enum<string>
default:markdown

Format of the response. Use 'json' with json_schema for structured output.

Available options:
markdown,
html,
json
search_type
enum<string>
default:general

Type of web search to perform. 'news' provides recent news articles.

Available options:
general,
news
json_schema
string | null

JSON schema as string for structured responses. Required when answer_type='json'. The LLM will format its response according to this schema.

Example:

"{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}"

citations
boolean
default:false

Include inline citations [1], [2] in the response text

return_sources
boolean
default:false

Return the source documents used for answer generation

return_images
boolean
default:false

Return relevant images from image search. Adds $0.001 to cost.

date_filter
enum<string>
default:anytime

Filter search results by recency. Useful for time-sensitive queries.

Available options:
anytime,
hour,
day,
week,
month,
year
max_tokens
integer
default:1500

Maximum tokens in the LLM response. Affects cost.

Required range: x >= 1
temperature
number
default:0.7

Controls response randomness. 0=deterministic, 2=very creative. Not supported by all models.

Required range: 0 <= x <= 2
domain_filter
string[] | null

Include or exclude specific domains. Use '-' prefix to exclude (e.g., ['-reddit.com', 'wikipedia.org'])

Example:
["wikipedia.org", "-reddit.com"]
max_queries
integer
default:1

Number of search queries to generate from the user query. More queries = broader search but higher cost. Each additional query costs $0.004.

Required range: 1 <= x <= 5
Example:

3

search_context_size
enum<string>
default:medium

Amount of search context to extract and pass to LLM. 'high' provides more context but uses more tokens.

Available options:
low,
medium,
high

Response

Server-Sent Events stream

SSE stream with event types: sources, images, answer, usage, done, error