Answer API - Stream responses in real-time

curl --request POST \
  --url https://api.llmlayer.dev/api/v2/answer_stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "What are the latest developments in quantum computing?",
  "model": "openai/gpt-4o-mini",
  "location": "us",
  "provider_key": "<string>",
  "system_prompt": "<string>",
  "response_language": "auto",
  "answer_type": "markdown",
  "search_type": "general",
  "json_schema": "{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}",
  "citations": false,
  "return_sources": false,
  "return_images": false,
  "date_filter": "anytime",
  "max_tokens": 1500,
  "temperature": 0.7,
  "domain_filter": [
    "wikipedia.org",
    "-reddit.com"
  ],
  "max_queries": 3,
  "search_context_size": "medium"
}
'

"data: {\"type\":\"sources\",\"data\":[{\"title\":\"Example\",\"link\":\"https://example.com\",\"snippet\":\"...\"}]}\n\n"

POST

api

answer_stream

Answer API - Stream responses in real-time

curl --request POST \
  --url https://api.llmlayer.dev/api/v2/answer_stream \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "What are the latest developments in quantum computing?",
  "model": "openai/gpt-4o-mini",
  "location": "us",
  "provider_key": "<string>",
  "system_prompt": "<string>",
  "response_language": "auto",
  "answer_type": "markdown",
  "search_type": "general",
  "json_schema": "{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}",
  "citations": false,
  "return_sources": false,
  "return_images": false,
  "date_filter": "anytime",
  "max_tokens": 1500,
  "temperature": 0.7,
  "domain_filter": [
    "wikipedia.org",
    "-reddit.com"
  ],
  "max_queries": 3,
  "search_context_size": "medium"
}
'

"data: {\"type\":\"sources\",\"data\":[{\"title\":\"Example\",\"link\":\"https://example.com\",\"snippet\":\"...\"}]}\n\n"

Authorizations

Authorization

string

header

required

Bearer token authentication using your LLMLayer API key. Include in Authorization header as: Bearer YOUR_LLMLAYER_API_KEY

Body

application/json

query

string

required

The search query or question to answer

Example:

"What are the latest developments in quantum computing?"

model

string

required

LLM model to use (e.g., openai/gpt-4o-mini, anthropic/claude-sonnet-4, groq/llama-3.3-70b-versatile)

Example:

"openai/gpt-4o-mini"

location

string

default:us

Country code for localized search results (us, uk, ca, etc.)

Example:

"us"

provider_key

string | null

Your own API key for the model provider. If provided, you pay the provider directly and LLMLayer only charges for search infrastructure.

system_prompt

string | null

Custom system prompt to override default behavior. Use this to customize how the LLM processes the search results.

response_language

string

default:auto

Language for the response. 'auto' detects from query, or specify language code (en, es, fr, etc.)

Example:

"auto"

answer_type

enum<string>

default:markdown

Format of the response. Use 'json' with json_schema for structured output.

Available options:

markdown,

html,

json

search_type

enum<string>

default:general

Type of web search to perform. 'news' provides recent news articles.

Available options:

general,

news

json_schema

string | null

JSON schema as string for structured responses. Required when answer_type='json'. The LLM will format its response according to this schema.

Example:

"{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}"

citations

boolean

default:false

Include inline citations [1], [2] in the response text

return_sources

boolean

default:false

Return the source documents used for answer generation

return_images

boolean

default:false

Return relevant images from image search. Adds $0.001 to cost.

date_filter

enum<string>

default:anytime

Filter search results by recency. Useful for time-sensitive queries.

Available options:

anytime,

hour,

day,

week,

month,

year

max_tokens

integer

default:1500

Maximum tokens in the LLM response. Affects cost.

Required range: x >= 1

temperature

number

default:0.7

Controls response randomness. 0=deterministic, 2=very creative. Not supported by all models.

Required range: 0 <= x <= 2

domain_filter

string[] | null

Include or exclude specific domains. Use '-' prefix to exclude (e.g., ['-reddit.com', 'wikipedia.org'])

Example:

["wikipedia.org", "-reddit.com"]

max_queries

integer

default:1

Number of search queries to generate from the user query. More queries = broader search but higher cost. Each additional query costs $0.004.

Required range: 1 <= x <= 5

Example:

3

search_context_size

enum<string>

default:medium

Amount of search context to extract and pass to LLM. 'high' provides more context but uses more tokens.

Available options:

low,

medium,

high

Response

Server-Sent Events stream

SSE stream with event types: sources, images, answer, usage, done, error

Answer API - Web-enhanced AI responses Web Search API - Raw search results

⌘I

API Documentation

Endpoint

Answer API - Stream responses in real-time

Authorizations

Body

Response