Stream search results and AI responses in real-time using Server-Sent Events (SSE). Does not support JSON structured output.
Bearer token authentication using your LLMLayer API key. Include in Authorization header as: Bearer YOUR_LLMLAYER_API_KEY
The search query or question to answer
"What are the latest developments in quantum computing?"
LLM model to use (e.g., openai/gpt-4o-mini, anthropic/claude-sonnet-4, groq/llama-3.3-70b-versatile)
"openai/gpt-4o-mini"
Country code for localized search results (us, uk, ca, etc.)
"us"
Your own API key for the model provider. If provided, you pay the provider directly and LLMLayer only charges for search infrastructure.
Custom system prompt to override default behavior. Use this to customize how the LLM processes the search results.
Language for the response. 'auto' detects from query, or specify language code (en, es, fr, etc.)
"auto"
Format of the response. Use 'json' with json_schema for structured output.
markdown, html, json Type of web search to perform. 'news' provides recent news articles.
general, news JSON schema as string for structured responses. Required when answer_type='json'. The LLM will format its response according to this schema.
"{\"type\":\"object\",\"properties\":{\"summary\":{\"type\":\"string\"},\"key_points\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}}}}"
Include inline citations [1], [2] in the response text
Return the source documents used for answer generation
Return relevant images from image search. Adds $0.001 to cost.
Filter search results by recency. Useful for time-sensitive queries.
anytime, hour, day, week, month, year Maximum tokens in the LLM response. Affects cost.
x >= 1Controls response randomness. 0=deterministic, 2=very creative. Not supported by all models.
0 <= x <= 2Include or exclude specific domains. Use '-' prefix to exclude (e.g., ['-reddit.com', 'wikipedia.org'])
["wikipedia.org", "-reddit.com"]Number of search queries to generate from the user query. More queries = broader search but higher cost. Each additional query costs $0.004.
1 <= x <= 53
Amount of search context to extract and pass to LLM. 'high' provides more context but uses more tokens.
low, medium, high Server-Sent Events stream
SSE stream with event types: sources, images, answer, usage, done, error