What is the Answer API?
The Answer API is like having a research assistant that:- Searches the web for current information about your question
- Reads and understands the most relevant sources
- Generates a complete answer using state-of-the-art AI models
- Cites sources so you can verify the information
Two Ways to Get Answers
Get Complete Answer
POST
/api/v2/answerGet the full answer in one response. Best when you need the complete result before proceeding.✅ Simpler to use
✅ Get everything at once
✅ Supports JSON outputStream Answer Live
POST
/api/v2/answer_streamReceive the answer word-by-word as it’s generated. Best for chat interfaces and real-time user feedback.✅ Lower perceived latency
✅ Progressive rendering
✅ Better user experienceBefore You Start
Get Your API Key
Sign up at llmlayer.dev
Create your free account at llmlayer.dev
Authentication
All requests require your API key in theAuthorization header:
Your First Answer (5-Minute Start)
Let’s make your first API call. Choose your language:That’s it! You just made your first AI-powered search. The API:
- Searched the web for information about France’s capital
- Used GPT-4o-mini to generate a clear answer
- Returned the result in ~1-2 seconds
How Pricing Works
Simple & Transparent Pricing
Total Cost = LLMLayer Fee + Model Usage (if not using your own key)
Cost Breakdown
- LLMLayer fee: $0.004 (1 query)
- Model input: 500 tokens × 0.000075
- Model output: 200 tokens × 0.00012
- Total: ~$0.004195 (less than half a cent!)
Monitor Your Costs
Every response includes cost information:Choose Your Model
Zero Markup Policy: We pass through provider pricing at cost. You only pay the official provider rate + our small infrastructure fee.
Quick Model Guide
For beginners, we recommend:- 🎯 General use:
openai/gpt-4o-mini- Fast, cheap, great quality - 🧠 Complex reasoning:
openai/gpt-4.1-mini- Better at hard questions - 🏆 Premium quality:
anthropic/claude-sonnet-4- Best for creative/nuanced tasks
Model Pricing (per 1M tokens)
OpenAI Models
| Model | Input ($/M tokens) | Output ($/M tokens) | Best For |
|---|---|---|---|
openai/gpt-5.1 | $1.25 | $10.00 | Advanced Complex reasoning & analysis |
openai/gpt-5 | $1.25 | $10.00 | Complex reasoning & analysis |
openai/gpt-5-mini | $0.25 | $2.00 | Cost-effective reasoning |
openai/gpt-5-nano | $0.05 | $0.40 | Balanced performance |
openai/gpt-4.1 | $2.00 | $8.00 | Advanced tasks |
openai/gpt-4.1-mini | $0.40 | $1.60 | Efficient advanced tasks |
openai/gpt-4o | $2.50 | $10.00 | Multimodal & complex queries |
openai/gpt-4o-mini | $0.15 | $0.60 | Fast, affordable searches |
Groq Models
| Model | Input ($/M tokens) | Output ($/M tokens) | Best For |
|---|---|---|---|
groq/openai-gpt-oss-120b | $0.15 | $0.75 | High-performance search |
groq/openai-gpt-oss-20b | $0.10 | $0.50 | Budget-friendly quality |
groq/kimi-k2 | $1.00 | $3.00 | High-performance search |
groq/llama-3.3-70b-versatile | $0.59 | $0.79 | Versatile applications |
groq/llama-4-maverick-17b-128e-instruct | $0.20 | $0.60 | Fast, efficient searches |
Anthropic Models
| Model | Input ($/M tokens) | Output ($/M tokens) | Best For |
|---|---|---|---|
anthropic/claude-sonnet-4 | $3.00 | $15.00 | highly creative writing and intelligent model |
anthropic/claude-sonnet-4-5 | $3.00 | $15.00 | highly creative writing and intelligent model |
Model Not Working? If you get
error_code: "invalid_model", the model may not be available in your region or plan. Try openai/gpt-4o-mini instead.Complete Answer API
Get the full answer in a single response. Endpoint:POST /api/v2/answer
Required Parameters
Your question or instruction. Be specific for better results.Examples:
- ✅ “What are the latest developments in quantum computing in 2025?”
- ✅ “Compare the populations of Tokyo, New York, and London”
- ❌ “quantum” (too vague)
The AI model to use. Format:
provider/model-nameRecommended:openai/gpt-4o-mini- Fast and cheap (start here!)openai/gpt-4.1-mini- Better reasoninganthropic/claude-sonnet-4- Creative tasks
Optional Parameters (Commonly Used)
Include the sources used to generate the answer.Use when: You need citations or want users to verify information.
Maximum length of the answer (in tokens). Roughly: 1 token ≈ 0.75 words.Guide:
500- Short answer (1-2 paragraphs)1500- Medium answer (default, 3-5 paragraphs)3000- Long answer (multiple sections)
Controls randomness. Range: 0.0 to 2.0Guide:
0.0-0.3- Factual, deterministic (news, data)0.5-0.8- Balanced (default, general use)0.9-2.0- Creative, varied (stories, brainstorming)
Optional Parameters (Advanced)
Use your own OpenAI/Anthropic/Groq/DeepSeek API key instead of LLMLayer’s.Benefits:
- You’re billed directly by the provider
model_costwill benullin responses- Good for high-volume usage
sk-... (OpenAI), sk-ant-... (Anthropic), etc.Override the default instructions given to the AI. Use this to customize tone, format, or expertise.Examples:
Output format:
markdown | html | jsonmarkdown- Clean text with formatting (default)html- Styled HTML contentjson- Structured data (requiresjson_schema)
Required when
answer_type="json". Defines the structure of the JSON response.Example:The SDK accepts an object and converts it to a string automatically. With cURL, send as a JSON string.
Type of search:
general | newsgeneral- Regular web searchnews- Recent news articles only
Filter results by recency:
anytime | hour | day | week | month | yearAdd inline citation markers like
[1], [2] in the answer text.Include relevant images in the response.
Country code for localized search results.Common:
us, uk, ca, de, fr, jp, au, inLanguage for the answer. Use
auto to detect from query, or specify: en, es, fr, de, ja, etc.Include or exclude specific domains. Use
- prefix to exclude.Examples:Number of search queries to generate (1-5). More queries = broader coverage but higher cost.Cost: Each additional query adds $0.004 to LLMLayer fee.
Amount of search content to feed the AI:
low | medium | highlow- Quick answers, saves tokensmedium- Default balancehigh- Maximum context, best quality
Response Format
The generated answer. String for markdown/HTML, Object for JSON output.
Source documents (only when
return_sources=true).Image results (only when
return_images=true).Cost of AI model usage.
null when using your own provider key.LLMLayer infrastructure fee ($0.004 per query).
Number of tokens sent to the model.
Number of tokens generated by the model.
Time taken to generate the answer (seconds).
Usage Examples
Example 1: Basic Question
Simple factual query with minimal configuration.Example 2: Get Recent News
Search for current events with time filters.Example 3: Extract Structured Data (JSON)
Get answers in a structured format for easy parsing.Example 4: Filter by Domain
Control which websites are used as sources.Example 5: Use Your Own API Key
For high-volume usage, use your own provider API key to get billed directly.Why use your own key?
- You already have credits with the provider
- You want consolidated billing from one provider
- You’re doing high-volume requests
- You want to track usage in your provider dashboard
Streaming Answer API
Stream the answer word-by-word as it’s generated. Perfect for chat interfaces! Endpoint:POST /api/v2/answer_stream
How Streaming Works
Instead of waiting for the complete answer, you get events in real-time:- sources - List of sources found
- images - Relevant images (if requested)
- answer - Text chunks as they’re generated (the main content!)
- usage - Token and cost information
- done - Completion signal with timing
Event Format
Each event is a JSON object with atype field:
Basic Streaming Example
Build a Chat UI
Here’s how to build a real-time chat interface:Error Handling
The API returns clear, structured errors. Here’s how to handle them:Error Format
All errors use this structure:Common Errors
401 - Authentication Errors
401 - Authentication Errors
Missing API KeyFix: Add your API key to the Authorization header.Provider Key InvalidFix: Verify your provider_key is correct.
400 - Validation Errors
400 - Validation Errors
Missing QueryFix: Provide a query string.Invalid ModelFix: Use a valid model from the pricing table.Missing JSON SchemaFix: Provide json_schema when answer_type is “json”.
429 - Rate Limits
429 - Rate Limits
Rate Limit ExceededFix: Wait a moment and retry with exponential backoff.
500 - Server Errors
500 - Server Errors
Search FailedFix: Retry the request. If it persists, contact support.Provider ErrorFix: The provider is having issues. Try a different model or wait and retry.
Robust Error Handling Code
Best Practices
💰 Reduce Cost
Choose the right model
- Start with
gpt-4o-mini- it’s fast and cheap - Only upgrade to premium models when needed
- Use
provider_keyfor high volume
- Keep
max_queries=1for most tasks - Set
max_tokensto your expected length - Only enable
return_imageswhen needed - Use
searchContextSize: 'low'for simple questions
⚡ Improve Speed
Use streaming
/answer_streamfeels 3-5x faster for users- Show progress instead of loading spinners
- Use
searchContextSize: 'low'when possible - Filter domains to focus search
- Keep queries specific and focused
✨ Enhance Quality
Better search
- Use
maxQueries: 2-3for research tasks - Use
searchContextSize: 'high'for complex topics - Add domain filters for authoritative sources
- Use
dateFilterfor time-sensitive info
- Provide clear, specific queries
- Use custom
systemPromptfor specialized needs - Set appropriate
temperature(low=factual, high=creative) - Enable
citationsfor verifiable content
🛡️ Build Reliable Apps
Error handling
- Always catch and handle errors
- Use exponential backoff for rate limits
- Have fallback models ready
- Track
model_costandllmlayer_cost - Log
input_tokensandoutput_tokens - Set up cost alerts
- Never expose API keys in frontend
- Always call from your backend
- Use environment variables
Quick Tips
Frequently Asked Questions
How is this different from using ChatGPT API directly?
How is this different from using ChatGPT API directly?
The Answer API combines web search + AI in one call. Benefits:
- Current information - Searches the web for latest data
- Verified sources - Returns citations you can check
- One API call - No need to search, scrape, and then call AI separately
- Optimized - We handle search ranking, content extraction, and context building
- Search the web yourself
- Scrape and clean content
- Build the prompt
- Call the API
- Parse the response
Can I use my own OpenAI/Anthropic key?
Can I use my own OpenAI/Anthropic key?
Yes! Use the When you do this:
provider_key parameter:- You’re billed directly by the provider
model_costwill benullin responses- You only pay LLMLayer’s infrastructure fee (~$0.004)
- Good for high-volume usage
What's the difference between max_queries and max_tokens?
What's the difference between max_queries and max_tokens?
-
max_queries: How many search queries to generate (1-5)
- 1 query = $0.004, searches with one query
- 3 queries = $0.012, searches with three different queries for broader coverage
-
max_tokens: Maximum length of the AI’s answer
- 500 = short answer (~375 words)
- 1500 = medium answer (~1125 words)
- 3000 = long answer (~2250 words)
Why use streaming vs regular endpoint?
Why use streaming vs regular endpoint?
Use streaming (
/answer_stream) when:- Building a chat interface
- User needs to see progress
- Long answers (better UX)
/answer) when:- Need complete answer before proceeding
- Need JSON structured output
- Processing in background
- Simpler to implement
How do I control which websites are used?
How do I control which websites are used?
Use
domainFilter:What happens if the model I choose is unavailable?
What happens if the model I choose is unavailable?
How can I estimate my costs?
How can I estimate my costs?
Use this formula:Example with gpt-4o-mini:
- LLMLayer: $0.004 (1 query)
- Input: 500 tokens × 0.000075
- Output: 200 tokens × 0.00012
- Total: ~$0.004195 per request
Can I get both JSON output and streaming?
Can I get both JSON output and streaming?
No, streaming does NOT support JSON structured output. Here’s why:
- JSON needs the complete response to validate against schema
- Streaming sends partial chunks
- These are incompatible
/answer endpoint for JSON output.Next Steps
Web Search API
Search without AI processing
Scraper API
Extract content from any URL
Supported Models
Full model list with pricing
Need Help?
Found a bug or have a feature request? We’d love to hear from you! Join our Discord or email us at [email protected]
