Skip to main content

What is the Answer API?

The Answer API is like having a research assistant that:
  1. Searches the web for current information about your question
  2. Reads and understands the most relevant sources
  3. Generates a complete answer using state-of-the-art AI models
  4. Cites sources so you can verify the information
Perfect for: chatbots, research tools, content generation, data extraction, and any application that needs accurate, up-to-date information.

Two Ways to Get Answers


Before You Start

Get Your API Key

1

Sign up at llmlayer.dev

Create your free account at llmlayer.dev
2

Copy your API key

Find your API key in the dashboard
3

Keep it secure

Never expose your API key in client-side code! Always call from your server.

Authentication

All requests require your API key in the Authorization header:
Authorization: Bearer YOUR_LLMLAYER_API_KEY
Security Alert: Missing or invalid API keys return 401 Unauthorized. Always use environment variables to store your key, never hard-code it.

Your First Answer (5-Minute Start)

Let’s make your first API call. Choose your language:
import { LLMLayerClient } from 'llmlayer';

// 1. Create a client with your API key
const client = new LLMLayerClient({
  apiKey: process.env.LLMLAYER_API_KEY
});

// 2. Ask a question
const response = await client.answer({
  query: 'What is the capital of France?',
  model: 'openai/gpt-4o-mini'  // Fast and affordable
});

// 3. Get your answer
console.log(response.answer);
// Output: "The capital of France is Paris..."

// 4. Check the cost
console.log(`Cost: $${response.llmlayer_cost}`);
// Output: "Cost: $0.004"
That’s it! You just made your first AI-powered search. The API:
  • Searched the web for information about France’s capital
  • Used GPT-4o-mini to generate a clear answer
  • Returned the result in ~1-2 seconds

How Pricing Works

Simple & Transparent Pricing Total Cost = LLMLayer Fee + Model Usage (if not using your own key)

Cost Breakdown

Total Cost = ($0.004 × max_queries)
           + (Input Tokens × Model Input Price)
           + (Output Tokens × Model Output Price)
           + ($0.001 if return_images=true)
Example: Basic question with GPT-4o-mini
  • LLMLayer fee: $0.004 (1 query)
  • Model input: 500 tokens × 0.15/1M=0.15/1M = 0.000075
  • Model output: 200 tokens × 0.60/1M=0.60/1M = 0.00012
  • Total: ~$0.004195 (less than half a cent!)

Monitor Your Costs

Every response includes cost information:
const response = await client.answer({
  query: 'Your question here',
  model: 'openai/gpt-4o-mini'
});

console.log('LLMLayer infrastructure:', response.llmlayer_cost);  // $0.004
console.log('Model usage:', response.model_cost);                 // ~$0.0002
console.log('Input tokens:', response.input_tokens);              // 500
console.log('Output tokens:', response.output_tokens);            // 200

const total = response.model_cost + response.llmlayer_cost;
console.log(`Total: $${total.toFixed(6)}`);                       // $0.004195

Choose Your Model

Zero Markup Policy: We pass through provider pricing at cost. You only pay the official provider rate + our small infrastructure fee.

Quick Model Guide

For beginners, we recommend:
  • 🎯 General use: openai/gpt-4o-mini - Fast, cheap, great quality
  • 🧠 Complex reasoning: openai/gpt-4.1-mini - Better at hard questions
  • 🏆 Premium quality: anthropic/claude-sonnet-4 - Best for creative/nuanced tasks

Model Pricing (per 1M tokens)

OpenAI Models

ModelInput ($/M tokens)Output ($/M tokens)Best For
openai/gpt-5.1$1.25$10.00Advanced Complex reasoning & analysis
openai/gpt-5$1.25$10.00Complex reasoning & analysis
openai/gpt-5-mini$0.25$2.00Cost-effective reasoning
openai/gpt-5-nano$0.05$0.40Balanced performance
openai/gpt-4.1$2.00$8.00Advanced tasks
openai/gpt-4.1-mini$0.40$1.60Efficient advanced tasks
openai/gpt-4o$2.50$10.00Multimodal & complex queries
openai/gpt-4o-mini$0.15$0.60Fast, affordable searches

Groq Models

ModelInput ($/M tokens)Output ($/M tokens)Best For
groq/openai-gpt-oss-120b$0.15$0.75High-performance search
groq/openai-gpt-oss-20b$0.10$0.50Budget-friendly quality
groq/kimi-k2$1.00$3.00High-performance search
groq/llama-3.3-70b-versatile$0.59$0.79Versatile applications
groq/llama-4-maverick-17b-128e-instruct$0.20$0.60Fast, efficient searches

Anthropic Models

ModelInput ($/M tokens)Output ($/M tokens)Best For
anthropic/claude-sonnet-4$3.00$15.00highly creative writing and intelligent model
anthropic/claude-sonnet-4-5$3.00$15.00highly creative writing and intelligent model
Model Not Working? If you get error_code: "invalid_model", the model may not be available in your region or plan. Try openai/gpt-4o-mini instead.

Complete Answer API

Get the full answer in a single response. Endpoint: POST /api/v2/answer

Required Parameters

query
string
required
Your question or instruction. Be specific for better results.Examples:
  • ✅ “What are the latest developments in quantum computing in 2025?”
  • ✅ “Compare the populations of Tokyo, New York, and London”
  • ❌ “quantum” (too vague)
model
string
required
The AI model to use. Format: provider/model-nameRecommended:
  • openai/gpt-4o-mini - Fast and cheap (start here!)
  • openai/gpt-4.1-mini - Better reasoning
  • anthropic/claude-sonnet-4 - Creative tasks
Not sure which model? Start with openai/gpt-4o-mini - it’s fast, affordable, and handles most tasks well.

Optional Parameters (Commonly Used)

return_sources
boolean
default:"false"
Include the sources used to generate the answer.Use when: You need citations or want users to verify information.
returnSources: true  // Returns array of source objects
max_tokens
integer
default:"1500"
Maximum length of the answer (in tokens). Roughly: 1 token ≈ 0.75 words.Guide:
  • 500 - Short answer (1-2 paragraphs)
  • 1500 - Medium answer (default, 3-5 paragraphs)
  • 3000 - Long answer (multiple sections)
Higher values = higher cost. Only use what you need!
temperature
float
default:"0.7"
Controls randomness. Range: 0.0 to 2.0Guide:
  • 0.0-0.3 - Factual, deterministic (news, data)
  • 0.5-0.8 - Balanced (default, general use)
  • 0.9-2.0 - Creative, varied (stories, brainstorming)
temperature: 0.3  // For factual Q&A
temperature: 0.8  // For creative writing

Optional Parameters (Advanced)

provider_key
string
Use your own OpenAI/Anthropic/Groq/DeepSeek API key instead of LLMLayer’s.Benefits:
  • You’re billed directly by the provider
  • model_cost will be null in responses
  • Good for high-volume usage
Format: sk-... (OpenAI), sk-ant-... (Anthropic), etc.
providerKey: process.env.OPENAI_API_KEY  // Use your OpenAI key
system_prompt
string
Override the default instructions given to the AI. Use this to customize tone, format, or expertise.Examples:
systemPrompt: "You are a medical expert. Use technical terminology."
systemPrompt: "Explain everything in simple terms for a 10-year-old."
systemPrompt: "Be concise. Maximum 3 sentences per answer."
Only works with markdown and html output. Ignored for JSON responses.
answer_type
string
default:"markdown"
Output format: markdown | html | json
  • markdown - Clean text with formatting (default)
  • html - Styled HTML content
  • json - Structured data (requires json_schema)
answerType: 'markdown'  // Default
answerType: 'json'      // Must provide jsonSchema!
json_schema
object | string
Required when answer_type="json". Defines the structure of the JSON response.Example:
jsonSchema: {
  type: "object",
  properties: {
    summary: { type: "string" },
    key_points: { type: "array", items: { type: "string" } }
  },
  required: ["summary", "key_points"]
}
The SDK accepts an object and converts it to a string automatically. With cURL, send as a JSON string.
search_type
string
default:"general"
Type of search: general | news
  • general - Regular web search
  • news - Recent news articles only
searchType: 'news'  // For current events
date_filter
string
default:"anytime"
Filter results by recency: anytime | hour | day | week | month | year
dateFilter: 'week'  // Only results from the past week
citations
boolean
default:"false"
Add inline citation markers like [1], [2] in the answer text.
citations: true      // "Paris is the capital of France [1]..."
returnSources: true  // Get the actual source list
return_images
boolean
default:"false"
Include relevant images in the response.
Adds $0.001 to your LLMLayer cost per request.
returnImages: true  // Returns array of image objects
location
string
default:"us"
Country code for localized search results.Common: us, uk, ca, de, fr, jp, au, in
location: 'de'  // Search German sources
response_language
string
default:"auto"
Language for the answer. Use auto to detect from query, or specify: en, es, fr, de, ja, etc.
responseLanguage: 'es'  // Answer in Spanish
domain_filter
array
Include or exclude specific domains. Use - prefix to exclude.Examples:
domainFilter: ['nature.com', 'science.org']           // Only these
domainFilter: ['-reddit.com', '-pinterest.com']       // Exclude these
domainFilter: ['arxiv.org', '-wikipedia.org']         // Mix both
max_queries
integer
default:"1"
Number of search queries to generate (1-5). More queries = broader coverage but higher cost.Cost: Each additional query adds $0.004 to LLMLayer fee.
maxQueries: 1  // Default - most cases
maxQueries: 3  // Research tasks, complex questions
Start with 1. Only increase for complex research questions where you need multiple perspectives.
search_context_size
string
default:"medium"
Amount of search content to feed the AI: low | medium | high
  • low - Quick answers, saves tokens
  • medium - Default balance
  • high - Maximum context, best quality
searchContextSize: 'high'  // For detailed research

Response Format

answer
string | object
The generated answer. String for markdown/HTML, Object for JSON output.
"answer": "Paris is the capital of France..."
sources
array
Source documents (only when return_sources=true).
"sources": [
  {
    "title": "Paris - Wikipedia",
    "link": "https://en.wikipedia.org/wiki/Paris",
    "snippet": "Paris is the capital and most populous city of France...",
    "position": 1,
    "favicon": "https://wikipedia.org/favicon.ico",
    "date": "2025-01-15"
  }
]
images
array
Image results (only when return_images=true).
"images": [
  {
    "title": "Eiffel Tower",
    "image_url": "https://example.com/image.jpg",
    "thumbnail_url": "https://example.com/thumb.jpg",
    "source": "example.com"
  }
]
model_cost
float
Cost of AI model usage. null when using your own provider key.
"model_cost": 0.000195
llmlayer_cost
float
LLMLayer infrastructure fee ($0.004 per query).
"llmlayer_cost": 0.004
input_tokens
integer
Number of tokens sent to the model.
"input_tokens": 500
output_tokens
integer
Number of tokens generated by the model.
"output_tokens": 200
response_time
string
Time taken to generate the answer (seconds).
"response_time": "1.23"

Usage Examples

Example 1: Basic Question

Simple factual query with minimal configuration.
const response = await client.answer({
  query: 'What is the population of Tokyo?',
  model: 'openai/gpt-4o-mini',
  returnSources: true
});

console.log(response.answer);
console.log('\nSources:');
response.sources?.forEach(source => {
  console.log(`- ${source.title}`);
  console.log(`  ${source.link}`);
});
console.log(`\nTotal cost: $${response.model_cost + response.llmlayer_cost}`);
Output:
Tokyo has a population of approximately 14 million people in the prefecture, 
or about 37-38 million in the greater Tokyo metropolitan area, making it the 
world's most populous metropolitan area.

Sources:
- Tokyo Population 2025
  https://worldpopulationreview.com/world-cities/tokyo-population

Total cost: $0.0042

Example 2: Get Recent News

Search for current events with time filters.
const response = await client.answer({
  query: 'Latest developments in renewable energy',
  model: 'openai/gpt-4.1-mini',     // Better for complex topics
  searchType: 'news',               // Search news articles
  dateFilter: 'week',               // Past week only
  citations: true,                  // Add citation markers
  returnSources: true,              // Get source list
  maxQueries: 2,                    // Broader coverage ($0.008 total)
  temperature: 0.5                  // Balanced
});

console.log(response.answer);
console.log('\nSources:', response.sources?.length);
console.log('LLMLayer cost:', response.llmlayer_cost);  // $0.008 (2 queries)

Example 3: Extract Structured Data (JSON)

Get answers in a structured format for easy parsing.
// Define what data structure you want
const schema = {
  type: 'object',
  properties: {
    summary: {
      type: 'string',
      description: 'Brief 2-sentence summary'
    },
    key_findings: {
      type: 'array',
      items: { type: 'string' },
      description: 'List of 3-5 key points'
    },
    companies_mentioned: {
      type: 'array',
      items: { type: 'string' }
    }
  },
  required: ['summary', 'key_findings']
};

const response = await client.answer({
  query: 'What are the major AI companies and their latest products?',
  model: 'openai/gpt-4o',
  answerType: 'json',           // JSON output
  jsonSchema: schema,           // Define structure
  maxQueries: 2,                // Better coverage
  searchContextSize: 'high'     // More information
});

// Parse the JSON response
const data = typeof response.answer === 'string'
  ? JSON.parse(response.answer)
  : response.answer;

console.log('Summary:', data.summary);
console.log('\nKey Findings:');
data.key_findings.forEach((finding, i) => {
  console.log(`${i + 1}. ${finding}`);
});
console.log('\nCompanies:', data.companies_mentioned.join(', '));
Output:
{
  "summary": "Major AI companies including OpenAI, Anthropic, Google, and Microsoft have released powerful new models in 2024-2025. These companies are competing on model capabilities, pricing, and specialized use cases.",
  "key_findings": [
    "OpenAI released GPT-5 and o3 models with improved reasoning",
    "Anthropic's Claude Sonnet 4 excels at creative writing tasks",
    "Google's Gemini 2.0 focuses on multimodal capabilities",
    "Microsoft integrated AI across Azure and Office products",
    "Competition is driving prices down while quality improves"
  ],
  "companies_mentioned": ["OpenAI", "Anthropic", "Google", "Microsoft", "DeepSeek"]
}

Example 4: Filter by Domain

Control which websites are used as sources.
// Research medical information from trusted sources only
const response = await client.answer({
  query: 'What are the latest treatments for type 2 diabetes?',
  model: 'anthropic/claude-sonnet-4',  // Best for nuanced medical info
  domainFilter: [
    'pubmed.gov',           // Include: Medical database
    'nih.gov',              // Include: National health institute
    'mayoclinic.org',       // Include: Trusted medical source
    '-reddit.com',          // Exclude: User opinions
    '-pinterest.com'        // Exclude: Non-medical content
  ],
  searchContextSize: 'high',  // Maximum context
  temperature: 0.3,           // Factual
  returnSources: true
});

console.log(response.answer);
console.log('\nSources used:');
response.sources?.forEach(source => {
  const domain = new URL(source.link).hostname;
  console.log(`- ${source.title} (${domain})`);
});

Example 5: Use Your Own API Key

For high-volume usage, use your own provider API key to get billed directly.
const response = await client.answer({
  query: 'Explain the theory of relativity',
  model: 'openai/gpt-4o',
  providerKey: process.env.OPENAI_API_KEY,  // Use your OpenAI key
  maxTokens: 2000,
  temperature: 0.7
});

// When using your own key:
console.log('LLMLayer cost:', response.llmlayer_cost);  // $0.004 only
console.log('Model cost:', response.model_cost);        // null (you're billed by OpenAI)
console.log('Total from LLMLayer:', response.llmlayer_cost);
Why use your own key?
  • You already have credits with the provider
  • You want consolidated billing from one provider
  • You’re doing high-volume requests
  • You want to track usage in your provider dashboard

Streaming Answer API

Stream the answer word-by-word as it’s generated. Perfect for chat interfaces! Endpoint: POST /api/v2/answer_stream
Important: Streaming does NOT support answer_type="json". For structured output, use the regular /answer endpoint.

How Streaming Works

Instead of waiting for the complete answer, you get events in real-time:
  1. sources - List of sources found
  2. images - Relevant images (if requested)
  3. answer - Text chunks as they’re generated (the main content!)
  4. usage - Token and cost information
  5. done - Completion signal with timing

Event Format

Each event is a JSON object with a type field:
// Sources found
{"type": "sources", "data": [{...source objects...}]}

// Images found
{"type": "images", "data": [{...image objects...}]}

// Content being generated (this is the main one!)
{"type": "answer", "content": "The capital"}
{"type": "answer", "content": " of France"}
{"type": "answer", "content": " is Paris"}

// Final metrics
{"type": "usage", "input_tokens": 500, "output_tokens": 200, "model_cost": 0.0002, "llmlayer_cost": 0.004}

// Stream complete
{"type": "done", "response_time": "2.45"}

Basic Streaming Example

import { LLMLayerClient } from 'llmlayer';

const client = new LLMLayerClient({
  apiKey: process.env.LLMLAYER_API_KEY
});

// Start the stream
const stream = client.streamAnswer({
  query: 'Explain how photosynthesis works',
  model: 'openai/gpt-4o-mini',
  returnSources: true,
  temperature: 0.7
});

// Handle each event
for await (const event of stream) {
  switch (event.type) {
    case 'answer':
      // Print each chunk immediately (like ChatGPT!)
      process.stdout.write(event.content || '');
      break;

    case 'sources':
      console.log('\n\nSources:', event.data?.length);
      break;

    case 'usage':
      const total = (event.model_cost || 0) + (event.llmlayer_cost || 0);
      console.log(`\n\nCost: $${total.toFixed(4)}`);
      break;

    case 'done':
      console.log(`\nCompleted in ${event.response_time}s`);
      break;

    case 'error':
      console.error('Error:', event.error);
      break;
  }
}

Build a Chat UI

Here’s how to build a real-time chat interface:
import { useState } from 'react';
import { LLMLayerClient } from 'llmlayer';

function ChatInterface() {
  const [messages, setMessages] = useState([]);
  const [currentAnswer, setCurrentAnswer] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  const client = new LLMLayerClient({
    apiKey: process.env.LLMLAYER_API_KEY
  });

  async function askQuestion(question: string) {
    setIsStreaming(true);
    setCurrentAnswer('');

    // Add user message
    setMessages(prev => [...prev, { role: 'user', content: question }]);

    try {
      const stream = client.streamAnswer({
        query: question,
        model: 'openai/gpt-4o-mini',
        temperature: 0.7
      });

      let fullAnswer = '';

      for await (const event of stream) {
        if (event.type === 'answer') {
          fullAnswer += event.content;
          setCurrentAnswer(fullAnswer);  // Update UI in real-time
        }

        if (event.type === 'done') {
          // Add complete answer to history
          setMessages(prev => [...prev, {
            role: 'assistant',
            content: fullAnswer
          }]);
          setCurrentAnswer('');
        }
      }
    } catch (error) {
      console.error('Stream error:', error);
    } finally {
      setIsStreaming(false);
    }
  }

  return (
    <div>
      {messages.map((msg, i) => (
        <div key={i} className={msg.role}>
          {msg.content}
        </div>
      ))}
      {currentAnswer && (
        <div className="assistant streaming">
          {currentAnswer}
          <span className="cursor"></span>
        </div>
      )}
    </div>
  );
}

Error Handling

The API returns clear, structured errors. Here’s how to handle them:

Error Format

All errors use this structure:
{
  "detail": {
    "error_type": "validation_error",
    "error_code": "missing_query",
    "message": "Query parameter cannot be empty",
    "details": {
      "additional": "context"
    }
  }
}

Common Errors

Missing API Key
{
  "error_code": "missing_llmlayer_api_key",
  "message": "Provide LLMLayer API key via 'Authorization: Bearer <token>'"
}
Fix: Add your API key to the Authorization header.Provider Key Invalid
{
  "error_code": "openai_auth_error",
  "message": "Authentication failed for openai. Please check your provider API key"
}
Fix: Verify your provider_key is correct.
Missing Query
{
  "error_code": "missing_query",
  "message": "Query parameter cannot be empty"
}
Fix: Provide a query string.Invalid Model
{
  "error_code": "invalid_model",
  "message": "Model 'gpt-99' is not supported by LLMLAYER"
}
Fix: Use a valid model from the pricing table.Missing JSON Schema
{
  "error_code": "missing_json_schema",
  "message": "JSON schema is required for JSON response type"
}
Fix: Provide json_schema when answer_type is “json”.
Rate Limit Exceeded
{
  "error_code": "anthropic_rate_limit",
  "message": "Rate limit exceeded for anthropic. Please try again later"
}
Fix: Wait a moment and retry with exponential backoff.
Search Failed
{
  "error_code": "search_context_error",
  "message": "Failed to retrieve search results"
}
Fix: Retry the request. If it persists, contact support.Provider Error
{
  "error_code": "openai_500",
  "message": "Error from openai provider"
}
Fix: The provider is having issues. Try a different model or wait and retry.

Robust Error Handling Code

import {
  LLMLayerClient,
  AuthenticationError,
  InvalidRequest,
  RateLimitError,
  ProviderError
} from 'llmlayer';

const client = new LLMLayerClient({
  apiKey: process.env.LLMLAYER_API_KEY
});

async function robustAnswer(query: string, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.answer({
        query,
        model: 'openai/gpt-4o-mini'
      });

    } catch (error) {
      // Authentication errors - don't retry
      if (error instanceof AuthenticationError) {
        console.error('❌ Fix your API key:', error.message);
        throw error;
      }

      // Validation errors - don't retry
      if (error instanceof InvalidRequest) {
        console.error('❌ Fix your request:', error.message);
        throw error;
      }

      // Rate limit - wait and retry
      if (error instanceof RateLimitError) {
        const waitTime = Math.pow(2, attempt) * 1000;  // Exponential backoff
        console.log(`⏳ Rate limited. Waiting ${waitTime}ms...`);
        await new Promise(resolve => setTimeout(resolve, waitTime));
        continue;  // Retry
      }

      // Provider error - try different model
      if (error instanceof ProviderError) {
        console.log('⚠️ Provider error, trying fallback model...');
        return await client.answer({
          query,
          model: 'groq/llama-3.3-70b-versatile'  // Fallback
        });
      }

      // Unknown error on last attempt
      if (attempt === maxRetries - 1) {
        console.error('❌ Max retries exceeded:', error);
        throw error;
      }

      // Wait before retry
      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }
}

// Usage
try {
  const response = await robustAnswer('What is machine learning?');
  console.log(response.answer);
} catch (error) {
  console.error('Failed after all retries:', error);
}

Best Practices

💰 Reduce Cost

Choose the right model
  • Start with gpt-4o-mini - it’s fast and cheap
  • Only upgrade to premium models when needed
  • Use provider_key for high volume
Optimize parameters
  • Keep max_queries=1 for most tasks
  • Set max_tokens to your expected length
  • Only enable return_images when needed
  • Use searchContextSize: 'low' for simple questions

⚡ Improve Speed

Use streaming
  • /answer_stream feels 3-5x faster for users
  • Show progress instead of loading spinners
Reduce context
  • Use searchContextSize: 'low' when possible
  • Filter domains to focus search
  • Keep queries specific and focused

✨ Enhance Quality

Better search
  • Use maxQueries: 2-3 for research tasks
  • Use searchContextSize: 'high' for complex topics
  • Add domain filters for authoritative sources
  • Use dateFilter for time-sensitive info
Better AI output
  • Provide clear, specific queries
  • Use custom systemPrompt for specialized needs
  • Set appropriate temperature (low=factual, high=creative)
  • Enable citations for verifiable content

🛡️ Build Reliable Apps

Error handling
  • Always catch and handle errors
  • Use exponential backoff for rate limits
  • Have fallback models ready
Monitor usage
  • Track model_cost and llmlayer_cost
  • Log input_tokens and output_tokens
  • Set up cost alerts
Security
  • Never expose API keys in frontend
  • Always call from your backend
  • Use environment variables

Quick Tips

Starting out? Use this config for most tasks:
{
  model: 'openai/gpt-4o-mini',
  temperature: 0.7,
  maxTokens: 1500,
  returnSources: true
}
Need citations? Enable both options:
{
  citations: true,      // Adds [1], [2] markers in text
  returnSources: true   // Returns the actual source list
}
Working with news? Use these settings:
{
  searchType: 'news',
  dateFilter: 'week',   // Recent only
  maxQueries: 2         // Broader coverage
}
Building a chat UI? Always use streaming:
const stream = client.streamAnswer({
  query: userMessage,
  model: 'openai/gpt-4o-mini'
});
// Users see results immediately!
Need JSON output? Define your schema clearly:
{
  answerType: 'json',
  jsonSchema: {
    type: 'object',
    properties: { /* your structure */ },
    required: [/* required fields */]
  }
}

Frequently Asked Questions

The Answer API combines web search + AI in one call. Benefits:
  1. Current information - Searches the web for latest data
  2. Verified sources - Returns citations you can check
  3. One API call - No need to search, scrape, and then call AI separately
  4. Optimized - We handle search ranking, content extraction, and context building
With ChatGPT API directly, you’d need to:
  • Search the web yourself
  • Scrape and clean content
  • Build the prompt
  • Call the API
  • Parse the response
We do all of that for you!
Yes! Use the provider_key parameter:
providerKey: process.env.OPENAI_API_KEY
When you do this:
  • You’re billed directly by the provider
  • model_cost will be null in responses
  • You only pay LLMLayer’s infrastructure fee (~$0.004)
  • Good for high-volume usage
  • max_queries: How many search queries to generate (1-5)
    • 1 query = $0.004, searches with one query
    • 3 queries = $0.012, searches with three different queries for broader coverage
  • max_tokens: Maximum length of the AI’s answer
    • 500 = short answer (~375 words)
    • 1500 = medium answer (~1125 words)
    • 3000 = long answer (~2250 words)
Use streaming (/answer_stream) when:
  • Building a chat interface
  • User needs to see progress
  • Long answers (better UX)
Use regular (/answer) when:
  • Need complete answer before proceeding
  • Need JSON structured output
  • Processing in background
  • Simpler to implement
Use domainFilter:
// Only use these domains
domainFilter: ['nature.com', 'science.org']

// Exclude these domains (note the - prefix)
domainFilter: ['-reddit.com', '-pinterest.com']

// Mix both
domainFilter: ['arxiv.org', '-wikipedia.org']
The API returns a 400 error with:
{
  "error_code": "invalid_model",
  "message": "Model 'X' is not supported"
}
Solution: Use a different model. We recommend:
  • openai/gpt-4o-mini - Works for most everyone
  • groq/llama-3.3-70b-versatile - Budget alternative
  • anthropic/claude-sonnet-4 - Premium option
Use this formula:
Cost per request = ($0.004 × max_queries)
                 + (input_tokens × model_input_price / 1M)
                 + (output_tokens × model_output_price / 1M)
                 + ($0.001 if return_images=true)
Example with gpt-4o-mini:
  • LLMLayer: $0.004 (1 query)
  • Input: 500 tokens × 0.15/1M=0.15/1M = 0.000075
  • Output: 200 tokens × 0.60/1M=0.60/1M = 0.00012
  • Total: ~$0.004195 per request
For 1000 requests: ~$4.20
No, streaming does NOT support JSON structured output. Here’s why:
  • JSON needs the complete response to validate against schema
  • Streaming sends partial chunks
  • These are incompatible
Solution: Use regular /answer endpoint for JSON output.

Next Steps


Need Help?

Found a bug or have a feature request? We’d love to hear from you! Join our Discord or email us at [email protected]