What is the Map API?
The Map API is like a website explorer - it discovers all URLs on a domain and returns them with their page titles. Think of it as creating a sitemap or table of contents for any website.URL Discovery
Find all pages on a website automatically
Site Structure
Understand website hierarchy and organization
Lightweight & Fast: Returns only URLs and titles - no content extraction. This makes it 10x faster than scraping each page individually.
Why Use Map Before Crawling?
1
Map: Discover URLs
Use Map API to find all pagesCost: $0.002 (one-time)Speed: 1-5 seconds
2
Filter URLs
Pick which pages you actually needFilter by keyword, path, or pattern
3
Crawl: Get Content
Use Crawl API on selected pages onlyCost: $0.001 per pageSave money: Only crawl what you need!
Pricing (Super Affordable)
Flat Fee Per Site
$0.002 per map request= $2 for 1,000 sites mapped
Cost is the same whether you discover 10 URLs or 5,000 URLs. It’s a flat fee per website, not per URL discovered.
Before You Start
Authentication
All requests require your API key in theAuthorization header:
Your First Map (2-Minute Start)
Let’s discover all pages on a website!Done! You just discovered all pages on a website in seconds. Now you know exactly what’s there before crawling or scraping.
How It Works
The Map API discovers URLs using multiple strategies:1
Check Sitemap
First looks for
sitemap.xml (fastest method)2
Crawl Links
If no sitemap, crawls the site following links
3
Extract Titles
Gets page titles without downloading full content
4
Return List
Returns complete URL list with titles
Smart discovery: The API automatically chooses the best method. If a sitemap exists, it uses that (super fast). Otherwise, it crawls to find pages.
Basic Usage
Map a Website (All Pages)
Discover all pages on a domain.Advanced Options
Filter URLs by Keyword
Only discover URLs containing specific keywords.Include Subdomains
Discover URLs across all subdomains (blog.example.com, docs.example.com, etc.)Ignore Sitemap (Force Crawling)
Force the API to crawl instead of using sitemap.xml.When to use this:
- Sitemap is outdated or incomplete
- You want to discover hidden pages
- Testing actual site structure vs sitemap
Set URL Limit
Limit the number of URLs discovered (default: 5000).Set Timeout
Control how long to wait (default: 15000ms / 15 seconds).Real-World Examples
Example 1: Plan a Bulk Scrape
Map a site, filter URLs, then scrape only what you need.Example 2: Build a Sitemap
Generate a sitemap from any website.Example 3: Content Audit Tool
Analyze website structure and find pages by category.Request Parameters (Complete Reference)
Endpoint:POST /api/v2/map
Required Parameters
The website URL to map. Must be a valid HTTP(S) URL.Examples:
- ✅
https://example.com - ✅
https://docs.example.com - ❌
example.com(missing protocol)
Optional Parameters
Skip sitemap.xml and force crawling to discover URLs.Use when:
- Sitemap is outdated
- You want actual site structure
- Testing completeness
Include URLs from all subdomains (blog., docs., api.*, etc.)
Filter discovered URLs by keyword. Only returns URLs containing this string.Examples:
Maximum number of URLs to discover. Stops after reaching this limit.Range: 1 - 5000
Timeout in milliseconds (how long to wait).Default: 15000 (15 seconds)
This is the operation timeout, not the HTTP request timeout. It controls how long the mapping operation runs.
Response Format
Response Structure
Response Fields
Array of discovered URLs with titles.Each link object contains:
url(string): Full URLtitle(string): Page title
HTTP status code (200 for success)
Cost in USD ($0.001 per request)
Error Handling
Error Format
All errors use this structure:Common Errors
401 - Authentication Error
401 - Authentication Error
Missing or invalid API keyFix: Add your API key to the Authorization header.
400 - Invalid URL
400 - Invalid URL
Invalid or malformed URLFix: Ensure URL includes protocol (
https://) and is properly formatted.500 - Map Failed
500 - Map Failed
Failed to map the websiteCommon causes:
- Website is down
- Website blocks crawlers
- Network timeout
- Invalid site structure
504 - Timeout
504 - Timeout
Operation timeoutFix: Increase timeout or try mapping with a limit.
Robust Error Handling
Best Practices
💰 Cost Optimization
Use Map before Crawl
- Map entire site once ($0.001)
- Filter URLs to only what you need
- Crawl selected pages ($0.001 each)
- Save money on unnecessary scraping
- Site structures don’t change often
- Cache map results for hours/days
- Re-map only when site updates
⚡ Performance Tips
Use appropriate limits
- Don’t map 5000 URLs if you only need 50
- Set
limitto control discovery - Use
searchto filter early
- Default (sitemap): Fastest
ignoreSitemap: true: More completeincludeSubdomains: Comprehensive
✨ Better Results
Filter effectively
- Use
searchparameter to narrow results - Filter by path, keyword, or pattern
- Process results programmatically
- Check
links.lengthbefore processing - Validate URLs before scraping
- Group by subdomain or path
🛡️ Reliability
Handle errors gracefully
- Some sites block mapping
- Network issues happen
- Implement retry logic
- Check
statusCode === 200 - Verify
linksis not empty - Handle partial results
- Track discovered URL counts
- Log failed mappings
- Alert on anomalies
Important Limitations
Maximum limits:
- Up to 5,000 URLs per request
- 15-second default timeout (configurable)
- Flat fee regardless of URLs found
What works best:
- Public websites with sitemaps
- Documentation sites
- Blogs and news sites
- E-commerce product catalogs
- Company websites
Quick Tips
Frequently Asked Questions
What's the difference between Map and Crawl?
What's the difference between Map and Crawl?
Map API:
- Discovers URLs only (no content)
- Returns titles but no page content
- Super fast (1-5 seconds)
- Very cheap ($0.002)
- Use for: Discovery, planning, sitemaps
- Gets full content from each page
- Returns markdown, HTML, PDF, screenshots
- Slower (depends on page count)
- More expensive ($0.001 per page)
- Use for: Content extraction, archiving
How does the API discover URLs?
How does the API discover URLs?
The API uses multiple strategies:
-
Sitemap first (fastest)
- Checks for
/sitemap.xml - Uses sitemap index if available
- Parses all URLs from sitemap
- Checks for
-
Crawling fallback (if no sitemap or
ignoreSitemap: true)- Starts from homepage
- Follows internal links
- Discovers pages organically
-
Title extraction
- Fetches page metadata only
- Gets title without full content
- Much faster than scraping
Is the limit per domain or total?
Is the limit per domain or total?
The If you enable
limit parameter is the total number of URLs returned, not per subdomain.includeSubdomains, those 100 URLs can come from any subdomain.Can I map a specific subdomain only?
Can I map a specific subdomain only?
Yes! Just provide the subdomain URL:This will only map that specific subdomain, not the main domain or other subdomains.
What if the sitemap is outdated?
What if the sitemap is outdated?
Use This discovers the actual site structure by following links, which may find pages not in the sitemap.
ignoreSitemap: true to force actual crawling:How can I filter results after mapping?
How can I filter results after mapping?
Process the
links array with JavaScript/Python:Can I map password-protected sites?
Can I map password-protected sites?
No, the Map API cannot access:
- Password-protected pages
- Pages behind authentication
- Private intranets
- Paywalled content
What's the timeout for?
What's the timeout for?
The If the operation takes longer:
timeout parameter controls how long the mapping operation runs:- It will stop and return what was found so far
- Or return an error if nothing was discovered
Next Steps
Crawl API
Scrape multiple pages after mapping
Scraper API
Scrape individual pages
Answer API
Search + AI-powered answers
Need Help?
Found a bug or have a feature request? We’d love to hear from you! Join our Discord or email us at [email protected]
