Skip to main content

Program Data

const program = await client.extract('https://www.ycombinator.com/about', {
  modes: ['json'],
  jsonSchema: {
    program: 'string',
    duration: 'string',
    funding: 'string',
    benefits: ['string'],
  },
  instructions: 'Return concise values. Use null when a field is missing.',
});

console.log(program.structured_data);

List Extraction

Use an array in your schema when the page contains repeated items.
posts = client.extract(
    "https://www.ycombinator.com/blog",
    modes=["json"],
    json_schema={
        "posts": [
            {
                "title": "string",
                "url": "string",
                "date": "string",
            }
        ]
    },
    instructions="Preserve each article URL when available.",
)

for post in posts.structured_data.get("posts", []):
    print(post["title"], post.get("date"), post.get("url"))

Page Summary + Q&A

const result = await client.extract('https://www.ycombinator.com/blog', {
  modes: ['summary', 'qa'],
  query: 'What are the three most actionable recommendations?',
  responseLanguage: 'en',
});

console.log(result.summary);
console.log(result.answer);

Brand Enrichment

brand = client.extract(
    "https://www.ycombinator.com",
    modes=["brand", "links"],
)

print(brand.brand)
print(brand.links[:10] if brand.links else [])

Combined Workflow

const site = await client.extract('https://www.ycombinator.com', {
  modes: ['summary', 'brand', 'links'],
});

const posts = await client.extract('https://www.ycombinator.com/blog', {
  modes: ['json'],
  jsonSchema: {
    posts: [
      {
        title: 'string',
        url: 'string',
        date: 'string',
      },
    ],
  },
});

console.log(site.summary);
console.log(site.brand);
console.log(posts.structured_data);

Reliability Tips

  • Keep json_schema focused on fields visible on the page.
  • Use instructions for normalization rules, not for unrelated business logic.
  • Split very large pages into scrape/search workflows when extraction output gets too large.
  • Treat structured_data as the only structured extraction result field.

Extract API

Main guide and response contract.

Errors & Refunds

Error codes and refund behavior.