{} The web,
as an API.
- No credit card
- Bot-proof fetching
- Schema-validated JSON
F1 TV Pro — stream every session
Hamilton on the Suzuka comeback
Drops into your stack — pick your weapon
Built for developers who need reliable data
AI Agent Extraction
Describe what you want in plain English. Our agents understand the page semantically and return exactly the structured data you asked for.
user_prompt: "Extract top 5 stories with title, url, points"
Bot-Proof Browser
Our own Chromium fork, fingerprint-patched at the source — not a bolt-on script anti-bot can spot.
Schema Enforcement
Validate, repair, guarantee. JSON Schema in, conforming data out.
validate schema ✓ 5/5 fields matched ✓ no repair needed
Intelligent Chunking
Long pages split, processed in parallel, deduped on merge.
Any Input Source
URLs, raw HTML, Markdown, or PDFs — one API for all of them.
Content Reduction
A local NLP layer strips navbars, ads, and noise before extraction — cutting cost and latency by 50–80% with no accuracy loss.
One request. Six stages. Clean JSON.
What you can build
E-commerce monitoring
Track prices, stock, and reviews across thousands of product pages with consistent JSON schemas.
Lead generation
Extract names, emails, titles, and company info from directories and profiles at scale.
News & content intel
Pull articles, authors, dates, and entities from any publisher into clean, queryable data.
AI agent tooling
Plug structured web data into LangChain, n8n, or your own agents — no scrapers to maintain.
Stop fighting with selectors and broken scripts
SmartScraper | DIY scraper | Headless browser | |
|---|---|---|---|
| Works on any site without writing selectors | |||
| Schema-validated structured output | |||
| Bot-proof browser | |||
| Automatic chunking for long pages | |||
| 50–80% noise removed before processing | |||
| PDF + HTML + Markdown input | |||
| Zero maintenance when sites change | |||
| No infrastructure to run |
One endpoint. Any language.
# fetch the top 5 HN stories
curl -X POST https://api.webscrape.ai/v1/smartscraper \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"website_url": "https://news.ycombinator.com",
"user_prompt": "Extract the top 5 stories with title, url, and points"
}'- ▸fetch12ms
- ▸clean3ms
- ▸reduce23ms
- ▸extract480ms
- ▸validate2ms
// 200 OK · extracted in ~325ms
{
"stories": [
{
"title": "Show HN: I built a real-time code editor",
"url": "https://example.com/editor",
"points": 342
},
{
"title": "Why Rust is the future of systems programming",
"url": "https://example.com/rust",
"points": 281
},
{
"title": "PostgreSQL 18 released with major improvements",
"url": "https://postgresql.org/18",
"points": 256
},
{
"title": "A deep dive into WebAssembly garbage collection",
"url": "https://example.com/wasm-gc",
"points": 198
},
{
"title": "Open source alternative to Figma",
"url": "https://example.com/penpot",
"points": 175
}
]
}Simple, transparent pricing
Free
Try the API with no commitment.
- 500 starting credits
- 300 credits / month thereafter
- 1 concurrent request
- 10 requests / minute
- 7-day data retention
- Limited SmartBrowse
Hobby
For side projects and prototypes.
- 5,000 credits / month
- 10 concurrent requests
- 100 requests / minute
- Standard proxy rotation
- 30-day data retention
- 20% off extra credit
Startup
Cost EfficientFor growing teams in production.
- 30,000 credits / month
- 50 concurrent requests
- 500 requests / minute
- Residential proxies
- 30-day data retention
- Priority support
- 40% off extra credit
Enterprise
Tailored solutions for large organizations.
- Unlimited credits
- Custom rate limits
- Dedicated infrastructure
- Premium proxy pool
- 99.9% SLA guarantee
- Dedicated account manager
- On-premise deployment
AI agent? Read the plain-text version at /pricing.md.
Frequently asked questions
Ready to extract structured data?
Try the live playground or integrate the API in minutes. No credit card required.