ExtractAegis soonCrucible soon
Errors
All errors return a JSON body with error and status fields:
{
"error": "Description of what went wrong",
"status": 400
}
Error Codes
400 — Bad Request
| Error | Cause | Fix |
|---|---|---|
Invalid JSON body | Request body isn't valid JSON | Check your JSON syntax |
Missing or invalid 'url' field | No url in the body, or it's not a string | Send {"url": "https://..."} |
Invalid URL format | The URL can't be parsed | Use a full URL with https:// |
URL exceeds maximum length of 2048 | URL is longer than 2048 characters | Shorten the URL or use a redirect |
Unsupported URL scheme: <scheme>: | URL uses a scheme other than http: or https: | Only http:// and https:// are supported |
URL targets a private IPv4 address / URL resolves to a private host | URL points to a private, loopback, or link-local address (e.g. localhost, 127.0.0.1, 10.x, 192.168.x, 169.254.x) | Use a publicly reachable URL |
DNS resolution failed | The hostname could not be resolved | Verify the domain exists and is spelled correctly |
Cannot revoke the key you are currently using | Trying to delete your own active key | Use a different key to authenticate |
401 — Unauthorized
| Error | Cause | Fix |
|---|---|---|
Missing or invalid Authorization header | No Authorization header or wrong format | Add Authorization: Bearer YOUR_KEY |
API key is empty | Header present but key is blank | Include the key after Bearer |
Invalid API key | Key doesn't match any active key | Check the key, or it may have been revoked |
404 — Not Found
| Error | Cause | Fix |
|---|---|---|
Key not found or already revoked | Key ID doesn't exist or is already revoked | Check the key ID |
Page not found at the given URL | The target URL returned a 404 | Verify the URL is correct |
422 — Unprocessable
| Error | Cause | Fix |
|---|---|---|
Domain not found. Check the URL. | DNS resolution failed | Verify the domain exists |
429 — Rate Limited
| Error | Cause | Fix |
|---|---|---|
Monthly usage limit exceeded | You've hit your plan's monthly page limit | Upgrade your plan or wait for the next billing cycle |
Rate limited by Cloudflare | Too many concurrent scraping requests | Wait a few seconds and retry |
The 429 response includes extra fields:
{
"error": "Monthly usage limit exceeded",
"status": 429,
"limit": 50,
"usage": 50,
"plan": "free"
}
502 — Bad Gateway
| Error | Cause | Fix |
|---|---|---|
Connection refused by the target server | Target server blocked the connection | The site may block automated access |
SSL/TLS error connecting to the target server | SSL certificate issue on target | The target site has a certificate problem |
Scraping failed: Rate limiter timeout: too many concurrent requests | Our upstream scraping provider is saturated and we couldn't get a slot within 30s | Retry after a short delay — this is rare |
504 — Gateway Timeout
| Error | Cause | Fix |
|---|---|---|
Page took too long to load | The target page didn't finish loading in time | The site may be very slow or heavily JS-dependent. Try again — we retry automatically. |
Automatic Retries
The API automatically retries on transient errors (408, 429, 500, 502, 503, 504) with exponential backoff:
- Attempt 1: Immediate
- Attempt 2: After 1 second
- Attempt 3: After 3 seconds
If all retries fail, the error message includes (retried) to indicate that retries were attempted. You generally don't need to implement your own retry logic for scraping failures.
Thin Content Warning
When a page returns fewer than 200 words, the response includes a warning field:
{
"markdown": "...",
"metadata": { ... },
"warning": "Content appears thin (under 200 words). The page may require JavaScript rendering or have limited content."
}
This is not an error — you still get the content. It's a signal that the extraction may be incomplete, often because:
- The page is behind a login wall
- The page uses heavy client-side rendering
- The page genuinely has little content