Crawlfox API

Authentication

Every request is authenticated with your secret API key, sent as a Bearer token in the Authorization header. Create and reveal your key from the dashboard.

Authorization: Bearer $CRAWLFOX_API_KEY

Keep it secret

Treat the key like a password — never ship it in client-side code. If it leaks, rotate it from the API Keys page; the old secret is invalidated immediately.

Crawlfox API

Rate Limits

Each key has a requests-per-minute burst limit and a monthly credit allowance. Requests over the burst limit get HTTP 429; slow down and retry with backoff.

Credits

Successful requests consume credits: 1 credit per page scraped (requesting multiple formats for one page is still 1 credit), 1 credit per 10 requested search results (based on num), and 1 credit per crawled page. Failed calls are free; cache hits are billed at a reduced 0.1 credit.

Crawlfox API

Status Codes & Errors

Errors return a JSON envelope with a stable machine code, an HTTP status, whether the request is safe to retry, and a human remediation hint.

{
  "code": "UPSTREAM_TIMEOUT",
  "status": 504,
  "retryable": true,
  "title": "Request timed out",
  "message": "The target site did not respond in time.",
  "remediation": "Retry. If a URL consistently times out, try a more specific path."
}

Error codes

Switch on the stable `code` — statuses and messages may change, codes won't. Retryable errors are safe to retry with backoff.

Code	Retryable	When
MISSING_URL	No	The request body has no url.
INVALID_URL	No	The URL isn't fully-qualified (https://…) or is malformed.
UPSTREAM_UNREACHABLE	Yes	The target site couldn't be reached.
UPSTREAM_NOT_FOUND	No	The target site returned 404 for the URL.
UPSTREAM_TIMEOUT	Yes	The target site didn't respond in time.
UPSTREAM_RATE_LIMITED	Yes	The target site is rate-limiting requests.
UPSTREAM_SERVER_ERROR	Yes	The target site returned a server error.
UPSTREAM_GEO_BLOCKED	No	The target site refused the request on regional or legal grounds.
BOT_WALL	No	The page could not be retrieved.
NO_PUBLIC_CONTENT	No	The page loaded but had no readable content.
INTERNAL_ERROR	Yes	An unexpected error on our side — retry, then contact support with the request ID.

Scrape

Scrape a URL

POSThttps://api.crawlfox.io/v1/scrape

Consumes credits. 1 credit per page scraped — any number of formats for one page is still 1 credit. Cache hits are billed at 0.1 credit; failed calls are free.

Fetch a single URL and return clean, structured data. Crawlfox handles proxy rotation, bot-defense clearing, and a headless-browser fallback, so you get a result instead of a block. Choose one or more output formats.

Body params

urlstringrequired: The URL to scrape. Must include the scheme (https://).
formatsstring[]: Any of markdown, html, rawHtml, text, json, links, images, emails. Defaults to markdown.
extractMainContentboolean: Strip nav/header/footer/sidebar and return just the article body.
skipCacheboolean: Force a fresh fetch instead of serving a cached result.

Language

Credentials

Bearer$CRAWLFOX_API_KEY

request.sh

curl --request POST \
  --url https://api.crawlfox.io/v1/scrape \
  --header "Authorization: Bearer $CRAWLFOX_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{"url":"https://example.com/","formats":["markdown","html","links"]}'

200 OKresponse.json

{
  "success": true,
  "data": {
    "markdown": "# Example Domain\n...",
    "html": "<html>...</html>",
    "links": ["https://www.iana.org/domains/example"],
    "metadata": {
      "title": "Example Domain",
      "description": "Example Domain for use in documents.",
      "language": "en",
      "sourceURL": "https://example.com/",
      "statusCode": 200
    }
  }
}

Search the web (SERP)

POSThttps://api.crawlfox.io/v1/serp/{engine}

Consumes credits. 1 credit per 10 requested results (num) — num: 20 costs 2 credits, charged on the requested count whether or not that many are found. Failed calls are free.

Run a search on Google, Bing, or DuckDuckGo and get ranked organic results as structured JSON. Swap the engine in the path: /v1/serp/google, /v1/serp/bing, or /v1/serp/duckduckgo. For large result sets, POST to /v1/serp/{engine}/stream instead to receive newline-delimited JSON (NDJSON) — one result page per line, streamed as each page lands.

Body params

qstringrequired: The search query.
numinteger: Number of results to return (up to 100 on Google; Bing caps ~10).
startinteger: Result offset for pagination — e.g. start: 10 returns the next page after the first 10. Google supports offsets up to 90.
countrystring: Two-letter country code to localize results, e.g. us.
languagestring: Two-letter language code, e.g. en.

Language

Credentials

Bearer$CRAWLFOX_API_KEY

request.sh

curl --request POST \
  --url https://api.crawlfox.io/v1/serp/google \
  --header "Authorization: Bearer $CRAWLFOX_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{"q":"rust async tutorial","num":20}'

200 OKresponse.json

{
  "success": true,
  "data": {
    "web": [
      {
        "position": 1,
        "url": "https://rust-lang.github.io/async-book/",
        "title": "Asynchronous Programming in Rust",
        "description": "An introduction to async programming in Rust..."
      }
    ]
  },
  "creditsUsed": 2
}

Batch

Batch scrape

POSThttps://api.crawlfox.io/v1/batch

Consumes credits. 1 credit per page scraped, billed per URL. Cache hits are 0.1 credit; failed URLs are free.

Scrape many URLs in one call. Each URL is processed independently with the same formats, and results come back in the same order as the input array.

Body params

urlsstring[]required: The list of URLs to scrape — up to 100 per request.
formatsstring[]: Output formats applied to every URL. Defaults to markdown.
extractMainContentboolean: Strip nav/header/footer/sidebar from every page and return just the article body.
skipCacheboolean: Force a fresh fetch for every URL instead of serving cached results.

Language

Credentials

Bearer$CRAWLFOX_API_KEY

request.sh

curl --request POST \
  --url https://api.crawlfox.io/v1/batch \
  --header "Authorization: Bearer $CRAWLFOX_API_KEY" \
  --header 'Content-Type: application/json' \
  --data '{"urls":["https://example.com/","https://example.org/"],"formats":["markdown"]}'

200 OKresponse.json

{
  "success": true,
  "data": [
    { "markdown": "# Example Domain\n...", "metadata": { "sourceURL": "https://example.com/", "statusCode": 200 } },
    { "markdown": "# Example Domain\n...", "metadata": { "sourceURL": "https://example.org/", "statusCode": 200 } }
  ]
}