Rate Limiting

Per-token rate limiting protects the API from abuse and ensures fair usage across all consumers. Each API token has configurable rate limits that control how many requests can be made within a given time window.

Default Limits

100

100 requests per 1 hour per token

This is the default limit applied to all new tokens. Limits are configurable per-token from the dashboard at /dashboard/tokens.

Window Formats

Rate limit windows are expressed as a number followed by a time unit suffix. The following formats are supported:

Supported Formats

NsSecondse.g. 30s

NmMinutese.g. 15m

NhHourse.g. 1h

NdDayse.g. 24h

Examples: 1h (1 hour), 15m (15 minutes), 24h (24 hours), 30s (30 seconds).

Response Headers

Every OCR response includes rate limit headers so your application can track usage in real time.

Rate Limit Headers

X-RateLimit-LimitMaximum requests allowed in the current window

X-RateLimit-RemainingRequests remaining in the current window

X-RateLimit-ResetISO 8601 timestamp when the window resets

Example Response Headershttp

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 2026-04-05T15:00:00.000Z

{
  "success": true,
  "data": {
    "documentType": "sa-id-smart-front",
    "surname": "NKOSI",
    "names": "THABO MICHAEL",
    "idNumber": "9201015800086",
    ...
  }
}

When Rate Limited

When your token exceeds its rate limit, the API responds with HTTP 429 Too Many Requests and includes a Retry-After header indicating how many seconds to wait before retrying.

429

Error code: RATE_LIMITED

The Retry-After header tells you how many seconds to wait.

429 Rate Limited Responsehttp

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-04-05T15:00:00.000Z

{
  "success": false,
  "error": {
    "code": "RATE_LIMITED",
    "message": "Too many requests"
  }
}

Best Practices

Check remaining requests

Monitor the X-RateLimit-Remaining header before sending requests to avoid hitting the limit.

Implement exponential backoff

When you receive a 429 response, wait and retry with increasing delays (e.g. 1s, 2s, 4s, 8s).

Cache OCR results client-side

Store extracted data locally to avoid rescanning the same document multiple times.

Use separate tokens per project

Each token has its own independent rate limit. Isolate projects with dedicated tokens.

Request a higher limit

If your use case requires more throughput, raise the limit on a per-token basis from the dashboard at /dashboard/tokens.