Rate Limiting
Per-token rate limiting protects the API from abuse and ensures fair usage across all consumers. Each API token has configurable rate limits that control how many requests can be made within a given time window.
Default Limits
100 requests per 1 hour per token
This is the default limit applied to all new tokens. Limits are configurable per-token from the dashboard at /dashboard/tokens.
Window Formats
Rate limit windows are expressed as a number followed by a time unit suffix. The following formats are supported:
Supported Formats
NsSecondse.g. 30sNmMinutese.g. 15mNhHourse.g. 1hNdDayse.g. 24hExamples: 1h (1 hour), 15m (15 minutes), 24h (24 hours), 30s (30 seconds).
Response Headers
Every OCR response includes rate limit headers so your application can track usage in real time.
Rate Limit Headers
X-RateLimit-LimitMaximum requests allowed in the current windowX-RateLimit-RemainingRequests remaining in the current windowX-RateLimit-ResetISO 8601 timestamp when the window resetsHTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 2026-04-05T15:00:00.000Z
{
"success": true,
"data": {
"documentType": "sa-id-smart-front",
"surname": "NKOSI",
"names": "THABO MICHAEL",
"idNumber": "9201015800086",
...
}
}When Rate Limited
When your token exceeds its rate limit, the API responds with HTTP 429 Too Many Requests and includes a Retry-After header indicating how many seconds to wait before retrying.
Error code: RATE_LIMITED
The Retry-After header tells you how many seconds to wait.
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-04-05T15:00:00.000Z
{
"success": false,
"error": {
"code": "RATE_LIMITED",
"message": "Too many requests"
}
}Best Practices
Check remaining requests
Monitor the X-RateLimit-Remaining header before sending requests to avoid hitting the limit.
Implement exponential backoff
When you receive a 429 response, wait and retry with increasing delays (e.g. 1s, 2s, 4s, 8s).
Cache OCR results client-side
Store extracted data locally to avoid rescanning the same document multiple times.
Use separate tokens per project
Each token has its own independent rate limit. Isolate projects with dedicated tokens.
Request a higher limit
If your use case requires more throughput, raise the limit on a per-token basis from the dashboard at /dashboard/tokens.