Skip to main content

Rate Limits

WAYSCloud implements rate limits to ensure fair usage and maintain service quality. This guide covers rate limits for each service, how to handle rate limiting, and strategies to optimize your usage.

Overview

Rate limits are applied per API key and are measured in requests per time window. When you exceed a rate limit, you'll receive a 429 Too Many Requests response.

Rate Limits by Service

Storage API

ResourceLimitWindow
API Requests1000 requestsper minute
Burst Allowance+100 additional requestsper minute
Max File Size50GBper upload
Concurrent Connections100per API key

Notes:

  • Burst allowance allows temporary spikes in traffic
  • Multipart uploads count as one request per part
  • List operations are paginated (max 1000 objects per page)

LLM API

ResourceLimitWindow
API Requests1000 requestsper minute
Burst Allowance+100 additional requestsper minute
Max Tokens (input + output)Varies by modelper request
Concurrent Requests50per API key

Token Limits by Model:

ModelMax ContextMax Output Tokens
mixtral-8x7b32K tokens4K tokens
qwen3-80b-instruct32K tokens4K tokens
qwen3-80b-thinking32K tokens4K tokens
qwen3-235b-thinking32K tokens4K tokens
deepseek-v364K tokens8K tokens
deepseek-r164K tokens8K tokens
kimi-k2200K tokens8K tokens
llama-3.1-405b128K tokens8K tokens
qwen3-coder-480b32K tokens4K tokens
llamaguard-48K tokens1K tokens

Notes:

  • Token limits include both input (prompt) and output (completion)
  • Streaming requests count as one request
  • Rate limits apply to all models combined

Database API

ResourceLimitWindow
API Requests500 requestsper minute
Database Creation10 databasesper hour
Snapshot Creation20 snapshotsper hour
Max Databases100per account
Max Connections100per database
Firewall Rules50per database

Notes:

  • Database operations are heavier, hence lower limits
  • Snapshot operations don't count toward general API limit
  • Max database size: 1TB

DNS API

ResourceLimitWindow
Zone Creation10 zonesper hour
Record Creation100 recordsper minute
Record Updates100 recordsper minute
Record Deletion100 recordsper minute
Zone Queries (GET)1000 requestsper minute
Record Queries (GET)1000 requestsper minute
Batch Operations1000 recordsper request

Notes:

  • Zone creation is rate-limited to prevent abuse
  • Batch operations count as one request regardless of record count
  • DNSSEC operations have same limits as regular operations

GPU API

ResourceLimitWindow
API Requests100 requestsper minute
Concurrent Jobs10 jobsper API key
Job Status Checks1000 requestsper minute

Job-Specific Limits:

Job TypeMax DurationMax Output
Video Generation10 minutes30 seconds
Text-to-Speech5 minutes10MB
Audio Transcription30 minutes1 hour audio
Image Generation5 minutes4K resolution

Notes:

  • Jobs are asynchronous and don't count toward rate limit once submitted
  • Status checks have separate, higher limit
  • Webhook delivery doesn't count toward limits

Rate Limit Headers

All API responses include rate limit headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 742
X-RateLimit-Reset: 1699123456

Headers:

  • X-RateLimit-Limit - Maximum requests allowed in window
  • X-RateLimit-Remaining - Requests remaining in current window
  • X-RateLimit-Reset - Unix timestamp when limit resets

Handling Rate Limits

429 Too Many Requests Response

When you exceed a rate limit, you'll receive:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1699123456

{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded",
"retry_after": 60,
"details": {
"limit": "1000 requests/minute",
"window": "60 seconds"
}
}
}

Response Headers:

  • Retry-After - Seconds to wait before retrying

Exponential Backoff

Implement exponential backoff for 429 responses:

import time
import requests

def api_call_with_backoff(url, headers, max_retries=5):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)

if response.status_code == 200:
return response.json()

if response.status_code == 429:
# Check Retry-After header
retry_after = int(response.headers.get('Retry-After', 60))

# Exponential backoff: 1s, 2s, 4s, 8s, 16s
wait_time = min(retry_after, 2 ** attempt)

print(f"Rate limited. Waiting {wait_time}s before retry {attempt + 1}/{max_retries}")
time.sleep(wait_time)
continue

# Handle other errors
response.raise_for_status()

raise Exception("Max retries exceeded")

JavaScript/Node.js Example

async function apiCallWithBackoff(url, headers, maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, { headers });

if (response.ok) {
return await response.json();
}

if (response.status === 429) {
// Check Retry-After header
const retryAfter = parseInt(response.headers.get('Retry-After') || '60');

// Exponential backoff
const waitTime = Math.min(retryAfter, Math.pow(2, attempt));

console.log(`Rate limited. Waiting ${waitTime}s (attempt ${attempt + 1}/${maxRetries})`);
await new Promise(resolve => setTimeout(resolve, waitTime * 1000));
continue;
}

throw new Error(`HTTP ${response.status}: ${await response.text()}`);
}

throw new Error('Max retries exceeded');
}

Optimization Strategies

1. Monitor Rate Limit Headers

Track remaining requests to avoid hitting limits:

def smart_api_call(url, headers):
response = requests.get(url, headers=headers)

# Check remaining requests
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
reset_time = int(response.headers.get('X-RateLimit-Reset', 0))

if remaining < 10:
# Close to limit - slow down
current_time = time.time()
wait_until_reset = reset_time - current_time

if wait_until_reset > 0:
print(f"Approaching rate limit. {remaining} requests remaining.")
print(f"Slowing down for {wait_until_reset}s")
time.sleep(wait_until_reset / 10) # Space out remaining requests

return response.json()

2. Batch Operations

Use batch endpoints when available:

# Bad - Multiple individual requests (uses 100 API calls)
for record in records:
api.create_dns_record(zone_id, record)

# Good - Single batch request (uses 1 API call)
api.create_dns_records_batch(zone_id, records)

Services with Batch Support:

  • DNS API: Create/update/delete multiple records
  • Storage API: Use multipart upload for large files
  • Database API: Manage multiple firewall rules

3. Cache Responses

Cache responses when data doesn't change frequently:

import time
from functools import lru_cache

@lru_cache(maxsize=128)
def get_dns_records(zone_id):
"""Cache DNS records for 60 seconds"""
return api.list_dns_records(zone_id)

# Or use time-based caching
cache = {}
CACHE_TTL = 60 # seconds

def get_cached_data(key):
if key in cache:
data, timestamp = cache[key]
if time.time() - timestamp < CACHE_TTL:
return data

# Fetch fresh data
data = api.get_data(key)
cache[key] = (data, time.time())
return data

4. Use Webhooks (GPU API)

Instead of polling job status, use webhooks:

# Bad - Polling (uses many API calls)
while True:
status = api.get_job_status(job_id)
if status['status'] in ['completed', 'failed']:
break
time.sleep(5) # Poll every 5 seconds

# Good - Webhooks (uses 1 API call)
job = api.create_job(
job_type='video_generation',
webhook_url='https://myapp.com/webhook'
)
# Job status delivered to your webhook when complete

5. Distribute Load

Spread requests evenly over time:

import time

def rate_limited_loop(items, requests_per_second=10):
"""Process items with rate limiting"""
interval = 1.0 / requests_per_second

for item in items:
start_time = time.time()

# Process item
process_item(item)

# Wait to maintain rate
elapsed = time.time() - start_time
if elapsed < interval:
time.sleep(interval - elapsed)

6. Use Multiple API Keys

For high-volume applications, use multiple API keys:

from itertools import cycle

# Multiple API keys
api_keys = [
'wayscloud_storage_key1_secret',
'wayscloud_storage_key2_secret',
'wayscloud_storage_key3_secret'
]

# Round-robin through keys
key_cycle = cycle(api_keys)

def make_request(url):
api_key = next(key_cycle)
headers = {'Authorization': f'Bearer {api_key}'}
return requests.get(url, headers=headers)
warning

Ensure compliance with Terms of Service when using multiple keys. Contact support for high-volume use cases.

Quota Limits

In addition to rate limits, some resources have quota limits:

Storage Quotas

PlanStorage QuotaBandwidth
Free10GB50GB/month
Basic100GB500GB/month
Pro1TB5TB/month
EnterpriseCustomCustom

Database Quotas

PlanMax DatabasesMax DB SizeSnapshots
Free31GB5 per DB
Basic1010GB10 per DB
Pro50100GB30 per DB
EnterpriseCustomCustomCustom

DNS Quotas

PlanMax ZonesRecords per ZoneQueries/Month
Free31001M
Basic1050010M
Pro502000100M
EnterpriseCustomCustomCustom

Monitoring Usage

Via Dashboard

Monitor API usage at my.wayscloud.services/usage:

  • Real-time request counts
  • Rate limit violations
  • Quota usage
  • Historical data

Via API

Check usage programmatically:

curl -X GET "https://provision.wayscloud.net/api/v1/dashboard/usage" \
-H "Authorization: Bearer {keycloak_token}"

Response:

{
"period": "2025-11-04",
"services": {
"storage": {
"requests": 45234,
"rate_limit_hits": 3,
"quota_used": "45.2GB",
"quota_limit": "100GB"
},
"llm": {
"requests": 12456,
"tokens_used": 5234567,
"rate_limit_hits": 0
},
"database": {
"requests": 3421,
"databases_active": 5,
"rate_limit_hits": 0
}
}
}

Upgrading Limits

Need higher limits? Contact us:

  • Email: sales@wayscloud.no
  • Subject: Rate Limit Increase Request
  • Include:
    • Current usage patterns
    • Required limits
    • Use case description
    • Expected growth

Enterprise plans offer custom rate limits and dedicated capacity.

Best Practices Summary

  1. Monitor headers - Check X-RateLimit-Remaining
  2. Implement backoff - Use exponential backoff for 429 errors
  3. Use batch operations - Reduce API calls with batch endpoints
  4. Cache responses - Cache data that doesn't change frequently
  5. Use webhooks - Avoid polling with webhook notifications
  6. Distribute load - Spread requests evenly over time
  7. Log violations - Track and investigate rate limit hits
  8. Plan capacity - Monitor usage trends and upgrade proactively

Next Steps

Support

Questions about rate limits?