Rate Limits
Understanding API rate limits and how to work within them
Overview
Rate limits protect the API from abuse and ensure fair usage for all users. Moknah enforces limits on both request frequency and credit consumption.
Moknah processes one request per user at a time. Wait for the
current request to complete before sending another. Concurrent requests will receive a 409
Conflict error.
Default Limits
| Limit Type | Value | Description |
|---|---|---|
| Requests per Minute (RPM) | 60 |
Maximum number of API requests per minute |
| Credits per Minute (CPM) | 10,000 |
Maximum credits consumed per minute (TTS) |
| Concurrency | 1 |
One request processed at a time per user |
Rate Limit Headers
Every API response includes headers to help you track your rate limit status:
| Header | Description | Example |
|---|---|---|
RateLimit-Limit |
Your RPM limit | 60 |
RateLimit-Remaining |
Requests remaining this minute | 45 |
RateLimit-Reset |
Unix timestamp when limits reset | 1699574460 |
Moknah-Credits-Remaining |
Credits remain this minute | 10000 |
Moknah-Credits-Used |
Credits used this minute | 5000 |
When Rate Limited
If you exceed the rate limit, you'll receive a 429 Too Many Requests response with these additional
headers:
| Header | Description | Example |
|---|---|---|
Retry-After |
Seconds to wait before retrying | 45 |
RateLimit-Reset |
Unix timestamp when you can retry | 1699574460 |
Credit Calculation
Credits are calculated based on text length and processing options:
| Factor | Multiplier | Description |
|---|---|---|
| Base cost | 1x |
1 credit per character |
| AI-Enhanced Normalization | 2x |
Advanced Arabic processing with diacritics |
| Premium Voice | +% |
Additional percentage for premium voices |
Example: A 500-character text with AI-Enhanced normalization:
500 characters × 2 (AI-Enhanced) = 1,000 credits
Best Practices
1. Implement Request Queuing
Since Moknah doesn't support concurrent requests, queue your requests and process them sequentially:
import queue
import threading
import time
class TTSQueue:
def __init__(self, api_key):
self.api_key = api_key
self.queue = queue.Queue()
self.worker = threading.Thread(target=self._process_queue, daemon=True)
self.worker.start()
def _process_queue(self):
while True:
text, voice_id, callback = self.queue.get()
try:
result = self._generate(text, voice_id)
callback(result, None)
except Exception as e:
callback(None, e)
finally:
self.queue.task_done()
def _generate(self, text, voice_id):
# Your API call here
pass
def add(self, text, voice_id, callback):
self.queue.put((text, voice_id, callback))
# Usage
tts = TTSQueue("your_api_key")
tts.add("Hello world", "voice_123", lambda r, e: print(r or e))
class TTSQueue {
constructor(apiKey) {
this.apiKey = apiKey;
this.queue = [];
this.processing = false;
}
async add(text, voiceId) {
return new Promise((resolve, reject) => {
this.queue.push({ text, voiceId, resolve, reject });
this.processNext();
});
}
async processNext() {
if (this.processing || this.queue.length === 0) return;
this.processing = true;
const { text, voiceId, resolve, reject } = this.queue.shift();
try {
const result = await this.generate(text, voiceId);
resolve(result);
} catch (error) {
reject(error);
} finally {
this.processing = false;
this.processNext(); // Process next in queue
}
}
async generate(text, voiceId) {
// Your API call here
}
}
// Usage
const tts = new TTSQueue("your_api_key");
const audio = await tts.add("Hello world", "voice_123");
2. Monitor Rate Limit Headers
Check the response headers and slow down before hitting the limit:
import requests
import time
def generate_with_rate_limit(text, voice_id, api_key):
response = requests.post(
"https://moknah.io/api/v1/tts/generate",
headers={"Authorization": f"Bearer {api_key}"},
json={"text": text, "voice_id": voice_id}
)
# Check remaining requests
remaining_rpm = int(response.headers.get("RateLimit-Remaining", 60))
remaining_cpm = int(response.headers.get("Moknah-Credits-Remaining", 10000))
# Slow down if running low
if remaining_rpm < 10:
print(f"Warning: Only {remaining_rpm} requests remaining this minute")
time.sleep(1) # Add delay between requests
if remaining_cpm < 1000:
print(f"Warning: Only {remaining_cpm} credits remaining this minute")
return response
async function generateWithRateLimit(text, voiceId, apiKey) {
const response = await fetch("https://moknah.io/api/v1/tts/generate", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json"
},
body: JSON.stringify({ text, voice_id: voiceId })
});
// Check remaining requests
const remainingRPM = parseInt(response.headers.get("RateLimit-Remaining") || 60);
const remainingCPM = parseInt(response.headers.get("Moknah-Credits-Remaining") || 10000);
// Slow down if running low
if (remainingRPM < 10) {
console.warn(`Warning: Only ${remainingRPM} requests remaining this minute`);
await new Promise(r => setTimeout(r, 1000)); // Add delay
}
if (remainingCPM < 1000) {
console.warn(`Warning: Only ${remainingCPM} credits remaining this minute`);
}
return response;
}
3. Implement Exponential Backoff
When rate limited, use exponential backoff to retry:
import time
import random
def request_with_backoff(func, max_retries=5):
retries = 0
while retries < max_retries:
try:
response = func()
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 60))
# Add jitter to prevent thundering herd
wait_time = retry_after + random.uniform(0, 1)
print(f"Rate limited.Waiting {wait_time:.1f}s...")
time.sleep(wait_time)
retries += 1
continue
return response
except Exception as e:
# Exponential backoff for other errors
wait_time = (2 ** retries) + random.uniform(0, 1)
print(f"Error: {e}.Retrying in {wait_time:.1f}s...")
time.sleep(wait_time)
retries += 1
raise Exception("Max retries exceeded")
async function requestWithBackoff(func, maxRetries = 5) {
let retries = 0;
while (retries < maxRetries) {
try {
const response = await func();
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get("Retry-After") || 60);
// Add jitter to prevent thundering herd
const waitTime = retryAfter + Math.random();
console.log(`Rate limited.Waiting ${waitTime.toFixed(1)}s...`);
await new Promise(r => setTimeout(r, waitTime * 1000));
retries++;
continue;
}
return response;
} catch (error) {
// Exponential backoff for other errors
const waitTime = Math.pow(2, retries) + Math.random();
console.log(`Error: ${error.message}.Retrying in ${waitTime.toFixed(1)}s...`);
await new Promise(r => setTimeout(r, waitTime * 1000));
retries++;
}
}
throw new Error("Max retries exceeded");
}
Summary
Default limits: 60 RPM, 10,000 CPM, no concurrency
Need more? Email sales@moknah.io
For API-related questions or issues, contact us at api@moknah.io.