Skip to content

Rate Limits

AG2Trust implements rate limiting to ensure fair usage and system stability.

Rate Limit Overview

Scope Default Limit Configurable
Organization 60 requests/minute Yes (per plan)
Per Agent 50 requests/minute No
Webhook delivery 1/second per customer No

Rate Limit Headers

Every API response includes rate limit information:

HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1700308800
Header Description
X-RateLimit-Limit Maximum requests per minute
X-RateLimit-Remaining Requests left in current window
X-RateLimit-Reset Unix timestamp when limit resets

Rate Limit Exceeded (429)

When you exceed the rate limit:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1700308800
Retry-After: 15

{
  "error": "Rate limit exceeded",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "details": {
    "limit": 60,
    "reset_at": "2025-01-15T10:31:00Z",
    "retry_after": 15
  }
}

Handling Rate Limits

Basic Retry Logic

import time
import requests

def send_with_retry(url, payload, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.post(url, json=payload, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 5))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            continue

        return response

    raise Exception("Max retries exceeded")
async function sendWithRetry(url, payload, headers, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, {
      method: 'POST',
      headers,
      body: JSON.stringify(payload)
    });

    if (response.status === 429) {
      const retryAfter = parseInt(
        response.headers.get('Retry-After') || '5'
      );
      console.log(`Rate limited. Waiting ${retryAfter}s...`);
      await new Promise(r => setTimeout(r, retryAfter * 1000));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

Exponential Backoff

import time
import random

def exponential_backoff(attempt, base=1, max_delay=60):
    """Calculate delay with jitter."""
    delay = min(base * (2 ** attempt), max_delay)
    jitter = random.uniform(0, delay * 0.1)
    return delay + jitter

def send_with_backoff(url, payload, headers, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, json=payload, headers=headers)

        if response.status_code == 429:
            delay = exponential_backoff(attempt)
            print(f"Rate limited. Backing off {delay:.1f}s...")
            time.sleep(delay)
            continue

        return response

    raise Exception("Max retries exceeded")

Using Libraries

from tenacity import (
    retry,
    retry_if_result,
    wait_exponential,
    stop_after_attempt
)

def is_rate_limited(response):
    return response.status_code == 429

@retry(
    retry=retry_if_result(is_rate_limited),
    wait=wait_exponential(multiplier=1, max=60),
    stop=stop_after_attempt(5)
)
def send_message(url, payload, headers):
    return requests.post(url, json=payload, headers=headers)
import axios from 'axios';
import axiosRetry from 'axios-retry';

const client = axios.create({
  baseURL: 'https://agents.ag2trust.com'
});

axiosRetry(client, {
  retries: 3,
  retryCondition: (error) =>
    error.response?.status === 429,
  retryDelay: (retryCount, error) => {
    const retryAfter = error.response?.headers['retry-after'];
    return retryAfter ? retryAfter * 1000 : retryCount * 1000;
  }
});

Proactive Rate Limit Management

Monitor Remaining Requests

class RateLimitAwareClient:
    def __init__(self, api_key):
        self.api_key = api_key
        self.remaining = None
        self.reset_at = None

    def send(self, endpoint, payload):
        # Check if we should wait
        if self.remaining is not None and self.remaining < 5:
            wait_time = max(0, self.reset_at - time.time())
            if wait_time > 0:
                print(f"Approaching limit, waiting {wait_time:.0f}s")
                time.sleep(wait_time)

        response = requests.post(
            f"https://agents.ag2trust.com{endpoint}",
            json=payload,
            headers={"X-API-Key": self.api_key}
        )

        # Update rate limit info
        self.remaining = int(
            response.headers.get("X-RateLimit-Remaining", 60)
        )
        self.reset_at = int(
            response.headers.get("X-RateLimit-Reset", 0)
        )

        return response

Request Queuing

import asyncio
from collections import deque

class RequestQueue:
    def __init__(self, rate_limit=60):
        self.rate_limit = rate_limit
        self.queue = deque()
        self.tokens = rate_limit
        self.last_refill = time.time()

    async def execute(self, func, *args, **kwargs):
        await self._acquire_token()
        return await func(*args, **kwargs)

    async def _acquire_token(self):
        while True:
            self._refill_tokens()
            if self.tokens > 0:
                self.tokens -= 1
                return
            await asyncio.sleep(0.1)

    def _refill_tokens(self):
        now = time.time()
        elapsed = now - self.last_refill
        refill = int(elapsed * (self.rate_limit / 60))
        if refill > 0:
            self.tokens = min(self.rate_limit, self.tokens + refill)
            self.last_refill = now

Rate Limits by Endpoint

Endpoint Limit Notes
POST /api/v1/ask/* 60/min Per organization
POST /api/v1/agents/*/messages 50/min Per agent
GET /api/v1/agents 120/min Higher for read ops
GET /api/v1/usage 120/min Higher for read ops
PUT /api/v1/webhook 10/min Configuration endpoint
POST /api/v1/webhook/test 10/min Testing endpoint

Agent-Level Rate Limits

Individual agents have their own internal rate limits:

Operation Limit Purpose
Tool calls 5/minute Prevent runaway tools
HTTP requests 3/minute Limit external calls
Web search 3/minute API cost control
Git push 10/hour Prevent spam commits

These limits are per-agent and cannot be changed via API.

Enterprise Rate Limits

Higher limits are available on Enterprise plans:

Plan Rate Limit
Starter 60/min
Professional 300/min
Enterprise Custom

Contact sales for Enterprise pricing.

Best Practices

1. Implement Retry Logic

Always handle 429 responses gracefully:

# Don't: Crash on rate limit
response = requests.post(url, json=payload)
response.raise_for_status()

# Do: Handle gracefully
response = requests.post(url, json=payload)
if response.status_code == 429:
    handle_rate_limit(response)

2. Use Backoff with Jitter

Prevent thundering herd:

delay = base_delay * (2 ** attempt)
jitter = random.uniform(0, delay * 0.1)
time.sleep(delay + jitter)

3. Monitor Rate Limit Headers

Track your usage proactively:

remaining = response.headers.get("X-RateLimit-Remaining")
if int(remaining) < 10:
    alert_operations_team()

4. Batch When Possible

Reduce request count by batching operations where supported.

5. Use Async for Long Tasks

Instead of polling, use webhooks for async operations.

Debugging Rate Limits

Check Current Usage

curl https://agents.ag2trust.com/api/v1/usage \
  -H "X-API-Key: cust_your_api_key"

Common Issues

Issue Cause Solution
Constant 429s Too many requests Implement queuing
Burst 429s Sudden traffic spike Add rate limiting client-side
Slow reset Multiple clients sharing limit Coordinate requests

Next Steps