Skip to main content
orchagent enforces rate limits to ensure fair usage across the platform.

Platform Limits

TierDaily CallsConcurrent Requests
Free1,00010
Pro10,00050
EnterpriseCustomCustom

Sandbox Agent Limits

Code runtime and managed loop agents have separate limits for compute time (sandbox execution):
TierDaily CallsMax TimeoutCompute Hours
Free5030sIncluded
Pro5005minIncluded
EnterpriseCustomCustomCustom
Sandbox agent limits are separate from direct LLM agent limits. You can use both with their respective quotas.
Check your compute usage:
# Via CLI
orch usage --compute

# Via API
GET /usage/compute
Paid agent calls (where credits are charged) bypass free tier daily limits and use a separate abuse protection cap of 100,000 calls/day. This prevents paid usage from being blocked by free tier limits.

How Limits Are Counted

Top-Level Calls

Each call to an agent counts as 1 call against your daily limit:
orch run acme/summarizer --data '{"text": "..."}'  # +1 call

Orchestrator Calls

When you call an orchestrator that calls other agents, only the top-level call counts:
You → security-review → leak-finder
                     → vuln-scanner
                     → license-checker
Your daily count: +1 (not +4) The orchestrator’s sub-calls are handled internally and don’t count against your limit.

Rate Limit Headers

Every response includes headers showing your limit status:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 995
X-RateLimit-Reset: 1704067200
HeaderDescription
X-RateLimit-LimitYour daily limit
X-RateLimit-RemainingRemaining calls today
X-RateLimit-ResetUnix timestamp when limit resets

Rate Limit Errors

When you exceed your limit:
{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded. Try again after 2024-01-16T00:00:00Z",
    "is_retryable": true,
    "suggested_wait_time": 3600
  },
  "metadata": {
    "request_id": "req_abc123"
  }
}
HTTP Status: 429 Too Many Requests

Timeouts

Request Timeouts

SettingDefaultMaximum
Author-configured60s300s
Platform-enforced300s
Authors set timeout in their manifest:
{
  "timeout_seconds": 120
}

Timeout Propagation

For orchestrators, timeouts propagate through the call chain:
remaining_time = original_deadline - elapsed_time
If a sub-call would exceed the remaining time, it fails fast with TIMEOUT.

Composition Limits

Max Hops

Limits how deep agent-to-agent calls can go:
Caller → Agent A → Agent B → Agent C
              ↑         ↑         ↑
            hop 1     hop 2     hop 3
Effective limit: min(caller's max_hops, agent's max_hops)

Downstream Cap

Controls the budget passed to each downstream dependency call. This limits what each called agent can spend in further downstream calls — it does not limit the current agent’s own call count:
{
  "manifest": {
    "per_call_downstream_cap": 100
  }
}

Handling Rate Limits

Check Before Calling

import httpx

response = httpx.get(
    "https://api.orchagent.io/usage",
    headers={"Authorization": f"Bearer {api_key}"}
)
usage = response.json()
remaining = usage["calls_remaining"]

Implement Backoff

import time
import httpx

def call_with_retry(url, data, max_retries=3):
    for attempt in range(max_retries):
        response = httpx.post(url, json=data)

        if response.status_code == 429:
            wait_time = int(response.headers.get("Retry-After", 60))
            time.sleep(wait_time)
            continue

        return response

    raise Exception("Max retries exceeded")

JavaScript Example

async function callWithRetry(url, data, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(data),
    });

    if (response.status === 429) {
      const waitTime = parseInt(response.headers.get("Retry-After") || "60");
      await new Promise((resolve) => setTimeout(resolve, waitTime * 1000));
      continue;
    }

    return response;
  }

  throw new Error("Max retries exceeded");
}

Upgrading Limits

Pro Plan

  • 10,000 calls/day
  • 50 concurrent requests
  • Priority support

Enterprise

  • Custom limits
  • SLA guarantees
  • Dedicated support
Contact [email protected] for enterprise pricing.

Service Limits

Always-on services have separate limits from on-demand agent runs:
TierConcurrent ServicesMax Instances per Service
Pro53
Team2010
EnterpriseCustomCustom
Service compute time is metered by runtime minutes and counts toward your workspace usage. See Billing for details.

Best Practices

  1. Check remaining calls before batch operations
  2. Implement exponential backoff for 429 responses
  3. Cache responses when appropriate
  4. Use webhooks instead of polling when available
  5. Monitor usage in the dashboard