Rate Limits - orchagent

orchagent enforces rate limits to ensure fair usage across the platform.

Platform Limits

Tier	Daily Calls	Concurrent Requests
Free	1,000	10
Pro	10,000	50
Enterprise	Custom	Custom

Sandbox Agent Limits

Code runtime and managed loop agents have separate limits for compute time (sandbox execution):

Tier	Daily Calls	Max Timeout	Compute Hours
Free	50	30s	Included
Pro	500	5min	Included
Enterprise	Custom	Custom	Custom

Sandbox agent limits are separate from direct LLM agent limits. You can use both with their respective quotas.

Check your compute usage:

# Via CLI
orch usage --compute

# Via API
GET /usage/compute

Paid Calls

Paid agent calls (where credits are charged) bypass free tier daily limits and use a separate abuse protection cap of 100,000 calls/day. This prevents paid usage from being blocked by free tier limits.

How Limits Are Counted

Top-Level Calls

Each call to an agent counts as 1 call against your daily limit:

orch run acme/summarizer --data '{"text": "..."}'  # +1 call

Orchestrator Calls

When you call an orchestrator that calls other agents, only the top-level call counts:

You → security-review → leak-finder
                     → vuln-scanner
                     → license-checker

Your daily count: +1 (not +4) The orchestrator’s sub-calls are handled internally and don’t count against your limit.

Rate Limit Headers

Every response includes headers showing your limit status:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 995
X-RateLimit-Reset: 1704067200

Header	Description
`X-RateLimit-Limit`	Your daily limit
`X-RateLimit-Remaining`	Remaining calls today
`X-RateLimit-Reset`	Unix timestamp when limit resets

Rate Limit Errors

When you exceed your limit:

{
  "error": {
    "code": "RATE_LIMITED",
    "message": "Rate limit exceeded. Try again after 2024-01-16T00:00:00Z",
    "is_retryable": true,
    "suggested_wait_time": 3600
  },
  "metadata": {
    "request_id": "req_abc123"
  }
}

HTTP Status: 429 Too Many Requests

Timeouts

Request Timeouts

Setting	Default	Maximum
Author-configured	60s	300s
Platform-enforced	—	300s

Authors set timeout in their manifest:

{
  "timeout_seconds": 120
}

Timeout Propagation

For orchestrators, timeouts propagate through the call chain:

remaining_time = original_deadline - elapsed_time

If a sub-call would exceed the remaining time, it fails fast with TIMEOUT.

Composition Limits

Max Hops

Limits how deep agent-to-agent calls can go:

Caller → Agent A → Agent B → Agent C
              ↑         ↑         ↑
            hop 1     hop 2     hop 3

Effective limit: min(caller's max_hops, agent's max_hops)

Downstream Cap

Controls the budget passed to each downstream dependency call. This limits what each called agent can spend in further downstream calls — it does not limit the current agent’s own call count:

{
  "manifest": {
    "per_call_downstream_cap": 100
  }
}

Handling Rate Limits

Check Before Calling

import httpx

response = httpx.get(
    "https://api.orchagent.io/usage",
    headers={"Authorization": f"Bearer {api_key}"}
)
usage = response.json()
remaining = usage["calls_remaining"]

Implement Backoff

import time
import httpx

def call_with_retry(url, data, max_retries=3):
    for attempt in range(max_retries):
        response = httpx.post(url, json=data)

        if response.status_code == 429:
            wait_time = int(response.headers.get("Retry-After", 60))
            time.sleep(wait_time)
            continue

        return response

    raise Exception("Max retries exceeded")

JavaScript Example

async function callWithRetry(url, data, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(data),
    });

    if (response.status === 429) {
      const waitTime = parseInt(response.headers.get("Retry-After") || "60");
      await new Promise((resolve) => setTimeout(resolve, waitTime * 1000));
      continue;
    }

    return response;
  }

  throw new Error("Max retries exceeded");
}

Upgrading Limits

Pro Plan

10,000 calls/day
50 concurrent requests
Priority support

Enterprise

Custom limits
SLA guarantees
Dedicated support

Contact [email protected] for enterprise pricing.

Service Limits

Always-on services have separate limits from on-demand agent runs:

Tier	Concurrent Services	Max Instances per Service
Pro	5	3
Team	20	10
Enterprise	Custom	Custom

Service compute time is metered by runtime minutes and counts toward your workspace usage. See Billing for details.

Best Practices

Check remaining calls before batch operations
Implement exponential backoff for 429 responses
Cache responses when appropriate
Use webhooks instead of polling when available
Monitor usage in the dashboard

​Platform Limits

​Sandbox Agent Limits

​Paid Calls

​How Limits Are Counted

​Top-Level Calls

​Orchestrator Calls

​Rate Limit Headers

​Rate Limit Errors

​Timeouts

​Request Timeouts

​Timeout Propagation

​Composition Limits

​Max Hops

​Downstream Cap

​Handling Rate Limits

​Check Before Calling

​Implement Backoff

​JavaScript Example

​Upgrading Limits

​Pro Plan

​Enterprise

​Service Limits

​Best Practices

Platform Limits

Sandbox Agent Limits

Paid Calls

How Limits Are Counted

Top-Level Calls

Orchestrator Calls

Rate Limit Headers

Rate Limit Errors

Timeouts

Request Timeouts

Timeout Propagation

Composition Limits

Max Hops

Downstream Cap

Handling Rate Limits

Check Before Calling

Implement Backoff

JavaScript Example

Upgrading Limits

Pro Plan

Enterprise

Service Limits

Best Practices