All posts
March 10, 20267 min read

How to Set Budget Limits on the OpenAI API

OpenAI gives you account-level spending caps, but nothing stops one rogue agent from burning through your entire monthly budget. Here's how to enforce per-agent, per-user spending limits without touching your application code.

The Problem: OpenAI Has No Per-Agent Budget Enforcement

If you're building with the OpenAI API, you've probably discovered a frustrating gap: OpenAI lets you set a hard limit on your overall account spend, but it has no concept of per-agent or per-user limits. Every API key you issue has access to your full account balance.

This becomes a serious problem as soon as you have more than one agent or customer using your API key. A single runaway loop — an agent stuck retrying a failed tool call, or a customer with unusual usage — can exhaust your monthly budget before you even notice.

Consider a typical scenario: you've built a customer-facing AI assistant. You have 500 paying customers, each generating roughly 10,000 tokens per day. Your OpenAI budget is set to $500/month. But what happens when one customer's account gets compromised and someone starts hammering your API? Or when a new feature introduces an infinite loop in your agent code? Without per-client limits, the answer is: your entire $500 budget disappears overnight.

OpenAI's own documentation acknowledges this gap. Their recommended workaround is to issue separate API keys per customer or agent — but that means managing hundreds of keys, tracking usage per key via the OpenAI dashboard, and writing your own enforcement logic. It's a maintenance nightmare.

The Right Solution: A Gateway That Enforces Budgets

The clean solution is to route all your LLM traffic through a proxy gateway that understands the concept of a *client identity* — and enforces per-client spend limits at the gateway layer, before requests ever reach OpenAI.

Proxide does exactly this. You keep a single Proxide API key. Each request includes an x-proxide-client-id header that identifies the agent or user making the request. Proxide tracks spend per client ID and returns a 402 Payment Required response the moment a client exceeds its configured limit.

Setting Up Per-Agent Budget Limits

Step 1: Change Your Base URL

The only code change you need is pointing your OpenAI client at Proxide:

typescript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "prox-your-key-here", // Your Proxide key
  baseURL: "https://gateway.proxide.ai/openai/v1",
});

That's it. Every existing OpenAI SDK call — chat.completions.create, embeddings.create, function calling, streaming — works without any other changes.

Step 2: Identify Your Agent or User

Add the x-proxide-client-id header to each request to tell Proxide which agent or user is making the call:

typescript
const response = await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: userMessage }],
  },
  {
    headers: {
      "x-proxide-client-id": `user-${userId}`, // e.g. "user-12345"
    },
  }
);

The client ID can be any string — a user ID, an agent name, a tenant slug. Proxide will track spend independently for each unique ID.

Step 3: Configure Budget Limits in the Dashboard

In your Proxide dashboard, navigate to Agents → Budget Rules and set:

  • Daily limit: Maximum spend per day per client ID (e.g. $0.50/day)
  • Monthly limit: Maximum spend per calendar month (e.g. $5.00/month)
  • Action on exceed: Block requests (returns 402) or just alert

You can also set limits via the Proxide API for programmatic configuration:

bash
curl -X POST https://api.proxide.ai/v1/budgets \
  -H "Authorization: Bearer prox-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "client_id": "user-12345",
    "daily_limit_usd": 0.50,
    "monthly_limit_usd": 5.00
  }'

What Happens When a Limit Is Exceeded

When a client hits its budget limit, Proxide returns a standard HTTP 402 Payment Required response:

json
{
  "error": {
    "type": "budget_exceeded",
    "message": "Daily budget limit of $0.50 exceeded for client user-12345",
    "client_id": "user-12345",
    "limit_type": "daily",
    "limit_usd": 0.50,
    "spent_usd": 0.52
  }
}

You can handle this gracefully in your application:

typescript
try {
  const response = await client.chat.completions.create({ ... });
} catch (error) {
  if (error.status === 402) {
    const detail = error.error;
    if (detail.type === "budget_exceeded") {
      // Show user a friendly message or prompt them to upgrade
      return "You've reached your daily AI usage limit. Upgrade your plan for more.";
    }
  }
  throw error;
}

Beyond Simple Limits: Loop Detection

Budget limits solve the spending problem, but Proxide goes further. It also detects when an agent appears to be looping — making the same or very similar requests in rapid succession. When loop detection fires, it stops the requests immediately, even before the budget limit is hit. This is especially useful for agentic systems where a bug can cause thousands of nearly-identical tool calls before the daily budget ceiling is reached.

Per-Tenant Billing

If you're building a SaaS product on top of LLMs, you can use client IDs to track exactly how much each customer costs you. The Proxide dashboard shows per-client spend breakdowns, letting you calculate your actual per-customer AI cost and make informed pricing decisions.

Summary

OpenAI's account-level limits aren't enough for production multi-agent or multi-tenant systems. By routing through Proxide and setting per-client budget rules, you get:

  • Hard daily and monthly spend limits per agent or user
  • Automatic 402 responses when limits are exceeded — no more surprise bills
  • Loop detection to catch runaway agents before they reach the limit
  • Per-client spend analytics for cost attribution

Sign up for Proxide — the free plan includes budget enforcement with up to 1 agent. Pro ($49/month) supports unlimited agents with granular per-client limits.

Try Proxide free

Get started in 2 minutes. Change your baseURL and get automatic failover, budget limits, PII redaction, and more.

Start for free →