05 – Spend Caps
Without spend caps, a single bug or runaway loop can drain a user's entire credit balance in seconds. Spend caps act as safety valves — they limit how many credits a user can consume in a given period, protecting both the user and the platform operator from unexpected costs.
ducto supports three cap behaviors: deny (hard block — the operation is rejected when the cap is exceeded), warn (soft alert — the operation proceeds but the overage is flagged), and notify (passive monitoring — the operation proceeds and a notification event is triggered). The deny action is the most common choice for production deployments, while warn and notify are useful for gradual rollouts or monitoring-only scenarios.
Caps can be configured per user and per period type. ducto supports daily caps, which reset every calendar day, and monthly caps, which reset every calendar month. You can set different limits for different users — for example, a 5,000-credit daily cap for free-tier users and a 50,000-credit daily cap for enterprise customers.
In this notebook we will use the MemoryStore implementation (since PostgresStore does not ship a built-in set_spend_cap method) and walk through each cap action type. You will see what happens when a deduction stays under the cap, when it exceeds a deny cap, and when it exceeds a warn cap. By the end of this notebook you will understand how to configure and enforce spend controls in your own ducto deployment.
Setup
This notebook uses MemoryStore instead of PostgresStore because spend cap management is a store-specific feature. The PostgresStore backend does not include a set_spend_cap implementation — the database schema for cap tracking varies between deployments and is left to the implementer. MemoryStore, on the other hand, ships with full cap support out of the box, making it the ideal choice for learning and prototyping.
The memory_setup() helper creates a fresh MemoryStore instance, imports the SpendCap model (which defines cap configurations), and initializes the store's internal tables. All of our spend cap examples will run against this in-memory store.
import uuid
from datetime import datetime, timedelta
from ducto.interface.memory import MemoryStore
from ducto.manager import CreditManager
from ducto.engine import PricingEngine
from ducto.metrics import UsageMetrics, ToolCall
from ducto.interface.models import (
PricingConfigData, PlanDefinition,
CreditMetadata, SpendCap,
)
store = MemoryStore()
store.setup()
print("✔ MemoryStore ready.")
Set a daily deny-cap of 5 000
A deny cap is the strictest form of spend control: if the user's cumulative spend for the period reaches the limit, any further deduction attempt is rejected with an error. No credits leave the account, and the calling application must handle the rejection gracefully.
In this section we create a user, seed their account with 50,000 credits, and then configure a daily deny cap of 5,000 credits. The SpendCap model takes four parameters: the user identifier, the cap type ("daily" or "monthly"), the credit limit (in whole credits), and the action to take when the limit is reached ("deny", "warn", or "notify"). This sets up the conditions for the next two sections, where we test the cap from both sides: under and over.
# Create a new user for this spend cap demonstration
user = str(uuid.uuid4())
# Seed the user with a generous starting balance of 50,000 credits
store.add_credits(user, 50_000, type="seed")
print(f" Initial balance: {store.get_balance(user).balance}")
# Define a daily deny-cap: maximum 5,000 credits consumed per day, hard block
cap = SpendCap(user_id=user, cap_type="daily", limit=5_000, action="deny")
# Register the cap with the store
store.set_spend_cap(cap)
print(f" Cap registered: type=daily, limit={cap.limit}, action={cap.action}")
Deduct under cap (succeeds)
With the 5,000-credit daily deny cap in place, we first test a deduction that stays under the limit. A 3,000-credit charge is well within the 5,000-credit cap, so the reserve-and-deduct sequence should complete successfully.
After the deduction, we call check_spend_cap() to inspect the user's current spend tracking. This method returns a CapCheckResult object that tells us whether the user is currently capped, how much they have spent in the current period, and what their limit is. After a 3,000-credit deduction against a 5,000-credit cap, the current spend should be 3,000 and the user should not be flagged as capped.
# Reserve 3,000 credits — this is well under the 5,000 daily cap
res = store.reserve_credits(user, 3_000, operation_type="inference")
# Deduct the reserved credits — this should succeed since 3,000 is less than 5,000
ded = store.deduct_credits(user, res.reservation_id, 3_000)
print(f" Deduction succeeded: balance after = {ded.balance_after}")
# Check the user's current spend against their configured cap
check = store.check_spend_cap(user)
print(f" Cap status: capped={check.capped}, current spend={check.current_spend}")
Exceed cap (denied)
Now we attempt a second deduction that would push the user over the 5,000-credit daily limit. The user has already spent 3,000 credits, and a further 3,000-credit deduction would bring the total to 6,000 — exceeding the cap by 1,000.
With the deny action, the store rejects the deduction and returns an error message explaining why. The credits stay in the user's account and the reservation is released. This is the expected behavior for a hard cap: the application receives the error and can decide how to respond — perhaps by showing the user an upgrade prompt, logging the event for admin review, or retrying with a smaller operation.
This safety net prevents runaway costs from a single misconfigured loop or a compromised API key. Without it, the same 3,000-credit deduction would succeed and leave the platform operator to detect the overage retroactively. In this section we attempt the over-cap deduction and observe the deny behavior in action.
# Attempt a second deduction of 3,000 credits — this would exceed the remaining cap
res2 = store.reserve_credits(user, 3_000, operation_type="inference")
# Try to deduct — the store will reject this because 3,000 already spent + 3,000 new > 5,000 cap
ded2 = store.deduct_credits(user, res2.reservation_id, 3_000)
# Check the result: with a deny cap, the error field contains a descriptive message
if ded2.error:
print(f" Deduction denied: {ded2.error}")
else:
print(f" Deduction allowed: balance after = {ded2.balance_after}")
print(f" Explanation: daily cap = 5,000, already spent 3,000 today")
Cap with warn action
Not every cap breach needs to be a hard block. Sometimes you want to allow the operation but flag it for attention. The warn action does exactly that: the deduction proceeds normally, but the system records a warning that the user has exceeded their configured limit.
This is useful for gradual rollouts of spend controls. You might set a warn cap first to observe how many users would be affected, then switch to deny once you are confident in the limits. It is also appropriate for internal tools or trusted users where you want visibility into spending without disrupting their workflow.
In this section we create a second user with a much lower daily cap of 500 credits and the warn action. We then attempt a 1,000-credit deduction and observe that it succeeds even though it exceeds the cap. The check_spend_cap() response confirms the overage but does not block the operation.
# Create a second user for the warn cap demonstration
user2 = str(uuid.uuid4())
# Seed the user with 50,000 credits, same as the first user
store.add_credits(user2, 50_000, type="seed")
# Set a very low daily cap of 500 credits with the warn action (not deny)
store.set_spend_cap(SpendCap(user_id=user2, cap_type="daily", limit=500, action="warn"))
# Attempt to deduct 1,000 credits — exceeds the 500-credit warn cap
res3 = store.reserve_credits(user2, 1_000, operation_type="inference")
ded3 = store.deduct_credits(user2, res3.reservation_id, 1_000)
# Check cap status — the action field confirms "warn" and current spend exceeds the limit
check2 = store.check_spend_cap(user2)
print(f" Current spend: {check2.current_spend} Cap limit: {check2.cap_limit} Action: {check2.action}")
print(f" Key insight: warn action allows the deduction but flags the overage")