Concurrency & Quota
Concurrency Control
Limit concurrent tasks per Agent when using rate-limited AI services:
apiVersion: kubeopencode.io/v1alpha1
kind: Agent
metadata:
name: rate-limited-agent
spec:
profile: "Rate-limited agent for API-quota-constrained backends"
agentImage: ghcr.io/kubeopencode/kubeopencode-agent-opencode:latest
executorImage: ghcr.io/kubeopencode/kubeopencode-agent-devbox:latest
workspaceDir: /workspace
serviceAccountName: kubeopencode-agent
maxConcurrentTasks: 3 # Only 3 Tasks can run at once
When the limit is reached:
- New Tasks enter
Queuedphase instead ofRunning - Queued Tasks automatically transition to
Runningwhen capacity becomes available - Tasks are processed in approximate FIFO order
Quota (Rate Limiting)
In addition to maxConcurrentTasks (which limits simultaneous running Tasks), you can configure quota to limit the rate at which Tasks can start using a sliding time window:
apiVersion: kubeopencode.io/v1alpha1
kind: Agent
metadata:
name: rate-limited-agent
spec:
profile: "Rate-limited agent with sliding window quota"
agentImage: ghcr.io/kubeopencode/kubeopencode-agent-opencode:latest
executorImage: ghcr.io/kubeopencode/kubeopencode-agent-devbox:latest
workspaceDir: /workspace
serviceAccountName: kubeopencode-agent
quota:
maxTaskStarts: 10 # Maximum 10 task starts
windowSeconds: 3600 # Per hour (sliding window)
Quota vs MaxConcurrentTasks
| Feature | maxConcurrentTasks | quota |
|---|---|---|
| What it limits | Simultaneous running Tasks | Rate of new Task starts |
| Time component | No (instant check) | Yes (sliding window) |
| Queued Reason | AgentAtCapacity | QuotaExceeded |
| Use case | Limit resource usage | API rate limiting |
Both can be used together for comprehensive control. When quota is exceeded, new Tasks enter Queued phase with reason QuotaExceeded.