Skip to main content

Concurrency & Quota

Concurrency Control

Limit concurrent tasks per Agent when using rate-limited AI services:

apiVersion: kubeopencode.io/v1alpha1
kind: Agent
metadata:
name: rate-limited-agent
spec:
profile: "Rate-limited agent for API-quota-constrained backends"
agentImage: ghcr.io/kubeopencode/kubeopencode-agent-opencode:latest
executorImage: ghcr.io/kubeopencode/kubeopencode-agent-devbox:latest
workspaceDir: /workspace
serviceAccountName: kubeopencode-agent
maxConcurrentTasks: 3 # Only 3 Tasks can run at once

When the limit is reached:

  • New Tasks enter Queued phase instead of Running
  • Queued Tasks automatically transition to Running when capacity becomes available
  • Tasks are processed in approximate FIFO order

Quota (Rate Limiting)

In addition to maxConcurrentTasks (which limits simultaneous running Tasks), you can configure quota to limit the rate at which Tasks can start using a sliding time window:

apiVersion: kubeopencode.io/v1alpha1
kind: Agent
metadata:
name: rate-limited-agent
spec:
profile: "Rate-limited agent with sliding window quota"
agentImage: ghcr.io/kubeopencode/kubeopencode-agent-opencode:latest
executorImage: ghcr.io/kubeopencode/kubeopencode-agent-devbox:latest
workspaceDir: /workspace
serviceAccountName: kubeopencode-agent
quota:
maxTaskStarts: 10 # Maximum 10 task starts
windowSeconds: 3600 # Per hour (sliding window)

Quota vs MaxConcurrentTasks

FeaturemaxConcurrentTasksquota
What it limitsSimultaneous running TasksRate of new Task starts
Time componentNo (instant check)Yes (sliding window)
Queued ReasonAgentAtCapacityQuotaExceeded
Use caseLimit resource usageAPI rate limiting

Both can be used together for comprehensive control. When quota is exceeded, new Tasks enter Queued phase with reason QuotaExceeded.