Agentina

Scaling to a fleet

One agent is the starting line. A fleet is the point. This page covers the operational choices you face when you go from 1 → 5 → 20 → 100 agents.

When to add the next agent

Three honest signals you're ready:

  1. You trust the output. If you're still inspecting every diff the first agent produces, stop. Get to confidence first.
  2. The first agent has a steady queue. Check /admin/jobs. If >25% of the time it's idle, more agents won't help — more work will.
  3. You hit max_concurrent_tasks_per_agent. The Control Plane starts queuing. Time for more concurrency, which means more agents.

How many agents do I need?

Two limits matter:

  • max_agents on your license — the hard cap. Visible in agentina status on any host.
  • Hardware — each agent uses ~200 MB of RAM idle, plus whatever its workload needs (a tester running Playwright wants 1–2 GB available). Pack budget accordingly.

Practical sizing for a coder fleet on a typical repo:

Repo sizeCodersReviewersTesters
< 50k LOC1–211
50k – 500k LOC3–51–21–2
> 500k LOC5–152–32–4
Always run exactly one indexer per repo, regardless of fleet size. Multiple indexers fight each other; one is enough.

One host or many?

You can run multiple agents on one host (each is a separate systemctl unit + state dir) or one agent per host. Both work. The trade-offs:

PatternProCon
One big host, many agentsCheaper, simpler to monitorOne bad agent can starve neighbors; single point of failure
One agent per hostIsolation, easier capacity planningMore installs to keep current
Hybrid (typical)indexer on its own host, coder fleet packed onto 1–2 bigger hostsTwo patterns to operate

Installing N agents

Each install needs its own activation token. Mint one per intended agent in /portal/tokens; the first machine to redeem each token claims it.

To run multiple agents on the same host, override the state dir + systemd unit name:

bash
# Second agent on the same box
AGENTINA_STATE_DIR=/var/lib/agentina-2 \
AGENTINA_SYSTEMD_UNIT=agentina-2 \
curl -fsSL https://dl.agentina.io/install.sh | sudo AGENTINA_TOKEN=act_… bash

Monitor a fleet

The four numbers that matter, in order:

  1. Online ratio — agents heartbeating in the last 3 min ÷ total. See /portal. Healthy fleet: >95%.
  2. Job throughput — completed jobs per hour. See /admin/jobs. If it's dropping with the same workload, something's wrong upstream.
  3. Anomalies — unresolved findings. See /admin/anomalies. Aim for 0 unresolved at end of day.
  4. Version skew — how many agents are not on the latest release. See /admin/agents. Anything >1 minor version behind, plan an upgrade.

Upgrading a fleet

The Control Plane sends every agent an update_available hint on heartbeat. You decide when to apply it.

The safe pattern, every time:

  1. Upgrade one agent. Let it run a full day.
  2. Inspect: did its job throughput drop? Did agentina status stay healthy?
  3. If yes, upgrade the rest in batches of 25%. Watch each batch for an hour before the next.

Rollback is automatic on smoke-check failure and on systemd start failure — see /docs/updates for the full mechanism.

Anti-patterns

  • Upgrading the whole fleet at once. A bad release takes everyone down. Two-level rollback helps but doesn't replace canarying.
  • Identical hosts. When something OS-level breaks, you lose every agent. Mix hosts across at least two AZ's / regions for anything you can't survive losing.
  • No indexer in the fleet. Already mentioned. Worth repeating.
  • Sharing one activation token across machines. Tokens are single-use; the second machine's install will fail. Mint one per agent.

Next