Scaling to a fleet

One agent is the starting line. A fleet is the point. This page covers the operational choices you face when you go from 1 → 5 → 20 → 100 agents.

When to add the next agent

Three honest signals you're ready:

You trust the output. If you're still inspecting every diff the first agent produces, stop. Get to confidence first.
The first agent has a steady queue. Check /admin/jobs. If >25% of the time it's idle, more agents won't help — more work will.
You hit max_concurrent_tasks_per_agent. The Control Plane starts queuing. Time for more concurrency, which means more agents.

How many agents do I need?

Two limits matter:

max_agents on your license — the hard cap. Visible in agentina status on any host.
Hardware — each agent uses ~200 MB of RAM idle, plus whatever its workload needs (a tester running Playwright wants 1–2 GB available). Pack budget accordingly.

Practical sizing for a coder fleet on a typical repo:

Repo size	Coders	Reviewers	Testers
< 50k LOC	1–2	1	1
50k – 500k LOC	3–5	1–2	1–2
> 500k LOC	5–15	2–3	2–4

Always run exactly one indexer per repo, regardless of fleet size. Multiple indexers fight each other; one is enough.

One host or many?

You can run multiple agents on one host (each is a separate systemctl unit + state dir) or one agent per host. Both work. The trade-offs:

Pattern	Pro	Con
One big host, many agents	Cheaper, simpler to monitor	One bad agent can starve neighbors; single point of failure
One agent per host	Isolation, easier capacity planning	More installs to keep current
Hybrid (typical)	indexer on its own host, coder fleet packed onto 1–2 bigger hosts	Two patterns to operate

Installing N agents

Each install needs its own activation token. Mint one per intended agent in /portal/tokens; the first machine to redeem each token claims it.

To run multiple agents on the same host, override the state dir + systemd unit name:

bash

# Second agent on the same box
AGENTINA_STATE_DIR=/var/lib/agentina-2 \
AGENTINA_SYSTEMD_UNIT=agentina-2 \
curl -fsSL https://dl.agentina.io/install.sh | sudo AGENTINA_TOKEN=act_… bash

Monitor a fleet

The four numbers that matter, in order:

Online ratio — agents heartbeating in the last 3 min ÷ total. See /portal. Healthy fleet: >95%.
Job throughput — completed jobs per hour. See /admin/jobs. If it's dropping with the same workload, something's wrong upstream.
Anomalies — unresolved findings. See /admin/anomalies. Aim for 0 unresolved at end of day.
Version skew — how many agents are not on the latest release. See /admin/agents. Anything >1 minor version behind, plan an upgrade.

Upgrading a fleet

The Control Plane sends every agent an update_available hint on heartbeat. You decide when to apply it.

The safe pattern, every time:

Upgrade one agent. Let it run a full day.
Inspect: did its job throughput drop? Did agentina status stay healthy?
If yes, upgrade the rest in batches of 25%. Watch each batch for an hour before the next.

Rollback is automatic on smoke-check failure and on systemd start failure — see /docs/updates for the full mechanism.

Anti-patterns

Upgrading the whole fleet at once. A bad release takes everyone down. Two-level rollback helps but doesn't replace canarying.
Identical hosts. When something OS-level breaks, you lose every agent. Mix hosts across at least two AZ's / regions for anything you can't survive losing.
No indexer in the fleet. Already mentioned. Worth repeating.
Sharing one activation token across machines. Tokens are single-use; the second machine's install will fail. Mint one per agent.

Update lifecycle — the full upgrade mechanism, channels, rollback.
Troubleshooting — every error message + what it means.
Security model — read this before scaling past your own laptop.