Scale

The scale block is defined inside an agent block to configure auto-scaling behavior. When workload increases, scaled agents spawn additional instances to handle demand in parallel.

Basic Syntax

agent worker {
  model: sonnet
  tools: [Read, Write]

  scale {
    mode: auto
    by: pending-tasks
    min: 1
    max: 5
    batch-size: 3
  }
}

Field Reference

Field	Type	Required	Description
`mode`	identifier	Yes	Scaling mode (currently `auto`)
`by`	identifier	No	Metric to scale by (e.g. `pending-tasks`)
`min`	number	No	Minimum number of instances (default: 1)
`max`	number	No	Maximum number of instances
`batch-size`	number	No	Number of items each instance processes

Scaling Mode

Currently, auto is the only supported mode. In auto mode, AgenTopology monitors the specified metric and adjusts instance count within the defined bounds.

scale {
  mode: auto
  by: pending-tasks
  min: 1
  max: 10
}

Scaling Metric

The by field determines what drives scaling decisions:

Metric	Description
`pending-tasks`	Number of queued tasks waiting for this agent

scale {
  mode: auto
  by: pending-tasks
  min: 2
  max: 8
}

Batch Size

Control how many items each agent instance processes at a time:

agent file-processor {
  model: sonnet
  tools: [Read, Write]

  scale {
    mode: auto
    by: pending-tasks
    min: 1
    max: 5
    batch-size: 10
  }
}

Each instance handles up to 10 items before a new instance is spawned.

Full Example

topology processing-pipeline : [fan-out] {
  agent dispatcher {
    model: sonnet
    role: "Splits large workloads into individual tasks"
    tools: [Read]
  }

  agent processor {
    model: sonnet
    role: "Processes individual items"
    tools: [Read, Write]

    scale {
      mode: auto
      by: pending-tasks
      min: 1
      max: 10
      batch-size: 5
    }
  }

  agent aggregator {
    model: sonnet
    role: "Combines processed results"
    tools: [Read, Write]
  }

  flow {
    dispatcher -> processor -> aggregator
  }
}

Tips

Scale blocks go inside agent blocks, not at the topology level.
Set max to prevent runaway costs from unbounded scaling.
batch-size is useful for file processing, data transformation, and other item-oriented workloads.
Scaling works best with the fan-out and orchestrator-worker patterns.
min: 1 is the default. Set min: 0 if the agent should only run when there is work.

Scale

On this page