Scheduled Monitor
A monitoring topology with schedule blocks, notification interfaces, sandbox agents, and fallback chains
Scheduled Monitor
This example demonstrates a monitoring topology that runs on a schedule, checks system health, and sends alerts through Slack and PagerDuty. It uses sandbox agents for safe execution and fallback chains for resilience.
The Complete Topology
topology system-monitor : [pipeline, fan-out] {
meta {
version: "1.0.0"
description: "Scheduled system health monitoring with alerts"
}
orchestrator {
model: sonnet
handles: [intake, report]
}
schedule {
cron: "*/15 * * * *"
timezone: "UTC"
on-overlap: skip
}
interfaces {
interface slack {
type: webhook
url: "$SLACK_WEBHOOK_URL"
format: markdown
}
interface pagerduty {
type: api
url: "https://events.pagerduty.com/v2/enqueue"
auth: "$PAGERDUTY_API_KEY"
}
}
agent health-checker {
model: sonnet
phase: 1
sandbox: true
tools: [Bash, Read]
outputs: { status: healthy | degraded | critical }
prompt {
"Run system health checks: CPU usage, memory, disk space, and
service endpoints. Report overall status."
}
}
agent log-analyzer {
model: sonnet
phase: 1
sandbox: true
tools: [Read, Grep, Glob]
outputs: { anomalies-found: yes | no, severity: low | medium | high }
prompt {
"Analyze recent log files for error patterns, unusual spikes,
and anomalies. Report findings with severity."
}
}
agent reporter {
model: haiku
phase: 2
tools: [Read, Write]
outputs: { alert-level: none | warning | critical }
prompt {
"Synthesize health check and log analysis results into a concise
status report. Determine if alerts should be sent."
}
}
agent notifier {
model: haiku
phase: 3
tools: [Read]
prompt {
"Send notifications based on alert level. Use Slack for warnings
and PagerDuty for critical alerts."
}
fallback-chain: [slack, pagerduty]
}
flow {
intake -> [health-checker, log-analyzer]
health-checker -> reporter
log-analyzer -> reporter
reporter -> notifier [when reporter.alert-level == warning]
reporter -> notifier [when reporter.alert-level == critical]
reporter -> report [when reporter.alert-level == none]
notifier -> report
}
}Walkthrough
Schedule Block
schedule {
cron: "*/15 * * * *"
timezone: "UTC"
on-overlap: skip
}The schedule block runs the topology automatically on a cron schedule:
| Property | Value | Purpose |
|---|---|---|
cron | */15 * * * * | Run every 15 minutes |
timezone | UTC | Interpret the cron expression in UTC |
on-overlap | skip | If a previous run is still active, skip this execution |
The on-overlap property prevents concurrent runs from piling up. Other options include queue (wait for the previous run to finish) and cancel (stop the previous run and start fresh).
Interfaces
interfaces {
interface slack {
type: webhook
url: "$SLACK_WEBHOOK_URL"
format: markdown
}
interface pagerduty {
type: api
url: "https://events.pagerduty.com/v2/enqueue"
auth: "$PAGERDUTY_API_KEY"
}
}Interfaces define external communication channels. Each interface has:
| Property | Description |
|---|---|
type | Communication method: webhook, api, email |
url | Endpoint URL (supports environment variables) |
format | Output format for the interface |
auth | Authentication credential (supports environment variables) |
Agents reference interfaces by name. The notifier agent can send messages to slack or pagerduty based on the alert severity.
Sandbox Agents
agent health-checker {
sandbox: true
tools: [Bash, Read]
}
agent log-analyzer {
sandbox: true
tools: [Read, Grep, Glob]
}The sandbox: true property runs the agent in an isolated environment. This is critical for monitoring agents that execute system commands — the sandbox prevents accidental writes or destructive operations.
Sandboxed agents can read system state but cannot modify it. If the health checker's Bash commands attempt to write files or change configurations, the sandbox blocks those operations.
Fallback Chain
agent notifier {
fallback-chain: [slack, pagerduty]
}The fallback-chain property defines a list of interfaces to try in order. If Slack is unreachable (webhook fails), the notifier automatically falls back to PagerDuty. This ensures critical alerts always reach someone.
Fallback chains try each interface in order until one succeeds. If all interfaces fail, the agent reports the failure to the orchestrator.
Phase 1: Parallel Health Checks (Fan-Out)
flow {
intake -> [health-checker, log-analyzer]
}Both monitoring agents run simultaneously at phase 1. The health checker runs system commands while the log analyzer scans log files. Running them in parallel reduces the total monitoring cycle time.
Phase 2: Report Synthesis
agent reporter {
model: haiku
phase: 2
outputs: { alert-level: none | warning | critical }
}The reporter merges results from both phase 1 agents and determines the overall alert level. Using haiku keeps costs low for a task that requires synthesis but not deep reasoning.
Phase 3: Conditional Notification
flow {
reporter -> notifier [when reporter.alert-level == warning]
reporter -> notifier [when reporter.alert-level == critical]
reporter -> report [when reporter.alert-level == none]
}Notifications are only sent when there is something to report. If the alert level is none, the flow goes directly to the final report without bothering the notifier. This prevents alert fatigue from routine "all clear" messages.
Flow Diagram
+--> health-checker --+
| |
[schedule] ---+ +--> reporter --+--> notifier -> report
| | |
+--> log-analyzer ----+ +--> report
(none)Adapting This Example
- Add more checkers — include a
dependency-checkerorcertificate-checkerat phase 1 - Add escalation — if PagerDuty also fails, route to an
emailinterface - Add metering — set a daily budget to control costs for frequent scheduled runs
- Change the schedule — use
cron: "0 * * * *"for hourly checks orcron: "0 9 * * 1-5"for weekday mornings - Add a human gate — require human confirmation before sending PagerDuty alerts to reduce false alarms