Distributed Canaries
Distributed canaries allow you to define a check once and have it automatically run on multiple agents. This is useful for monitoring services from different locations, clusters, or network segments.
How It Works
When you specify an agentSelector on a canary:
- The canary does not run locally on the server
- A copy of the canary is created for each matched agent
- Each agent runs the check independently and reports results back
- The copies are kept in sync with the parent canary
A background job syncs agent selector canaries every 5 minutes. When agents are added or removed, the derived canaries are automatically created or cleaned up.
Agent Selector Patterns
The agentSelector field accepts a list of patterns to match agent names:
| Pattern | Description |
|---|---|
agent-1 | Exact match |
eu-west-* | Prefix match (glob) |
*-prod | Suffix match (glob) |
!staging | Exclude agents matching this pattern |
team-*, !team-b | Match all team-* except team-b |
Example: HTTP Check on All Agents
This example creates an HTTP check for a Kubernetes service that runs on every agent matching the pattern:
distributed-http-check.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: api-health
namespace: monitoring
spec:
schedule: '@every 1m'
http:
- name: api-endpoint
url: http://api-service.default.svc.cluster.local:8080/health
responseCodes: [200]
test:
expr: json.status == 'healthy'
agentSelector:
- '*' # Run on all agents
When this canary is created:
- The check is executed locally only when
localagent is provided in selector - A derived canary is created for each registered agent
- Each agent executes the HTTP check against
api-service.default.svc.cluster.local:8080/healthin its own cluster - Results from all agents are aggregated and visible in the UI
Example: Regional Monitoring
Monitor an external API from specific regions:
regional-monitoring.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: external-api-latency
spec:
schedule: '@every 5m'
http:
- name: payment-gateway
url: https://api.payment-provider.com/health
responseCodes: [200]
maxResponseTime: 500
agentSelector:
- 'eu-*' # All EU agents
- 'us-*' # All US agents
- '!us-test' # Exclude test agent
- 'local' # Run on local instance as well
Example: Exclude Specific Agents
Run checks on all agents except those in a specific environment:
production-only.yamlapiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: production-checks
spec:
schedule: '@every 2m'
http:
- name: internal-service
url: http://internal.example.com/status
agentSelector:
- '!*-dev' # Exclude all dev agents
- '!*-staging' # Exclude all staging agents