2026-03-05-11 mingitopsfleetjenkinsvmalertalerting

Deploying Alert Rules at Scale with Fleet and Jenkins

The Problem with Fleet and Alert Rules

Rancher Fleet is great for GitOps. You push configs to a Git branch, Fleet syncs them to your clusters. Simple, reliable, no manual deployments.

But Fleet has a limitation that caught me off guard: no pre-sync hooks.

ArgoCD has resource hooks with a PreSync phase - you can run a Kubernetes Job before the actual sync happens. This is useful when your source files need transformation before deployment. Fleet doesn't have this concept. It uses Helm's native hooks (pre-install, pre-upgrade) but those run at deploy time inside the cluster, not at the Git level before sync. Fleet syncs exactly what's in your Git repository, as-is.

This became a problem when I needed to deploy alert rules to vmalert. The format I wanted engineers to write in wasn't the format vmalert understands. I needed a transformation step between "engineer writes a template" and "vmalert receives the rule."

The Repo Structure

The alert rules live in a dedicated branch of the Fleet GitOps repository. Here's the layout:

vmalert-apps/
├── alert-templates/             # Engineers edit these
│   ├── app-rules/               # Application-specific alerts
│   │   ├── service-a/alerts.yaml
│   │   ├── service-b/alerts.yaml
│   │   └── ...
│   └── infra-rules/             # Infrastructure alerts
│       ├── kubernetes/alerts.yaml
│       ├── database/alerts.yaml
│       └── ...
├── rules-bundle/                # Auto-generated - don't edit
│   └── rules/
│       ├── app-rules/           # Generated from templates
│       ├── infra-rules/         # Generated from templates
│       └── recording-rules/     # Edited directly (no templates)
├── scripts/
│   └── generate_from_template.py
├── fleet.yaml                   # Fleet targeting config
└── validation_Jenkinsfile       # CI pipeline

The alert-templates/ directory is where engineers work. The rules-bundle/rules/ directory is what Fleet deploys to vmalert. The recording-rules/ subdirectory is an exception - those are edited directly since they don't need severity splitting.

Why Templates

Standard vmalert alert rules look like this:

groups:
  - name: kubernetes.rules
    rules:
      - alert: CPUCloseToLimits
        expr: |
          (sum by (namespace,pod,container,cluster)(
            rate(container_cpu_usage_seconds_total[5m])
          ) / sum by(namespace,pod,container,cluster)(
            kube_pod_container_resource_limits{resource="cpu"}
          )) * 100 > 95
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "CPU usage close to limits"

That's one rule, one severity. If I want critical at 95%, warning at 90%, and low at 80%, I write three separate rules with the same expression but different thresholds. Three rules to maintain. Three places to update when the query changes. Three chances to make a mistake.

With 500+ rules across infrastructure, databases, applications, and health checks, this doesn't scale. I designed a template format where you define one alert with a severities block containing all levels. The expressions can be completely different between levels - not just different thresholds. A critical might check for complete failure while a warning checks for degradation using a different query entirely.

But vmalert doesn't understand severities: blocks. Something needs to transform this into standard rules before deployment.

The Transformation: Input vs Output

Here's what the generator actually produces. One template in, multiple vmalert rules out:

What the engineer writes (template):

- name: CPU Close to Limits
  annotations:
    summary: "CPU usage close to limits"
    runbook: /docs/runbooks/cpu-limits.md
  labels:
    team: infrastructure
  severities:
    - level: critical
      expr: |
        (sum by (namespace,pod,container,cluster)(
          rate(container_cpu_usage_seconds_total[5m])
        ) / sum by(namespace,pod,container,cluster)(
          kube_pod_container_resource_limits{resource="cpu"}
        )) * 100 > 95
      for: 5m
    - level: warning
      expr: |
        ...same query... > 90
      for: 5m

What vmalert receives (generated):

- alert: CPU Close to Limits
  expr: |
    (sum by (namespace,pod,container,cluster)(
      rate(container_cpu_usage_seconds_total[5m])
    ) / sum by(namespace,pod,container,cluster)(
      kube_pod_container_resource_limits{resource="cpu"}
    )) * 100 > 95
  for: 5m
  labels:
    team: infrastructure
    severity: critical
    severity_order: "1"
  annotations:
    summary: "CPU usage close to limits"
    runbook: /docs/runbooks/cpu-limits.md

- alert: CPU Close to Limits
  expr: |
    ...same query... > 90
  for: 5m
  labels:
    team: infrastructure
    severity: warning
    severity_order: "2"
  annotations:
    summary: "CPU usage close to limits"
    runbook: /docs/runbooks/cpu-limits.md

The key details: both generated rules share the same alert name (for inhibition matching), each gets a severity and severity_order label automatically, and all shared labels/annotations from the template are copied to every rule.

The Generator Logic

The core of the Python script is straightforward. For each template rule that has a severities block, it generates one standard alert rule per severity:

SEVERITY_ORDER = {
    'critical': '1',
    'warning': '2',
    'low': '3',
    'info': '4'
}

def generate_alert_rule(template_rule, severity):
    """Generate a single vmalert rule from a template + severity."""
    alert_rule = {
        'alert': template_rule['name'],
        'expr': severity['expr'],
        'labels': {},
        'annotations': {}
    }

    if severity.get('for'):
        alert_rule['for'] = severity['for']

    # Copy shared labels from template, then add severity
    if 'labels' in template_rule:
        alert_rule['labels'].update(template_rule['labels'])

    alert_rule['labels']['severity'] = severity['level']
    alert_rule['labels']['severity_order'] = SEVERITY_ORDER.get(severity['level'], '0')

    # Copy shared annotations
    if 'annotations' in template_rule:
        alert_rule['annotations'].update(template_rule['annotations'])

    return alert_rule

def process_template_group(template_group):
    """Expand all templates in a group into individual rules."""
    generated_rules = []

    for template_rule in template_group.get('rules', []):
        # Skip disabled alerts
        if not template_rule.get('enabled', True):
            continue

        for severity in template_rule.get('severities', []):
            generated_rules.append(
                generate_alert_rule(template_rule, severity)
            )

    return generated_rules

The script also validates that every severity has both expr and level fields, preserves multi-line PromQL as YAML block scalars, and supports an enabled: false flag to disable alerts without deleting them.

The Pipeline Flow

The setup: alert rules live in a dedicated branch of the Fleet GitOps repository. Engineers edit templates in an alert-templates/ directory. A rules-bundle/ directory contains the generated vmalert rules that Fleet actually deploys. The Jenkinsfile handles both validation and generation depending on which branch it runs on.

Two pipelines in one Jenkinsfile:

Feature Branch: Validate Only

When an engineer opens a PR from a feature branch, Jenkins:

Checks for direct rule edits - if someone edited the generated rules directory instead of the templates, the pipeline fails immediately with a clear error. The generated directory is output-only.
Generates rules locally - runs the Python generator to transform templates into vmalert format.
Validates with vmalert dry-run - spins up a vmalert container and validates every generated rule file:

docker run --rm \
  -v $(pwd)/rules:/rules \
  victoriametrics/vmalert:v1.123.0 \
  -rule="/rules/**/*.yaml" \
  -dryRun

If any expression has a syntax error, the pipeline fails before the PR can merge. No broken rules reach production.

Main Branch: Generate and Push

After the PR merges to the main alerting branch, Jenkins:

Detects template changes - compares the merge commit to find if any template files changed. If only non-template files changed, it skips generation entirely.
Generates rules - the Python script reads every template, expands the severities blocks into individual vmalert rules, adds severity and severity_order labels automatically.
Commits and pushes - the generated rules get committed back to the same branch with a [jenkins] tag. Fleet detects the new commit and syncs to vmalert.

The push has retry logic with rebase - if someone else pushed to the branch between the generation and push, Jenkins rebases and retries up to 3 times.

The Generator

The Python script does the actual transformation. For each template rule with a severities block, it generates one standard alert rule per severity level:

name becomes alert (the alert name)
Each severity's expr and for become the rule's expression and pending duration
severity label is added automatically (critical, warning, low)
severity_order label is added for sorting (1, 2, 3)
Shared annotations and labels from the template are copied to each generated rule
Multi-line PromQL expressions are preserved as YAML block scalars

Rules in the recording-rules/ directory are excluded from generation - those are edited directly since they don't need severity splitting.

The script also handles an enabled: false flag per alert. Engineers can disable a specific alert without deleting it, keeping the definition for reference.

Why This Matters

The generated rules share the same alertname label across severities. This is what makes AlertManager's inhibition work:

inhibit_rules:
  - source_matchers:
      - severity = critical
    target_matchers:
      - severity =~ warning|info|low
    equal:
      - alertname

When "CPU Close to Limits" fires as critical (>95%), AlertManager automatically suppresses the warning (>90%) and low (>80%) for the same alert name. The on-call engineer sees one alert at the highest severity, not three.

Without the template system generating consistent alert names across severities, this inhibition wouldn't work. Engineers would have to manually ensure naming consistency across separately maintained rules.

What's Still Not Solved

There's a pattern I keep running into: multiple different warning conditions should be suppressed by a single critical condition. For example, I might have three warning-level alerts checking different aspects of database health, and one critical alert for "database completely down." When the critical fires, all three warnings should be suppressed.

Current inhibition rules match on alertname, so the critical needs the same name as the warnings. But it's a different alert checking a different thing.

I haven't found a clean solution for this yet. Options I'm considering:

A group label that links related alerts across different names
Expanding the template format to support alert families
Using AlertManager's target_matchers with regex patterns

If you've solved this problem, I'd like to hear about it.

The Full Stack

The template system is one piece of a larger alerting pipeline:

Templates define alerts with multiple severities
Jenkins transforms and validates before deployment
Fleet syncs generated rules to vmalert across all clusters
vmalert evaluates rules against VictoriaMetrics
AlertManager handles inhibition, grouping, and routing
OpsGenie receives alerts with priority based on severity and cluster state
Silence Manager auto-suppresses alerts for non-live environments

Each piece is managed via GitOps from a single repository. Template changes go through PR review, get validated by CI, and deploy automatically. No SSH, no manual kubectl, no "I forgot to apply the new rules to cluster 47."

Migrating Existing Rules

If you already have hundreds of standard Prometheus/vmalert rules and want to adopt this template format, rewriting them by hand isn't practical. I built a migration tool that converts existing rules into the template format automatically.

It reads your current rule files, groups alerts by name, detects rules that share the same name but have different severity labels, and merges them into a single template with a severities block. Rules that don't have severity labels get converted to single-severity templates.

python3 convert_to_template.py \
  --input ./existing-rules \
  --output ./templates

This bootstraps the migration. You'll still want to review the output and adjust thresholds, but it saves hours of manual conversion.

Open Source

Both the generator and migration tool are available on GitHub: alert-template-generator

For more on the alerting pipeline, OpsGenie routing, and silence automation, see From Health Logs to Grouped OpsGenie Alerts.