Progressive Delivery with GitOps: Safer Deployments Using Argo Rollouts and Flagger

Beyond All-or-Nothing: The Case for Gradual Rollouts

You’ve adopted GitOps. Your infrastructure is declarative, version-controlled, and automatically reconciled. But when it comes to deploying application changes, are you still flipping a switch and hoping for the best?

Progressive delivery bridges this gap. Instead of instant cutover, traffic shifts gradually — 5% → 25% → 100% — with automated checks at every step. If metrics degrade, instant rollback. If health checks pass, automatic promotion. The result: safer deployments without sacrificing velocity.

The Progressive Delivery Stack

At its core, progressive delivery combines three capabilities:

Traffic Shifting — Gradually move users from old to new version
Automated Analysis — Continuously evaluate SLOs and business metrics
Automatic Promotion/Rollback — Decisions based on data, not gut feeling

The two leading implementations in the Kubernetes ecosystem are Argo Rollouts and Flagger. Both integrate with existing GitOps workflows but approach progressive delivery differently.

Argo Rollouts: Native Kubernetes Experience

Argo Rollouts extends the Deployment concept with custom resources. You get canaries, blue-green deployments, and experiments using familiar Kubernetes primitives.

Architecture Overview

┌─────────────────────────────────────────┐
│           Argo Rollouts Controller      │
│  (manages Rollout CRD, traffic shaping) │
├─────────────────────────────────────────┤
│              Service Mesh               │
│    (Istio, Linkerd, NGINX, ALB, SMI)  │
├─────────────────────────────────────────┤
│           Prometheus/OTel               │
│         (metric queries for analysis)   │
└─────────────────────────────────────────┘

Example: Canary Deployment

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: payment-service
spec:
  replicas: 10
  strategy:
    canary:
      canaryService: payment-service-canary
      stableService: payment-service-stable
      trafficRouting:
        istio:
          virtualService:
            name: payment-service-vs
            routes:
            - primary
      steps:
      - setWeight: 5
      - pause: {duration: 10m}
      - setWeight: 20
      - pause: {duration: 10m}
      - analysis:
          templates:
          - templateName: success-rate
      - setWeight: 50
      - pause: {duration: 10m}
      - setWeight: 100
      - analysis:
          templates:
          - templateName: success-rate
          - templateName: latency

Analysis Template

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  metrics:
  - name: success-rate
    interval: 5m
    count: 3
    successCondition: result[0] >= 0.95
    provider:
      prometheus:
        address: http://prometheus:9090
        query: |
          sum(rate(http_requests_total{service="payment-service",status=~"2.."}[5m]))
          /
          sum(rate(http_requests_total{service="payment-service"}[5m]))

Flagger: GitOps-Native Approach

Flagger takes a different approach. Instead of replacing Deployments, it works alongside them — creating canary resources and managing traffic splitting externally.

Architecture Overview

┌─────────────────────────────────────────┐
│              Flagger                    │
│  (watches Deployments, manages canary) │
├─────────────────────────────────────────┤
│         Service Mesh / Ingress          │
│  (Istio, Linkerd, NGINX, Gloo, Contour)│
├─────────────────────────────────────────┤
│         Prometheus/CloudWatch            │
│          (metrics for canary checks)  │
└─────────────────────────────────────────┘

Example: Automated Canary

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: payment-service
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  service:
    port: 8080
  analysis:
    interval: 30s
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
      interval: 1m
    - name: request-duration
      thresholdRange:
        max: 500
      interval: 1m
    webhooks:
    - name: load-test
      url: http://flagger-loadtester.test/
      timeout: 5s
      metadata:
        cmd: "hey -z 1m -q 10 -c 2 http://payment-service-canary/"

Argo Rollouts vs Flagger: Quick Comparison

Aspect	Argo Rollouts	Flagger
Deployment Model	Replaces Deployment with Rollout CRD	Watches existing Deployments
GitOps Integration	Argo CD native (same project)	Works with any GitOps tool
Traffic Control	Multiple meshes + ALB/NLB	Multiple meshes + ingress controllers
Experimentation	Built-in A/B/n testing	A/B testing via webhooks
Analysis	AnalysisTemplate/AnalysisRun CRDs	Inline metric thresholds
Rollback	Automatic on failed analysis	Automatic on threshold breach

Metric-Driven Promotion

The magic happens when deployment decisions are based on actual system behavior, not time-based guesses.

Key Metrics to Watch

Golden Signals: Latency, traffic, errors, saturation
Business Metrics: Conversion rates, checkout completion
Infrastructure Metrics: CPU, memory, disk I/O

Prometheus Integration Example

# Argo Rollouts: P99 latency check
- name: p99-latency
  interval: 5m
  successCondition: result[0] <= 200
  provider:
    prometheus:
      address: http://prometheus.monitoring
      query: |
        histogram_quantile(0.99,
          sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
        )

# Flagger: Error rate check
metrics:
- name: request-success-rate
  thresholdRange:
    min: 99.0
  interval: 1m

Adoption Path: From GitOps to Progressive Delivery

For teams already running Argo CD or Flux, the transition is gradual:

Phase 1: Observability Foundation

Ensure metrics are flowing (Prometheus/Grafana operational)
Define SLOs and error budgets
Set up alerting on key services

Phase 2: First Canary

Pick a non-critical service with good metrics coverage
Install Argo Rollouts or Flagger controller
Convert Deployment to Rollout/Canary (small team impact)

Phase 3: Expand Coverage

Roll out to more services
Refine analysis templates based on learnings
Add automated load testing in canary phase

Phase 4: Advanced Patterns

A/B/n testing for feature validation
Multi-region progressive rollouts
Chaos engineering integration

Integration with Argo CD

Argo Rollouts shines here because it's part of the same ecosystem:

# Application manifest with Rollout
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: payment-service
  namespace: argocd
spec:
  project: production
  source:
    repoURL: https://github.com/org/gitops-repo
    targetRevision: HEAD
    path: apps/payment-service
  destination:
    server: https://kubernetes.default.svc
    namespace: payments
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

The Rollout resource is just another Kubernetes object — Argo CD manages it like any Deployment.

Common Pitfalls and How to Avoid Them

Insufficient Metrics Coverage

Problem: Canary proceeds based on partial data.
Solution: Require minimum metric samples before promotion decision.

Overly Aggressive Traffic Shifts

Problem: 50% traffic jump exposes too many users to issues.
Solution: Use smaller steps (5% → 10% → 25% → 50% → 100%).

Ignoring Cold Start Effects

Problem: New pods show artificially high latency initially.
Solution: Add warmup period or exclude initial metrics from analysis.

When to Choose Which

Choose Argo Rollouts if:

You're already using Argo CD
You want tight integration with your GitOps workflow
You need sophisticated experimentation (A/B/n testing)

Choose Flagger if:

You use Flux or another GitOps tool
You prefer keeping native Deployments
You want simpler, less invasive setup

Conclusion

Progressive delivery isn't just a safety net — it's a competitive advantage. Teams that deploy confidently multiple times per day recover faster from incidents, validate features with real traffic, and reduce the blast radius of bad changes.

The tooling is mature, the patterns are proven, and the integration with existing GitOps workflows is seamless. Whether you choose Argo Rollouts or Flagger, the important step is starting: pick a service, set up your first canary, and let data drive your deployment decisions.

GitOps gave us declarative infrastructure. Progressive delivery gives us declarative confidence in our deployments.