AI in the CI/CD loop: let the gates do the trusting
- Kaur Kallas
- AI
- June 18, 2026
Table of Contents
AI coding agents are good enough now that writing the code is no longer the bottleneck — trusting what gets shipped is. “It compiles” and “it looks right to the model” are not quality gates. If you let an agent merge and deploy on its own judgement, you’ve quietly replaced code review and testing with vibes. The fix isn’t to keep AI out of production. It’s to make it pass exactly the same gates a human would.
The gates are the trust boundary — not the AI
Here’s the mental shift that makes this work: the agent proposes, the pipeline decides.
GitOps already gives you a deterministic, auditable trust boundary. Git is the source of truth, CI is the test gate, Argo CD reconciles the cluster to what’s declared, and Kargo promotes changes through environments behind verification. An AI agent is just a very fast new contributor that has to clear those same gates. It never gets a “trust me” exemption, and it never grades its own homework.
What the loop actually looks like
I run the agent — Claude — directly against Git, and drive it through ChatOps. The loop looks like this:
- Local fast feedback. Where it’s cheap and quick, the agent builds and runs tests locally first — lint, unit tests, a throwaway kind cluster — and fixes the obvious things before anyone else has to look.
- Push to Git. The change lands as a branch and commit, same as any contributor. Now the real gates take over.
- CI is the real test gate. The pipeline builds the image and runs the full suite. The agent’s local run was a courtesy; CI is the verdict.
- GitOps picks it up. Argo CD reconciles, or Kargo’s Warehouse spots the new image, mints Freight, and promotes it stage by stage — each step behind verification.
- End-to-end check. The agent then looks at the running system —
kubectlfor rollout and resource status, Grafana and metrics for the golden signals, traces for request health — increasingly through a kubectl / observability MCP, so it can interrogate the cluster the way I would. - Iterate or promote. Gate red, the agent sees the failure and goes again. Gate green, promotion continues — with a human approval at the production boundary.
ChatOps is the interface: I manage the whole thing conversationally, but every action flows through Git and the pipeline. Auditable, reversible, no side-channel to prod.
Why it’s safe — and why it’s fast
The speed comes from the agent compressing the inner loop — write, test locally, fix, push — into minutes. The safety comes from determinism: tests, progressive delivery, verification and observability that don’t care whether a human or a model produced the change. The AI never decides something is good. The gates do.
That’s also why this matters most in the environments people assume AI can’t go near — regulated banking and health-tech. You don’t earn trust there by trusting the model; you earn it by making the model auditable and gating it behind the same controls as everyone else.
It only works if the gates are real
The uncomfortable corollary: this is only ever as good as your gates. An AI shipping through a pipeline with thin tests and no verification just lets you ship bad code faster. Before you put an agent in the loop, the pipeline has to be worth trusting:
- Tests and verification that genuinely catch regressions.
- Progressive delivery so a bad change is contained, not global — Kargo handles those promotion gates.
- Observability good enough to answer “is it healthy?” automatically — eBPF and OBI get you there without instrumenting everything by hand.
- Tight guardrails on the agent itself: scoped credentials, no force-pushing past gates, human approval at prod.
Get that right and an AI agent stops being a risk you’re nervous about and becomes the fastest, most tireless contributor on the team — one that still has to earn every promotion.
Curious what an AI-assisted, properly-gated delivery loop looks like on your platform? Get in touch.