Drift Monitoring

This page summarizes how to set up operations that periodically run tazuna state diff and tazuna state drift to visualize drift. For the command specs, see tazuna state diff and tazuna state drift; for the spec of State’s contents, see Internal Structure of State.

Two kinds of drift

There are two kinds of drift Tazuna can see, differing in direction.

Name	What it diffs	Command used for detection
Declared drift	Resources that should be generated from `tazuna.yaml` vs State	`tazuna state diff`
Live drift	State vs the actual objects on the live cluster	`tazuna state drift`

Declared drift captures “you updated tazuna.yaml but haven’t applied it” and “you removed a Manifest but it still remains on the cluster.” Live drift captures “you didn’t change tazuna.yaml, but someone ran kubectl apply directly” and “it was deleted by hand on the cluster side.”

What we call drift (declared drift)

Drift here is the difference between the resource set that should be generated from tazuna.yaml (the Build result) and the resource set recorded in the in-cluster State. This corresponds exactly to the output of tazuna state diff.

Diff type	Cases Detected	Typical example of drift
`added`	Present in the Build result, absent from State	Updated `tazuna.yaml` to add a Manifest, but it has not yet been applied
`modified`	Present in both, but with different content	Helm values change, kustomize overlay change, image tag update not yet reflected
`removed`	Present in State, absent from the Build result	Removed a Manifest from `tazuna.yaml`, but the resource is still in the cluster
`always-sync`	Always treated as synchronized	Secrets originating from GenesisSecret. Not drift but “places to check every time.”

tazuna state diff does not look at the cluster’s actual state. Results of hand-running kubectl apply against the cluster (resources not in State) are not detected here. They are ignored as outside Tazuna’s management.

Output Format

tazuna state diff emits output like the following on a per-Manifest basis.

Manifest: ingress-nginx
  STATUS         RESOURCE                                                     HASH
  modified       ingress-nginx/apps/v1/Deployment/ingress-nginx/controller    abc123... -> def456...

Manifest: aws-credentials
  STATUS         RESOURCE                                                     HASH
  always-sync    aws-credentials//v1/Secret/default/aws-credentials           xyz789...

If there are no differences, only the following single line is emitted.

No changes detected.

The most straightforward way to judge “no drift” today is by this one line (filter on whether the output contains No changes detected.). tazuna state diff itself does not change its exit code based on whether differences exist. Note that having differences is not an error.

Shapes of Monitoring

In practice, “drift monitoring” is one of (or a combination of) the following.

a. Run a CI Job Periodically

Run tazuna state diff a few times a day with GitHub Actions’ schedule and save the output.

Pro: Reuses existing CI credentials. Easy to post to Slack or similar when differences appear.
Con: Cluster connection info must be brought into CI. Not suitable for short intervals.

Points to note:

The job only needs read access to the cluster (tazuna state diff does not modify the cluster).
Dump the output to a file with tazuna state diff -f path/to/tazuna.yaml > diff.txt and only send a notification when it does not contain No changes detected., which eliminates noise during quiet periods.

b. Run as an In-cluster Job

Build a container image including the tazuna binary and run it periodically as a CronJob.

Pro: Authentication is confined to a ServiceAccount. Easy to use short intervals.
Con: You need to build and distribute the image. The job side also needs access to the same tazuna.yaml repository as CI.

If you distribute the full tazuna.yaml set as an OCI artifact via type: oras, the job side does not need to clone the repository. Combined with tazuna apply --offline, the registry also becomes unnecessary.

Wiring Up Notifications

The notification side wants the following three pieces of information.

Which Manifest has differences
Which Diff type it is (removed deserves special attention)
Which resource it is (in State key form)

The State key format is fixed as manifest/group/version/kind/namespace/name (cluster-scoped resources omit namespace), so grep-based post-processing is sufficient. See Internal Structure of State - State key for details.

Minimal notification prototype:

if ! tazuna state diff -f tazuna.yaml | tee diff.txt | grep -q "No changes detected."; then
  curl -X POST "$SLACK_WEBHOOK_URL" --data "$(jq -Rs '{text: .}' < diff.txt)"
fi

jq -Rs '{text: .}' is the standard idiom for wrapping the contents of diff.txt as a raw string into the {"text": "..."} JSON format expected by Slack’s Incoming Webhook (-R reads raw input, -s slurps all lines into a single string).

Responding to Detection

When drift appears, your options are one of the following.

The change was intentional: catch State up to the cluster with tazuna apply (add --sync to apply only the diff).
The change was unintentional:
- modified: track who changed it when via git log / cluster audit log, then decide whether to roll back or absorb the change into tazuna.yaml.
- added: most often this is a Manifest added to tazuna.yaml but not yet applied. Either apply, or revert tazuna.yaml, depending on intent.
- removed: a Manifest was removed from Tazuna but the resource still exists in the cluster. Clean it up with tazuna destroy narrowed by --tags, or with tazuna apply --sync --prune.
GenesisSecret’s always-sync: this is not drift, so it is fine to exclude it from notifications.

Detecting live drift

Whereas tazuna state diff looks only at “Build result vs State,” tazuna state drift compares “State vs the live cluster.” Even when you have not changed tazuna.yaml, it can detect resources rewritten by hand with kubectl apply or removed with kubectl delete.

# Example: check live drift every 30 minutes and post to Slack
if ! tazuna state drift -f tazuna.yaml | tee drift.txt | grep -q "No drift detected."; then
  curl -X POST "$SLACK_WEBHOOK_URL" --data "$(jq -Rs '{text: .}' < drift.txt)"
fi

There are two output categories: live-drifted (hash mismatch) and live-missing (gone from the cluster).

Monitoring both live drift and declared drift is the recommended setup. The former surfaces operational mistakes by cluster operators, while the latter surfaces reach problems in the GitOps pipeline - each separately.

Command spec: tazuna state diff / tazuna state drift / tazuna apply
Internal structure of State: Internal Structure of State
Terminology: Diff type / always-sync

Keyboard shortcuts

Tazuna