fix(cutover 0.1.20): Step-06 pushes YAML edit to local Gitea so patches survive Flux reconcile (#970) (#971)
## Root cause (live on otech116 2026-05-05 14:38) After the #968 fix shipped (0.1.19), the cutover engine reached Step-7 (87%) successfully — Step-01..07 all completed. Then Step-08 (egress- block-test) caught 38/38 HelmRepositories had reverted to upstream: ``` external HelmRepositories still pointing at ghcr.io/openova-io: 38 OFFENDER flux-system/bp-cilium=oci://ghcr.io/openova-io ... (37 more) FAIL — at least one HelmRepository did not pivot ``` But Step-06's job logs say: ``` [helmrepository-patches] OK bp-cilium -> oci://harbor.otech116.omani.works/openova-io ... (37 more OK) ok=38 skip=0 fail=0 ``` So Step-06 thought it succeeded — and it had, momentarily. But then the bootstrap-kit Kustomization (which had successfully pivoted to local Gitea via Step-05) reconciled its YAML from local Gitea, where the YAML still declared `url: oci://ghcr.io/openova-io`. Within ~30s every kubectl patch was undone. The cutover engine then aborted at Step-8 verification. ## Fix Step-06 now runs in two phases: 1. **Live K8s patches** (existing behaviour) — flips spec.url on every HelmRepository immediately. Useful for the cluster between cutover and the next reconcile. 2. **NEW — Push YAML edit to local Gitea** — clones `openova/openova` from the local Gitea over basic-auth, sed-rewrites every `clusters/_template/bootstrap-kit/*.yaml` declaration of `url: oci://ghcr.io/openova-io` → `oci://harbor.<sov-fqdn>/openova-io`, commits with a clear message, pushes back. Subsequent reconciles see local Harbor as the steady-state. After the push, the script annotates `flux-system/openova` GitRepository to trigger immediate reconciliation so the new YAML lands without waiting for the polling interval. ## Image change Step-06 image bumped from `bitnami/kubectl:1.31.4` to `alpine/k8s:1.31.4` because the new phase needs both `kubectl` and `git` in one image (verified live on otech116 — both binaries present). ## Acceptance gate Test case 16 added to cutover-contract.sh — guards against future regressions that remove the `git clone`, the `git push origin main`, or the `clusters/_template/bootstrap-kit` target dir reference. ## Live verification Will fire on otech117 (next provision). Expected: - Step-06 logs `cloning gitea-http.gitea.../openova/openova.git` then `pushed to ...` - Step-08 verify PASSES (38/38 HelmRepositories pivoted in K8s + Gitea) - self-sovereign-cutover-status `cutoverComplete: "true"` - Egress block to ghcr.io safely activates Co-authored-by: e3mrah <ebaysal@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
9ed579d4ba
commit
608db53a25
@ -184,11 +184,27 @@ spec:
|
||||
# one DNS miss was terminal and the cutover engine aborted all
|
||||
# 8 steps. Fix is dual: (a) catalyst-api now stamps Jobs with
|
||||
# `backoffLimit=3` so a single miss is recoverable; (b) Step-01
|
||||
# bash script gains an explicit `nslookup` readiness loop (30 ×
|
||||
# bash script gains an explicit `nslookup` readiness loop (30 x
|
||||
# 5s) at the top, before any wget call. Both layers are needed —
|
||||
# the in-script probe is fastest; the backoffLimit is the
|
||||
# safety net for any other transient pre-cluster-stable race.
|
||||
version: 0.1.19
|
||||
# 0.1.20: Step-06 helmrepository-patches reverted by Flux (#970).
|
||||
# 0.1.19 unblocked the cutover through Step-7, but Step-08
|
||||
# verify caught 38/38 HelmRepositories had reverted to
|
||||
# oci://ghcr.io/openova-io despite Step-06's job logs showing
|
||||
# `OK ${name} -> oci://harbor.<sov-fqdn>/openova-io` for each.
|
||||
# Root cause: Step-06 only `kubectl patch`ed the live K8s
|
||||
# objects; bootstrap-kit Kustomization reconciled YAML from
|
||||
# local Gitea every 1m, where the YAML still declared the
|
||||
# upstream URL, undoing each patch within ~30s. Fix: Step-06
|
||||
# now does both phases — (a) live kubectl patches as before,
|
||||
# then (b) clones local Gitea, sed-rewrites every
|
||||
# clusters/_template/bootstrap-kit/*.yaml declaration of
|
||||
# `url: oci://ghcr.io/openova-io` → local Harbor prefix,
|
||||
# commits, and pushes. Subsequent reconciles see local Harbor
|
||||
# as steady-state. Image bumped to alpine/k8s:1.31.4 (kubectl
|
||||
# + git in one image; verified live on otech116).
|
||||
version: 0.1.20
|
||||
sourceRef:
|
||||
kind: HelmRepository
|
||||
name: bp-self-sovereign-cutover
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
apiVersion: v2
|
||||
name: bp-self-sovereign-cutover
|
||||
version: 0.1.19
|
||||
version: 0.1.20
|
||||
description: |
|
||||
Catalyst Self-Sovereignty Cutover Blueprint. Installs DORMANT — this
|
||||
chart ships eight step ConfigMaps (PodSpec ConfigMaps, one per step),
|
||||
|
||||
@ -29,7 +29,9 @@ data:
|
||||
activeDeadlineSeconds: {{ .Values.stepTimeouts.helmRepositoryPatchesSeconds }}
|
||||
containers:
|
||||
- name: helmrepository-patches
|
||||
image: {{ include "bp-self-sovereign-cutover.image" (dict "repository" .Values.images.kubectl.repository "tag" .Values.images.kubectl.tag "cutoverPhase" "post" "Values" .Values) }}
|
||||
image: alpine/k8s:1.31.4 # ships kubectl + git so we can both
|
||||
# patch the live K8s object AND push
|
||||
# the YAML edit to local Gitea (#970).
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: HELMREPO_NAMESPACE
|
||||
@ -38,10 +40,30 @@ data:
|
||||
value: {{ .Values.helmRepositories.upstreamPrefix | quote }}
|
||||
- name: HARBOR_PUBLIC_URL
|
||||
value: {{ .Values.sovereign.harborPublicURL | quote }}
|
||||
- name: SOVEREIGN_FQDN
|
||||
value: {{ .Values.sovereign.fqdn | quote }}
|
||||
- name: GITEA_INTERNAL_URL
|
||||
value: {{ .Values.sovereign.giteaInternalURL | quote }}
|
||||
- name: GITEA_USERNAME
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: {{ .Values.gitea.adminSecretRef.name }}
|
||||
key: {{ .Values.gitea.adminSecretRef.usernameKey }}
|
||||
- name: GITEA_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: {{ .Values.gitea.adminSecretRef.name }}
|
||||
key: {{ .Values.gitea.adminSecretRef.passwordKey }}
|
||||
- name: GITEA_ORG
|
||||
value: {{ .Values.gitea.org | quote }}
|
||||
- name: GITEA_REPO
|
||||
value: {{ .Values.gitea.repo | quote }}
|
||||
volumeMounts:
|
||||
- name: helmrepository-list
|
||||
mountPath: /work
|
||||
readOnly: true
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
command: ["/bin/sh", "-c"]
|
||||
args:
|
||||
- |
|
||||
@ -51,6 +73,7 @@ data:
|
||||
|
||||
echo "[helmrepository-patches] upstream=${UPSTREAM_PREFIX} -> local=${local_prefix}"
|
||||
|
||||
# ---- Phase 1: live K8s patches (immediate) ----
|
||||
ok=0
|
||||
skip=0
|
||||
fail=0
|
||||
@ -90,11 +113,89 @@ data:
|
||||
fi
|
||||
done < /work/helmrepository-names.txt
|
||||
|
||||
echo "[helmrepository-patches] ok=${ok} skip=${skip} fail=${fail}"
|
||||
[ "${fail}" -eq 0 ]
|
||||
echo "[helmrepository-patches] live-K8s ok=${ok} skip=${skip} fail=${fail}"
|
||||
[ "${fail}" -eq 0 ] || exit 1
|
||||
|
||||
# ---- Phase 2: push YAML edit to local Gitea (#970) ----
|
||||
#
|
||||
# The kubectl patches above flip the live HelmRepository
|
||||
# objects. But the bootstrap-kit Kustomization reconciles
|
||||
# YAML from the Sovereign's local Gitea every minute, and
|
||||
# those YAML files still declare `url: oci://ghcr.io/openova-io`.
|
||||
# Without this phase, Flux reverts every patch within ~1m,
|
||||
# and Step-08's verify catches the regression as "OFFENDER".
|
||||
#
|
||||
# Fix: clone the local Gitea, sed-rewrite every
|
||||
# clusters/_template/bootstrap-kit/*.yaml that declares the
|
||||
# upstream URL, commit, push. Subsequent reconciles pick up
|
||||
# the local Harbor URL as the steady-state.
|
||||
export HOME=/tmp
|
||||
git config --global user.name "self-sovereign-cutover"
|
||||
git config --global user.email "cutover@${SOVEREIGN_FQDN}"
|
||||
git config --global advice.detachedHead false
|
||||
|
||||
# gitea-http is headless (#968); wait for DNS just in case.
|
||||
gitea_host="$(printf '%s' "${GITEA_INTERNAL_URL}" | sed -E 's|^https?://||' | cut -d: -f1 | cut -d/ -f1)"
|
||||
for i in $(seq 1 30); do
|
||||
if nslookup "${gitea_host}" >/dev/null 2>&1; then break; fi
|
||||
sleep 5
|
||||
done
|
||||
|
||||
# URL with embedded basic auth — credential goes to git via
|
||||
# URL only, never echoed to stdout.
|
||||
push_url=$(printf '%s' "${GITEA_INTERNAL_URL}" | sed -E "s,^(https?://),\1${GITEA_USERNAME}:${GITEA_PASSWORD}@,")"/${GITEA_ORG}/${GITEA_REPO}.git"
|
||||
redacted=$(printf '%s' "${GITEA_INTERNAL_URL}/${GITEA_ORG}/${GITEA_REPO}.git")
|
||||
|
||||
echo "[helmrepository-patches] cloning ${redacted}"
|
||||
cd /tmp
|
||||
rm -rf repo
|
||||
git clone --depth 1 --branch main "${push_url}" repo >/dev/null 2>&1
|
||||
cd repo
|
||||
|
||||
# Find every HelmRepository YAML that declares the upstream
|
||||
# URL under clusters/_template/bootstrap-kit/. Rewrite to
|
||||
# the local Harbor prefix in-place.
|
||||
target_dir="clusters/_template/bootstrap-kit"
|
||||
if [ ! -d "${target_dir}" ]; then
|
||||
echo "[helmrepository-patches] FATAL: ${target_dir} not present in local mirror" >&2
|
||||
exit 1
|
||||
fi
|
||||
edited=0
|
||||
for f in $(grep -lE "^[[:space:]]*url:[[:space:]]*${UPSTREAM_PREFIX}[[:space:]]*$" "${target_dir}"/*.yaml 2>/dev/null || true); do
|
||||
sed -i -E "s,^([[:space:]]*url:[[:space:]]*)${UPSTREAM_PREFIX}([[:space:]]*)$,\1${local_prefix}\2," "${f}"
|
||||
edited=$((edited+1))
|
||||
echo "[helmrepository-patches] edited ${f}"
|
||||
done
|
||||
echo "[helmrepository-patches] sed edited ${edited} files"
|
||||
|
||||
if [ "${edited}" -eq 0 ]; then
|
||||
echo "[helmrepository-patches] no edits — already pivoted in Gitea or upstream prefix not present"
|
||||
# Don't fail; phase-1 already succeeded.
|
||||
exit 0
|
||||
fi
|
||||
|
||||
git add "${target_dir}"
|
||||
if git diff --staged --quiet; then
|
||||
echo "[helmrepository-patches] git diff empty after sed — nothing to commit"
|
||||
exit 0
|
||||
fi
|
||||
git commit -m "cutover: pivot ${edited} HelmRepository URLs to local Harbor" >/dev/null
|
||||
git push origin main >/dev/null 2>&1 || {
|
||||
echo "[helmrepository-patches] FATAL: git push failed" >&2
|
||||
exit 1
|
||||
}
|
||||
echo "[helmrepository-patches] pushed to ${redacted} (commit will reconcile via bootstrap-kit Kustomization)"
|
||||
|
||||
# Trigger an immediate Flux reconciliation so the new YAML
|
||||
# lands without waiting for the polling interval.
|
||||
kubectl annotate --overwrite gitrepository openova \
|
||||
-n flux-system \
|
||||
"reconcile.fluxcd.io/requestedAt=$(date +%s)" >/dev/null || true
|
||||
|
||||
echo "[helmrepository-patches] step complete"
|
||||
resources:
|
||||
requests: { cpu: 50m, memory: 64Mi }
|
||||
limits: { memory: 256Mi }
|
||||
requests: { cpu: 50m, memory: 128Mi }
|
||||
limits: { memory: 384Mi }
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1001
|
||||
@ -109,3 +210,5 @@ data:
|
||||
items:
|
||||
- key: helmrepository-names.txt
|
||||
path: helmrepository-names.txt
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
|
||||
@ -267,4 +267,28 @@ if ! grep -q 'gitea_host=' "$TMP/render.yaml"; then
|
||||
fi
|
||||
echo " PASS (Step-01 has DNS readiness probe)"
|
||||
|
||||
echo "[cutover-contract] Case 16: Step-06 helmrepository-patches pushes YAML edit to local Gitea (#970)"
|
||||
# 0.1.19 Step-06 only ran kubectl patch against live HelmRepository
|
||||
# objects. bootstrap-kit Kustomization reconciled YAML from local
|
||||
# Gitea every 1m and reverted each patch within ~30s — Step-08
|
||||
# verify caught 38/38 OFFENDERS (caught live on otech116 2026-05-05).
|
||||
#
|
||||
# 0.1.20 Step-06 has a Phase-2 that clones local Gitea, sed-rewrites
|
||||
# every clusters/_template/bootstrap-kit/*.yaml that declares the
|
||||
# upstream URL, commits, and pushes. This gate guards against future
|
||||
# regressions that drop the git-push.
|
||||
if ! grep -q 'git clone' "$TMP/render.yaml"; then
|
||||
echo "FAIL: Step-06 missing git clone (no Gitea push — patches will be reverted by Flux) (#970)" >&2
|
||||
exit 1
|
||||
fi
|
||||
if ! grep -q 'git push origin main' "$TMP/render.yaml"; then
|
||||
echo "FAIL: Step-06 missing git push origin main (#970)" >&2
|
||||
exit 1
|
||||
fi
|
||||
if ! grep -q 'clusters/_template/bootstrap-kit' "$TMP/render.yaml"; then
|
||||
echo "FAIL: Step-06 missing target_dir reference (#970)" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo " PASS (Step-06 pushes YAML edit to local Gitea)"
|
||||
|
||||
echo "[cutover-contract] All gates green."
|
||||
|
||||
Loading…
Reference in New Issue
Block a user