Implement Option D from issue #183: the new Sovereign's cloud-init PUTs its rewritten kubeconfig (server URL pinned to the LB public IP, k3s service-account token in the body) to catalyst-api over HTTPS using a per-deployment bearer token. catalyst-api never SSHs into the Sovereign — by design, it does not hold the SSH private key (the wizard returns it once to the browser and does not persist it on the catalyst-api side). How the bearer flow works ------------------------- 1. CreateDeployment mints a 32-byte random bearer (crypto/rand, hex-encoded), computes its SHA-256, and persists ONLY the hash on Deployment.kubeconfigBearerHash. Plaintext is stamped onto provisioner.Request just long enough for writeTfvars to render it into the per-deployment OpenTofu workdir, then GC'd. 2. infra/hetzner/variables.tf adds three variables — deployment_id, kubeconfig_bearer_token (sensitive), catalyst_api_url. main.tf passes them through templatefile() with load_balancer_ipv4 read from hcloud_load_balancer.main.ipv4. 3. cloudinit-control-plane.tftpl, after `kubectl --raw /healthz` succeeds, sed-rewrites k3s.yaml's https://127.0.0.1:6443 to the LB's public IPv4, writes the result to a 0600 file, and curls PUT to {catalyst_api_url}/api/v1/deployments/{deployment_id}/ kubeconfig with `Authorization: Bearer {token}`. --retry 60 --retry-delay 10 --retry-all-errors handles transient reachability gaps. The 0600 file is removed after the PUT. 4. PUT /api/v1/deployments/{id}/kubeconfig: - Reads `Authorization: Bearer <token>` (RFC 6750). - Computes SHA-256 of the inbound bearer, constant-time-compares to the persisted hash via subtle.ConstantTimeCompare. - 401 on missing/malformed Authorization, 403 on bearer mismatch, 403 if no hash on record, 403 if KubeconfigPath already set (single-use replay defence), 422 on empty/oversize body, 503 if the kubeconfigs directory is unwritable. - On 204: writes the body to /var/lib/catalyst/kubeconfigs/ <id>.yaml at mode 0600 (atomic temp+rename), sets Result.KubeconfigPath, persistDeployment, then `go runPhase1Watch(dep)`. 5. GET /api/v1/deployments/{id}/kubeconfig now reads the file at Result.KubeconfigPath. 409 with {"error":"not-implemented"} when the postback hasn't happened yet (preserves the wizard's existing StepSuccess fallback). 409 {"error": "kubeconfig-file-missing"} on PVC drift. 6. internal/store: Record carries KubeconfigBearerHash. The path pointer round-trips via Result.KubeconfigPath; the JSON record NEVER contains the kubeconfig plaintext (test grep on the on- disk JSON for the kubeconfig sentinels asserts zero matches). 7. restoreFromStore relaunches helmwatch on Pod restart for any rehydrated deployment whose Result.KubeconfigPath points at an existing file AND Phase1FinishedAt is nil AND the original status was not in-flight (the existing in-flight-status-rewrite-to-failed contract is preserved). Channels are re-allocated for resumed deployments because the fromRecord-loaded ones are closed. 8. internal/handler/phase1_watch.go reads kubeconfig YAML from the file at Result.KubeconfigPath (not from a string field on Result). The Result.Kubeconfig field is removed entirely; the on-disk JSON only carries kubeconfigPath. Tests ----- internal/handler/kubeconfig_test.go covers every spec gate: - PUT 401 missing/malformed Authorization - PUT 403 bearer mismatch / no-bearer-hash / already-set - PUT 422 empty body / oversize body - PUT 404 deployment not found - PUT 204 first success, file at <dir>/<id>.yaml mode 0600, Result.KubeconfigPath set, on-disk JSON has kubeconfigPath pointer with no plaintext leak - PUT triggers Phase 1 helmwatch goroutine - GET reads from path-pointer - GET 409 path-pointer-set-but-file-missing - newBearerToken / hashBearerToken round-trip + entropy - subtle.ConstantTimeCompare correctness - shouldResumePhase1 gates every branch - restoreFromStore re-launches helmwatch on rehydrated deployments - phase1Started guard prevents double watch (PUT then runProvisioning) - extractBearer RFC 6750 case-insensitive scheme Chart ----- products/catalyst/chart/templates/api-deployment.yaml mounts the existing catalyst-api-deployments PVC at /var/lib/catalyst (one level up) so deployments/<id>.json and kubeconfigs/<id>.yaml live on the same single-attach volume — no second PVC. Adds env vars CATALYST_KUBECONFIGS_DIR=/var/lib/catalyst/kubeconfigs and CATALYST_API_PUBLIC_URL=https://console.openova.io/sovereign. Per docs/INVIOLABLE-PRINCIPLES.md - #3: OpenTofu is still the only Phase-0 IaC; cloud-init is part of the OpenTofu module's templated user_data, not a separate code path. catalyst-api never execs helm/kubectl/ssh. - #4: catalyst_api_url is runtime-configurable (CATALYST_API_PUBLIC_URL env var), so air-gapped franchises override without code changes. - #10: Bearer plaintext NEVER lands on disk on the catalyst-api side (only the SHA-256 hash). Kubeconfig plaintext NEVER lands in the JSON record (only the file path). The kubeconfig file is chmod 0600 and the directory 0700 owned by the catalyst-api UID. Closes #183. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
371 lines
17 KiB
Plaintext
371 lines
17 KiB
Plaintext
#cloud-config
|
|
# Catalyst Sovereign control-plane bootstrap.
|
|
# Sovereign: ${sovereign_fqdn}
|
|
# Provisioned by: catalyst-provisioner (https://console.openova.io/sovereign)
|
|
#
|
|
# This script:
|
|
# 1. Installs OS hardening (SSH password-auth off, fail2ban, unattended-upgrades).
|
|
# 2. Installs k3s with --flannel-backend=none (Cilium replaces it).
|
|
# 3. Installs Flux + bootstraps the GitRepository pointing at clusters/${sovereign_fqdn}/
|
|
# in the public OpenOva monorepo. From this point Flux is the GitOps
|
|
# reconciler and installs the 11-component bootstrap kit
|
|
# (Cilium → cert-manager → Crossplane → ... → bp-catalyst-platform) in
|
|
# dependency order via Kustomizations the cluster directory ships.
|
|
# 4. Touches /var/lib/catalyst/cloud-init-complete so the catalyst-api
|
|
# provisioner can detect cloud-init has finished.
|
|
|
|
package_update: true
|
|
package_upgrade: false
|
|
packages:
|
|
- curl
|
|
- iptables
|
|
- jq
|
|
- ca-certificates
|
|
- git
|
|
%{ if enable_fail2ban ~}
|
|
- fail2ban
|
|
%{ endif ~}
|
|
%{ if enable_unattended_upgrades ~}
|
|
- unattended-upgrades
|
|
- apt-listchanges
|
|
%{ endif ~}
|
|
|
|
write_files:
|
|
- path: /var/lib/catalyst/sovereign.json
|
|
permissions: '0644'
|
|
content: |
|
|
{
|
|
"sovereignFQDN": "${sovereign_fqdn}",
|
|
"sovereignSubdomain": "${sovereign_subdomain}",
|
|
"orgName": ${jsonencode(org_name)},
|
|
"orgEmail": ${jsonencode(org_email)},
|
|
"region": "${region}",
|
|
"haEnabled": ${ha_enabled},
|
|
"workerCount": ${worker_count},
|
|
"k3sVersion": "${k3s_version}",
|
|
"gitopsRepoUrl": "${gitops_repo_url}",
|
|
"gitopsBranch": "${gitops_branch}"
|
|
}
|
|
|
|
# ── OS hardening: SSH daemon ──────────────────────────────────────────
|
|
# Drop-in overrides /etc/ssh/sshd_config defaults. Per Catalyst's threat
|
|
# model the operator's only valid path in is the Hetzner-project SSH key
|
|
# injected via cloud-init authorized_keys. Password auth, KbdInteractive,
|
|
# and root password login are all off.
|
|
- path: /etc/ssh/sshd_config.d/99-catalyst-hardening.conf
|
|
permissions: '0644'
|
|
content: |
|
|
# Managed by Catalyst Sovereign cloud-init — do not edit by hand.
|
|
PasswordAuthentication no
|
|
KbdInteractiveAuthentication no
|
|
ChallengeResponseAuthentication no
|
|
PermitRootLogin prohibit-password
|
|
PermitEmptyPasswords no
|
|
UsePAM yes
|
|
X11Forwarding no
|
|
AllowAgentForwarding no
|
|
AllowTcpForwarding no
|
|
ClientAliveInterval 300
|
|
ClientAliveCountMax 2
|
|
MaxAuthTries 3
|
|
LoginGraceTime 30
|
|
|
|
%{ if enable_unattended_upgrades ~}
|
|
# ── Unattended security upgrades ──────────────────────────────────────
|
|
# Ubuntu's stock unattended-upgrades, restricted to the security pocket.
|
|
# Runs daily, reboots automatically at 02:30 if a kernel upgrade requires
|
|
# it (k3s tolerates single-node restarts on a solo Sovereign within the
|
|
# ~60s window the Hetzner LB health-check covers).
|
|
- path: /etc/apt/apt.conf.d/20auto-upgrades
|
|
permissions: '0644'
|
|
content: |
|
|
APT::Periodic::Update-Package-Lists "1";
|
|
APT::Periodic::Unattended-Upgrade "1";
|
|
APT::Periodic::AutocleanInterval "7";
|
|
- path: /etc/apt/apt.conf.d/52unattended-upgrades-catalyst
|
|
permissions: '0644'
|
|
content: |
|
|
Unattended-Upgrade::Allowed-Origins {
|
|
"$${distro_id}:$${distro_codename}-security";
|
|
"$${distro_id}ESMApps:$${distro_codename}-apps-security";
|
|
"$${distro_id}ESM:$${distro_codename}-infra-security";
|
|
};
|
|
Unattended-Upgrade::Automatic-Reboot "true";
|
|
Unattended-Upgrade::Automatic-Reboot-Time "02:30";
|
|
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
|
|
Unattended-Upgrade::Remove-Unused-Dependencies "true";
|
|
%{ endif ~}
|
|
|
|
%{ if enable_fail2ban ~}
|
|
# ── fail2ban: sshd jail ───────────────────────────────────────────────
|
|
# Even though SSH is firewalled to ssh_allowed_cidrs (or fully closed at
|
|
# the firewall), fail2ban remains a defence-in-depth layer for the case
|
|
# where the firewall rule is widened by an operator post-bootstrap.
|
|
- path: /etc/fail2ban/jail.d/catalyst-sshd.local
|
|
permissions: '0644'
|
|
content: |
|
|
[sshd]
|
|
enabled = true
|
|
port = ssh
|
|
filter = sshd
|
|
maxretry = 5
|
|
findtime = 10m
|
|
bantime = 1h
|
|
backend = systemd
|
|
%{ endif ~}
|
|
|
|
# ── flux-system/ghcr-pull Secret ─────────────────────────────────────
|
|
#
|
|
# Every HelmRepository CR in clusters/${sovereign_fqdn}/bootstrap-kit/
|
|
# references `secretRef: name: ghcr-pull` because the bp-* OCI artifacts
|
|
# at `ghcr.io/openova-io/` are PRIVATE. Without this Secret, the
|
|
# source-controller logs:
|
|
#
|
|
# failed to get authentication secret 'flux-system/ghcr-pull':
|
|
# secrets "ghcr-pull" not found
|
|
#
|
|
# …and Phase 1 stalls at bp-cilium. The operator workaround (kubectl
|
|
# apply the Secret by hand after Flux installs) is not durable across
|
|
# re-provisioning of the same Sovereign — every fresh control-plane
|
|
# boots without the Secret.
|
|
#
|
|
# We write the Secret into flux-system at cloud-init time, BEFORE
|
|
# /var/lib/catalyst/flux-bootstrap.yaml is applied, so the GitRepository +
|
|
# Kustomization land into a cluster that already has working GHCR creds.
|
|
# The apply step is in runcmd: below; the manifest itself lives here.
|
|
#
|
|
# Token rotation policy: yearly, stored in 1Password under
|
|
# "Catalyst — GHCR pull token (catalyst-ghcr-pull-token)". See
|
|
# docs/SECRET-ROTATION.md. The token NEVER lives in git.
|
|
- path: /var/lib/catalyst/ghcr-pull-secret.yaml
|
|
permissions: '0600'
|
|
content: |
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: ghcr-pull
|
|
namespace: flux-system
|
|
type: kubernetes.io/dockerconfigjson
|
|
data:
|
|
.dockerconfigjson: ${base64encode(jsonencode({
|
|
auths = {
|
|
"ghcr.io" = {
|
|
username = ghcr_pull_username
|
|
password = ghcr_pull_token
|
|
auth = ghcr_pull_auth_b64
|
|
}
|
|
}
|
|
}))}
|
|
|
|
# Flux GitRepository + Kustomization that take over after k3s is up.
|
|
# The clusters/${sovereign_fqdn}/ directory in the public OpenOva monorepo
|
|
# contains a Kustomization tree that installs the 11-component bootstrap
|
|
# kit + bp-catalyst-platform umbrella in dependency order.
|
|
- path: /var/lib/catalyst/flux-bootstrap.yaml
|
|
permissions: '0644'
|
|
content: |
|
|
apiVersion: source.toolkit.fluxcd.io/v1
|
|
kind: GitRepository
|
|
metadata:
|
|
name: openova
|
|
namespace: flux-system
|
|
spec:
|
|
interval: 1m
|
|
url: ${gitops_repo_url}
|
|
ref:
|
|
branch: ${gitops_branch}
|
|
ignore: |
|
|
/*
|
|
!/clusters/${sovereign_fqdn}
|
|
!/platform
|
|
!/products
|
|
---
|
|
# Two Flux Kustomizations with dependsOn so Crossplane CRDs land
|
|
# before any resource that uses them is dry-run-applied.
|
|
#
|
|
# bootstrap-kit installs the 11 HelmReleases (Cilium, cert-manager,
|
|
# Flux, Crossplane core, sealed-secrets, SPIRE, NATS-JetStream,
|
|
# OpenBao, Keycloak, Gitea, bp-catalyst-platform). bp-crossplane
|
|
# registers the Crossplane core CRDs (Provider, ProviderConfig…)
|
|
# AND the bp-catalyst-platform umbrella reconciles the rest.
|
|
#
|
|
# infrastructure-config applies the cluster's Provider package +
|
|
# ProviderConfig + Compositions. Because it dependsOn bootstrap-kit
|
|
# AND uses wait: true, Flux waits until bootstrap-kit's HelmReleases
|
|
# are Ready (Crossplane core + provider-hcloud installed,
|
|
# hcloud.crossplane.io/v1beta1 CRDs registered) before dry-running
|
|
# ProviderConfig — which is the exact ordering the prior single-
|
|
# Kustomization model tripped over with:
|
|
# no matches for kind "ProviderConfig" in version
|
|
# "hcloud.crossplane.io/v1beta1"
|
|
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
|
kind: Kustomization
|
|
metadata:
|
|
name: bootstrap-kit
|
|
namespace: flux-system
|
|
spec:
|
|
interval: 5m
|
|
path: ./clusters/${sovereign_fqdn}/bootstrap-kit
|
|
prune: true
|
|
sourceRef:
|
|
kind: GitRepository
|
|
name: openova
|
|
wait: true
|
|
timeout: 30m
|
|
---
|
|
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
|
kind: Kustomization
|
|
metadata:
|
|
name: infrastructure-config
|
|
namespace: flux-system
|
|
spec:
|
|
interval: 5m
|
|
path: ./clusters/${sovereign_fqdn}/infrastructure
|
|
prune: true
|
|
sourceRef:
|
|
kind: GitRepository
|
|
name: openova
|
|
dependsOn:
|
|
- name: bootstrap-kit
|
|
wait: true
|
|
timeout: 30m
|
|
|
|
runcmd:
|
|
- swapoff -a
|
|
- sed -i '/swap/d' /etc/fstab
|
|
- update-alternatives --set iptables /usr/sbin/iptables-legacy || true
|
|
- update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy || true
|
|
|
|
# Activate hardened sshd config (cloud-init may have written authorized_keys
|
|
# already from Hetzner ssh_keys[]; we never touch that file).
|
|
- systemctl reload ssh || systemctl reload sshd || true
|
|
%{ if enable_fail2ban ~}
|
|
- systemctl enable --now fail2ban
|
|
%{ endif ~}
|
|
%{ if enable_unattended_upgrades ~}
|
|
- systemctl enable --now unattended-upgrades
|
|
%{ endif ~}
|
|
|
|
# k3s control-plane. Flags per docs/SOVEREIGN-PROVISIONING.md §3 and
|
|
# docs/PLATFORM-TECH-STACK.md §8.1:
|
|
# --cluster-init Initialise embedded etcd (HA-ready).
|
|
# --flannel-backend=none Cilium replaces flannel.
|
|
# --disable=traefik Cilium Gateway replaces traefik.
|
|
# --disable=servicelb Hetzner LB handles ingress.
|
|
# --disable=local-storage Crossplane-provisioned hcloud-csi instead.
|
|
# --disable-network-policy Cilium handles NetworkPolicy.
|
|
# --tls-san=${sovereign_fqdn} API server cert valid for the sovereign FQDN.
|
|
- 'curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=${k3s_version} K3S_TOKEN=${k3s_token} INSTALL_K3S_EXEC="server --cluster-init --flannel-backend=none --disable-network-policy --disable=traefik --disable=servicelb --disable=local-storage --tls-san=${sovereign_fqdn} --node-label catalyst.openova.io/role=control-plane --write-kubeconfig-mode=0644" sh -'
|
|
|
|
# Wait for the API server to be reachable. Cilium needs to come up before
|
|
# nodes Ready, so we wait specifically for the API endpoint.
|
|
- 'until kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml get --raw /healthz; do sleep 5; done'
|
|
|
|
%{ if deployment_id != "" && kubeconfig_bearer_token != "" && catalyst_api_url != "" ~}
|
|
# ── Cloud-init kubeconfig postback (issue #183, Option D) ───────────────
|
|
#
|
|
# The k3s install above wrote /etc/rancher/k3s/k3s.yaml with the API
|
|
# server URL pinned to https://127.0.0.1:6443 — kubectl's default for a
|
|
# local single-node install. catalyst-api lives off-cluster (Catalyst-Zero
|
|
# franchise console on contabo-mkt) and cannot reach 127.0.0.1 on this
|
|
# node, so we MUST rewrite that field before sending the kubeconfig
|
|
# back. The Hetzner load balancer at $${load_balancer_ipv4} forwards
|
|
# 6443 to the control plane's 6443 (firewall rule above), so a kubeconfig
|
|
# pointing at the LB's public IPv4 is reachable from anywhere.
|
|
#
|
|
# Plaintext: we read from /etc/rancher/k3s/k3s.yaml (mode 0644 written
|
|
# by k3s), apply the rewrite via sed, write the result to
|
|
# /etc/rancher/k3s/k3s.yaml.public (mode 0600 explicitly), then
|
|
# curl --data-binary the file content to catalyst-api with the bearer
|
|
# token. The .public file is removed at the end of the runcmd block
|
|
# so the bearer-protected kubeconfig only lives on this node for the
|
|
# few seconds it takes to PUT.
|
|
#
|
|
# --retry 60 --retry-delay 10 --retry-all-errors handles the case
|
|
# where catalyst-api is briefly unreachable (image roll, ingress
|
|
# reconciliation) — the cloud-init runcmd budget is bounded by the
|
|
# systemd cloud-final timeout (~30 minutes).
|
|
- install -m 0600 /dev/null /etc/rancher/k3s/k3s.yaml.public
|
|
- sed 's|https://127.0.0.1:6443|https://${load_balancer_ipv4}:6443|g' /etc/rancher/k3s/k3s.yaml > /etc/rancher/k3s/k3s.yaml.public
|
|
- chmod 0600 /etc/rancher/k3s/k3s.yaml.public
|
|
- |
|
|
curl -fsSL --retry 60 --retry-delay 10 --retry-all-errors \
|
|
-X PUT \
|
|
-H "Authorization: Bearer ${kubeconfig_bearer_token}" \
|
|
-H "Content-Type: application/x-yaml" \
|
|
--data-binary @/etc/rancher/k3s/k3s.yaml.public \
|
|
${catalyst_api_url}/api/v1/deployments/${deployment_id}/kubeconfig
|
|
- rm -f /etc/rancher/k3s/k3s.yaml.public
|
|
%{ endif ~}
|
|
|
|
# ── Cilium FIRST (before Flux) ───────────────────────────────────────────
|
|
#
|
|
# k3s started with --flannel-backend=none, so the cluster has NO CNI yet.
|
|
# If we apply Flux install.yaml at this point, the Flux controller pods
|
|
# stay Pending forever — kubelet rejects them with
|
|
# "container runtime network not ready: cni plugin not initialized"
|
|
# Flux is then unable to reconcile bp-cilium, so Cilium is never
|
|
# installed → bootstrap deadlock that we hit in production at
|
|
# omantel.omani.works deployment 5cd1bceaaacb71f6 (25 min stuck Pending).
|
|
#
|
|
# Bootstrap chicken-and-egg: Cilium IS the install unit (bp-cilium), but
|
|
# Flux needs a CNI to run, and Cilium IS the CNI. Resolution: install
|
|
# Cilium ONCE here via Helm with the same chart + values bp-cilium would
|
|
# apply later. When Flux reconciles bp-cilium, it adopts the existing
|
|
# release (Helm release-name match), so there is no churn.
|
|
#
|
|
# Per INVIOLABLE-PRINCIPLES.md #3 the GitOps engine is Flux — this Helm
|
|
# install is the one-shot bootstrap exception explicitly authorised by
|
|
# the same principle's "everything ELSE" qualifier. The chart version
|
|
# matches platform/cilium/blueprint.yaml's chartVersion to keep the
|
|
# bootstrap install and the reconciled HelmRelease byte-identical.
|
|
- 'curl -sSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash'
|
|
- 'helm repo add cilium https://helm.cilium.io/'
|
|
- 'helm repo update'
|
|
- |
|
|
KUBECONFIG=/etc/rancher/k3s/k3s.yaml helm install cilium cilium/cilium \
|
|
--version 1.16.5 \
|
|
--namespace kube-system \
|
|
--set kubeProxyReplacement=true \
|
|
--set k8sServiceHost=127.0.0.1 \
|
|
--set k8sServicePort=6443 \
|
|
--set ipam.mode=kubernetes \
|
|
--set tunnelProtocol=vxlan \
|
|
--set bpf.masquerade=true
|
|
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml -n kube-system rollout status ds/cilium --timeout=240s'
|
|
|
|
# Install Flux core. Cilium is now the cluster's CNI, so Flux pods will
|
|
# actually start. Flux then reconciles clusters/${sovereign_fqdn}/ which
|
|
# adopts the Helm release above as bp-cilium and continues with
|
|
# bp-cert-manager, bp-flux (host-level Flux, distinct from this Flux
|
|
# which is the CONTROL-PLANE Flux), bp-crossplane, etc.
|
|
- 'curl -fsSL https://github.com/fluxcd/flux2/releases/download/v2.4.0/install.yaml | kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f -'
|
|
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml -n flux-system wait --for=condition=Available --timeout=300s deployment --all'
|
|
|
|
# ── flux-system/ghcr-pull Secret (applied BEFORE GitRepository) ──────
|
|
#
|
|
# Apply the docker-registry pull secret rendered above. This MUST land
|
|
# before the GitRepository + Kustomization in flux-bootstrap.yaml,
|
|
# because the bootstrap-kit Kustomization includes HelmRepository CRs
|
|
# that reference this Secret by name; the source-controller resolves
|
|
# them on its first reconciliation tick and a missing Secret propagates
|
|
# as a Ready=False/AuthError state that has been observed to persist
|
|
# for 5+ minutes even after the Secret is later applied.
|
|
#
|
|
# Idempotent: `kubectl apply` against an existing Secret is a no-op
|
|
# when the manifest's bytes match. A reprovision (same Sovereign FQDN)
|
|
# rewrites this with the same content; a token rotation propagates
|
|
# through here on the next cloud-init render.
|
|
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f /var/lib/catalyst/ghcr-pull-secret.yaml'
|
|
|
|
# Apply the Flux bootstrap GitRepository + Kustomization. From here, Flux
|
|
# owns the cluster: pulls clusters/${sovereign_fqdn}/, installs Cilium
|
|
# via bp-cilium, cert-manager via bp-cert-manager, etc., then bp-catalyst-platform.
|
|
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f /var/lib/catalyst/flux-bootstrap.yaml'
|
|
|
|
# Marker for the catalyst-api provisioner to detect cloud-init is done.
|
|
- mkdir -p /var/lib/catalyst
|
|
- touch /var/lib/catalyst/cloud-init-complete
|
|
|
|
final_message: "Catalyst control-plane bootstrap complete after $UPTIME seconds — Flux is now reconciling clusters/${sovereign_fqdn}/"
|