fix(infra): escape ${SOVEREIGN_FQDN} in cloudinit-control-plane.tftpl comments (#471)

Phase-8a-preflight bug surfaced by first live provision attempt
(deployment febeeb888debf477, 2026-05-01 16:30 UTC):

  Error: Invalid function argument
    on main.tf line 140, in locals:
    140:   control_plane_cloud_init = templatefile("${path.module}/cloudinit-control-plane.tftpl", {
  Invalid value for "vars" parameter: vars map does not contain key
  "SOVEREIGN_FQDN", referenced at ./cloudinit-control-plane.tftpl:12,37-51.

Tofu's templatefile() interprets ${...} ANYWHERE in the file (including
inside shell '#' comments), since the file is a template not a shell
script. Five lines in cloudinit-control-plane.tftpl reference
${SOVEREIGN_FQDN} as part of documentation prose explaining how
Flux postBuild.substitute interpolates the value at Flux apply time.

The Tofu vars map passed by main.tf:140 uses the canonical lowercase
HCL convention (sovereign_fqdn = var.sovereign_fqdn), not the uppercase
envsubst convention SOVEREIGN_FQDN. So Tofu fails: 'vars map does not
contain key SOVEREIGN_FQDN'.

Latest reference (line 12) added by #326 (commit 20b89607); older 4
references predate that and were never exercised because no live
provision had ever been attempted before this Phase-8a run.

Fix: escape with double-dollar ($$) so Tofu emits a literal ${...}
in the rendered cloudinit file. The 5 comments now read $${SOVEREIGN_FQDN}
in source, render as ${SOVEREIGN_FQDN} in the user_data output —
preserving documentation intent without breaking templatefile().

Refs:
- Live provision: console.openova.io/sovereign/provision/febeeb888debf477
- Diagnostic: tofu plan exit 1 — vars map does not contain key SOVEREIGN_FQDN
- Out of scope: any other latent templatefile() escape issues — those
  surface as their own Phase-8a iterations

Co-authored-by: hatiyildiz <hatiyildiz@noreply.github.com>
This commit is contained in:
e3mrah 2026-05-01 20:33:21 +04:00 committed by GitHub
parent 1628a1b3aa
commit 03b1469331
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -9,7 +9,7 @@
# 3. Installs Flux + bootstraps the GitRepository pointing at the shared
# clusters/_template/ tree in the public OpenOva monorepo. The
# Sovereign's FQDN is interpolated into the template manifests via
# Flux postBuild.substitute (${SOVEREIGN_FQDN}) at apply time, so
# Flux postBuild.substitute ($${SOVEREIGN_FQDN}) at apply time, so
# no per-Sovereign directory needs to be committed before
# provisioning. From this point Flux is the GitOps reconciler and
# installs the 11-component bootstrap kit (Cilium → cert-manager →
@ -337,7 +337,7 @@ write_files:
# Canonical fix: GitRepository selects the shared `_template/` tree,
# Kustomization paths point at `clusters/_template/{bootstrap-kit,
# infrastructure}`, and Flux's `postBuild.substitute` interpolates
# `${SOVEREIGN_FQDN}` into the template manifests at apply time. The
# `$${SOVEREIGN_FQDN}` into the template manifests at apply time. The
# per-FQDN copy that prior provisioning depended on becomes a no-op:
# one shared tree serves every Sovereign, with the Sovereign's FQDN
# injected by Flux on the cluster instead of by sed in the repo.
@ -380,10 +380,10 @@ write_files:
# "hcloud.crossplane.io/v1beta1"
#
# postBuild.substitute (issue #218): Flux's envsubst runs over the
# rendered manifests after kustomize build, replacing ${SOVEREIGN_FQDN}
# rendered manifests after kustomize build, replacing $${SOVEREIGN_FQDN}
# with the Sovereign's FQDN that this cloud-init was rendered for.
# The template manifests in clusters/_template/bootstrap-kit/*.yaml
# use ${SOVEREIGN_FQDN} as the substitution token.
# use $${SOVEREIGN_FQDN} as the substitution token.
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
@ -736,7 +736,7 @@ runcmd:
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f /var/lib/catalyst/crossplane-provider-hcloud.yaml'
# Apply the Flux bootstrap GitRepository + Kustomization. From here, Flux
# owns the cluster: pulls clusters/_template/ (with ${SOVEREIGN_FQDN}
# owns the cluster: pulls clusters/_template/ (with $${SOVEREIGN_FQDN}
# substituted to ${sovereign_fqdn} via postBuild), installs Cilium
# via bp-cilium, cert-manager via bp-cert-manager, etc., then bp-catalyst-platform.
- 'kubectl --kubeconfig=/etc/rancher/k3s/k3s.yaml apply -f /var/lib/catalyst/flux-bootstrap.yaml'