openova/.github/workflows/test-hetzner-e2e.yaml
hatiyildiz 7c7c46bc62 test: Hetzner Sovereign end-to-end provisioning test (#141)
Closes the Group L "end-to-end provisioning test on Hetzner test project"
ticket. Per the ticket's exact wording: scaffolding + harness + CI
workflow, gated on HETZNER_TEST_TOKEN, NEVER mocked.

Lifecycle when HETZNER_TEST_TOKEN is set:
  1. Generate unique sovereign FQDN (e2e-<run-id>.openova.io)
  2. Stage canonical infra/hetzner/ OpenTofu module into temp dir
  3. Render tofu.auto.tfvars.json with test inputs (BYO domain mode so
     Dynadot isn't touched; region runtime-configurable; SSH key minted
     by CI per-run)
  4. tofu init && tofu apply -auto-approve (30m timeout)
  5. Assert outputs: control_plane_ip + load_balancer_ip are valid IPv4
  6. Assert TCP/22 reachable on control plane (5m await)
  7. Assert TCP/443 reachable on LB after Cilium + Flux land (15m await,
     soft-failure since the Catalyst control plane install is the long
     tail and partial-bootstrap is acceptable proof of OpenTofu + Flux)
  8. tofu destroy -auto-approve (always — t.Cleanup, runs even on fail)
  9. Verify state list is empty after destroy (no leaked resources)

When HETZNER_TEST_TOKEN is absent, the test SKIPS — does not mock, does
not fall through to a stub. Per docs/INVIOLABLE-PRINCIPLES.md #2,
mocking the cloud would tell us nothing about whether the OpenTofu module,
hcloud provider, cloud-init scripts, or k3s actually work. A second test
(TestHarness_NoHetznerCredsSkips) explicitly verifies the skip semantics
so future refactors don't accidentally land mocking.

CI workflow (.github/workflows/test-hetzner-e2e.yaml):
  - Triggers on workflow_dispatch (operator initiates real run) or PR
    labeled `test/hetzner-e2e` — NOT on every push (each run costs real
    Hetzner minutes ~EUR 0.005/run).
  - Generates a per-run throwaway SSH ed25519 keypair so no secret
    long-term key lands in any logs.
  - Installs OpenTofu via opentofu/setup-opentofu@v1.
  - Reads HETZNER_TEST_TOKEN + HETZNER_TEST_PROJECT_ID from repo secrets;
    operator populates them out-of-band (per the ticket: "operator will
    populate later").
  - 55m job timeout, plus the test itself uses contexts of 30m apply
    + 20m destroy.

Files:
  - tests/e2e/hetzner-provisioning/main_test.go (the harness)
  - tests/e2e/hetzner-provisioning/go.mod (separate module, stdlib-only)
  - .github/workflows/test-hetzner-e2e.yaml (gated CI)

Refs #141

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:00:29 +02:00

72 lines
2.4 KiB
YAML

name: Test — Hetzner Sovereign E2E (real cloud)
# Closes #141 — end-to-end provisioning test on a real Hetzner test
# project. The test does NOT run on every push (each run costs real
# Hetzner CX22 minutes); it runs only on:
# 1. workflow_dispatch (manually triggered by an operator)
# 2. PRs labeled "test/hetzner-e2e"
#
# The test itself skips cleanly when HETZNER_TEST_TOKEN is missing — see
# tests/e2e/hetzner-provisioning/main_test.go. The job below is a no-op
# when the secret is absent, exiting 0 so the matrix doesn't go red.
on:
workflow_dispatch:
inputs:
region:
description: 'Hetzner region (fsn1, nbg1, hel1, ash, hil)'
default: 'fsn1'
required: false
pull_request:
types: [labeled, opened, synchronize]
jobs:
e2e:
if: |
github.event_name == 'workflow_dispatch' ||
contains(github.event.pull_request.labels.*.name, 'test/hetzner-e2e')
runs-on: ubuntu-latest
timeout-minutes: 60
defaults:
run:
working-directory: tests/e2e/hetzner-provisioning
env:
# Operator populates these in repo secrets. The test SKIPS when
# HETZNER_TEST_TOKEN is empty — never falls back to mocking.
HETZNER_TEST_TOKEN: ${{ secrets.HETZNER_TEST_TOKEN }}
HETZNER_TEST_PROJECT_ID: ${{ secrets.HETZNER_TEST_PROJECT_ID }}
HETZNER_TEST_REGION: ${{ inputs.region || 'fsn1' }}
# SSH key — generated below.
HETZNER_TEST_SSH_PUBLIC_KEY: ''
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.22'
cache-dependency-path: tests/e2e/hetzner-provisioning/go.sum
- name: Install OpenTofu
uses: opentofu/setup-opentofu@v1
with:
tofu_version: 1.8.5
- name: Generate throwaway SSH keypair
id: sshkey
run: |
ssh-keygen -t ed25519 -N "" -f /tmp/e2e_id_ed25519 -C "hetzner-e2e-${{ github.run_id }}"
{
echo "HETZNER_TEST_SSH_PUBLIC_KEY<<EOF"
cat /tmp/e2e_id_ed25519.pub
echo "EOF"
} >> "$GITHUB_ENV"
- name: Run E2E test
run: |
if [ -z "$HETZNER_TEST_TOKEN" ]; then
echo "HETZNER_TEST_TOKEN not populated — test will skip cleanly. This is expected for PRs from forks; operator triggers the real run via workflow_dispatch."
fi
go test -v -count=1 -timeout 55m ./...