Acceptance greps with Pass 37's new literal-domain check and case-insensitive banned-term sweep found one surviving instance: platform/temporal/README.md L272 Worker Deployment had `namespace: fuse`. Pass 26 renamed fuse → fabric; Pass 32+35 fixed temporal's image ref and DNS but the namespace YAML key was missed (eye tracks surrounding structure, skims past `namespace:` value). Renamed to `fabric`. docs/SECURITY.md: clean (deep re-scan §6-§10 per Pass 23 lesson). All sections consistent with canonical model and Pass 7's independent-Raft fix. §9 OpenSearch SIEM wording acceptable as "default destination when SIEM is enabled" rather than "default-installed component" — deferred for optional tightening pass. platform/grafana/README.md: clean. Banner, tiered storage, and OTel instrumentation example all consistent with canonical conventions. Lesson: case-insensitive banned-term grep is non-negotiable. Future passes should always run \bfuse\b and similar legacy-product-name greps regardless of surfaced category. |
||
|---|---|---|
| .. | ||
| README.md | ||
Temporal
Durable workflow orchestration with saga + compensation. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.3 — Workflow & processing). Used by bp-fabric (composite Data & Integration Blueprint) for long-running, compensable workflows that span multiple Application services.
Status: Accepted | Updated: 2026-04-27
Overview
Temporal is a durable execution platform that makes it simple to build reliable, long-running workflows and microservice orchestrations. Unlike traditional message queues or job schedulers, Temporal provides durable execution: workflow code survives process crashes, node failures, and even entire cluster restarts without losing state. Developers write workflows as ordinary code in their language of choice, and Temporal handles retries, timeouts, and state persistence transparently.
Within OpenOva, Temporal serves as the workflow orchestration engine for the Fabric data and integration product. It handles saga patterns for distributed transactions, long-running business processes, scheduled jobs, and any operation that needs reliable execution across multiple services. Temporal replaces fragile combinations of message queues, cron jobs, and custom state machines with a single, battle-tested platform.
Temporal's architecture separates the server (which manages workflow state) from workers (which execute workflow and activity code). Workers are stateless and can be scaled independently. The server persists all workflow state to a database, ensuring that workflows survive any infrastructure failure. SDKs are available for Go, Java, Python, and TypeScript, making Temporal accessible to polyglot teams.
Architecture
flowchart TB
subgraph Clients["Workflow Clients"]
API[API Services]
Scheduler[Scheduled Jobs]
Events[Event Handlers]
end
subgraph Temporal["Temporal Server"]
Frontend[Frontend Service]
History[History Service]
Matching[Matching Service]
Worker_svc[Internal Worker]
end
subgraph Persistence["Persistence"]
PG[PostgreSQL / CNPG]
ES[Elasticsearch / OpenSearch]
end
subgraph Workers["Application Workers"]
W1[Order Worker]
W2[Payment Worker]
W3[Notification Worker]
end
Clients --> Frontend
Frontend --> History
Frontend --> Matching
History --> PG
Worker_svc --> ES
Matching --> W1
Matching --> W2
Matching --> W3
Saga Pattern
sequenceDiagram
participant C as Client
participant T as Temporal
participant O as Order Service
participant P as Payment Service
participant I as Inventory Service
participant N as Notification Service
C->>T: Start OrderSaga
T->>O: CreateOrder
O-->>T: Order Created
T->>P: ProcessPayment
P-->>T: Payment OK
T->>I: ReserveInventory
I-->>T: Inventory Reserved
T->>N: SendConfirmation
N-->>T: Notification Sent
T-->>C: Saga Complete
Note over T,I: If ReserveInventory fails:
T->>P: CompensatePayment (refund)
T->>O: CancelOrder
Key Features
| Feature | Description |
|---|---|
| Durable Execution | Workflows survive process/node/cluster failures |
| Saga Orchestration | Coordinate distributed transactions with compensation |
| Retry Policies | Configurable retry with exponential backoff per activity |
| Timeouts | Start-to-close, schedule-to-start, and heartbeat timeouts |
| Cron Workflows | Replace cron jobs with reliable scheduled workflows |
| Versioning | Deploy new workflow logic without breaking running instances |
| Signals & Queries | Send data to and read state from running workflows |
| Child Workflows | Compose complex workflows from smaller building blocks |
| Visibility | Search and filter workflows by custom attributes |
Configuration
Helm Values
temporal:
server:
replicas: 3
config:
persistence:
default:
driver: sql
sql:
driver: postgres12
host: temporal-postgres.databases.svc
port: 5432
database: temporal
user: temporal
password: ${PG_PASSWORD} # From ESO
visibility:
driver: sql
sql:
driver: postgres12
host: temporal-postgres.databases.svc
port: 5432
database: temporal_visibility
user: temporal
password: ${PG_PASSWORD} # From ESO
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2
memory: 4Gi
admintools:
enabled: true
web:
enabled: true
ingress:
enabled: true
hosts:
- temporal.<env>.<sovereign-domain>
prometheus:
enabled: true
Namespace Setup
# Create Temporal namespace for workload isolation
tctl namespace register \
--namespace orders \
--retention 30d \
--description "Order processing workflows"
Workflow Examples
Go SDK - Order Saga
package workflows
import (
"time"
"go.temporal.io/sdk/temporal"
"go.temporal.io/sdk/workflow"
)
func OrderSagaWorkflow(ctx workflow.Context, order Order) (OrderResult, error) {
retryPolicy := &temporal.RetryPolicy{
InitialInterval: time.Second,
BackoffCoefficient: 2.0,
MaximumInterval: time.Minute,
MaximumAttempts: 5,
}
activityOpts := workflow.ActivityOptions{
StartToCloseTimeout: 30 * time.Second,
RetryPolicy: retryPolicy,
}
ctx = workflow.WithActivityOptions(ctx, activityOpts)
// Step 1: Create Order
var orderID string
err := workflow.ExecuteActivity(ctx, CreateOrder, order).Get(ctx, &orderID)
if err != nil {
return OrderResult{}, err
}
// Step 2: Process Payment (with compensation)
var paymentID string
err = workflow.ExecuteActivity(ctx, ProcessPayment, orderID, order.Amount).Get(ctx, &paymentID)
if err != nil {
// Compensate: cancel the order
_ = workflow.ExecuteActivity(ctx, CancelOrder, orderID).Get(ctx, nil)
return OrderResult{}, err
}
// Step 3: Reserve Inventory (with compensation)
err = workflow.ExecuteActivity(ctx, ReserveInventory, orderID, order.Items).Get(ctx, nil)
if err != nil {
// Compensate: refund payment, cancel order
_ = workflow.ExecuteActivity(ctx, RefundPayment, paymentID).Get(ctx, nil)
_ = workflow.ExecuteActivity(ctx, CancelOrder, orderID).Get(ctx, nil)
return OrderResult{}, err
}
// Step 4: Send Confirmation
_ = workflow.ExecuteActivity(ctx, SendConfirmation, orderID).Get(ctx, nil)
return OrderResult{OrderID: orderID, PaymentID: paymentID, Status: "completed"}, nil
}
Python SDK - Data Pipeline
from temporalio import workflow, activity
from datetime import timedelta
@activity.defn
async def extract_data(source: str) -> dict:
# Extract data from source system
...
@activity.defn
async def transform_data(raw_data: dict) -> dict:
# Apply business transformations
...
@activity.defn
async def load_data(transformed: dict) -> str:
# Write to destination
...
@workflow.defn
class DataPipelineWorkflow:
@workflow.run
async def run(self, source: str) -> str:
raw = await workflow.execute_activity(
extract_data, source,
start_to_close_timeout=timedelta(minutes=10),
)
transformed = await workflow.execute_activity(
transform_data, raw,
start_to_close_timeout=timedelta(minutes=30),
)
result = await workflow.execute_activity(
load_data, transformed,
start_to_close_timeout=timedelta(minutes=10),
)
return result
Worker Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-workflow-worker
namespace: fabric
spec:
replicas: 3
template:
spec:
containers:
- name: worker
image: harbor.<location-code>.<sovereign-domain>/fabric/order-worker:latest
env:
- name: TEMPORAL_HOST
value: temporal-frontend.temporal.svc:7233
- name: TEMPORAL_NAMESPACE
value: orders
- name: TEMPORAL_TASK_QUEUE
value: order-processing
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 1
memory: 1Gi
Monitoring
| Metric | Description |
|---|---|
temporal_workflow_started_total |
Workflows started |
temporal_workflow_completed_total |
Workflows completed successfully |
temporal_workflow_failed_total |
Workflows failed |
temporal_workflow_task_queue_depth |
Pending tasks per queue |
temporal_activity_execution_latency |
Activity execution duration |
temporal_workflow_endtoend_latency |
Total workflow duration |
temporal_schedule_missed_catchup_window |
Missed cron schedule executions |
Consequences
Positive:
- Durable execution eliminates custom retry/state-machine code across all services
- Saga pattern support simplifies distributed transaction management
- Multi-language SDKs (Go, Java, Python, TypeScript) suit polyglot teams
- Workflow versioning enables safe deployments without breaking running instances
- Built-in visibility and search make debugging production workflows practical
- Cron workflows replace fragile crontab/CronJob setups with reliable scheduling
Negative:
- Requires PostgreSQL for persistence, adding a database dependency
- Temporal server itself needs careful sizing and operational attention
- Workflow determinism constraints require developer discipline (no random, no system clock)
- Learning curve for understanding event sourcing and replay semantics
- Debugging replayed workflows requires familiarity with Temporal's execution model
- Large workflow histories can impact performance without proper archival configuration
Part of OpenOva Fabric - Data & Integration