Production Deployment Guide
This guide covers deploying Torvyn pipelines in production environments. It assumes familiarity with Torvyn’s concepts (contracts, components, flows, capabilities) and prior experience running pipelines locally with torvyn run.
Resource Sizing and Capacity Planning
Memory Budget
Torvyn’s memory consumption has three components:
-
Host overhead: Resource tables, stream queues, metric structures, trace buffers, and runtime bookkeeping. Typically 10–50 MiB depending on the number of active flows and components.
-
Buffer pool reservation: Pre-allocated buffer pools. With default settings, the pool reservation is approximately 53 MiB (Small: ~1 MiB, Medium: ~4 MiB, Large: ~16 MiB, Huge: ~32 MiB on-demand). Adjust pool sizes based on workload: if your pipeline processes mostly small messages, reduce Large and Huge pools; if it processes large binary payloads, increase Large pool allocation.
-
Component linear memory: Each Wasm component instance has its own linear memory, starting small and growing up to the configured limit (
max_memory_per_component, default 64 MiB). Total component memory = number of active component instances × their actual memory usage. For a pipeline with 5 components at 20 MiB each, budget 100 MiB.
Formula: total_memory ≈ host_overhead + pool_reservation + (active_components × avg_component_memory)
CPU Budget
Torvyn uses Tokio’s multi-threaded runtime. Set worker_threads to the number of CPU cores available to the process. Each flow driver consumes one Tokio task; Tokio distributes tasks across worker threads. For workloads with many concurrent flows, ensure worker_threads ≥ number of flows that must make progress simultaneously.
Wasmtime’s fuel mechanism provides CPU time limiting per component invocation. The default fuel_per_invocation of 1,000,000 fuel units is approximately 1–10 ms of CPU time depending on instruction mix. Adjust based on component complexity.
Queue Depth Sizing
The default queue depth of 64 elements per stream is suitable for most workloads. Increase it for bursty sources (to absorb bursts without triggering backpressure) or decrease it for latency-sensitive pipelines (smaller queues reduce maximum queuing delay).
Total queue memory = Σ(queue_depth × max_element_handle_size) per stream. Since queues hold handle references (not payload bytes), queue memory is small relative to buffer pool memory.
Monitoring and Alerting Setup
Prometheus Integration
Enable the Prometheus metrics endpoint:
[observability]
metrics_enabled = true
prometheus_enabled = true
The /metrics endpoint is served on the inspection API (default: Unix domain socket at $TORVYN_STATE_DIR/torvyn.sock, or localhost TCP if configured). Configure your Prometheus scrape target accordingly.
Trace Export
For production trace export, configure OTLP gRPC:
[observability.tracing]
level = "production"
tracing_exporter = "otlp-grpc"
tracing_endpoint = "http://your-otel-collector:4317"
sample_rate = 0.01
error_promote = true
latency_promote_threshold_ms = 10
Recommended Grafana Dashboards
Build dashboards around these key metrics: flow.elements.total rate (throughput), flow.latency percentiles, stream.backpressure.duration_ns (health), pool.available per tier (capacity), component.processing_time per component (bottleneck identification), and system.memory.total (resource utilization).
Security Hardening Checklist
- Set
default_capability_policy = "deny-all"in[security]. - Grant each component the minimum required capabilities. Review grants against component manifests.
- Enable audit logging:
audit_enabled = true. - Bind the inspection API to localhost only (the default). Do not expose it to the network without authentication.
- If exposing the inspection API over TCP, configure token-based authentication.
- Verify component artifacts are signed before deploying (
torvyn inspect --show capabilitieson every artifact). - Set appropriate
max_memory_per_componentlimits to prevent memory exhaustion by any single component. - Set appropriate
fuel_per_invocationlimits to prevent CPU monopolization. - Review and pin component versions in pipeline definitions. Do not use mutable tags (like
latest) for production components.
Operational Runbook
Pipeline is not processing elements
- Check flow state:
GET /flowson the inspection API. Look for flows inFailedorPausedstate. - Check for component errors: review
component.errorsmetrics. Afatalerror stops the flow. - Check for backpressure: if
stream.backpressure.duration_nsis high, the pipeline may be stalled on a slow sink. - Check resource availability: if
pool.exhaustion_eventsis increasing, buffer pools may be exhausted.
Latency is increasing
- Check per-component latency:
component.processing_timehistograms identify the slow stage. - Check queue depths:
stream.queue.current_depthshows where data is accumulating. - Check backpressure: sustained backpressure on a specific stream indicates the downstream component is the bottleneck.
- Check system resources: CPU saturation or memory pressure can increase latency across all components.
Memory usage is growing
- Check
component.memory_currentfor each component. A steadily increasing gauge suggests a memory leak in component code. - Check
pool.availableper tier. Decreasing availability without corresponding increase in throughput suggests buffers are not being returned to the pool. - Check for stale flows: flows that have stopped processing but have not been cleaned up may hold resource references.
Upgrade Procedures
- Test the upgrade locally. Run the new version with your pipeline using
torvyn runandtorvyn bench. Compare benchmark results against your baseline. - Verify contract compatibility. If the new version includes contract changes, run
torvyn linkagainst all your component artifacts. - Rolling upgrade (multi-instance). If running multiple Torvyn instances behind a load balancer, upgrade instances one at a time. Drain active flows on an instance before upgrading.
- In-place upgrade (single instance). Stop the running pipeline (
torvyn runresponds to SIGTERM with graceful shutdown — flows drain, components teardown, resources are reclaimed). Replace the binary. Restart.