On-Prem

Run high-density video encoding/decoding/transcoding inside your own data center using Quadra VPUs without changing ingest paths, codecs, or downstream workflows.

Use this architecture when:

 

  • Regulatory, security, or cost constraints require local processing
  • CPU/GPU encoding limits density or power efficiency
  • Deterministic performance at sustained throughput is required


This architecture is optimized for control, predictability, and maximum encoding efficiency per server.

What changes

  • Video compute shifts from CPU/GPU to VPUs
  • Encoding density per server increases significantly
  • Power, cooling, and rack footprint per stream decrease

What doesn’t

  • Ingest sources and output destinations
  • FFmpeg/GStreamer-based workflows and codecs
  • Network topology, storage systems, or security posture

VPU placement

  • Quadra VPUs reside inside dedicated on-prem encoding servers
  • VPUs handle encode/decode/transcode only
  • CPU resources are freed for orchestration and control

Scaling model

  • Scale by adding VPU-enabled servers
  • Predictable throughput per server
  • Linear capacity growth without performance degradation

Prerequisites

  • Quadra-enabled on-prem servers
  • Existing ingest and output endpoints
  • Orchestration layer (Bitstreams or existing scheduler)
  • Standard FFmpeg/GStreamer or compatible encoding pipelines

Validation path

  • Deploy a single Quadra-enabled server
  • Benchmark against CPU/GPU systems
  • Measure streams per rack unit, power, and cost per stream
  • Expand incrementally across facilities

What this is not

  • Not a SaaS platform
  • Not a general-purpose accelerator
  • Not a proprietary pipeline
  • Not a rip-and-replace migration

Outcome

Higher encoding density. Lower cost per stream. Predictable performance using your existing infrastructure.

Supported by the VPU Ecosystem, partners operating this architecture in production today.