GPU infrastructure that developers love
Run GPU inference workloads with a Python decorator. Instant provisioning, real-time logs, automatic scaling. No YAML, no Docker, no SSH.
Programmable infra
Define everything in code, no YAML or config files. Keep environment and hardware requirements in sync.
Built for performance
Launch and scale containers in seconds to keep feedback loops tight and latency low.
VietKong (snapshots)
VietKong
Provider A
Provider B
Kubernetes + EC2
Elastic GPU scaling
Elastic GPU capacity and access to thousands of GPUs across clouds. No quotas or reservations. Scale back to zero when not in use.
Unified observability
Detailed logs, metrics, and traces for every function call. Debug and optimize without leaving your workflow.
Products & Platform
Everything you need for GPU inference

Inference
Deploy and scale inference for LLMs, audio, image and video generation.
Learn more →
Multi-cloud
Run on any GPU provider — Vast.ai, Lambda, RunPod, AWS, or your own hardware.
Learn more →Python-native runtime
No containers to manage. Define your environment with Python decorators and VietKong handles the rest.
Built-in container images
Start from any Docker image, layer Python packages and system deps with a fluent builder API.
Real-time log streaming
GPU metrics, function logs, and cost data stream back to your terminal via SSE as your code runs.
Multi-cloud capacity pool
Access GPUs across Vast.ai, Lambda, RunPod, AWS, GCP, Azure, or your own on-prem hardware.