About Tandemn
We're abstracting the hardware layer from AI software.
Massive GPU capacity sits underutilized while teams struggle with access and cost. We're building the infrastructure layer that fixes this — open-source, transparent, and production-ready.
Make GPU infrastructure invisible.
Teams shouldn't have to become infrastructure experts to run AI workloads. We're building a platform where you describe what you need — model, latency, budget — and the system figures out the rest. GPU selection, routing, scaling, failover — all handled automatically.
Our goal is simple: let engineers focus on building AI products, not managing GPU fleets.
The Team
Building the future of distributed inference.
What We Believe
The principles that guide how we build.
Open Source First
Tuna and Orca are fully open source. We believe infrastructure should be inspectable, forkable, and community-driven. No black boxes.
Transparent by Default
Transparent benchmarks, public roadmaps, audit-ready code. If you're trusting us with your inference stack, you should be able to see exactly how it works.
Built for Production
We don't ship demos. Every feature is designed for real workloads — spot preemptions, heterogeneous GPUs, traffic spikes, SLO deadlines. The messy real world.
Get in Touch
Interested in what we're building? Whether you want to contribute, partner, or just talk distributed inference — we'd love to hear from you.