Project SGLang Spins Out as RadixArk With $400M Valuation as Inference Market Explodes

RadixArk, the startup behind AI inference tool SGLang, hits $400M valuation amid surging demand for efficient model deployment.
Matilda

RadixArk Emerges From SGLang with $400M Valuation

In a move that underscores the explosive growth of AI infrastructure, RadixArk—the commercial entity behind the open-source project SGLang—has secured a $400 million valuation in a recent funding round. If you’ve been wondering who’s building the next generation of tools to make AI models faster and cheaper to run, the answer is increasingly pointing to startups like RadixArk. Born from research at UC Berkeley and now backed by top-tier investors including Accel and Intel CEO Lip-Bu Tan, the company is positioning itself at the heart of the AI inference boom.

Project SGLang Spins Out as RadixArk With $400M Valuation as Inference Market Explodes
Credit: Yuichiro Chino / Getty Images

The timing couldn’t be better. As enterprises rush to deploy large language models (LLMs) in production, the cost and latency of running those models have become critical bottlenecks. Enter SGLang: a programming framework designed to optimize how AI models generate responses, dramatically cutting both time and compute expenses. Now, with its core team transitioning into a formal startup, RadixArk aims to turn academic innovation into enterprise-grade infrastructure.

From Berkeley Lab to Billion-Dollar Potential

RadixArk didn’t appear out of thin air. Its roots trace back to 2023, when researchers at UC Berkeley—led by Databricks co-founder Ion Stoica—began developing SGLang as a way to streamline the execution of complex AI workloads. The project quickly gained traction in developer communities for its ability to accelerate inference, the phase where trained models actually produce outputs like text, code, or images.

Unlike training—which happens once—inference occurs every time a user interacts with an AI system, making efficiency here crucial for scalability and cost control. SGLang tackled this by introducing a new programming model that allows developers to write “structured generation” logic directly into their prompts, enabling smarter, more efficient decoding without sacrificing flexibility.

What started as a research experiment soon caught the attention of industry heavyweights. Companies like xAI and Cursor adopted SGLang early, using it to speed up their internal AI pipelines. That real-world validation became the springboard for commercialization.

Leadership Shift Signals Serious Ambition

A key turning point came when Ying Sheng, a former engineer at Elon Musk’s xAI and a research scientist at Databricks, stepped away from her role to co-found and lead RadixArk as CEO. Her LinkedIn announcement in December 2025 confirmed the transition, signaling not just a change in title but a strategic pivot from open-source contribution to full-time startup building.

Sheng’s background is telling. Having worked at the intersection of cutting-edge AI research and real-world deployment, she brings both technical depth and product intuition to the table. Her decision to leave xAI—a company flush with resources and talent—highlights how compelling the opportunity around inference optimization has become.

Though neither Sheng nor lead investor Accel responded to requests for comment, insiders confirm that the recent funding round was oversubscribed, reflecting intense investor interest in the AI infrastructure layer. Notably, Intel CEO Lip-Bu Tan participated as an angel investor, adding strategic weight beyond just capital.

Why Inference Is the New Battleground

For years, the AI race focused almost exclusively on training bigger models. But in 2026, the narrative has shifted. Training may get the headlines, but inference is where the money is spent—and lost. Industry estimates suggest that inference can account for up to 80% of total AI operational costs over a model’s lifetime.

This reality has ignited a gold rush for tools that make inference faster, cheaper, and more energy-efficient. RadixArk’s SGLang stands out by offering a developer-friendly approach: instead of relying solely on hardware upgrades or black-box optimizations, it empowers engineers to write more intelligent generation logic that the system can execute efficiently.

For example, when generating code or structured data, SGLang allows developers to embed constraints—like JSON schemas or syntax rules—directly into the generation process. This reduces wasted computation on invalid outputs and speeds up response times, a win for both user experience and bottom-line costs.

Open Source Roots, Commercial Future

RadixArk’s strategy mirrors a growing trend in AI infrastructure: start with a powerful open-source project, build community and credibility, then launch a company to offer managed services, enterprise support, and advanced features. This playbook has worked for companies like Hugging Face, Weaviate, and even Databricks itself.

SGLang remains open source, ensuring continued adoption and contribution from the global developer community. But RadixArk is already working on proprietary extensions—think enhanced monitoring, multi-model orchestration, and cloud-native deployment tools—that will appeal to businesses running mission-critical AI applications.

This dual-track approach balances innovation with monetization. Developers get a free, powerful tool to experiment with; enterprises get a supported, scalable solution they can trust in production.  

The $400M Question: Can RadixArk Scale?

A $400 million valuation is impressive for a company that only formally launched in August 2025. But in today’s AI market, speed often trumps size. With inference demand skyrocketing and cloud providers racing to offer optimized AI stacks, RadixArk is entering a high-stakes, high-reward arena.

Competitors range from cloud-native solutions like AWS Inferentia and Google’s TPU-based optimizations to startups like vLLM and TensorRT-LLM. Yet RadixArk’s differentiator lies in its programming-first philosophy—it doesn’t just accelerate models; it rethinks how developers interact with them.

Still, challenges loom. Scaling a distributed systems startup requires deep engineering talent, robust customer support, and seamless integration with existing MLOps workflows. And while early adopters like xAI provide validation, broader enterprise adoption will demand security certifications, SLAs, and interoperability with major cloud platforms.

If RadixArk can deliver on those fronts, its valuation may look conservative within a year.

What This Means for Developers and Businesses

For developers, RadixArk’s rise is good news. It means more tools to reduce latency and cost without sacrificing control. SGLang’s expressive syntax lowers the barrier to writing efficient AI applications—whether you’re building a chatbot, a code assistant, or a real-time analytics engine.

For businesses, the implications are even larger. Every millisecond shaved off inference time translates to better user retention, lower cloud bills, and greener AI operations. In an era where AI ROI is under scrutiny, efficiency isn’t optional—it’s existential.

RadixArk’s emergence signals a maturing AI ecosystem: one that’s moving beyond hype and toward sustainable, scalable deployment. The infrastructure layer is no longer an afterthought—it’s the foundation.

RadixArk’s $400 million valuation isn’t just a number—it’s a vote of confidence in the future of efficient AI. By transforming an academic project into a venture-backed powerhouse, the team behind SGLang is betting that the next wave of AI innovation won’t come from bigger models, but from smarter ways to run them.

As the inference market explodes, startups like RadixArk aren’t just riding the wave—they’re building the engines that will power it. And in 2026, that might be the most valuable position of all.

Post a Comment