This chip startup simply raised $135M on a guess that AI’s greatest bottleneck is not compute — it is reminiscence | TechCrunch


Each time you ask ChatGPT a query, your request triggers an information relay race. Data leaves reminiscence, passes by way of a CPU for preprocessing, travels to a GPU for heavy computation, after which makes its approach again — and that whole journey repeats for each single phrase the AI generates.

The bottleneck is structural — it means routing by way of among the costliest and power-intensive chips within the trade on each single request. That inefficiency is precisely what XCENA, a startup with workplaces in South Korea and the U.S., is attempting to unravel. The four-year-old startup has designed a chip that locations compute capabilities a lot nearer to DRAM — the quick, short-term reminiscence chips that retailer information a processor is actively utilizing — permitting routine information operations to be dealt with close to reminiscence, with out the pricey spherical journeys between CPUs, GPUs, and reminiscence.

If it really works at scale, the implications for AI infrastructure prices might be important, which largely explains investor enthusiasm across the nation. Certainly, XCENA simply raised $135 million in a Collection B at a valuation of $570 million, bringing its complete raised to $185 million.

XCENA CEO Jin Kim co-founded the startup in 2022 alongside CTO Dohun Kim and CPO Harry Juhyun Kim, all veterans of Samsung and SK Hynix, the reminiscence giants that offer chips powering Nvidia’s GPUs. “CPUs and GPUs have each gotten smarter over the a long time. Reminiscence by no means did. XCENA needs to alter that,” Kim mentioned in an interview with TechCrunch. “The current rise in reminiscence costs and associated shares factors to a broader shift in AI infrastructure towards memory-centric architectures,” he added. (This month, the three corporations that dominate the worldwide reminiscence chip market — Samsung, SK Hynix, and Micron — every crossed a trillion-dollar valuation for the primary time.)

XCENA is betting its enterprise on the thesis that “inference isn’t only a compute downside; it’s more and more a reminiscence scaling downside,” mentioned Kim.

XCENA’s chip, the MX1, connects to the CPU by way of CXL (Compute Categorical Hyperlink) — primarily a devoted specific lane between the processor and reminiscence — processing information earlier than it ever wants to go away the reminiscence module. It brings compute to the info, not the opposite approach round. The corporate claims that what used to require 10 servers may probably run on only one.

“Whereas GPUs excel at matrix multiplication — the heavy math behind AI mannequin coaching — a lot of the encompassing information orchestration, together with preprocessing, KV cache administration [the system that stores prior conversation context so a model doesn’t have to reprocess it], and information caching, nonetheless runs on CPUs. Our chip handles these duties immediately inside the reminiscence module itself,” Kim mentioned.

Demand for reminiscence options has surged because the second half of final yr, and the corporate believes the timing is working in its favor.

Conversations with a number of world reminiscence distributors are in early levels, although Kim declined to call them. The corporate’s ultimate clients are hyperscalers spending tens of billions a yr on AI infrastructure, the place even a small achieve in reminiscence effectivity can imply a whole bunch of thousands and thousands in financial savings.

The MX1 continues to be a prototype. Mass manufacturing chips are scheduled to roll off Samsung’s foundry traces by the top of 2026, with the corporate anticipating to generate income beginning in 2027.

Whereas neural processing unit (NPU) makers are competing to problem Nvidia for coaching workloads, XCENA is concentrating on the memory-intensive layer that sits beneath all of it.

XCENA’s closest rivals embody Astera Labs and Marvell, each Nasdaq-listed corporations engaged on next-generation reminiscence connectivity. Marvell is a big, established participant already working in the identical house, Kim mentioned, including that the differentiator comes all the way down to mental property. “Now we have hundreds of cores,” Kim mentioned. Primarily based on public specs, Marvell’s strategy depends on a handful of general-purpose cores by comparability.

These cores are constructed on RISC-V — an open-source chip design blueprint — and optimized particularly for information processing, with every core intentionally saved small and environment friendly. Past the cores themselves, XCENA designs its personal inside reminiscence hierarchy, interconnect bus, and DRAM controller — a degree of vertical integration that almost all chip corporations, together with bigger rivals, sometimes outsource.

Seoul-based VC companies Altinum and IMM Funding co-led the Collection B spherical, together with Corstone Asia and current traders SBI Funding and Mirae Asset Capital. The corporate, which has greater than 90 employees throughout workplaces in Pangyo, a tech hub outdoors Seoul, and in Sunnyvale, can be in conversations with worldwide traders about further funding.

If you buy by way of hyperlinks in our articles, we may earn a small commission. This doesn’t have an effect on our editorial independence.