Enfabrica Unveils Industry’s First Ethernet-Based AI Memory Fabric System for Efficient Superscaling of LLM Inference

By: Enfabrica Corporation via Business Wire

July 29, 2025 at 09:00 AM EDT

Elastic Networked-Memory Solution Delivers Multi-800GB/s Read-Write Throughput Over Ethernet and Up To 50% Lower Cost Per Token Per User in AI Inference Workloads

Enfabrica Corporation, an industry leader in high-performance networking silicon for artificial intelligence (AI) and accelerated computing, today announced the availability of its Elastic Memory Fabric System “EMFASYS”, a transformative hardware and software solution designed to dramatically improve the compute efficiencies in large-scale, distributed, memory-bound AI inference workloads. EMFASYS is the first commercially available system that integrates high-performance Remote Direct Memory Access (RDMA) Ethernet networking with an abundance of parallel ComputeExpressLink (CXL) based DDR5 memory channels. The solution provides AI compute racks with fully elastic memory bandwidth and memory capacity, in a standalone appliance reachable by any GPU server at low, bounded latency over existing network ports.

This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20250729711298/en/

Inference Cluster with EMFASYS Elastic AI Memory Fabric

Generative, agentic, and reasoning-driven AI workloads are growing exponentially – in many cases requiring 10 to 100 times more compute per query than previous Large Language Model (LLM) deployments and accounting for billions of batched inference calls per day across AI clouds. The EMFASYS solution addresses the critical need for AI clouds to extract the highest possible utilization of GPU and High-Bandwidth-Memory (HBM) resources in the compute rack while scaling to greater user/agent count, accumulated context, and token volumes. It achieves this outcome by dynamically offloading HBM to commodity DRAM using a caching hierarchy, load-balancing token generation across AI servers, and reducing stranding of expensive GPU cores. When deployed at scale with Enfabrica’s EMFASYS remote memory software stack, the solution enables up to 50% lower cost per token per user, allowing foundational LLM providers to deliver significant savings in a price/performance tiered model.

“AI Inference has a memory bandwidth-scaling problem and a memory margin-stacking problem,” said Rochan Sankar, CEO of Enfabrica. “As inference gets more agentic versus conversational, more retentive versus forgetful, the current ways of scaling memory access won’t hold. We built EMFASYS to create an elastic, rack-scale AI memory fabric and solve these challenges in a way that hasn’t been done before. Customers are excited to partner with us to build a far more scalable memory movement architecture for their GenAI workloads and drive even better token economics.”

EMFASYS Features and Benefits

Powered by Enfabrica’s 3.2 Terabits/second (Tbps) Accelerated Compute Fabric SuperNIC (ACF-S) elastically connecting up to 144 CXL memory lanes to 400/800 Gigabit Ethernet (GbE) ports
Offloads GPU and HBM consumption by enabling shared memory targets of up to 18 TeraBytes (TB) CXL DDR5 DRAM per node, networked using 3.2 Tbps RDMA over Ethernet
Effectively aggregates CXL memory bandwidth for AI by enabling the application to stripe transactions across a wide number of memory channels and Ethernet ports
Uncompromised AI workload performance with read access times in microseconds and software-enabled caching hierarchy that hides transfer latency within inference pipelines
Drives down the cost of LLM inference at scale, particularly in large-context and high-turn workloads, by containing growth in GPU server compute and memory footprints
Outperforms flash-based inference storage solutions with 100x lower latency and unlimited write/erase transactions

Enfabrica’s EMFASYS system effectively allows AI cloud operators to deploy massively parallel, low-latency ‘Ethernet memory controllers’, fed by wide GPU networking pipes and populated with pooled, commodity DRAM they can purchase directly from DRAM suppliers. Scaling memory with EMFASYS alleviates the tax of linearly growing GPU HBM and CPU DRAM resources within the AI server based on inference service scale requirements alone.

The announcement of the EMFASYS memory fabric system follows Enfabrica’s successful sampling earlier this year of its industry-leading 3.2 Tbps ACF-S chip – the AI networking silicon at the heart of EMFASYS. The ACF-S chip delivers multi-port 800 GbE connectivity to GPU servers and 4X the I/O bandwidth, radix, and multipath resiliency of any other GPU-attached network interface controller (NIC) product available today. By virtue of the chip’s flexibility, ACF-S supports high-throughput, zero-copy, direct data placement and steering not only across a 4- or 8-GPU server complex, but alternatively across 18+ channels of CXL-enabled DDR memory. EMFASYS leverages the ACF-S chip’s high-performance RDMA-over-Ethernet networking and on-chip memory movement engines, along with a remote memory software stack based on Infiniband Verbs, to enable massively parallel, bandwidth-aggregated memory transfers between GPU servers and commodity DRAM over resilient bundles of 400/800G network ports.

The release of the EMFASYS AI memory fabric system builds on Enfabrica’s expanding presence in the AI infrastructure industry and its pioneering approach to optimizing and democratizing accelerated computing networks. Earlier this year, Enfabrica opened a new R&D center in India to grow its world-class engineering team and scale silicon and software product development. In April, the company began sampling its 3.2 Tbps ACF-S chip following the announcement of the solution’s general availability late last year. Enfabrica is also an active advisory member of the Ultra Ethernet Consortium (UEC) and a contributor to the Ultra Accelerator Link (UALink) Consortium.

Availability:

Both the EMFASYS AI memory fabric system and the 3.2 Tbps ACF SuperNIC chip are currently sampling and piloting with customers.

About Enfabrica:

Enfabrica is a cutting-edge silicon and software company building disruptive networking solutions for parallel, heterogeneous, and accelerated computing infrastructure. As the inventors of the Accelerated Compute Fabric SuperNIC (ACF-S), Enfabrica’s groundbreaking chips, software stack design, and partner-enabled systems give customers the freedom to stitch the fabric of our AI-enabled future and scale GPU and accelerated compute clusters like no one has. Enfabrica is elevating networking for the age of GenAI by producing the world’s most advanced, performant and efficient solutions that interconnect compute, memory and network. To learn more, follow us on LinkedIn and visit enfabrica.net.

Third-party trademarks mentioned are the property of their respective owners.

View source version on businesswire.com: https://www.businesswire.com/news/home/20250729711298/en/

Contacts

Media Contact:

Nick Gibiser for Enfabrica

Wireside Communications

Email: ngibiser@wireside.com

Phone: +1-804-500-6660

Enfabrica Unveils Industry’s First Ethernet-Based AI Memory Fabric System for Efficient Superscaling of LLM Inference

Contacts

More News