On May 16th, Broadcom hosted an AI network seminar in Beijing successfully. Equipment manufacturers from the Internet, operators, and other industries from both home and abroad gathered to discuss and exchange ideas about Broadcom's latest technologies and product solutions about smart computing center networks. Attendees shared their excellent practices in the field of AI networks, which helped promote communication and cooperation among customers, partners, and academia, and facilitate the innovation and development of industry's technologies.

Ruijie Networks was invited to participate in the conference and released its AI Fabric smart computing center network solution for the next-generation AI cloud service. With high throughput, large bandwidth, and high availability, the solution can be applied to a variety of service scenarios such as big data processing, machine learning, and Artificial Intelligence Generated Content (AIGC) to help customers build network card-level smart computing center networks and contribute to the rapid development of AI businesses.

Technical Communication Between Engineers and Customers

At the exhibition site, Ruijie Networks exhibited two AI Fabric smart computing center network products, the 400G Network Cloud Processor (NCP) switch RG-S6930-18QC40F1 and 200G Network Cloud Fabric (NCF) switch RG-X56-96F1.

With a 2 RU height, the RG-S6930-18QC40F1 provides 18 x 400G physical ports + 40 x 200G fabric ports, 4 x fan modules, and 2 x power modules.  

With a 4 RU height, the RG-X56-96F1 provides 96 x 200G fabric ports, 4 x fan modules, and 4 x power supply modules.  

 

Left: NCF switch (RG-X56-96F1)  /  Right: NCP switch (RG-S6930-18QC40F1)

At the AI seminar, Liu Yang from Ruijie Networks gave a keynote speech on “Smart Computing Center Networks for the Next-Generation AI Cloud Services”. With the empowerment of AIGC, major global cloud service providers have also launched their own large models and corresponding AI cloud services, accelerating the development of computing servers while also speeding up the deployment process of AI acceleration cards for cloud users. Therefore, improving GPU cluster efficiency is crucial for maintaining strong competitiveness in the service model of AI cloud services. At the network level, improving bandwidth utilization, reducing dynamic latency, and achieving lossless network transmission are key indicators for improving service efficiency and reducing costs.

 
Speech by Liu Yang, Solution Manager from Ruijie Networks

To secure the preceding key indicators, Ruijie Networks has launched the AI Fabric smart computing center network solution based on high-performance chip technology. The solution maximizes network bandwidth utilization by splitting data flows into equally sized cells and hashes them to all links. The end-to-end flow control mechanism based on Virtual Output Queueing (VOQ) and Credit realizes a service-independent and lossless self-closed network. Additionally, the three-layer network architecture based on NCP and NCF can support a graphics processing unit (GPU) cluster consisting of 18K to 32K nodes, facilitating the construction of smart computing center networks for the next-generation AI cloud services.

AI Fabric Smart Computing Center Network Solution Architecture

While developing the AI Fabric smart computing center network solution, Ruijie Networks has also launched a distributed operating system aimed at simplifying deployment and improving system reliability. The conventional Distributed Disaggregated Chassis (DDC) has a centralized control plane. Once the Network Cloud Controller (NCC) is disconnected, the entire network will be affected, thereby compromising the entire service process. Moreover, some devices are confronted with significant O&M difficulties concerning upgrade due to version incompatibility. Ruijie Networks' AI Fabric smart computing center network solution adopts a decentralized and distributed operating system, which decouples the control plane from the management plane. Even if the management platform encounters problems,  the entire network is not affected. In addition, it solves the compatibility problem to allow devices to be upgraded independently, thereby making O&M much easier.

Customers and Experts from Various Industries

As the builder for next-generation AI cloud service smart computing center networks, Ruijie Networks is committed to providing high-quality and highly reliable network solutions and advanced products of smart computing data centers, with the aim to meet the growing demands for smart computing centers and to help customers improve service efficiency and reduce costs.

In the future, Ruijie Networks will continuously improve the AI Fabric smart computing center network solution, achieving constant breakthroughs in reducing latency, improving in-network computing performance, and achieving endpoint and network integration. This helps build a smart computing center network of next-generation AI cloud service featuring high-speed interconnection, elastic scalability, and power conservation. In addition, Ruijie Networks is actively exploring and developing endpoint and network collaboration solution based on high-performance chip networking, with optimization of intelligent NIC end-to-end network performance. Ruijie Networks will join hands with customers towards the era of AIGC smart computing.