Loading…
ONE Summit 2024 has ended
In Person
April 29 - May 1, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for ONE Summit 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Pacific Daylight Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Tuesday, April 30 • 4:20pm - 4:50pm
Non-Minimal Path Routing for AI and Cloud Computation Networks - David Wong, Claruspon Systems, Inc.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
There are several factors to drive AI networks to expand from supporting 25 thousand GPUs to 1 million GPUs, from interconnecting clusters within a datacenter to interconnecting clusters spanning across tens of datacenters, these include 1) the sustainable power that can be delivered to a single datacenter 2) the growing number of homogenous and heterogenous AI clusters and the desire to bring AI resources together to higher aggregated performance. In the 2nd half of year 2023, non-minimal path routing in Dragonfly and Dragonfly+ topologies were proposed. In this talk we analyze Clos and Mesh topologies based on a model of path diversity from network congestion point of view. We review the findings from AI workload optimization (such as the one from Meta and MIT). We also review the two facets of the non-minimal path routing on (e)BGP protocol. Finally, we come back to work on the Dragonfly and Dragonfly+ topologies with what we just learned in this talk and tentatively propose several large-scale AI networks with likely good characteristics to meet the needs of low network latency, low congestion, zero packet loss, and large bisectional bandwidth.

Speakers
avatar for David Wong

David Wong

Founder & CEO, Claruspon Systems, Inc.
David I-Keong Wong is the founder of Claruspon Systems, Inc., a Bay area company. He is the inventor and owner of several US patents in mesh networks and non-minimal path routing for data center, and a designer of datacenter switch product. He built a datacenter mini pod using datacenter... Read More →



Tuesday April 30, 2024 4:20pm - 4:50pm PDT
211 CD