Cisco Nexus 9000 Switches for Connecting Intel Gaudi 2 Accelerators Buying Guide

Available Languages

Download Options

  • PDF
    (1.0 MB)
    View with Adobe Reader on a variety of devices
Updated:May 15, 2025

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Available Languages

Download Options

  • PDF
    (1.0 MB)
    View with Adobe Reader on a variety of devices
Updated:May 15, 2025
 

 

Cisco Nexus 9364D-GX2A switches are qualified to connect Intel Gaudi 2 servers to build a scale-out network for running Large-Language Model (LLM) training, inference, a similar Artificial Intelligence (AI), Machine Learning (ML), or generative AI workload.

Q.  What are Intel Gaudi 2 servers?
A.  Intel Gaudi 2 servers contain eight Gaudi 2 accelerators. Models that can’t fit in or train fast enough using eight accelerators, can benefit by connecting multiple Gaudi 2 servers via a 400 GbE non-blocking Ethernet network allowing RoCEv2 transport.

Intel Gaudi accelerator

Figure 1.         

Intel Gaudi accelerator

A Gaudi 2 server provides 24 x 100 GbE scale-out or back-end interfaces exposed via six QSFP-DD ports for an inter-Gaudi 2 network, also known as a scale-out network or a back-end network. These ports (marked as QSFP-DD0 to QSFP-DD5 in Figure 2) are dedicated to interconnecting Gaudi 2 accelerators in other servers and are not connected to external networks.

A Gaudi 2 server also provides two NICs, each with two 100 GbE ports (a total of four) for external, storage, and management connectivity. In Figure 1, these NIC ports are below the Gaudi 2 scale-out ports.

Related image, diagram or screenshot

Figure 2.         

Front view of a Intel Gaudi 2 Server

For more information, refer to https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi2.html

Q.  What are Cisco Nexus 9364D-GX2A switches?
A.  Cisco Nexus 9364D-GX2A switches (see Figure 3) provide 64 x 400 GbE QSFP-DD ports in 2 RU form factor. Each port can also connect to 4 x 100 GbE interfaces using breakouts. Inter-Switch Links (ISL) should be connected using 400 GbE, whereas links to the Intel Gaudi 2 servers use 4 x 100 GbE breakouts.
Q.  What cables and transceivers are qualified for connecting Cisco Nexus 9364D-GX2A switches with Intel Gaudi 2 servers?
A.  QDD-400G-DR4-S transceivers and MPO cables are qualified for connecting Nexus 9364D-GX2A switches with Intel Gaudi2 server scale-out/back-end ports at 4 x 100 GbE.
For 400 GbE ISL connectivity, QDD-400- AOC20M transceivers are used in qualification, but any other qualified transceiver listed on the Cisco Optics Compatibility Matrix can be used.

Related image, diagram or screenshot

Figure 3.         

Port-side view of a Cisco Nexus 9364D-GX2A switch

Q.  How to buy Cisco Nexus switches for AI/ML use case?
A.  Cisco part number, N9K-C9364D-A1, bundles 12 Nexus 9364D-GX2A switches and a 3-year DCN Advantage license. The DCN Advantage license includes Cisco Nexus Dashboard features to simplify the configuration and monitoring of networks. We recommend upgrading to the DCN Premier license for full functionality of Nexus Dashboard.
You can also buy using the individual switch part number, N9K-C9364D-GX2A.
Q.  What is the typical sizing of a Gaudi 2 accelerator scale-out network?
A.  The following table shows sizing of a Gaudi 2 accelerator scale-out network using a non-blocking and non-oversubscribed design.

# Gaudi2 accelerators

# HLS-Gaudi2 servers(1)

# scale-out/ backend QSFP-DD ports(2)

# scale-out/ backend 100 GbE interfaces(3)

# Cisco Nexus 9364D-GX2A Leaf switches(4)

# Cisco Nexus 9364D-GX2A Spine switches(4)

# Total Cisco Nexus 9364D-GX2A switches

# QDD- 400-AOCxM transceivers(5)

# QDD- 400G-DR4-S transceivers(6)

# of MPO cables(7)

8

1

6

24

0

0

0

0

0

0

16

2

12

48

1

0

1

0

24

12

32

4

24

96

1

0

1

0

48

24

64

8

48

192

1

0

1

0

96

48

128

16

96

384

3

0

3

96

192

96

256

32

192

768

6

3

9

192

384

192

512

64

384

1536

12

6

18

384

768

384

1024

128

768

3072

24

12

36

768

1536

768

2048

256

1536

6144

48

24

72

1536

3072

1536

4096

512

3072

12288

96

48

144

3072

6144

3072

(1) Each Gaudi 2 server has eight Gaudi 2 accelerators.
(2) Each Gaudi 2 server has six QSFP-DD ports for connecting to an inter-Gaudi accelerator network (aka scale-out network or back-end network).
(3) Each QSFP-DD scale-out/back-end port on a Gaudi 2 server breaks out to 4 x 100 GbE interfaces.
(4) Each Nexus 9364D-GX2A switch can connect up to 64 x 400 GbE or 256 x 100 GbE interfaces.
(5) QDD-400-AOCxM has two transceivers and a cable to connect two ports at 400 GbE.
(6) Two QDD-400G-DR4-S transceivers (one in the server port and one in the switch port) are needed for 4 x 100GbE interfaces.
(7) One MPO cable is needed for two QDD-400G-DR4-S transceiver.

 

 

Learn more