Flow Steering

Under normal conditions, with flow steering disabled, the default receive buffer will receive all traffic coming off the wire that's destined for that Host (or broadcast/multicast). This means that the kernel driver and user applications may be seeing traffic that isn't relevant. To reduce the load on the kernel and application, we can filter traffic using the ExaNICs onboard flow steering capability.

Flow steering makes use of the concepts of filters and buffers, and can be used to direct incoming traffic to a number of different buffers. Each physical port on the card has a default 2 MByte ring buffer (buffer 0) in host memory to which all traffic is normally delivered and which is shared between the kernel and userspace applications. All ports have an additional 32 userspace buffers, numbered 1 to 32, that can be obtained by a user's application. The FPGA onboard the ExaNIC can make decisions about where to transfer each incoming frame based on user defined rules. Use of ExaNIC flow steering does not incur any additional latency penalty on received frames!

Note that traffic that has been steered to a non-default buffer will no longer be visible in the default buffer. This can affect applications that are not aware of flow steering, including applications that use standard socket calls (such as tcpdump).

The ExaNIC can steer received traffic to a number of userspace buffers based on rules over IP or MAC address.

The ExaNIC can steer traffic based on IP headers:

  • Source IP
  • Destination IP
  • Source Port
  • Destination Port
  • Protocol (e.g. UDP, TCP)

Or L2 Ethernet headers:

  • Destination MAC
  • Ethertype
  • VLAN tag

Rules can be very specific - requiring an exact match over all fields - or very broad, with a wild-card for any of the fields.

For some examples:

  • All traffic belonging to a TCP connection could be delivered to a buffer by specifying all of the IP fields,
  • Multicast UDP traffic to a specific multicast address, but with any source address,
  • All UDP traffic from and to any address or port combination.

ExaNIC Flow Steering

This functionality can all be configured using our API, and amounts to the following:

  • Obtain a free userspace RX buffer for your application.
  • Define rules and associate them with this buffer.
  • Monitor this buffer for any inbound frames.

Refer to this page for further details on configuring flow steering.

Load Balancing

In some cases it is desirable for the NIC to balance load across multiple CPUs on the host. The ExaNIC is capable of doing this using its flow hashing functionality. When placed in flow hashing mode, the ExaNIC calculates a hash over the IP headers of incoming frames. This hash is calculated such that packets belonging to each IP flow will always end up in the same buffer. One example of how this could be used is in for monitoring the connections passing between network segments.

ExaNIC Load Balancing

Refer to this page for further details on configuring flow hashing.

Port Bridging (X10, X4 Only)

Port bridging lets the ExaNIC operate as a mini switch. If you choose to enable port bridging, physical ports 0 and 1 are connected together in hardware by the FPGA. This means that any packets that arrive on port 0 which aren't destined for the host will be forwarded out of port 1 (and likewise for packets arriving on port 1). Additionally, the card supports rate matching in hardware (e.g. translating between 1GbE and 10GbE when the bridged ports have different speeds).

The original use case for this feature is to allow customers involved in high-frequency trading to put their most latency critical server as close to an exchange as possible, with any downstream servers or switches connected behind the ExaNIC. The ability to connect an optional backup power supply to the NIC means that even if this server goes down, network connectivity to downstream devices is maintained. Additionally, there's zero latency penalty incurred on packets to and from the host.

ExaNIC Port Bridging

To configure you can use either ethtool or exanic-config. With ethtool, use ethtool --show-priv-flags and ethtool --set-priv-flags. For example:

$ sudo ethtool --set-priv-flags eth7 bridging on
$ sudo ethtool --show-priv-flags eth7
Private flags for eth7:
bypass_only: off
mirror_rx: off
mirror_tx: off
bridging: off

The same settings can also be configured through exanic-config:

$ sudo exanic-config exanic0 bridging on
exanic0: bridging on (ports 0 and 1)

There are also other applications for bridging, for example, in the case of traffic monitoring, you could use the NIC to pass all network traffic traversing a particular network segment to the host. This could then be timestamped, logged or monitored in real time.

A special firmware image must be applied to the X10 to support this feature, please refer to the downloads page. Latency is increased by ~30ns when this firmware is applied.

Port Mirroring (X10, X4 Only)

Port mirroring allows you to replicate any traffic received or transmitted on any port out of the last port. This is all done in hardware, so it doesn't incur any overhead on the host processor, and mirroring of both RX and TX data on any of the ports is individually selectable. The last port (mirror port) is still available for use as a normal interface for the host. The intended use case for this feature is for logging, particularly in trading applications.

Timestamps are taken at the moment the first bit of the incoming packet arrives, for both TX and RX. For an incoming packet, this value is latched to the timestamp register for the ingress port. The same is true for a port with an outgoing packet and for ports with loopback enabled (where the timestamp will be the time the looped-back packet arrived).

ExaNIC Port Mirroring

$ sudo exanic-config exanic0:0 mirror-rx on
exanic0:0: RX mirroring on

A special firmware image must be applied to the X10 to support this feature, please refer to the downloads page. Latency for port 1 transmit is increased by ~30ns when this firmware is applied.

Port mirroring and bridging settings are stored in non-volatile memory and are persistent across reboot and power off (currently applies to the X4 only).