Table Of Contents
Fundamentals of the
Cisco High-Performance Subnet Manager
Overview
Why a High-Performance Subnet Manager?
Features
Compatibility and Requirements
High Availability Compatibility
Installing the Cisco High-Performance Subnet Manager
Installing the High-Performance Subnet Manager on Cisco InfiniBand Drivers
Installing the High-Performance Subnet Manager on OpenFabrics InfiniBand Drivers
Using the Cisco High-Performance Subnet Manager
Starting the High-Performance Subnet Manager
The ib_sm Command
Accessing the CLI
Editing the Command Line
Saving and Restoring the High-Performance Subnet Manager Configuration
Known Issues
Fundamentals of the
Cisco High-Performance Subnet Manager
This chapter describes the fundamentals of the Cisco High-Performance Subnet Manager (HSM) for InfiniBand server switches. The HSM manages the InfiniBand (IB) subnet from a stand-alone, Linux-based server. The HSM software runs on the server and connects to the IB network through a Host Channel Adapter (HCA). The HSM manages fabrics composed of Cisco Server Fabric Switch (SFS) products, as well as third-party InfiniBand switches. Like the Subnet Manager (SM) embedded on SFS switches, the HSM discovers the InfiniBand fabric, establishes routing, and performs the initial bringup of the fabric. After the initial sweep, the HSM monitors the subnet for network failures, as well as the addition or removal of IB hardware. The HSM complies with the InfiniBand specification revision 1.2, available on the InfiniBand Trade Association (IBTA) website and provides Subnet Administration (SA) services as described therein. Upper-level protocols on the hosts use these services to establish communications, register services, and register for notification of network events.

Note
The terms subnet, fabric, and network are interchangeable in a discussion of InfiniBand.
The terms HSM and SM are used interchangeably in this guide, except in discussion of the embedded Subnet Manager, where only SM applies.
This chapter includes the following sections :
•
Overview
•
Features
•
Compatibility and Requirements
•
Installing the Cisco High-Performance Subnet Manager
•
Using the Cisco High-Performance Subnet Manager
Overview
Like its embedded counterpart, the HSM reacts to and recovers from failures in the network. It provides a centralized management location that applications and endpoints can query. The HSM manages collections of InfiniBand-compliant hardware organized into an IB subnet. IB network administrators can use the HSM to perform routing calculations, program hardware, bring links active, and perform other management tasks.
Why a High-Performance Subnet Manager?
The High-Performance Subnet Manager scales more effectively than the embedded Subnet Manager and gracefully supports stressful loads. The HSM is a superior management tool for networks of thousands of hosts. Additionally, the HSM is ideally suited to manage IB networks that include third-party switch platforms that do not support the Cisco embedded SM.
Features
The High-Performance Subnet Manager provides features that target heterogeneous fabrics and large or growing fabrics.
Table 1-1 High-Performance Subnet Manager Features
Feature
|
Details
|
Failover
|
Administrators can run multiple HSMs on a fabric. The HSM supports the Cisco Subnet Manager Database Synchronization protocol to synchronize with other HSMs. Database Synchronization minimizes disruption when a Subnet Manager fails and another assumes management.
|
Performance Monitoring
|
The HSM includes an integrated performance monitor (PM). The PM feature collects data and provides real-time monitoring of the IB subnet. Users can monitor all ports on the network, specific ports, connections between endpoints, or switches running port_agent software.
The port_agent provides a mechanism for the PM feature to distribute the load of monitoring ports in a network. On Cisco-designed switch platforms, the port_agent provides local monitoring of ports so the HSM does not have to query ports individually. The port_agent notifies the HSM when any monitored ports exceed thresholds, so the HSM does not need to poll those ports. The port_agent reduces load, both on the CPU where the HSM runs and on the network, because polling traffic declines greatly.
The PM feature gives users the ability to configure error thresholds. When error counts exceed thresholds, the HSM logs the events.
|
SPAN
|
The Switched Port Analyzer (SPAN) feature, also known as port mirroring, on the HSM enables you to mirror packets that ingress a port and route them elsewhere on the network. Mirrored packets can be analyzed to troubleshoot upper-layer protocols and applications.
|
Multiple SL/VL support to allow for QoS
|
The HSM supports the IBA-defined QoS mechanisms and provides the ability to program and administer the SL/VL-related settings of the IB fabric. This optional QoS management capability can be enabled or disabled, depending upon the specific installation requirements. The QoS management feature gives the fabric administrator a high degree of flexibility and allows the administrator of an IB fabric to control these QoS-related settings of an IB subnet:
• Number of data VLs configured in the subnet
• Profiles of SL-to-VL mapping and VL arbitration tables to be enforced in the subnet
|
Route-around
|
The HSM allows you to specify IB devices (ports, nodes, or chassis) through which the Subnet Manager should not route traffic. Without disrupting existing network traffic, this route-around feature enables you to avoid forwarding traffic through ports that are starting to accumulate errors or avoid forwarding traffic through chassis that are about to be removed from the IB subnet.
|
Speed
|
The HSM responds quickly to network events and subnet administration queries. It can discover, route, and bring up an IB subnet containing approximately 4500 compute nodes in under 90 seconds.
|
Command-Line Interface (CLI)
|
We provide a comprehensive CLI to configure and use the HSM. The CLI commands fall into the following categories:
• Administrative
• Show
• Performance Monitoring
• Configure
Administrative commands execute basic functions. Show commands display both the state of the HSM and details of the network. Performance monitoring commands manage counters. Configure commands specify HSM operating parameters and network properties.
|
Scalability
|
The HSM can support 4500 compute nodes in production. The HSM is designed to scale to larger fabrics but has not yet been tested in larger configurations.
|
Interoperability
|
The HSM interoperates with all Cisco SFS InfiniBand switches (and related devices), as well as any IB-compliant third-party switches and channel adapters.
|
Runs on the Cisco InfiniBand drivers and the OpenFabrics InfiniBand drivers
|
The HSM supports two different InfiniBand driver stacks: the Cisco commercial InfiniBand driver stack and the OpenFabrics Enterprise Distribution stack (OFED).
|
Compatibility and Requirements
The High-Performance Subnet Manager must run on a Linux host with an installed Cisco HCA. The host can run any Linux distribution and platform supported by the Cisco commercial InfiniBand host driver stack.
All Cisco High-Performance SM features are available regardless of which driver stack the HSM is running. However, for the integrated PM feature to be fully functional when the HSM is running on the OpenFabrics InfiniBand driver stack, all Cisco IB switches in the fabric must be running the latest available SFS OS release: SFS OS 2.8.0 update 2 and later or SFS OS 2.9.0 update 2 and later, depending upon the platform.
If possible, use the HSM with the following hardware recommendations:
•
Intel or AMD i386/i686 or x86_64
•
2 GHz CPU
•
1 GB system memory (more for networks with more than 2000 hosts)
•
5 GB available disk space
High Availability Compatibility
Like the embedded SM, the HSM supports failover and high availability. The HSM supports the Cisco SM Database Synchronization protocol to synchronize multiple HSMs on one network.
Note
The HSM will not synchronize with the embedded SM.
Installing the Cisco High-Performance Subnet Manager
The Cisco High-Performance Subnet Manager comes packaged as an RPM. The RPM package contains an SM executable, a CLI interface executable, and a script that can be used to automatically launch the SM at host startup. Two types of RPMs are available for each Linux distribution and platform:
•
RPM for the Cisco InfiniBand driver stack
•
RPM for the OpenFabrics InfiniBand driver stack
Access the appropriate RPM for the driver stack that you want to install:
•
Installing the High-Performance Subnet Manager on Cisco InfiniBand Drivers
•
Installing the High-Performance Subnet Manager on OpenFabrics InfiniBand Drivers
Note
Release 1.3 of the HSM requires OpenFabrics InfiniBand driver stack version 1.3 or later.
If the HSM release 1.3 is operated with an OpenFabrics InfiniBand driver stack version 1.2.5 or earlier, HSM will fail to initialize and the following message is logged to syslog:
Current OFED stack ABI is not supported. Please upgrade your OFED stack to version 1.3 or later.
Operating an HSM prior to 1.3 on an OpenFabrics driver stack release 1.3 or later, results in the following warning messages to be logged to syslog (the default location is /var/log/messages). However, the HSM will complete initialization successfully.
user_mad: process ib_sm did not enable P_Key index support. user_mad:Documentation/infiniband/user_mad.txt has info on the new ABI.
Installing the High-Performance Subnet Manager on Cisco InfiniBand Drivers
Note
The HSM RPMs have an explicit dependency on the underlying topspin-ib and topspin-ib-mod RPMs. The topspin-ib-sm RPM will not install without a 3.2.0 version of those RPMs.
To install the HSM on the Cisco InfiniBand drivers, perform the following steps:
Step 1
Enter the rpm -qa | grep topspin command to verify that the Cisco InfiniBand driver stack is installed.
Step 2
Enter the rpm -i topspin-ib-sm-rhel4-3-2-0.99.x86_64.rpm command to install the HSM.
Step 3
The SM and CLI executables install in this directory: /usr/local/topspin/sbin.
Step 4
The RPM contains a script that can be used to automatically launch the SM at host startup. The script installs in this directory: /etc/init.d.
Step 5
Enable the script by entering the chkconfig administrative command.
The "rhel4" in the example command identifies the Red Hat Enterprise Linux version 4. (You should choose the appropriate distribution and platform for performing the installation.) The "3.2.0" identifies the host driver release with which the RPM works. The "99" represents the host-driver build with which the RPM works. The "x86_64" represents the platform type on which the RPM runs.
Installing the High-Performance Subnet Manager on OpenFabrics InfiniBand Drivers
Note
The HSM RPMs have an explicit dependency on the underlying kernel-ib RPM. The cisco-ib-sm-ofed RPM will not install without the kernel-ib RPM.
To install the HSM on OpenFabrics InfiniBand drivers, follow these steps:
Step 1
Enter the rpm-qa | grep kernel-ib command to verify that the OpenFabrics InfiniBand driver stack installed.
Step 2
Enter the rpm -i cisco-ib-sm-ofed-rhel4-3-2-0.99.x86_64.rpm command to install the HSM.
Step 3
The SM and CLI executables install in this directory: /usr/sbin.
Step 4
The RPM contains a script that can be used to automatically launch the SM at host startup. The script install in this directory: /etc/init.d.
Step 5
Enable the script by entering the chkconfig administrative command.
The "rhel4" in the example command identifies the Red Hat Enterprise Linux version 4. (You should choose the appropriate distribution and platform for performing the installation.) The "x86_64" represents the platform type on which the RPM runs.
Using the Cisco High-Performance Subnet Manager
After you install the High-Performance Subnet Manager software, follow the instructions in this section to use the application. A complete list of commands appears in subsequent chapters.
Starting the High-Performance Subnet Manager
To launch the application, log in to your host by performing the following command:
| |
Command
|
Purpose
|
Step 1
|
/etc/init.d/ib_sm start
|
Launches the application.
|
Note
To stop the High Performance Subnet Manager application, enter the shutdown command at the CLI prompt or execute the /etc/init.d/ib_sm stop command on the host.
The ib_sm Command
The SM command-line options of the ib_sm command can be added to the ib_sm script. The command-line options appear in the output that follows.
Usage: ib_sm <dev_num> [options]
Specifies the IB device number that SM should be
associated with. The device number is platform dependent
(0,1,2) for Topspin 90 /SFS 3001
(0) for Topspin 360 /SFS 3012
(0) for Topspin 120 /SFS 7000
(0,1) for Topspin 270 /SFS 7008
Specifies the IB device name that the SM should be
associated with. If this optional argument is specified
it overrides the mandatory <dev_num> parameter.
Indicates the runtime IB stack SM should expect.
SM defaults to 'cisco' stack if this optional argument
is omitted. <VAL> can either be 'cisco' or 'ofed'.
Use this option to specify the SM operating mode.
If this option is not provided at SM invocation time,
SM starts in automatic mode.
Choose one of the following mode values to change the
This specifies the port on the IB device that SM should
be bound to. If this optional argument is not provided,
SM attempts to bind to port 1 of the device identified
by the mandatory argument <dev_num>.
Specifies the 64-bit subnet prefix value.
If this optional argument is not provided, SM
uses the subnet prefix value of fe80000000000000.
Specifies the priority of the SM in the subnet.
Priority value must be in the range 0-15, with 15
being the highest priority.
If this argument is not provided SM priority is set
to 10 on embedded platforms and 0 on hosts.
This optional argument specifies the 64-bit SM_Key value.
By default this value is set to 0000000000000000.
Specifies the lmc value for the subnet.
If this optional arugment is not provided, SM
This optional argument specifies the max hop limit
to be considered for calculating IB routes.
Default value is set to 64.
This argument controls the database synchronization
feature state of the SM. By default the db sync
feature is enabled for SM running on all embedded
platforms and disabled on hosts. Specify 1 for
this optional argument in order to enable database
synchronization between master and standby
SM(s) and 2 to disable it.
This controls the ability of the SM to manage SL/VL
related settings in the subnet. By default the SL/VL mgmt
feature is disabled in the SM on all platforms. Specify 1
for this optional argument in order to enable the feature
This argument specifies the number of data VLs to be
permitted in the subnet. By default, the operational VL
value of ports of a link is set to the smaller of the two
VL capabilities. Choose one of the following values to
restrict the range of configured operational VL values
in the fabric. Note that -q setting needs to be
enabled in order to specify a non-default behavior for
auto-subnet Limit by the smallest VLCap value detected
auto-link The operational data VLs at a link are to
be limited only by the smaller of the
VLCap values of its ports
This optional argument specifies the CLI command file to
run at SM startup. The command file is typically used
with the host SM, to load user configuration data
a.k.a user configuration persistency, during SM
initialization. This option is not typically used on
embedded platforms since user data persistency is
supported via another mechanism, that is transparent to
the user, during chassis bringup.
This file should contain one or more host SM CLI
commands, each command specified on a line by itself.
Use this option along with the '-o' option to debug
command file syntactic or semantic issues.
Full path name of the command file should be provided
when using this optional argument.
--cmd-file-rslt=<file-name>
This option specifies the file to which results of
the execution of the command file specified via the '-x'
option are to be logged. Full path name of the output
file should be provided when using this option.
This optional argument controls the verbosity of SM
logs directed to syslog. Choose one of the following
values to control the verbosity of SM log messages:
This option specifies the trace flow mask of SM.
By default SM tracing is set to capture informational
This option specifies the maximum number of concurrent
CLI threads (number of HSM CLI clients) which are
By default, 10 threads are available.
--multicast-profile=<profile-name>
This optional argument specifies a multicast management
profile of the SM. A profile controls the SM limits such
as the number of multicast LIDs deployed, number of
multicast groups created and the number of end ports that
are permitted to join multicast groups. A profile also
controls the capability of the SM to reuse MLIDs amongst
multicast groups in the subnet, if necessary. The default
multicast management profile in the SM, is 'large' for
host platforms and 'large-512-mlids' for embedded
platforms. Choose one of the following profiles to
override the default behavior:
(The comma separated list of elements in the description
column below are: maximum number of MLIDs,
maximum number of multicast groups, reuse of MLIDs
between multicast groups and mapping of IPv6 solicited
node discovery related groups to an mlid.)
------------ ------------
small 512, 512, false, false (HSM)
512, 512, false, false (ESM)
large 1024, 20K, true, true (HSM)
1024, 3072, true, true (ESM)
large-512-mlids 512, 20K, true, true (HSM)
512, 3072, true, true (ESM)
Displays this help message.
Accessing the CLI
You can launch the CLI for an HSM that runs on Cisco IB drivers and for an HSM that runs on OFED drivers.
Cisco IB Drivers
To launch the CLI for the HSM running on Cisco IB drivers, log in to your host, start the HSM if you have not already done so, and perform the following steps:
| |
Command
|
Purpose
|
Step 1
|
cd /usr/local/topspin/sbin
|
Navigates to the /usr/local/topspin/sbin directory.
|
Step 2
|
./ib_sm_cli
|
Launches the CLI.
|
OFED Drivers
To launch the CLI for the HSM running on OFED drivers, log in to your host, start the HSM if you have not already done so, and perform the following steps:
| |
Command
|
Purpose
|
Step 1
|
cd /usr/sbin
|
Navigates to the /usr/sbin directory.
|
Step 2
|
./ib_sm_cli
|
Launches the CLI.
|
Note
To end your CLI session, enter the exit command at the CLI prompt.
Editing the Command Line
The CLI of the High Performance Subnet Manager supports the following features:
•
in-line command editing
•
command-history queueing
The in-line command editing feature gives you the opportunity to edit a command that you have typed or reviewed from the history queue. The command-history queue stores the last 100 commands that you entered so that you can quickly and easily review the past commands. For more information about the command-history queueing feature, see the "history" section on page 2-5.
Various keystrokes enable you to edit a command you have just entered and to traverse through the command history. The key strokes appear in Table 1-2:
Table 1-2 CLI Key Combinations
Key Combination
|
Function
|
Ctrl-d
|
Deletes the character at the cursor.
|
Ctrl-u
|
Deletes the line at the cursor.
|
Ctrl-k
|
Deletes all characters from the cursor to the end of the line.
|
Ctrl-a
|
Moves the cursor to the beginning of the line.
|
Ctrl-e
|
Moves the cursor to the end of the line.
|
Ctrl-p (or "up" arrow)
|
Retrieves the prior command in the command-history queue.
|
Ctrl-n (or "down" arrow)
|
Retrieves the next command in the command-history queue.
|
Ctrl-b (or "left" arrow)
|
Moves the cursor one character to the left.
|
Ctrl-f (of "right" arrow)
|
Moves the cursor one character to the right.
|
Esc-b
|
Moves the cursor back one word.
|
Esc-f
|
Moves the cursor forward one word.
|
Esc-d
|
Deletes the remainder of the word at the cursor.
|
Ctrl-w
|
Deletes a word up to the cursor.
|
Esc-backspace
|
Deletes the word behind the cursor.
|
Ctrl-t
|
Transposes the character before the cursor and the character after the cursor.
|
Ctrl-l
|
Refreshes the input line.
|
Ctrl-y
|
Undoes the Ctrl-k combination.
|
Esc-p
|
Searches the history matching the substring prefix backward.
|
Esc-n
|
Searches the history matching the substring prefix forward.
|
Alternatively, a previously entered command can be repeated by entering the !n command, where n is the nth command in the history.
Command-line editing also allows for pattern substitution in the previously entered command. The syntax of the command is ^pattern1^pattern2^. This combination substitutes pattern1 with pattern2 in the previous command and executes it if pattern1 exists. Otherwise, the CLI executes the previous command with no modification. For example, if show node -s was the previous command, entering ^node^port^ would result in the execution of show port -s command.
Note
You should avoid using the history command when you want to use this feature. Otherwise, history is the previous command.
Finally, there is another mechanism for modifying previously executed commands. The syntax of this command is !n:s/pattern1/pattern2/. Issuing this command results in executing the nth command in the history after substituting pattern1 with pattern2.
Saving and Restoring the High-Performance Subnet Manager Configuration
The Cisco High-Performance Subnet Manager does not have a mechanism for saving current configuration settings. However, there are two mechanisms available for a user to start the SM in a given configuration.
The first mechanism is with command-line options. The SM accepts the most commonly set configuration options as command-line parameters, so it is easy to tune SM behavior by editing the /etc/init.d/ib_sm script.
The second mechanism is by creating an HSM CLI "batch file." This file has an arbitrary set of SM CLI commands to execute at startup. You can test the script using the run CLI command, and when the script is debugged, the user can use the SM command-line interface to have it run the script file at startup. For more information, see run, page 2-7.
Note
SM will fail to initialize if there are syntactic errors in the startup config file or it does not have appropriate file permissions associated with it.
Known Issues
•
The High-Performance Subnet Manager fails to initialize when there is a problem executing the startup cmd-file (such as an invalid file name, bad file permissions, or syntactic errors in the file). Users must search the syslog for error messages.
•
Command-line editing syntax is not permitted in cmd-file syntax.
•
The exit and shutdown commands, if specified in the startup cmd-file, will prevent HSM from initializing.
•
The HSM is not supported by VFrame.
•
The HSM will synchronize with other compatible instances of the HSM, but it will not synchronize with a switch-based Subnet Manager because the two products target different market segments and have different capabilities.
Note
For more information about compatibility between the HSM versions, see the Release Notes for Cisco High-Performance Subnet Manager (HSM) for the HSM software versions you are operating.