FlexPod Datacenter with Red Hat OpenShift Bare Metal Manual Configuration with Cisco UCS X-Series Direct

Available Languages

Download Options

  • PDF
    (6.5 MB)
    View with Adobe Reader on a variety of devices
  • ePub
    (9.9 MB)
    View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
  • Mobi (Kindle)
    (5.4 MB)
    View on Kindle device or Kindle app on multiple devices
Updated:May 15, 2025

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Available Languages

Download Options

  • PDF
    (6.5 MB)
    View with Adobe Reader on a variety of devices
  • ePub
    (9.9 MB)
    View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
  • Mobi (Kindle)
    (5.4 MB)
    View on Kindle device or Kindle app on multiple devices
Updated:May 15, 2025
 

 

Published: May 2025

A logo for a companyDescription automatically generated

Related image, diagram or screenshot

In partnership with:

A black text on a white backgroundDescription automatically generated


 

About the Cisco Validated Design Program

The Cisco Validated Design (CVD) program consists of systems and solutions designed, tested, and documented to facilitate faster, more reliable, and more predictable customer deployments. For more information, go to: http://www.cisco.com/go/designzone.

Executive Summary

The FlexPod Datacenter solution is a validated design for deploying Cisco and NetApp technologies and products to build shared private and public cloud infrastructure. Cisco and NetApp have partnered to deliver a series of FlexPod solutions that enable strategic data center platforms. The success of the FlexPod solution is driven through its ability to evolve and incorporate both technology and product innovations in the areas of management, compute, storage, and networking. This document explains the deployment details of Red Hat OpenShift on FlexPod Bare Metal Infrastructure. Some of the key advantages of FlexPod Datacenter with Red Hat OpenShift Bare Metal are:

    Consistent Configuration: having a standard method for deploying Red Hat OpenShift on FlexPod Bare Metal infrastructure provides a consistent platform to run containers and virtualized workloads including CPU and GPU accelerated AI/ML workloads, software and models, and OpenShift Virtualization, all side by side on the same infrastructure.

    Simpler and programmable infrastructure: the entire underlying infrastructure can be configured using infrastructure as code delivered using Ansible.

    End-to-End 100Gbps Ethernet: utilizing the 5th Generation Cisco UCS VICs and the 5th Generation Cisco UCS S9108 Fabric Interconnects (FIs) to deliver 100Gbps Ethernet from the server through the network to the storage.

    Cisco Intersight Management: Cisco Intersight Managed Mode (IMM) is used to manage the Cisco UCS S9108 FIs and Cisco UCS X-Series Servers. Additionally, Cisco Intersight integrates with NetApp Active IQ Unified Manager and Cisco Nexus switches as described in the following sections.

    Built for investment protections: design ready for future technologies such as liquid cooling and high-Wattage CPUs; CXL-ready.

In addition to the FlexPod-specific hardware and software innovations, the integration of the Cisco Intersight cloud platform with NetApp Active IQ Unified Manager, and Cisco Nexus switches delivers monitoring and, orchestration capabilities for different layers (storage and networking) of the FlexPod infrastructure. Implementation of this integration at this point in the deployment process would require Cisco Intersight Assist and NetApp Active IQ Unified Manager to be deployed outside of the FlexPod.

For information about the FlexPod design and deployment details, including the configuration of various elements of design and associated best practices, refer to Cisco Validated Designs for FlexPod, here: https://www.cisco.com/c/en/us/solutions/design-zone/data-center-design-guides/flexpod-design-guides.html.

Solution Overview

This chapter contains the following:

    Introduction

    Audience

    Purpose of this Document

    What’s New in this Release?

Introduction

The FlexPod Datacenter with Red Hat OpenShift on Bare Metal configuration represents a cohesive and flexible infrastructure solution that combines computing hardware, networking, and storage resources into a single, integrated architecture. Designed as a collaborative effort between Cisco and NetApp, this converged infrastructure platform is engineered to deliver high levels of efficiency, scalability, and performance, suitable for a multitude of datacenter workloads. By standardizing on a validated design, organizations can accelerate deployment, reduce operational complexities, and confidently scale their IT operations to meet evolving business demands. The FlexPod architecture leverages Cisco's Unified Computing System (UCS) servers, Cisco Nexus networking, and NetApp's innovative storage systems, providing a robust foundation for both virtualized and non-virtualized environments.

Audience

The intended audience of this document includes but is not limited to IT architects, sales engineers, field consultants, professional services, IT managers, partner engineering, and customers who want to take advantage of an infrastructure built to deliver IT efficiency and enable IT innovation.

Purpose of this Document

This document provides deployment guidance around bringing up the FlexPod Datacenter with Red Hat OpenShift on Bare Metal infrastructure. This configuration is built as a tenant on top of FlexPod Base and assumes FlexPod Base has already been configured. This document introduces various design elements and explains various considerations and best practices for a successful deployment.

What’s New in this Release?

The following design elements distinguish this version of FlexPod from previous models:

    Configuration of Red Hat OpenShift Bare Metal as a tenant on top of FlexPod Base. This document is the first example of a FlexPod tenant on top of FlexPod Base that aligns with the tenant defined in FlexPod Zero Trust Framework Design Guide.

    Configuration of a platform that will support both Containerized Applications, such as AI applications and Virtual Machines on the same platform.

Deployment Hardware and Software

This chapter contains the following:

    Design Requirements

    Physical Topology

    Software Revisions

Design Requirements

The FlexPod Datacenter with Cisco UCS and Cisco Intersight meets the following general design requirements:

    Resilient design across all layers of the infrastructure with no single point of failure

    Scalable design with the flexibility to add compute capacity, storage, or network bandwidth as needed

    Modular design that can be replicated to expand and grow as the needs of the business grow

    Flexible design that can support different models of various components with ease

    Simplified design with the ability to integrate and automate with external automation tools

    Cloud-enabled design which can be configured, managed, and orchestrated from the cloud using GUI or APIs

To deliver a solution which meets all these design requirements, various solution components are connected and configured as explained in the following sections.

Physical Topology

The FlexPod Datacenter with Red Hat OpenShift on Bare Metal infrastructure configuration is built using the following hardware components:

    Cisco UCS X9508 Chassis with six Cisco UCS X210C M7 Compute Nodes and 2 UCS X440p PCIe Nodes each containing 2 NVIDIA L40S GPUs

    Fifth-generation Cisco UCS S9108 Fabric Interconnects to support 100GbE and 25GbE connectivity from various components

    High-speed Cisco NX-OS-based Nexus 93600CD-GX switching design to support 100GE and 400GE connectivity

    NetApp AFF C800 end-to-end NVMe storage with 25G or 100G Ethernet and (optional) 32G Fibre Channel connectivity

The software components of this solution consist of:

    Cisco Intersight to deploy, maintain, and support the Cisco UCS server components

    Cisco Intersight SaaS platform to maintain and support the FlexPod components

    Cisco Intersight Assist Virtual Appliance to help connect NetApp ONTAP and Cisco Nexus switches with Cisco Intersight

    NetApp Active IQ Unified Manager to monitor and manage the storage and for NetApp ONTAP integration with Cisco Intersight

    Red Hat OpenShift which provides a platform for both containers and VMs

FlexPod Datacenter with Red Hat OpenShift on Bare Metal Infrastructure with Cisco UCS X-Series Direct Topology

Figure 1 shows various hardware components and the network connections for this IP-based FlexPod design.

Figure 1.        FlexPod Datacenter Physical Topology for IP-based Storage Access

Related image, diagram or screenshot

The reference hardware configuration includes:

    Two Cisco Nexus 93600CD-GX Switches in Cisco NX-OS mode provide the switching fabric. Other Cisco Nexus Switches are also supported.

    Two Cisco UCS S9108 Fabric Interconnects (FIs) in the chassis provide the chassis connectivity. At least two 100 Gigabit Ethernet ports from each FI, configured as a Port-Channel, are connected to each Nexus 93600CD-GX switch. 25 Gigabit Ethernet connectivity is also supported as well as other versions of the Cisco UCS FI that would be used with Intelligent Fabric Modules (IFMs) in the chassis.

    One Cisco UCS X9508 Chassis contains 6 Cisco UCS X210C M7 servers and 2 Cisco UCS X440p PCIe Nodes each with 2 NVIDIA L40S GPUs. Other configurations of servers with and without GPUs are also supported.

    One NetApp AFF C800 HA pair connects to the Cisco Nexus 93600CD-GX Switches using two 100 GE ports from each controller configured as a Port-Channel. 25 Gigabit Ethernet connectivity is also supported as well as other NetApp AFF, ASA, and FAS storage controllers.

Red Hat OpenShift on Bare Metal Server Configuration

A simple Red Hat OpenShift cluster consists of at least five servers – 3 Control-Plane Nodes and 2 or more Worker Nodes where applications and VMs are run. In this lab validation 3 Worker Nodes were utilized. Based on OpenShift published requirements, the three Control Plane Nodes were configured with 64GB RAM, and the three Worker Nodes were configured with 768GB RAM to handle containerized applications and VMs.

An alternative configuration when all servers have the same amount of memory and CPUs, is to combine the control-plane and worker roles on the first three servers and then to assign only the worker role to the remaining servers. This configuration would require a minimum of three servers and notes throughout the document will explain deviations in the process for this configuration.

Each Node was booted from M.2. Both a single M.2 module and 2 M.2 modules with RAID1 are supported. Also, the servers paired with X440p PCIe Nodes were configured as Workers. From a networking perspective, both the Control-Plane Nodes and the Workers were configured with a single vNIC with UCS Fabric Failover in the Bare Metal or Management VLAN. The Workers were configured with extra NICs (vNICs) to allow storage attachment to the Workers. Each worker had two additional vNICs with the iSCSI A and B VLANs configured as native to allow iSCSI persistent storage attachment and future iSCSI boot. These same vNICs also had the NVMe-TCP A and B allowed VLANs assigned, allowing tagged VLAN interfaces for NVMe-TCP to be defined on the Workers. Finally, each worker had one additional vNIC with the OpenShift NFS VLAN configured as native to provide NFS persistent storage.

VLAN Configuration

Table 1 lists VLANs configured for setting up the FlexPod environment along with their usage.

Table 1.    VLAN Usage

VLAN ID

Name

Usage

IP Subnet used in this deployment

2*

Native-VLAN

Use VLAN 2 as native VLAN instead of default VLAN (1)

 

1020*

OOB-MGMT-VLAN

Out-of-band management VLAN to connect management ports for various devices

10.102.0.0/24; GW: 10.102.0.254

1022

OCP-BareMetal-MGMT

Routable OpenShift Bare Metal VLAN used for OpenShift cluster and node management

10.102.2.0/24; GW: 10.102.2.254

3012

OCP-iSCSI-A

Used for OpenShift iSCSI Persistent Storage

192.168.12.0/24

3022

OCP-iSCSI-B

Used for OpenShift iSCSI Persistent Storage

192.168.22.0/24

3032

OCP-NVMe-TCP-A

Used for OpenShift NVMe-TCP Persistent Storage

192.168.32.0/24

3042

OCP-NVMe-TCP-B

Used for OpenShift NVMe-TCP Persistent Storage

192.168.42.0/24

3052

OCP-NFS

Used for OpenShift NFS RWX Persistent Storage

192.168.52.0/24

Note:     *VLAN configured in FlexPod Base.

Note:     S3 object storage was also used in this environment but requires a routable subnet. In order to avoid having two default gateways on the OpenShift nodes, S3 was placed on the OCP-BareMetal-MGMT subnet and VLAN. A separate VLAN and subnet was not defined for S3.

Table 2 lists the VMs or bare metal servers necessary for deployment as outlined in this document.

Table 2.    Virtual Machines

Virtual Machine Description

VLAN

IP Address

Comments

OCP AD1

1022

10.102.2.249

Hosted on pre-existing management infrastructure within the FlexPod

OCP AD2

1022

10.102.2.250

Hosted on pre-existing management infrastructure within the FlexPod

OCP Installer

1022

10.102.2.10

Hosted on pre-existing management infrastructure within the FlexPod

NetApp Active IQ Unified Manager

1021

10.102.1.97

Hosted on pre-existing management infrastructure within the FlexPod

Cisco Intersight Assist Virtual Appliance

1021

10.102.1.96

Hosted on pre-existing management infrastructure within the FlexPod

Software Revisions

Table 3 lists the software revisions for various components of the solution.

Table 3.    Software Revisions

Layer

Device

Image Bundle

Comments

Compute

Cisco UCS Fabric Interconnect S9108

4.3(5.240191)

 

 

Cisco UCS X210C M7

5.3(5.250001)

 

Network

Cisco Nexus 93600CD-GX NX-OS

10.4(4)M

 

Storage

NetApp AFF C800

ONTAP 9.16.1

Latest patch release

Software

Red Hat OpenShift

4.17

 

NetApp Trident

25.02.1

 

NetApp DataOps Toolkit

2.5.0

 

Cisco Intersight Assist Appliance

1.1.1-1

1.1.1-0 initially installed and then automatically upgraded

NetApp Active IQ Unified Manager

9.16

 

NVIDIA L40S GPU Driver

550.144.03

 

FlexPod Cabling

The information in this section is provided as a reference for cabling the physical equipment in a FlexPod environment. To simplify cabling requirements, a cabling diagram was used.

The cabling diagram in this section contains the details for the prescribed and supported configuration of the NetApp AFF C800 running NetApp ONTAP 9.16.1.

Note:     For any modifications of this prescribed architecture, consult the NetApp Interoperability Matrix Tool (IMT).

Note:     This document assumes that out-of-band management ports are plugged into an existing management infrastructure at the deployment site. These interfaces will be used in various configuration steps.

Note:     Be sure to use the cabling directions in this section as a guide.

The NetApp storage controller and disk shelves should be connected according to best practices for the specific storage controller and disk shelves. For disk shelf cabling, refer to NetApp Support.

Figure 2 details the cable connections used in the validation lab for the FlexPod topology based on the Cisco UCS S9108 fabric interconnect directly in the chassis. Two 100Gb links connect each Cisco UCS Fabric Interconnect to the Cisco Nexus Switches and each NetApp AFF controller to the Cisco Nexus Switches. Additional 1Gb management connections will be needed for one or more out-of-band network switches that sit apart from the FlexPod infrastructure. Each Cisco UCS fabric interconnect and Cisco Nexus switch is connected to the out-of-band network switches, and each AFF controller has a connection to the out-of-band network switches. Layer 3 network connectivity is required between the Out-of-Band (OOB) and In-Band (IB) Management Subnets.

Figure 2.        FlexPod Cabling with Cisco UCS S9108 X-Series Direct Fabric Interconnects

Related image, diagram or screenshot 

Network Switch Configuration

This chapter contains the following:

    Physical Connectivity

    Cisco Nexus Switch Manual Configuration

This chapter provides a detailed procedure for configuring the Cisco Nexus 93600CD-GX switches for use in a FlexPod environment.

Note:     The following procedures describe how to configure the Cisco Nexus switches for use in the OpenShift Bare Metal FlexPod environment. This procedure assumes the use of Cisco Nexus 9000 10.3(4a)M.

    The following procedure includes the setup of NTP distribution on the bare metal VLAN. The interface-vlan feature and ntp commands are used to set this up.

    This procedure adds the tenant vlans to the appropriate port-channels.

Physical Connectivity

Follow the physical connectivity guidelines for FlexPod as explained in section FlexPod Cabling.

Cisco Nexus Switch Manual Configuration

Procedure 1.     Create Tenant VLANs on Cisco Nexus A and Cisco Nexus B

Step 1.    Log into both Nexus switches as admin using ssh.

Step 2.    Configure the OpenShift Bare Metal VLAN:

config t

vlan <bm-vlan-id for example, 1022>

name <tenant-name>-BareMetal-MGMT

Step 3.    Configure OpenShift iSCSI VLANs:

vlan <iscsi-a-vlan-id for example, 3012>

name <tenant-name>-iSCSI-A

vlan <iscsi-b-vlan-id for example, 3022>

name <tenant-name>-iSCSI-B

Step 4.    If configuring NVMe-TCP storage access, create the following two additional VLANs:

vlan <nvme-tcp-a-vlan-id for example, 3032>

name <tenant-name>-NVMe-TCP-A

vlan <nvme-tcp-b-vlan-id for example, 3042>

name <tenant-name>-NVMe-TCP-B
exit

Step 5.    Add OpenShift NFS VLAN:

vlan <nfs-vlan-id for example, 3052>

name <tenant-name>-NFS

Step 6.    Add VLANs to the vPC peer link in both Nexus switches:

int Po10
switchport trunk allowed vlan add <bm-vlan-id>,<iscsi-a-vlan-id>,<iscsi-b-vlan-id>,<nvme-tcp-a-vlan-id>,<nvme-tcp-b-vlan-id>,<nfs-vlan-id>

Step 7.    Add VLANs to the storage interfaces in both Nexus switches:

int Po11,Po12
switchport trunk allowed vlan add <bm-vlan-id>,<iscsi-a-vlan-id>,<iscsi-b-vlan-id>,<nvme-tcp-a-vlan-id>,<nvme-tcp-b-vlan-id>,<nfs-vlan-id>

Step 8.    Add VLANs to the UCS Fabric Interconnect Uplink interfaces in both Nexus switches:

int Po19,Po110
switchport trunk allowed vlan add <bm-vlan-id>,<iscsi-a-vlan-id>,<iscsi-b-vlan-id>,<nvme-tcp-a-vlan-id>,<nvme-tcp-b-vlan-id>,<nfs-vlan-id>

Step 9.    Add the Bare Metal VLAN to the Switch Uplink interface in both Nexus switches:

interface Po127

switchport trunk allowed vlan add <bm-vlan-id>

exit

Step 10.                       If configuring NTP Distribution in these Nexus Switches, add Tenant VRF and NTP Distribution Interface in Cisco Nexus A:

vrf context <tenant-name>
ip route 0.0.0.0/0 <bm-subnet-gateway>
exit
interface Vlan<bm-vlan-id>
no shutdown
vrf member <tenant-name>
ip address <bm-switch-a-ntp-distr-ip>/<bm-vlan-mask-length>

exit

copy run start

Step 11.                       If configuring NTP Distribution in these Nexus Switches, add Tenant VRF and NTP Distribution Interface in Cisco Nexus B:

vrf context <tenant-name>
ip route 0.0.0.0/0 <bm-subnet-gateway>
exit
interface Vlan<bm-vlan-id>
no shutdown
vrf member <tenant-name>
ip address <bm-switch-b-ntp-distr-ip>/<bm-vlan-mask-length>

exit

copy run start

Step 12.                       The following commands can be used to see the switch configuration and status.

show run

show vpc
show vlan

show port-channel summary

show ntp peer-status

show cdp neighbors

show lldp neighbors

show run int

show int

show udld neighbors

show int status

NetApp ONTAP Storage Configuration

This chapter contains the following:

      Configure NetApp ONTAP Storage

      Configure S3 access to the OpenShift Tenant

Configure NetApp ONTAP Storage

This section describes how to configure the NetApp ONTAP Storage for the OpenShift Tenant.

Procedure 1.     Log into the Cluster

Step 1.    Open an SSH connection to either the cluster IP or the host name.

Step 2.    Log into the admin user with the password you provided earlier.

Procedure 2.     Configure NetApp ONTAP Storage for the OpenShift Tenant

Note:     By default, all network ports are included in a separate default broadcast domain. Network ports used for data services (for example, e5a, e5b, and so on) should be removed from their default broadcast domain and that broadcast domain should be deleted.

Step 1.    Delete any Default-N automatically created broadcast domains:

network port broadcast-domain delete -broadcast-domain <Default-N> -ipspace Default

network port broadcast-domain show

Note:     Delete the Default broadcast domains with Network ports (Default-1, Default-2, and so on). This does not include Cluster ports and management ports.

Step 2.    Create an IPspace for the OpenShift tenant:

network ipspace create -ipspace AA02-OCP

Step 3.    Create the OCP-MGMT, OCP-iSCSI-A, OCP-iSCSI-B, OCP-NVMe-TCP-A , OCP-NVMe-TCP-B, and OCP-NFS broadcast domains with appropriate maximum transmission unit (MTU):

network port broadcast-domain create -broadcast-domain OCP-MGMT -mtu 1500 -ipspace AA02-OCP

network port broadcast-domain create -broadcast-domain OCP-iSCSI-A -mtu 9000 -ipspace AA02-OCP
network port broadcast-domain create -broadcast-domain OCP-iSCSI-B -mtu 9000 -ipspace AA02-OCP

network port broadcast-domain create -broadcast-domain OCP-NVMe-TCP-A -mtu 9000 -ipspace AA02-OCP

network port broadcast-domain create -broadcast-domain OCP-NVMe-TCP-B -mtu 9000 -ipspace AA02-OCP

network port broadcast-domain create -broadcast-domain OCP-NFS -mtu 9000 -ipspace AA02-OCP

Step 4.    Create the OpenShift management VLAN ports and add them to the OpenShift management broadcast domain:

network port vlan create -node AA02-C800-01 -vlan-name a0a-1022

network port vlan create -node AA02-C800-02 -vlan-name a0a-1022

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-MGMT -ports AA02-C800-01:a0a-1022,AA02-C800-02:a0a-1022

Step 5.    Create the OpenShift iSCSI VLAN ports and add them to the OpenShift iSCSI broadcast domains:

network port vlan create -node AA02-C800-01 -vlan-name a0a-3012

network port vlan create -node AA02-C800-02 -vlan-name a0a-3012

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-iSCSI-A -ports AA02-C800-01:a0a-3012,AA02-C800-02:a0a-3012

network port vlan create -node AA02-C800-01 -vlan-name a0a-3022

network port vlan create -node AA02-C800-02 -vlan-name a0a-3022

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-iSCSI-B -ports AA02-C800-01:a0a-3022,AA02-C800-02:a0a-3022

Step 6.    Create the OpenShift NVMe-TCP VLAN ports and add them to the OpenShift NVMe-TCP broadcast domains:

network port vlan create -node AA02-C800-01 -vlan-name a0a-3032

network port vlan create -node AA02-C800-02 -vlan-name a0a-3032

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-NVMe-TCP-A -ports AA02-C800-01:a0a-3032,AA02-C800-02:a0a-3032

network port vlan create -node AA02-C800-01 -vlan-name a0a-3042

network port vlan create -node AA02-C800-02 -vlan-name a0a-3042

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-NVMe-TCP-B -ports AA02-C800-01:a0a-3042,AA02-C800-02:a0a-3042

Step 7.    Create the OpenShift NFS VLAN ports and add them to the OpenShift NFS broadcast domain:

network port vlan create -node AA02-C800-01 -vlan-name a0a-3052

network port vlan create -node AA02-C800-02 -vlan-name a0a-3052

network port broadcast-domain add-ports -ipspace AA02-OCP -broadcast-domain OCP-NFS -ports AA02-C800-01:a0a-3052,AA02-C800-02:a0a-3052

Step 8.    Create the SVM (Storage Virtual Machine) in the IPspace. Run the vserver create command:

vserver create -vserver OCP-SVM -ipspace AA02-OCP

Note:     The SVM must be created in the IPspace. An SVM cannot be moved into an IPspace later.

Step 9.    Add the required data protocols to the SVM and remove the unused data protocols from the SVM:

vserver add-protocols -vserver OCP-SVM -protocols iscsi,nfs,nvme

vserver remove-protocols -vserver OCP-SVM -protocols cifs,fcp

Step 10.                       Add the two data aggregates to the OCP-SVM aggregate list and enable and run the NFS protocol in the SVM:

vserver modify -vserver OCP-SVM -aggr-list AA02_C800_01_SSD_CAP_1,AA02_C800_02_SSD_CAP_1

vserver nfs create -vserver OCP-SVM -udp disabled -v3 enabled -v4.1 enabled

Step 11.                       Create a Load-Sharing Mirror of the SVM Root Volume. Create a volume to be the load-sharing mirror of the infrastructure SVM root volume only on the node that does not have the Root Volume:

volume show -vserver OCP-SVM # Identify the aggregate and node where the vserver root volume is located.

volume create -vserver OCP-SVM -volume OCP_Trident_SVM_root_lsm01 -aggregate AA02_C800_0<x>_SSD_CAP_1 -size 1GB -type DP # Create the mirror volume on the other node

Step 12.                       Create the 15min interval job schedule:

job schedule interval create -name 15min -minutes 15

Step 13.                       Create the mirroring relationship:

snapmirror create -source-path OCP-SVM:OCP_Trident_SVM_root -destination-path OCP-SVM:OCP_SVM_root_lsm01 -type LS -schedule 15min

Step 14.                       Initialize and verify the mirroring relationship:

snapmirror initialize-ls-set -source-path OCP-SVM:OCP_Trident_SVM_root

 

snapmirror show -vserver OCP-SVM

                                                                       Progress

Source            Destination Mirror  Relationship   Total             Last

Path        Type  Path        State   Status         Progress  Healthy Updated

----------- ---- ------------ ------- -------------- --------- ------- --------

AA02-C800://OCP-SVM/OCP_Trident_SVM_root

            LS   AA02-C800://OCP-SVM/OCP_SVM_root_lsm01

                              Snapmirrored

                                      Idle           -         true    -

Step 15.                       Create the iSCSI and NVMe services:

vserver iscsi create -vserver OCP-SVM -status-admin up
vserver iscsi show -vserver OCP-SVM

                 Vserver: OCP-SVM

             Target Name: iqn.1992-08.com.netapp:sn.8442b0854ebb11efb1a7d039eab7b2f3:vs.5

            Target Alias: OCP-SVM

   Administrative Status: up

vserver nvme create -vserver OCP-SVM -status-admin up
vserver nvme show -vserver OCP-SVM

           Vserver Name: OCP-SVM

  Administrative Status: up

Discovery Subsystem NQN: nqn.1992-08.com.netapp:sn.8442b0854ebb11efb1a7d039eab7b2f3:discovery

Note:     Make sure licenses are installed for all storage protocols used before creating the services.

Step 16.                       To create the login banner for the SVM, run the following command:

security login banner modify -vserver OCP-SVM -message "This OCP-SVM is reserved for authorized users only!"

Step 17.                       Create a new rule for the SVM NFS subnet in the default export policy and assign the policy to the SVM’s root volume:

vserver export-policy rule create -vserver OCP-SVM -policyname default -ruleindex 1 -protocol nfs -clientmatch 192.168.52.0/24 -rorule sys -rwrule sys -superuser sys -allow-suid true


volume modify –vserver OCP-SVM –volume OCP_Trident_SVM_root –policy default

Step 18.                       Create and enable the audit log in the SVM:

volume create -vserver OCP-SVM -volume audit_log -aggregate AA02_C800_01_SSD_CAP_1 -size 50GB -state online -policy default -junction-path /audit_log -space-guarantee none -percent-snapshot-space 0

snapmirror update-ls-set -source-path OCP-SVM:OCP_Trident_SVM_root
vserver audit create -vserver OCP-SVM -destination /audit_log
vserver audit enable -vserver OCP-SVM

Step 19.                       Run the following commands to create NFS Logical Interfaces (LIFs):

network interface create -vserver OCP-SVM -lif nfs-lif-01 -service-policy default-data-files -home-node AA02-C800-01 -home-port a0a-3052 -address 192.168.52.51 -netmask 255.255.255.0 -status-admin up -failover-policy broadcast-domain-wide -auto-revert true

 

network interface create -vserver OCP-SVM -lif nfs-lif-02 -service-policy default-data-files -home-node AA02-C800-02 -home-port a0a-3052 -address 192.168.52.52 -netmask 255.255.255.0 -status-admin up -failover-policy broadcast-domain-wide -auto-revert true

Step 20.                       Run the following commands to create iSCSI LIFs:

network interface create -vserver OCP-SVM -lif iscsi-lif-01a -service-policy default-data-iscsi -home-node AA02-C800-01 -home-port a0a-3012 -address 192.168.12.51 -netmask 255.255.255.0 -status-admin up

network interface create -vserver OCP-SVM -lif iscsi-lif-01b -service-policy default-data-iscsi -home-node AA02-C800-01 -home-port a0a-3022 -address 192.168.22.51 -netmask 255.255.255.0 -status-admin up

network interface create -vserver OCP-SVM -lif iscsi-lif-02a -service-policy default-data-iscsi -home-node AA02-C800-02 -home-port a0a-3012 -address 192.168.12.52 -netmask 255.255.255.0 -status-admin up

 

network interface create -vserver OCP-SVM -lif iscsi-lif-02b -service-policy default-data-iscsi -home-node AA02-C800-02 -home-port a0a-3022 -address 192.168.22.52 -netmask 255.255.255.0 -status-admin up

Step 21.                       Run the following commands to create NVMe-TCP LIFs:

network interface create -vserver OCP-SVM -lif nvme-tcp-lif-01a -service-policy default-data-nvme-tcp -home-node AA02-C800-01 -home-port a0a-3032 -address 192.168.32.51 -netmask 255.255.255.0 -status-admin up

network interface create -vserver OCP-SVM -lif nvme-tcp-lif-01b -service-policy default-data-nvme-tcp -home-node AA02-C800-01 -home-port a0a-3042 -address 192.168.42.51 -netmask 255.255.255.0 -status-admin up

network interface create -vserver OCP-SVM -lif nvme-tcp-lif-02a -service-policy default-data-nvme-tcp -home-node AA02-C800-02 -home-port a0a-3032 -address 192.168.32.52 -netmask 255.255.255.0 -status-admin up

 

network interface create -vserver OCP-SVM -lif nvme-tcp-lif-02b -service-policy default-data-nvme-tcp -home-node AA02-C800-02 -home-port a0a-3042 -address 192.168.42.52 -netmask 255.255.255.0 -status-admin up

Step 22.                       Run the following command to create the SVM-MGMT LIF:

network interface create -vserver OCP-SVM -lif svm-mgmt -service-policy default-management -home-node AA02-C800-01 -home-port a0a-1022 -address 10.102.2.50 -netmask 255.255.255.0 -status-admin up -failover-policy broadcast-domain-wide -auto-revert true

Step 23.                       Run the following command to verify LIFs:

network interface show -vserver OCP-SVM
            Logical    Status     Network            Current       Current Is

Vserver     Interface  Admin/Oper Address/Mask       Node          Port    Home

----------- ---------- ---------- ------------------ ------------- ------- ----

OCP-SVM

            iscsi-lif-01a

                         up/up    192.168.12.51/24   AA02-C800-01  a0a-3012

                                                                           true

            iscsi-lif-01b

                         up/up    192.168.22.51/24   AA02-C800-01  a0a-3022

                                                                           true

            iscsi-lif-02a

                         up/up    192.168.12.52/24   AA02-C800-02  a0a-3012

                                                                           true

            iscsi-lif-02b

                         up/up    192.168.22.52/24   AA02-C800-02  a0a-3022

                                                                           true

            nfs-lif-01   up/up    192.168.52.51/24   AA02-C800-01  a0a-3052

                                                                           true

            nfs-lif-02   up/up    192.168.52.52/24   AA02-C800-02  a0a-3052

                                                                           true

            nvme-tcp-lif-01a

                         up/up    192.168.32.51/24   AA02-C800-01  a0a-3032

                                                                           true

            nvme-tcp-lif-01b

                         up/up    192.168.42.51/24   AA02-C800-01  a0a-3042

                                                                           true

            nvme-tcp-lif-02a

                         up/up    192.168.32.52/24   AA02-C800-02  a0a-3032

                                                                           true

            nvme-tcp-lif-02b

                         up/up    192.168.42.52/24   AA02-C800-02  a0a-3042

                                                                           true

            svm-mgmt     up/up    10.102.2.50/24     AA02-C800-01  a0a-1022

                                                                           true

11 entries were displayed.

Step 24.                       Create a default route that enables the SVM management interface to reach the outside world:

network route create -vserver OCP-SVM -destination 0.0.0.0/0 -gateway 10.102.2.254

Step 25.                       Set a password for the SVM vsadmin user and unlock the user:

security login password -username vsadmin -vserver OCP-SVM

Enter a new password:

Enter it again:

 

security login unlock -username vsadmin -vserver OCP-SVM

Step 26.                       Add the OpenShift DNS servers to the SVM:

dns create -vserver OCP-SVM -domains ocp.flexpodb4.cisco.com -name-servers 10.102.2.249,10.102.2.250

Configure S3 access to the OpenShift Tenant

Procedure 1.     Enable S3 on the storage VM

Step 1.    In NetApp System Manager, click Storage > Storage VMs, select the storage VM (OCP-SVM), click Settings, and then click the pencil icon under S3.

Step 2.    Enter the S3 server name. Make sure to enter the S3 server name as a Fully Qualified Domain Name (FQDN).

Step 3.    TLS is enabled by default (port 443). You can enable HTTP if required.

Step 4.    Select the certificate type. Whether you select system-generated certificate or external-CA signed certificate, it will be required for client access.

Related image, diagram or screenshot

Step 5.    Enter the network interfaces. Note that here the S3 object storage will be placed on the OCP-BareMetal-MGMT subnet and VLAN.

Step 6.    Click Save.

Related image, diagram or screenshot

The ONTAP S3 object store server is now configured as shown in the following figure. There are two users created by default:

1.     root user with UID 0 – no access key or secret key is generated for this user

2.     sm_s3_user – both access and secret keys are generated for this user

Related image, diagram or screenshot

Note:     The ONTAP administrator must run the object-store-server users regenerate-keys command to set the access key and secret key for the root user. As a NetApp best practice, do not use this root user. Any client application that uses the access key or secret key of the root user has full access to all buckets and objects in the object store.

Step 7.    You can choose to utilize the default user (sm_s3_user) or create a custom ONTAP S3 user:

a.     Click Storage > Storage VMs. Select the storage VM (OCP-SVM) to which you need to add a user, select Settings and then click the pencil icon under S3.

b.     To add a user, click Users > Add.

c.     Enter a name for the user. Click Save.

A screenshot of a computerAI-generated content may be incorrect.

d.     The user is created, and an access key and a secret key are generated for the user.

e.     Download or save the access key and secret key. These will be required for access from S3 clients.

Note:     Beginning with ONTAP 9.14.1, you can specify the retention period of the access keys that get created for the user. You can specify the retention period in days, hours, minutes, or seconds, after which the keys automatically expire. By default, the value is set to 0 that indicates that the key is indefinitely valid.

Procedure 2.     Create ONTAP S3 user group to control access to buckets

Step 1.    Click Storage > Storage VMs. Select the storage VM (OCP-SVM) to which you need to add a group, select Settings and then click the pencil icon under S3.

Step 2.    To add a group, select Groups, then click Add.

Step 3.    Enter a group name and select from a list of users.

Step 4.    You can select an existing group policy or add one now, or you can add a policy later. In this configuration, we have used an existing policy (FullAccess).

Step 5.    Click Save.

A screenshot of a groupAI-generated content may be incorrect.

Procedure 3.     Create an ONTAP S3 bucket

Step 1.    Click Storage > Buckets, then click Add.

Step 2.    Enter a name for the bucket, select the storage VM (OCP-SVM), and enter the size.

a.     If you click Save at this point, a bucket is created with these default settings:

                i.    No users are granted access to the bucket unless any group policies are already in effect.

              ii.    A Quality of Service (performance) level that is the highest available for your system.

b.     Click Save if you want to create a bucket with these default values.

A screenshot of a bucketAI-generated content may be incorrect.

Step 3.    Click More Options to configure settings for object locking, user permissions, and performance level when you configure the bucket, or you can modify these settings later.

a.     If you intend to use the S3 object store for FabricPool tiering, consider selecting Use for tiering rather than a performance service level. In this validation, we are not using S3 for FabricPool tiering.

b.     To enable versioning for your objects for later recovery, select Enable Versioning. In this case, we have not enabled versioning.

c.     Performance service level – default value (Performance) is used in this configuration.

A screenshot of a bucketAI-generated content may be incorrect.

Step 4.    Under Permissions section, click Add to add relevant permissions for accessing the bucket. Specify the following parameters:

a.     Principal: the user or group to whom access is granted. Here, we selected “s3-group”.

b.     Effect: allows or denies access to a user or group. Allow is selected here for “s3-group”.

c.     Actions: permissible actions in the bucket for a given user or group. Select as required for validation.

d.     Resources: paths and names of objects within the bucket for which access is granted or denied. The defaults bucketname and bucketname/* grant access to all objects in the bucket. In this solution, we used default values for resources (s3-bucket1,s3-bucket1/*)

e.     Conditions (optional): expressions that are evaluated when access is attempted. For example, you can specify a list of IP addresses for which access will be allowed or denied. In this case, the field value was empty as no conditions were specified.

Step 5.    Click Save.

A screenshot of a computerAI-generated content may be incorrect.

Step 6.    Click Save to create the ONTAP S3 bucket.

Note:     In this configuration, we did not enable S3 object locking, but you can enable it if required by the validation.

Note:     You can configure protection for the bucket by enabling SnapMirror (ONTAP or cloud) if needed. In this validation, this was not required.

Step 7.    ONTAP S3 is successfully created as shown in the following figure. Navigate to Storage > Buckets, select the bucket (s3-bucket1) and click Overview tab to see detailed information about the bucket.

A screenshot of a computerAI-generated content may be incorrect.

Step 8.    On S3 client applications (whether ONTAP S3 or an external third-party application), you can verify access to the newly created S3 bucket. In this solution, we used S3 Browser application to access the bucket as shown in the following figure.

Related image, diagram or screenshot

Note:     In S3 Browser application, new account needs to be created first by providing S3 user access key and secret key, and REST endpoint (http://<s3-lif-ip>:80). Once account is added successfully, S3 buckets would be fetched automatically as shown above.

Cisco Intersight Managed Mode Configuration

This chapter contains the following:

    Set up Cisco Intersight Resource Group

    Set up Cisco Intersight Organization

    Add OpenShift VLANs to VLAN Policy

    Cisco UCS IMM Manual Configuration

    Create Control-Plane Node Server Profile Template

    Compute Configuration

    Configure BIOS Policy

    Configure Boot Order Policy for M2

    Configure Firmware Policy (optional)

    Configure Power Policy

    Configure Virtual Media Policy

    Configure Cisco IMC Access Policy

    Configure IPMI Over LAN Policy

    Configure Local User Policy

    Configure Virtual KVM Policy

    Storage Configuration (optional)

    Create Network Configuration - LAN Connectivity for Control-Plane Nodes

    Create MAC Address Pool for Fabric A and B

    Create Ethernet Network Group Policy

    Create Ethernet Network Control Policy

    Create Ethernet QoS Policy

    Create Ethernet Adapter Policy

    Add vNIC(s) to LAN Connectivity Policy

    Complete the Control-Plane Server Profile Template

    Build the OpenShift Worker LAN Connectivity Policy

    Create the OpenShift Worker Server Profile Template

    Derive Server Profiles

The Cisco Intersight platform is a management solution delivered as a service with embedded analytics for Cisco and third-party IT infrastructures. The Cisco Intersight Managed Mode (also referred to as Cisco IMM or Intersight Managed Mode) is an architecture that manages Cisco Unified Computing System (Cisco UCS) fabric interconnect–attached systems through a Redfish-based standard model. Cisco Intersight managed mode standardizes both policy and operation management for Cisco UCS C-Series M7 and Cisco UCS X210c M7 compute nodes used in this deployment guide.

Cisco UCS B-Series M6 servers, connected and managed through Cisco UCS FIs, are also supported by IMM. For a complete list of supported platforms, go to: https://www.cisco.com/c/en/us/td/docs/unified_computing/Intersight/b_Intersight_Managed_Mode_Configuration_Guide/b_intersight_managed_mode_guide_chapter_01010.html

Procedure 1.     Set up Cisco Intersight Resource Group

In this procedure, a Cisco Intersight resource group for the Red Hat OpenShift tenant is created where resources such as targets will be logically grouped. In this deployment, a single resource group is created to host all the resources, but you can choose to create multiple resource groups for granular control of the resources.

Step 1.    Log into Cisco Intersight.

Step 2.    Select System.

Step 3.    Click Resource Groups on the left.

Step 4.    Click + Create Resource Group in the top-right corner.

Step 5.    Provide a name for the Resource Group (for example, AA02-OCP-rg).

Step 6.    Under Resources, select Custom.

Step 7.    Select all resources that are connected to this Red Hat OpenShift FlexPod tenant.

Note:     If more than one FlexPod tenant is sharing the FIs, a subset of the servers can be assigned to the Resource Group.

Related image, diagram or screenshot

Step 8.    Click Create.

Procedure 2.     Set Up Cisco Intersight Organization

In this procedure, an Intersight organization for the Red Hat OpenShift tenant is created where all Cisco Intersight Managed Mode configurations including policies are defined.

Step 1.    Log into the Cisco Intersight portal.

Step 2.    Select System.

Step 3.    Click Organizations on the left.

Step 4.    Click + Create Organization in the top-right corner.

Step 5.    Provide a name for the organization (for example, AA02-OCP), optionally select Share Resources with Other Organizations, and click Next.

Step 6.    Select the Resource Group created in the last step (for example, AA02-OCP-rg) and click Next.

Step 7.    Click Create.

A screenshot of a computerDescription automatically generated

Procedure 3.     Add OpenShift VLANs to VLAN Policy

Step 1.    Log into the Cisco Intersight portal.

Step 2.    Select Configure. On the left, select Profiles then select the UCS Domain Profiles tab.

Step 3.    To the right of the UCS Domain Profile used for the OpenShift tenant, click and select Edit.

Step 4.    Click Next to go to UCS Domain Assignment.

Step 5.    Click Next to go to VLAN & VSAN Configuration.

Step 6.    Under VLAN & VSAN Configuration, click the pencil icon to the left of the VLAN Policy to Edit the policy.

Step 7.    Click Next to go to Policy Details.

Step 8.    To add the OCP-BareMetal VLAN, click Add VLANs.

Step 9.    For the Prefix, enter the VLAN name. For the VLAN ID, enter the VLAN id. Leave Auto Allow on Uplinks enabled and Enable VLAN Sharing disabled.

Step 10.                       Under Multicast Policy, click Select Policy and select the already configured Multicast Policy (for example, AA02-MCAST).

A screenshot of a computerDescription automatically generated

Step 11.                       Click Add to add the VLAN to the policy.

Step 12.                       Repeat the above process to add all the VLANs in Table 1 to the VLAN Policy.

Related image, diagram or screenshot

Step 13.                       Click Save to save the VLAN Policy.

Step 14.                       Click Next three times to get to the UCS Domain Profile Summary page.

Step 15.                       Click Deploy and then Deploy again to deploy the UCS Domain Profile.

Cisco UCS IMM Manual Configuration

Configure Server Profile Template

In the Cisco Intersight platform, a server profile enables resource management by simplifying policy alignment and server configuration. The server profiles are derived from a server profile template. A Server profile template and its associated policies can be created using the server profile template wizard. After creating the server profile template, customers can derive multiple consistent server profiles from the template.

The server profile templates captured in this deployment guide supports Cisco UCS X210c M7 compute nodes with 5th Generation VICs and can be modified to support other Cisco UCS blades and rack mount servers.

vNIC Placement for Server Profile Template

In this deployment, separate server profile templates are created for OpenShift Worker and Control-Plane Nodes where Worker Nodes have storage network interfaces to support workloads, but Control-Plane Nodes do not. The vNIC layout is covered below. While most of the policies are common across various templates, the LAN connectivity policies are unique and will use the information in the tables below.

Note:     If a cluster with combined control-plane and worker nodes is utilized, the Worker Server Profile Template should be used for all nodes.

One vNIC is configured for OpenShift Control-Plane Nodes. This vNIC is manually placed as listed in Table 4.

Four vNICs are configured for OpenShift Worker Nodes. These vNICs are manually placed as listed in Table 5. NVMe-TCP VLAN Interfaces can be added as tagged VLANs to the iSCSI vNICs when NVMe-TCP is being used.

Table 4.    vNIC placement for OpenShift Control-Plane Nodes

vNIC/vHBA Name

Switch ID

PCI Order

Fabric Failover

Native VLAN

Allowed VLANs

eno5

A

0

Y

OCP-BareMetal-MGMT

OCP-BareMetal-MGMT

Table 5.    vNIC placement for OpenShift Worker Nodes

vNIC/vHBA Name

Switch ID

PCI Order

Fabric Failover

Native VLAN

Allowed VLANs

eno5

A

0

Y

OCP-BareMetal-MGMT

OCP-BareMetal-MGMT

eno6

A

1

N

OCP-iSCSI-A

OCP-iSCSI-A,

OCP-NVMe-TCP-A

eno7

B

2

N

OCP-iSCSI-B

OCP-iSCSI-B,

OCP-NVMe-TCP-B

eno8

B

3

Y

OCP-NFS

OCP-NFS

Note:     OCP-NVMe-TCP-A will be added to eno6 as a VLAN interface. OCP-NVMe-TCP-B will be added to eno7 as a VLAN interface.

Procedure 4.     Create Worker Node Server Profile Template

A Server Profile Template will first be created for the OpenShift Worker Nodes. This procedure will assume an X210C M7 is being used but can be modified for other server types.

Step 1.    Log into Cisco Intersight.

Step 2.    Go to Configure > Templates in the main window and select UCS Server Profile Templates. Click Create UCS Server Profile Template.

Step 3.    Select the organization from the drop-down list (for example, AA02-OCP).

Step 4.    Provide a name for the server profile template (for example, AA02-OCP-Worker-X210C-M7)

Step 5.    Select UCS Server (FI-Attached).

Step 6.    Provide an optional description.

Related image, diagram or screenshot

Step 7.    Click Next.

Compute Configuration

Procedure 5.     Compute Configuration – Configure UUID Pool

Step 1.    Click Select Pool under UUID Pool and then in the pane on the right, click Create New.

Step 2.    Verify correct organization is selected from the drop-down list (for example, AA02) and provide a name for the UUID Pool (for example, AA02-OCP-UUID-Pool).

Step 3.    Provide an optional Description and click Next.

Step 4.    Provide a unique UUID Prefix (for example, a prefix of AA020000-0000-0001 was used).

Step 5.    Add a UUID block.

Related image, diagram or screenshot

Step 6.    Click Create.

Procedure 6.     Configure BIOS Policy

Step 1.    Click Select Policy next to BIOS and in the pane on the right, click Create New.

Step 2.    Verify correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-Intel-M7-Virtualization-BIOS).

Step 3.    Enter an optional Description.

Step 4.    Click Select Cisco Provided Configuration. In the Search box, type Vir. Select Virtualization-M7-Intel or the appropriate Cisco Provided Configuration for your platform.

Related image, diagram or screenshot

Step 5.    Click Next.

Step 6.    On the Policy Details screen, expand Server Management. Use the pulldown to set the Consistent Device Naming BIOS token to enabled.

Related image, diagram or screenshot

Note:     The BIOS Policy settings specified here are from the Performance Tuning Best Practices Guide for Cisco UCS M7 Platforms - Cisco with the Virtualization workload. For other platforms, the appropriate document is listed below:

      Performance Tuning Guide for Cisco UCS M6 Servers - Cisco

      Performance Tuning Guide for Cisco UCS M5 Servers White Paper - Cisco

      Performance Tuning for Cisco UCS C225 M6 and C245 M6 Rack Servers with 3rd Gen AMD EPYC Processors White Paper - Cisco

      Products - Performance Tuning for Cisco UCS C125 Rack Server Nodes with AMD Processors (White Paper) - Cisco

Step 7.    Click Create to create the BIOS Policy.

Procedure 7.     Configure Boot Order Policy for M2

Step 1.    Click Select Policy next to Boot Order and then, in the pane on the right, click Create New.

Step 2.    Verify correct organization is selected from the drop-down list (for example, AA02) and provide a name for the policy (for example, AA02-OCP-M2-Boot-Order).

Step 3.    Click Next.

Step 4.    For Configured Boot Mode, select Unified Extensible Firmware Interface (UEFI).

Step 5.    Do not turn on Enable Secure Boot.

A black background with white textDescription automatically generated

Note:     It is critical to not enable UEFI Secure Boot. If Secure Boot is enabled, the NVIDIA GPU Operator GPU driver will fail to initialize.

Step 6.    Click the Add Boot Device drop-down list and select Virtual Media.

Note:     We are entering the Boot Devices in reverse order here to avoid having to move them in the list later.

Step 7.    Provide a Device Name (for example, KVM-Mapped-ISO) and then, for the subtype, select KVM MAPPED DVD.

Step 8.    Click the Add Boot Device drop-down list and select Virtual Media.

Step 9.    Provide a Device Name (for example, CIMC-Mapped-ISO) and then, for the subtype, select CIMC MAPPED DVD.

Step 10.                       Click the Add Boot Device drop-down list and select Local Disk.

Step 11.                       Provide a Device Name (for example, M2) and MSTOR-RAID for the Slot.

Step 12.                       Verify the order of the boot devices and adjust the boot order as necessary using arrows next to the Delete button.

Related image, diagram or screenshot

Step 13.                       Click Create.

Procedure 8.     Configure Firmware Policy (optional)

Since Red Hat OpenShift recommends using homogeneous server types for Control-Plane Nodes (and Workers), a Firmware Policy can ensure that all servers are running the appropriate firmware when the Server Profile is deployed.

Step 1.    Click Select Policy next to Firmware and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-Firmware). Click Next.

Step 3.    Select the Server Model (for example, UCSX-210C-M7) and the latest 5.2(2) firmware version.

Step 4.    Optionally, other server models can be added using the plus sign on the right.

A screenshot of a computerDescription automatically generated

Step 5.    Click Create to create the Firmware Policy.

Procedure 9.     Configure Power Policy

A Power Policy can be defined and attached to blade servers (Cico UCS X- and B-Series).

Step 1.    Click Select Policy next to Power and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-Server-Power). Click Next.

Step 3.    Make sure UCS Server (FI-Attached) is selected and adjust any of the parameters according to your organizational policies.

A screenshot of a videoDescription automatically generated

Step 4.    Click Create to create the Power Policy.

Step 5.    Optionally, if you are using Cisco UCS C-Series servers, a Thermal Policy can be created and attached to the profile.

Procedure 10.  Configure Virtual Media Policy

Step 1.    Click Select Policy next to Virtual Media and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-vMedia). Click Next.

Step 3.    Ensure that Enable Virtual Media, Enable Virtual Media Encryption, and Enable Low Power USB are turned on.

Step 4.    Do not Add Virtual Media at this time, but the policy can be modified and used to map an ISO for a CIMC Mapped DVD.

Related image, diagram or screenshot

Step 5.    Click Create to create the Virtual Media Policy.

Related image, diagram or screenshot

Step 6.    Click Next to move to Management Configuration.

Management Configuration

Four policies will be added to the management configuration:

    IMC Access to define the pool of IP addresses for compute node KVM access

    IPMI Over LAN to allow Intersight to manage IPMI messages

    Local User to provide local administrator to access KVM

    Virtual KVM to allow the Tunneled KVM

Procedure 1.     Configure Cisco IMC Access Policy

The IMC Access Policy can be configured to use either the OOB-MGMT subnet/VLAN or the Baremetal-MGMT subnet/VLAN. The choice here is a design decision based on whether this FlexPod Tenant has access to the OOB-MGMT subnet/VLAN. This example procedure uses the Baremetal-MGMT subnet/VLAN but can be adjusted to use the OOB-MGMT subnet/VLAN. The IMC Access Policy should always be setup to use In-Band Management. If the OOB-MGMT subnet/VLAN is used, it was already configured on the FI switch ports and in the FIs.

Step 1.    Click Select Policy next to IMC Access and then, in the pane on the right, click Create New.

Step 2.    Verify correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-IMC-Access-Policy).

Step 3.    Click Next.

Note:     Because certain features are not yet enabled for Out-of-Band Configuration (accessed via the Fabric Interconnect mgmt0 ports), if you are using the OOB-MGMT subnet/VLAN, we are bringing in the OOB-MGMT VLAN through the Fabric Interconnect Uplinks and mapping it as the In-Band Configuration VLAN. This was done in FlexPod Base.

Step 4.    Ensure UCS Server (FI-Attached) is selected on the right.

Step 5.    Enable In-Band Configuration. Enter the OCP-BareMetal VLAN ID (for example, 1022) and select “IPv4 address configuration.”

Step 6.    Under IP Pool, click Select IP Pool and then, in the pane on the right, click Create New.

Step 7.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-BareMetal-MGMT-IP-Pool). Click Next.

Step 8.    Ensure Configure IPv4 Pool is selected and provide the information to define a unique pool for KVM IP address assignment including an IP Block (added by clicking Add IP Blocks).

Note:     You will need the IP addresses of the OpenShift DNS servers here.

A screenshot of a computerDescription automatically generated

Note:     The management IP pool subnet should be accessible from the host that is trying to open the KVM connection. In the example shown here, the hosts trying to open a KVM connection would need to be able to route to the 10.102.2.0/24 subnet.

Step 9.    Click Next.

Step 10.                       Deselect Configure IPv6 Pool.

Step 11.                       Click Create to finish configuring the IP address pool.

Step 12.                       Click Create to finish configuring the IMC access policy.

Procedure 2.     Configure IPMI Over LAN Policy

The IPMI Over LAN Policy can be used to allow both IPMI and Redfish connectivity to Cisco UCS Servers.

Step 1.    Click Select Policy next to IPMI Over LAN and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02) and provide a name for the policy (for example, AA02-OCP-IPMIoLAN-Policy). Click Next.

Step 3.    On the right, ensure UCS Server (FI-Attached) is selected.

Step 4.    Ensure Enable IPMI Over LAN is selected.

Step 5.    From the Privilege Level drop-down list, select admin.

Step 6.    For Encryption Key, enter 00 to disable encryption.

Related image, diagram or screenshot

Step 7.    Click Create to create the IPMI Over LAN policy.

Procedure 3.     Configure Local User Policy

Step 1.    Click Select Policy next to Local User and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-LocalUser-Policy). Click Next.

Step 3.    Verify that UCS Server (FI-Attached) is selected.

Step 4.    Verify that Enforce Strong Password is selected.

Step 5.    Enter 0 under Password History.

Step 6.    Click Add New User.

Step 7.    Provide the username (for example, flexadmin), select a role (for example, admin), and provide a password and password confirmation.

Related image, diagram or screenshot

Note:     The username and password combination defined here will be used as an alternate to log in to KVMs and can be used for IPMI.

Step 8.    Click Create to finish configuring the Local User policy.

Procedure 4.     Configure Virtual KVM Policy

Step 1.    Click Select Policy next to Virtual KVM and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-Virtual-KVM). Click Next.

Step 3.    Verify that UCS Server (FI-Attached) is selected.

Step 4.    Turn on Allow Tunneled vKVM.

Related image, diagram or screenshot

Step 5.    Click Create.

Note:     To fully enable Tunneled KVM, once the Server Profile Template has been created, go to System > Settings > Security and Privacy and click Configure. Turn on “Allow Tunneled vKVM Launch” and “Allow Tunneled vKVM Configuration.” If Tunneled vKVM Launch and Tunneled vKVM Configuration are not Allowed, use the Configure button to change these settings.

Related image, diagram or screenshot

Related image, diagram or screenshot

Step 6.    Click Next to move to Storage Configuration.

Storage Configuration

Procedure 1.     Storage Configuration (optional)

If you have two M.2 drives in your servers you can create an optional policy to mirror these drives using RAID1.

Step 1.    If it is not necessary to configure a Storage Policy, click Next to continue to Network Configuration.

Step 2.    Click Select Policy next to Storage and then, in the pane on the right-click Create New.

Step 3.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-M.2-RAID1-Storage). Click Next.

Step 4.    Enable M.2 RAID Configuration and leave the default Virtual Drive Name and Slot of the M.2 RAID controller field values, or values appropriate to your environment. Click Create.

A screenshot of a computerDescription automatically generated

Step 5.    Click Next.

Network Configuration

Procedure 1.     Create Network Configuration - LAN Connectivity for Worker Nodes

The LAN connectivity policy defines the connections and network communication resources between the server and the LAN. This policy uses pools to assign MAC addresses to servers and to identify the vNICs that the servers use to communicate with the network.

For consistent vNIC placement, manual vNIC placement is utilized. Additionally, the assumption is being made here that each server contains only one VIC card and Simple placement, which adds vNICs to the first VIC, is being used. If you have more than one VIC in a server, the Advanced placement will need to be used.

The Worker hosts use 4 vNICs configured as listed in Table 6.

Table 6.    vNIC placement for OpenShift Worker Nodes

vNIC/vHBA Name

Switch ID

PCI Order

Fabric Failover

Native VLAN

Allowed VLANs

MTU

eno5

A

0

Y

OCP-BareMetal-MGMT

OCP-BareMetal-MGMT

1500

eno6

A

1

N

OCP-iSCSI-A

OCP-iSCSI-A,

OCP-NVMe-TCP-A

9000

eno7

B

2

N

OCP-iSCSI-B

OCP-iSCSI-B,

OCP-NVMe-TCP-B

9000

eno8

B

3

Y

OCP-NFS

OCP-NFS

9000

Step 1.    Click Select Policy next to LAN Connectivity and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP), provide a name for the policy (for example, AA02-OCP-Worker-M2Bt-5G-LANConn) and select UCS Server (FI-Attached) under Target Platform. Click Next.

Step 3.    Leave None selected under IQN and under vNIC Configuration, select Manual vNICs Placement.

Step 4.    Use the Add drop-down list to select vNIC from Template.

Step 5.    Enter the name for the vNIC from the table above (for example, eno5) and click Select vNIC Template.

Related image, diagram or screenshot

Step 6.    In the upper right, click Create New.

Step 7.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the vNIC Template (for example, AA02-OCP-BareMetal-MGMT-vNIC). Click Next.

Procedure 2.     Create MAC Address Pool for Fabric A and B

Note:     When creating the first vNIC, the MAC address pool has not been defined yet, therefore a new MAC address pool will need to be created. Two separate MAC address pools are configured, one for each Fabric. MAC-Pool-A will be used for all Fabric-A vNICs, and MAC-Pool-B will be used for all Fabric-B vNICs. Adjust the values in the table for your environment.

Table 7.    MAC Address Pools

Pool Name

Starting MAC Address

Size

vNICs

MAC-Pool-A

00:25:B5:A2:0A:00

64*

eno5, eno6

MAC-Pool-B

00:25:B5:A2:0B:00

64*

eno7, eno8

Note:     For Control-Plane Nodes, each server requires 1 MAC address from MAC-Pool-A, and for Workers, each server requires 2 MAC addresses from MAC-Pool-A and 2 MAC addresses from MAC-Pool-B. Adjust the size of the pool according to your requirements.

Step 1.    Click Select Pool under MAC Pool and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the pool from Table 7 with the prefix applied depending on the vNIC being created (for example, AA02-OCP-MAC-Pool-A for Fabric A).

Step 3.    Click Next.

Step 4.    Provide the starting MAC address from Table 7 (for example, 00:25:B5:A2:0A:00)

Note:     For ease of troubleshooting FlexPod, some additional information is always coded into the MAC address pool. For example, in the starting address 00:25:B5:A2:0A:00, A2 is the rack ID and 0A indicates Fabric A.

Step 5.    Provide the size of the MAC address pool from Table 7 (for example, 64).

Related image, diagram or screenshot

Step 6.    Click Create to finish creating the MAC address pool.

Step 7.    From the Create vNIC Template window, provide the Switch ID from Table 6.

Step 8.    For Consistent Device Naming (CDN), from the drop-down list, select vNIC Name.

Step 9.    For Failover, set the value from Table 6.

A screenshot of a computerDescription automatically generated

Procedure 3.     Create Ethernet Network Group Policy

Ethernet Network Group policies will be created and reused on applicable vNICs as explained below. The ethernet network group policy defines the VLANs allowed for a particular vNIC, therefore multiple network group policies will be defined for this deployment as listed in Table 8.

Table 8.    Ethernet Group Policy Values

Group Policy Name

Native VLAN

Apply to vNICs

Allowed VLANs

AA02-OCP-BareMetal-NetGrp

OCP-BareMetal-MGMT (1022)

eno5

OCP-BareMetal-MGMT

AA02-OCP-iSCSI-NVMe-TCP-A-NetGrp

OCP-iSCSI-A (3012)

eno6

OCP-iSCSI-A, OCP-NVMe-TCP-A*

AA02-OCP-iSCSI-NVMe-TCP-B-NetGrp

OCP-iSCSI-B (3022)

eno7

OCP-iSCSI-B, OCP-NVMe-TCP-B*

AA02-OCP-NFS-NetGrp

OCP-NFS

eno8

OCP-NFS

Note:     *Add the NVMe-TCP VLANs when using NVMe-TCP.

Step 1.    Click Select Policy under Ethernet Network Group Policy and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy from the Table 8 (for example, AA02-OCP-BareMetal-NetGrp).

Step 3.    Click Next.

Step 4.    Using the Add VLANs pulldown, select Enter Manually.

Step 5.    Enter the Allowed VLANs from Table 8 as a comma separated list (for example, 1022). Click Enter.

Step 6.    Click the three dots to the right of the native VLAN for the policy and select Set Native VLAN.

Related image, diagram or screenshot

Step 7.    Click Create to finish configuring the Ethernet network group policy.

Note:     When ethernet group policies are shared between two vNICs, the ethernet group policy only needs to be defined for the first vNIC. For subsequent vNIC policy mapping, click Select Policy and pick the previously defined ethernet network group policy from the list.

Procedure 4.     Create Ethernet Network Control Policy

The Ethernet Network Control Policy is used to enable Cisco Discovery Protocol (CDP) and Link Layer Discovery Protocol (LLDP) for the vNICs. A single policy will be created here and reused for all the vNICs.

Step 1.    Click Select Policy under Ethernet Network Control Policy and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-Enable-CDP-LLDP).

Step 3.    Click Next.

Step 4.    Enable Cisco Discovery Protocol (CDP) and Enable Transmit and Enable Receive under LLDP.

Related image, diagram or screenshot

Step 5.    Click Create to finish creating Ethernet network control policy.

Procedure 5.     Create Ethernet QoS Policy

Note:     The Ethernet QoS policy is used to enable the appropriate maximum transmission unit (MTU) for all the vNICs. Across the vNICs, two policies will be created (one for MTU 1500 and one for MTU 9000) and reused for all the vNICs.

Table 9.    Ethernet QoS Policy association to vNICs

Policy Name

vNICs

AA02-OCP-MTU1500-EthernetQoS

eno5

AA02-OCP-MTU9000-EthernetQoS

eno6, eno7, eno8

Step 1.    Click Select Policy under Ethernet QoS and in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-MTU1500-EthernetQoS). The name of the policy should conform to the MTU from Table 9.

Step 3.    Click Next.

Step 4.    Change the MTU, Bytes value to the value from Table 8.

Step 5.    Set the Rate Limit Mbps to 100000.

Related image, diagram or screenshot

Step 6.    Click Create to finish setting up the Ethernet QoS policy.

Procedure 6.     Create Ethernet Adapter Policy

The ethernet adapter policy is used to set the interrupts, send, and receive queues, and queue ring size. The values are set according to the best-practices guidance for the operating system in use. Cisco Intersight provides a default Linux Ethernet Adapter policy for typical Linux deployments.

You can optionally configure a tweaked ethernet adapter policy for additional hardware receive queues handled by multiple CPUs in scenarios where there is a lot of traffic and multiple flows. In this deployment, a modified ethernet adapter policy, AA02-EthAdapter-16RXQs-5G, is created and attached to storage vNICs. Non-storage vNICs will use the default Linux-v2 Ethernet Adapter policy.

Table 10.  Ethernet Adapter Policy association to vNICs

Policy Name

vNICs

AA02-OCP-EthAdapter-Linux-v2

eno5

AA02-OCP-EthAdapter-16RXQs-5G

eno6, eno7, eno8

Step 1.    Click Select Policy under Ethernet Adapter and then, in the pane on the right, click Create New.

Step 2.    Verify the correct organization is selected from the drop-down list (for example, AA02-OCP) and provide a name for the policy (for example, AA02-OCP-EthAdapter-Linux-v2).

Step 3.    Click Select Cisco Provided Configuration under Cisco Provided Ethernet Adapter Configuration.

Step 4.    From the list, select Linux-v2.

Step 5.    Click Next.

Step 6.    For the AA02-OCP-EthAdapter-Linux-v2 policy, click Create and skip the rest of the steps in this “Create Ethernet Adapter Policy” section.

Step 7.    For the AA02-OCP-EthAdapter-16RXQs-5G policy, make the following modifications to the policy:

    Increase Interrupts to 19

    Increase Receive Queue Count to 16

    Increase Receive Ring Size to 16384 (Leave at 4096 for 4G VICs)

    Increase Transmit Ring Size to 16384 (Leave at 4096 for 4G VICs)

    Increase Completion Queue Count to 17

    Ensure Receive Side Scaling is enabled

A screenshot of a computerDescription automatically generated

Step 8.    Click Create.

Procedure 7.     Add vNIC(s) to LAN Connectivity Policy

The vNIC Template has now been created and all policies attached.

Step 1.    For PCI Order enter the number from Table 8. Verify the other values.

Step 2.    Click Add to add the vNIC to the LAN Connectivity Policy.

Step 3.    If building the Worker LAN Connectivity Policy, go back to Procedure 1 Create Network Configuration - LAN Connectivity for Worker Nodes, Step 4 and repeat the vNIC Template and vNIC creation for all four vNICs often selecting existing policies instead of creating them.

Step 4.    Verify all vNICs were successfully created.

Related image, diagram or screenshot

Step 5.    Click Create to finish creating the LAN Connectivity policy.

Procedure 8.     Complete the Worker Server Profile Template

Step 1.    When the LAN connectivity policy is created, click Next to move to the Summary screen.

Step 2.    On the Summary screen, verify the policies are mapped to various settings. The screenshots below provide the summary view for the OpenShift Worker M.2 Boot server profile template.

Related image, diagram or screenshot

Related image, diagram or screenshot

Related image, diagram or screenshot

Related image, diagram or screenshot

Step 3.    Click Close to close the template.

Procedure 9.     Build the OpenShift Control-Plane LAN Connectivity Policy

Note:     If combined control-plane and worker nodes are being used, it is not necessary to build a Control-Plane Node LAN Connectivity Policy because only the Worker LAN Connectivity Policy will be used.

Step 1.    The OpenShift Worker LAN Connectivity Policy can be cloned and vNICs eno6, eno7, and eno8 removed to build the OpenShift Control-Plane Node LAN Connectivity Policy with only eno5 that will then be used in the OpenShift Control-Plane Node Server Profile Template. Log into Cisco Intersight and select Configure > Policies.

Step 2.    In the policy list, look for the <org-name>-Worker-M2Bt-5G-LANConn or the LAN Connectivity policy created above. Click to the right of the policy and select Clone.

Step 3.    Change the name of the cloned policy to something like AA02-OCP-Control-Plane-M2Bt-5G-LANConn and select the correct Organization (for example, AA02-OCP).

Related image, diagram or screenshot

Step 4.    Click Clone to clone the policy.

Step 5.    From the Policies window, click the refresh button to refresh the list. The newly cloned policy should now appear at the top of the list. Click to the right of the newly cloned policy and select Edit.

Step 6.    Click Next.

Step 7.    Use the checkboxes to select all vNICs except eno5. Click the trash can icon to delete all vNICs except eno5.

Step 8.    Click Save to save the policy.

Procedure 10.  Create the OpenShift Control-Plane Node Server Profile Template

The OpenShift Worker Server Profile Template can be cloned and modified to create the OpenShift Control-Plane Node  Server Profile Template.

Note:     If combined control-plane and worker nodes are being used, it is not necessary to build a Control-Plane Node Server Profile Template because only the Worker Server Profile Template will be used.

Step 1.    Log into Cisco Intersight and select Configure > Templates > UCS Server Profile Templates.

Step 2.    To the right of the OCP-Worker-X210C-M7 template, click and select Clone.

Step 3.    Ensure that the correct Destination Organization is selected (for example, AA02-OCP) and click Next.

Step 4.    Adjust the Clone Name (for example, AA02-Control-Plane-X210C-M7) and Description as needed and click Next.

Step 5.    From the Templates window, click the to the right of the newly created clone and click Edit.

Step 6.    Click Next until you get to Storage Configuration. If the Storage Policy needs to be added or deleted, make that adjustment here.

Step 7.    Click Next to get to Network Configuration. Click the page icon to the right of the LAN Connectivity Policy and select the Control-Plane Node LAN Connectivity Policy. Click Select.

Step 8.    Click Next and Close to save this template.

Complete the Cisco UCS IMM Setup

Procedure 1.     Derive Server Profiles

Step 1.    From the Configure > Templates page, to the right of the OCP-Control-Plane template, click and select Derive Profiles.

Note:     If using combined control-plane and worker nodes, use the OCP-Worker template for all nodes.

Step 2.    Under the Server Assignment, select Assign Now and select the 3 Cisco UCS X210c M7 servers that will be used as OpenShift Control-Plane Nodes.

Related image, diagram or screenshot

Step 3.    Click Next.

Step 4.    For the Profile Name Prefix, put in the first part of the OpenShift Control-Plane Node hostnames (for example, control. Set Start Index for Suffix to 0 (zero). The 3 server Names should now correspond to the OpenShift Control-Plane Node hostnames.

Related image, diagram or screenshot

Step 5.    Click Next.

Step 6.    Click Derive to derive the OpenShift Control-Plane Node Server Profiles.

Step 7.    Select Profiles on the left and then select the UCS Server Profiles tab.

Step 8.    Select the 3 OpenShift Control-Plane Node profiles and then click the at the top or bottom of the list and select Deploy.

Step 9.    Select Reboot Immediately to Activate and click Deploy.

Step 10.                       Repeat this process to create 3 OpenShift Worker Node Server Profiles using the OCP-Worker-Template.

OpenShift Installation and Configuration

This chapter contains the following:

      OpenShift – Installation Requirements

      Prerequisites

      Network Requirements

      Deploy NetApp Trident

      NetApp DataOps Toolkit

      Add an Additional Administrative User to the OpenShift Cluster

      Back up Cluster etcd

      Add a Worker Node to an OpenShift Cluster

      Deploy a Sample Containerized Application

OpenShift 4.17 is deployed on the Cisco UCS infrastructure as M.2 booted bare metal servers. The Cisco UCS X210C M7 servers need to be equipped with an M.2 controller (SATA or NVMe) card and either 1 or 2 identical M.2 drives. Three control-plane nodes and three worker nodes are deployed in the validation environment and additional worker nodes can easily be added to increase the scalability of the solution. This document will guide you through the process of using the Assisted Installer to deploy OpenShift 4.17.

OpenShift – Installation Requirements

The Red Hat OpenShift Assisted Installer provides support for installing OpenShift on bare metal nodes. This guide provides a methodology to achieving a successful installation using the Assisted Installer.

Prerequisites

The FlexPod for OpenShift utilizes the Assisted Installer for OpenShift installation therefore when provisioning and managing the FlexPod infrastructure, you must provide all the supporting cluster infrastructure and resources, including an installer VM or host, networking, storage, and individual cluster machines.

The following supporting cluster resources are required for the Assisted Installer installation:

      The control plane and compute machines that make up the cluster

      Cluster networking

      Storage for the cluster infrastructure and applications

      The Installer VM or Host

Network Requirements

The following infrastructure services need to be deployed to support the OpenShift cluster, during the validation of this solution we have provided VMs on your hypervisor of choice to run the required services. You can use existing DNS and DHCP services available in the data center.

There are various infrastructure services prerequisites for deploying OpenShift 4.16. These prerequisites are as follows:

      DNS and DHCP services – these services were configured on Microsoft Windows Server VMs in this validation

      NTP Distribution was done with the Cisco Nexus switches

      Specific DNS entries for deploying OpenShift – added to the DNS server

      A Linux VM for initial automated installation and cluster management – a Rocky Linux 9 / RHEL 9 VM with appropriate packages

NTP

Each OpenShift node in the cluster must have access to at least two NTP servers.

NICs

vNICs configured on the Cisco UCS servers based on the design previously discussed.

DNS

Clients access the OpenShift cluster nodes over the bare metal network. Configure a subdomain or subzone where the canonical name extension is the cluster name.

The following domain and OpenShift cluster names are used in this deployment guide:

      Base Domain: flexpodb4.cisco.com

      OpenShift Cluster Name: ocp

The DNS domain name for the OpenShift cluster should be the cluster name followed by the base domain, for example, ocp.flexpodb4.cisco.com.

Table 11 lists the information for fully qualified domain names used during validation. The API and Nameserver addresses begin with canonical name extensions. The hostnames of the control plane and worker nodes are exemplary, so you can use any host naming convention you prefer.

Table 11.  DNS FQDN Names Used

Usage

Hostname

IP Address

API

api.ocp.flexpodb4.cisco.com

10.102.2.228

Ingress LB (apps)

*.apps.ocp.flexpodb4.cisco.com

10.102.2.229

control0

control0.ocp.flexpodb4.cisco.com

10.102.2.211

control1

control1.ocp.flexpodb4.cisco.com

10.102.2.212

control2

control2.ocp.flexpodb4.cisco.com

10.102.2.213

worker0

worker0.ocp.flexpodb4.cisco.com

10.102.2.214

worker1

worker1.ocp.flexpodb4.cisco.com

10.102.2.215

worker2

worker2.ocp.flexpodb4.cisco.com

10.102.2.216

DHCP

For the bare metal network, a network administrator must reserve several IP addresses, including:

      One IP address for the API endpoint

      One IP address for the wildcard Ingress endpoint

      One IP address for each control-plane node (DHCP server assigns to the node)

      One IP address for each worker node (DHCP server assigns to the node)

Note:     Obtain the MAC addresses of the bare metal Interfaces from the UCS Server Profile for each node to be used in the DHCP configuration to assign reserved IP addresses (reservations) to the nodes. The KVM IP address also needs to be gathered for the control-plane and worker nodes from the server profiles.

Procedure 1.     Gather MAC Addresses of Node Bare Metal Interfaces

Step 1.    Log into Cisco Intersight.

Step 2.    Select Configure > Profiles > Server Profile (for example, ocp-worker2).

Step 3.    In the center pane, select Inventory > Network Adapters > Network Adapter (for example, UCSX-ML-V5D200G).

Step 4.    In the center pane, select Interfaces.

Step 5.    Record the MAC address for NIC Interface eno5.

Step 6.    Select the General tab and select Identifiers in the center pane.

Step 7.    Record the Management IP assigned out of the OCP-BareMetal-IP-Pool.

Table 12 lists the IP addresses used for the OpenShift cluster including bare metal network IPs and UCS KVM Management IPs for IPMI or Redfish access.

Table 12.  Host BMC Information

Hostname

IP Address

UCS KVM Mgmt. IP Address

BareMetal MAC Address (eno5)

control0.ocp.flexpodb4.cisco.com

10.102.2.211

10.102.2.241

00:25:B5:A2:0A:80

control1.ocp.flexpodb4.cisco.com

10.102.2.212

10.102.2.240

00:25:B5:A2:0A:81

control2.ocp.flexpodb4.cisco.com

10.102.2.213

10.102.2.239

00:25:B5:A2:0A:82

worker0.ocp.flexpodb4.cisco.com

10.102.2.214

10.102.2.243

00:25:B5:A2:0A:83

worker1.ocp.flexpodb4.cisco.com

10.102.2.215

10.102.2.244

00:25:B5:A2:0A:85

worker2.ocp.flexpodb4.cisco.com

10.102.2.216

10.102.2.242

00:25:B5:A2:0A:87

Step 8.    From Table 12, enter the hostnames, IP addresses, and MAC addresses as reservations in your DHCP and DNS server(s) or configure the DHCP server to dynamically update DNS.

Step 9.    You will also need to pipe VLAN interfaces for up to all six storage VLANs into your DHCP server(s) and assign IPs in the storage networks on those interfaces. Then create a DHCP scope for each storage VLAN and subnet where the IPs assigned by the scope do not overlap with storage LIF IPs. Either enter the nodes in the DNS server or configure the DHCP server to forward entries to the DNS server. For the cluster nodes, create reservations to map the hostnames to the desired IP addresses.

Step 10.                       Setup either a VM or spare server as an OCP-Installer machine with the network interface connected to the Bare Metal VLAN and install either Red Hat Enterprise Linux (RHEL) 9.5 or Rocky Linux 9.5 “Server with GUI” and create an administrator user. Once the VM or host is up and running, update it and install and configure XRDP. Also, install Google Chrome onto this machine. Connect to this host with a Windows Remote Desktop client as the admin user.

Procedure 2.     Install Red Hat OpenShift using the Assisted Installer

Use the following steps to install OpenShift from the OCP-Installer VM.

Step 1.    From the Installer desktop, open a terminal session and create an SSH key pair to use to communicate with the OpenShift hosts:

ssh-keygen -t ed25519 -N '' -f ~/.ssh/id_ed25519

Step 2.    Copy the public SSH key to the user directory:

cp ~/.ssh/id_ed25519.pub ~/

Step 3.    Add the private key to the ssh-agent:

eval "$(ssh-agent)"                                

ssh-add ~/.ssh/id_ed25519

Step 4.    Launch Chrome and connect to https://console.redhat.com/openshift/cluster-list. Log into your Red Hat account.

Step 5.    Click Create cluster to create an OpenShift cluster.

Step 6.    Select Datacenter and then select Bare Metal (x86_64).

Step 7.    Select Interactive to launch the Assisted Installer.

Step 8.    Provide the cluster name and base domain. Select the latest OpenShift 4.17 version. Scroll down and click Next.

A screenshot of a computerAI-generated content may be incorrect.

A screenshot of a computerAI-generated content may be incorrect.

 

Step 9.    It is not necessary to install any Operators at this time, they can be added later. Click Next.

Step 10.                       Click Add hosts.

Step 11.                       Under Provisioning type, from the drop-down list select the Minimal image file. Under SSH public key, click Browse and browse to, select, and open the id_ed25519.pub file. The contents of the public key should now appear in the box. Click Generate Discovery ISO.

A screenshot of a computerAI-generated content may be incorrect.

Step 12.                       If your Cisco UCS Servers have the Intersight Advantage license installed, click Add hosts from Cisco Intersight. If you do not have the Advantage license or you do not wish to use the Cisco Intersight Integration, skip to Step 15.

A screenshot of a computerAI-generated content may be incorrect.

Step 13.                       A Cisco Intersight tab will appear in Chrome. Log into Intersight and select the appropriate account. Select the appropriate Organization (AA02-OCP). Click the pencil icon to select the servers for the OpenShift installation. In the list on the right, select the servers to install OpenShift onto and click Save. In the lower right-hand corner, click Execute. The Workflow will mount the Discovery ISO from the Red Hat Cloud and reboot the servers into the Discovery ISO.

A screenshot of a computerAI-generated content may be incorrect.

Step 14.                       Back in the Red Hat Hybrid Cloud Console, click Close to close the Add hosts popup. Skip to Step 22 below.

Step 15.                       Click Download Discovery ISO to download the Discovery ISO into the Downloads directory. Click Close when the download is done.

Step 16.                       Copy the Discovery ISO to an http server. Use a web browser to get a copy of the URL for the Discovery ISO.

Step 17.                       Use Chrome to connect to Cisco Intersight and log into the Intersight account previously set up.

Step 18.                       Go to Configure > Policies and edit the Virtual Media policy attached to your OpenShift server profiles. Once on the Policy Details page, click Add Virtual Media.

Step 19.                       In the Add Virtual Media dialogue, leave CDD selected and select HTTP/HTTPS. Provide a name for the mount and add the URL for File Location.

Related image, diagram or screenshot

Step 20.                       Click Add. Click Save & Deploy then click Save & Proceed. It is not necessary to reboot the hosts to add the vMedia mount. Click Deploy. Wait for each of the six servers to complete deploying the profile.

Step 21.                       Go to Configure > Profiles > UCS Server Profiles. Once all six server profiles have a status of OK, click the to the right of each profile and select Server Actions > Power > Power Cycle then Power Cycle to reboot each of the six servers. If the M.2 drives or virtual drives are blank, the servers should boot from the Discovery ISO. This can be monitored with the vKVM if desired.

Step 22.                       Once all six servers have booted “RHEL CoreOS (Live)” from the Discovery ISO, they will appear in the Assisted Installer under Host discovery. Use the drop-down lists under Role to assign the appropriate server roles. Scroll down and click Next.

Note:     If using combined control-plane and worker nodes, enable Run workloads on control plane nodes. When the “Control pane node” role is selected, it will also include the “Worker” role.

A screenshot of a computerDescription automatically generated 

Step 23.                       Expand each node and verify CoreOS and OpenShift is being installed to sda (the M.2 device). Click Next.

Step 24.                       Under Network Management, make sure Cluster-Managed Networking is selected. Under Machine network, from the drop-down list select the subnet for the BareMetal VLAN. Enter the API IP for the api.cluster.basedomain entry in the DNS servers. For the Ingress IP, enter the IP for the *.apps.cluster.basedomain entry in the DNS servers.

A screenshot of a computerAI-generated content may be incorrect.

Step 25.                       Scroll down. All nodes should all have a status of Ready. Click Next.

A screenshot of a computerAI-generated content may be incorrect.

Step 26.                       Review the information and click Install cluster to begin the cluster installation.

A screenshot of a computerAI-generated content may be incorrect.

Step 27.                       On the Installation progress page, expand the Host inventory. The installation will take 30-45 minutes. When installation is complete, all nodes will show a Status of Installed.

A screenshot of a computerAI-generated content may be incorrect.

Step 28.                       Select Download kubeconfig to download the kubeconfig file. In a terminal window, setup a cluster directory and save credentials:

cd
mkdir <clustername> # for example, ocp
cd <clustername>
mkdir auth
cd auth
mv ~/Downloads/kubeconfig ./
mkdir ~/.kube
cp kubeconfig ~/.kube/config

Step 29.                       In the Assisted Installer, click the icon to copy the kubeadmin password:

echo <paste password> > ./kubeadmin-password

Step 30.                       In a new tab in Chrome, connect to https://access.redhat.com/downloads/content/290. Download the OpenShift Linux Client for the version of OpenShift that you installed:

cd ..
mkdir client
cd client
ls ~/Downloads
mv ~/Downloads/oc-x.xx.x-linux.tar.gz ./
tar xvf oc-x.xx.x-linux.tar.gz
ls
sudo mv oc /usr/local/bin/
sudo mv kubectl /usr/local/bin/
oc get nodes

Step 31.                       To enable oc tab completion for bash, run the following:

oc completion bash > oc_bash_completion
sudo mv oc_bash_completion /etc/bash_completion.d/

Step 32.                       If you used the Cisco UCS Integration in the OpenShift installation process, connect to Cisco Intersight and from Configure > Profiles > UCS Server Profiles, select all OpenShift Server Profiles. Click the at either the top or bottom of the column and select Deploy. It is not necessary to reboot the servers, only select the second or lower check box and click Deploy.

Step 33.                       If you did not use the Cisco UCS Integration in the OpenShift installation process, in Cisco Intersight, edit the Virtual Media policy and remove the link to the Discovery ISO. Click Save & Deploy and then click Save & Proceed. Do not select “Reboot Immediately to Activate.” Click Deploy. The virtual media mount will be removed from the servers without rebooting them.

Step 34.                       In Chrome or Firefox, in the Assisted Installer page, click Launch OpenShift Console to launch the OpenShift Console. Use kubeadmin and the kubeadmin password to login. On the left, go to Compute > Nodes to see the status of the OpenShift nodes.

A screenshot of a computerAI-generated content may be incorrect.

Step 35.                       In the Red Hat OpenShift console, go to Compute > Bare Metal Hosts. For each Bare Metal Host, click the ellipses to the right of the host and select Edit Bare Metal Host. Select Enable power management. Using Table 12, fill in the BMC Address. For an IPMI connection to the server, use the BMC IP Address. For a redfish connection to the server, use redfish://<BMC IP>/redfish/v1/Systems/<server Serial Number> and make sure to check Disable Certificate Verification. Also, make sure the Boot MAC Address matches the MAC address in Table 12. For the BMC Username and BMC Password, use what was entered into the Cisco Intersight Local User policy. Click Save to save the changes. Repeat this step for all Bare Metal Hosts.

If using redfish to connect to the server, it is critical to check the box Disable Certificate Verification. A screenshot of a computerAI-generated content may be incorrect.

Step 36.                       Go to Compute > Bare Metal Hosts. Once all hosts have been configured, the Status should show “Externally provisioned,” and the Management Address should be populated. You can now manage power on the OpenShift hosts from the OpenShift console.

A screenshot of a computerAI-generated content may be incorrect.

Step 37.                       Enable dynamic resource allocation for kubelet. On your installer VM, create a directory for resource allocation, place the following YAML files in it, and run the commands to create the configuration:

cat worker-kubeletconfig.yaml

apiVersion: machineconfiguration.openshift.io/v1

kind: KubeletConfig

metadata:

  name: dynamic-node

spec:

  autoSizingReserved: true

  machineConfigPoolSelector:

    matchLabels:

      pools.operator.machineconfiguration.openshift.io/worker: ""

cat control-plane-kubeletconfig.yaml

apiVersion: machineconfiguration.openshift.io/v1

kind: KubeletConfig

metadata:

  name: dynamic-node-control-plane

spec:

  autoSizingReserved: true

  machineConfigPoolSelector:

    matchLabels:

      pools.operator.machineconfiguration.openshift.io/master: ""

oc create -f worker-kubeletconfig.yaml
oc create -f control-plane-kubeletconfig.yaml

Step 38.                       To setup NTP on the worker and control-plane nodes, and NVMe-TCP on the worker nodes, run the following:

cd

cd <cluster-name> # For example, ocp

mkdir machine-configs

cd machine-configs

curl https://mirror.openshift.com/pub/openshift-v4/clients/butane/latest/butane --output butane

chmod +x butane

Step 39.                       Build the following files in the machine-configs directory with variations for your network:

cat 99-control-plane-chrony-conf-override.bu

variant: openshift

version: 4.17.0

metadata:

  name: 99-control-plane-chrony-conf-override

  labels:

    machineconfiguration.openshift.io/role: master

storage:

  files:

    - path: /etc/chrony.conf

      mode: 0644

      overwrite: true

      contents:

        inline: |

          driftfile /var/lib/chrony/drift

          makestep 1.0 3

          rtcsync

          logdir /var/log/chrony

          server 10.102.2.3 iburst

          server 10.102.2.4 iburst

 

cat 99-worker-chrony-conf-override.bu

variant: openshift

version: 4.17.0

metadata:

  name: 99-worker-chrony-conf-override

  labels:

    machineconfiguration.openshift.io/role: worker

storage:

  files:

    - path: /etc/chrony.conf

      mode: 0644

      overwrite: true

      contents:

        inline: |

          driftfile /var/lib/chrony/drift

          makestep 1.0 3

          rtcsync

          logdir /var/log/chrony

          server 10.102.2.3 iburst

          server 10.102.2.4 iburst

cat 99-worker-nvme-discovery.bu

variant: openshift

version: 4.17.0

metadata:

  name: 99-worker-nvme-discovery

  labels:

    machineconfiguration.openshift.io/role: worker

openshift:

  kernel_arguments:

    - loglevel=7

storage:

  files:

    - path: /etc/nvme/discovery.conf

      mode: 0644

      overwrite: true

      contents:

        inline: |

          --transport=tcp --traddr=192.168.32.51 --trsvcid=8009
          --transport=tcp --traddr=192.168.32.52 --trsvcid=8009
          --transport=tcp --traddr=192.168.42.51 --trsvcid=8009

          --transport=tcp --traddr=192.168.42.52 --trsvcid=8009

Step 40.                       Create .yaml files from the butane files with butane, then load the configurations into OpenShift:

./butane 99-control-plane-chrony-conf-override.bu -o ./99-control-plane-chrony-conf-override.yaml
./butane 99-worker-chrony-conf-override.bu -o ./99-worker-chrony-conf-override.yaml
./butane 99-worker-nvme-discovery.bu -o ./99-worker-nvme-discovery.yaml

oc create -f 99-control-plane-chrony-conf-override.yaml

oc create -f 99-worker-chrony-conf-override.yaml

oc create -f 99-worker-nvme-discovery.yaml

Note:     If using combined control-plane and worker nodes, 99-control-plane-nvme-discovery.bu and 99-control-plane-nmve-discovery.yaml files will need to be created and loaded into OpenShift.

Step 41.                       To enable iSCSI and multipathing on the workers, create the 99-worker-ontap-iscsi.yaml and upload as a machine config:

cat 99-worker-ontap-iscsi.yaml

apiVersion: machineconfiguration.openshift.io/v1

kind: MachineConfig

metadata:

  name: 99-worker-ontap-iscsi

  labels:

    machineconfiguration.openshift.io/role: worker

spec:

  config:

    ignition:

      version: 3.2.0

    storage:

      files:

      - contents:

          source: data:text/plain;charset=utf-8;base64,IyBkZXZpY2UtbWFwcGVyLW11bHRpcGF0aCBjb25maWd1cmF0aW9uIGZpbGUKCiMgRm9yIGEgY29tcGxldGUgbGlzdCBvZiB0aGUgZGVmYXVsdCBjb25maWd1cmF0aW9uIHZhbHVlcywgcnVuIGVpdGhlcjoKIyAjIG11bHRpcGF0aCAtdAojIG9yCiMgIyBtdWx0aXBhdGhkIHNob3cgY29uZmlnCgojIEZvciBhIGxpc3Qgb2YgY29uZmlndXJhdGlvbiBvcHRpb25zIHdpdGggZGVzY3JpcHRpb25zLCBzZWUgdGhlCiMgbXVsdGlwYXRoLmNvbmYgbWFuIHBhZ2UuCgpkZWZhdWx0cyB7Cgl1c2VyX2ZyaWVuZGx5X25hbWVzIHllcwoJZmluZF9tdWx0aXBhdGhzIG5vCn0KCmJsYWNrbGlzdCB7Cn0K

          verification: {}

        filesystem: root

        mode: 600

        overwrite: true

        path: /etc/multipath.conf

    systemd:

      units:

        - name: iscsid.service

          enabled: true

          state: started

        - name: multipathd.service

          enabled: true

          state: started

  osImageURL: ""

oc create -f 99-worker-ontap-iscsi.yaml

Note:     If using combined control-plane and worker nodes, the 99-control-plane-ontap-iscsi.yaml file will need to be created and loaded into OpenShift.

Note:     The Base 64 encoded source above is the following file (/etc/multipath.conf) encoded. It is necessary to set “find_multipaths” to no.

cat multipath.conf

# device-mapper-multipath configuration file

 

# For a complete list of the default configuration values, run either:

# # multipath -t

# or

# # multipathd show config

 

# For a list of configuration options with descriptions, see the

# multipath.conf man page.

 

defaults {

        user_friendly_names yes

        find_multipaths no

}

 

blacklist {

}

Step 42.                       Over the next 20-30 minutes each of the nodes will go through the “Not Ready” state and reboot. You can monitor this by going to Compute > MachineConfigPools in the OpenShift Console. Wait until both pools have an Update status of “Up to date.”

A screenshot of a computerDescription automatically generated

Step 43.                       The Kubernetes NMState Operator will be used to configure the storage networking interfaces on the workers (and also virtual machine connected interfaces if OpenShift Virtualization is installed). In the OpenShift Console, go to Operators > OperatorHub. In the search box, enter NMState and Kubernetes NMState Operator should appear. Click Kubernetes NMState Operator.

A screenshot of a computerAI-generated content may be incorrect.

Step 44.                       Click Install. Leave all the defaults in place and click Install again. The operator will take a few minutes to install.

Step 45.                       Once the operator is installed, click View Operator.

Step 46.                       Select the NMState tab. On the right, click Create NMState. Leave all defaults in place and click Create. The nmstate will be created. You will also need to refresh the console because additional items will be added under Networking.

A screenshot of a computerAI-generated content may be incorrect.

Step 47.                       In an NMState directory on the ocp-installer machine, create the following YAML files:

cat eno6.yaml

apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy

metadata:

  name: ocp-iscsi-a-policy

spec:

  nodeSelector:

    node-role.kubernetes.io/worker: ''

  desiredState:

    interfaces:

    - name: eno6

      description: Configuring eno6 on workers

      type: ethernet

      state: up

      ipv4:

        dhcp: true

        enabled: true

      ipv6:

        enabled: false

cat eno7.yaml

apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy

metadata:

  name: ocp-iscsi-b-policy

spec:

  nodeSelector:

    node-role.kubernetes.io/worker: ''

  desiredState:

    interfaces:

    - name: eno7

      description: Configuring eno7 on workers

      type: ethernet

      state: up

      ipv4:

        dhcp: true

        enabled: true

      ipv6:

        enabled: false

cat eno8.yaml   # If configuring NFS

apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy

metadata:

  name: ocp-nfs-policy

spec:

  nodeSelector:

    node-role.kubernetes.io/worker: ''

  desiredState:

    interfaces:

    - name: eno8

      description: Configuring eno8 on workers

      type: ethernet

      state: up

      ipv4:

        dhcp: true

        enabled: true

      ipv6:

        enabled: false

cat eno6.3032.yaml   # If configuring NVMe-TCP

apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy

metadata:

  name: ocp-nvme-tcp-a-policy

spec:

  nodeSelector:

     node-role.kubernetes.io/worker: ''

  desiredState:

    interfaces:

    - name: eno6.3032

      description: VLAN 3032 using eno6

      type: vlan

      state: up
      ipv4:

        dhcp: true

        enabled: true

      ipv6:

        enabled: false

      vlan:

        base-iface: eno6

        id: 3032

cat eno7.3042.yaml  # If configuring NVMe-TCP

apiVersion: nmstate.io/v1

kind: NodeNetworkConfigurationPolicy

metadata:

  name: ocp-nvme-tcp-b-policy

spec:

  nodeSelector:

     node-role.kubernetes.io/worker: ''

  desiredState:

    interfaces:

    - name: eno7.3042

      description: VLAN 3042 using eno7

      type: vlan

      state: up
      ipv4:

        dhcp: true

        enabled: true

      ipv6:

        enabled: false

      vlan:

        base-iface: eno7

        id: 3042

Step 48.                       Add the Node Network Configuration Policies to the OpenShift cluster:

oc create -f eno6.yaml
oc create -f eno7.yaml
oc create -f eno8.yaml # If configuring NFS
oc create -f eno6.3032.yaml # If configuring NVMe-TCP
oc create -f eno7.3042.yaml # If configuring NVMe-TCP

Step 49.                       The policies should appear under Networking > NodeNetworkConfigurationPolicy.

A screenshot of a computerAI-generated content may be incorrect.

Note:     If using combined control-plane and worker nodes, since all nodes have the worker role, the node selector will apply these policies to all nodes.

Step 50.                       Using ssh core@<node IP>, connect to each of the worker nodes and use the ifconfig -a and chronyc sources commands to verify the correct network and NTP setup of the servers.

Procedure 3.     Install the NVIDIA GPU Operator (optional)

If you have GPUs installed in your Cisco UCS servers, you need to install the Node Feature Discovery (NFD) Operator to detect NVIDIA GPUs and the NVIDIA GPU Operator to make these GPUs available to containers and virtual machines.

Step 1.    In the OpenShift web console, click Operators > OperatorHub.

Step 2.    Type Node Feature in the Filter box and then click the Node Feature Discovery Operator with Red Hat in the upper right corner. Click Install.

Step 3.    Do not change any settings and click Install.

Step 4.    When the Install operator is ready for use, click View Operator.

Step 5.    In the bar to the right of Details, click NodeFeatureDiscovery.

Step 6.    Click Create NodeFeatureDiscovery.

Step 7.    Click Create.

Step 8.    When the nfd-instance has a status of Available, Upgradeable, select Compute > Nodes.

Step 9.    Select a node that has one or more GPUs and then select Details.

Step 10.                       The following label should be present on the host:

Related image, diagram or screenshot

Note:     This label should appear on all nodes with GPUs.

Step 11.                       Return to Operators > OperatorHub.

Step 12.                       Type NVIDIA in the Filter box and then click on the NVIDIA GPU Operator. Click Install.

Step 13.                       Do not change any settings and click Install.

Step 14.                       When the Install operator is ready for use, click View Operator.

Step 15.                       In the bar to the right of Details, click ClusterPolicy.

Step 16.                       Click Create ClusterPolicy.

Step 17.                       Do not change any settings and scroll down and click Create. This will install the latest GPU driver.

Step 18.                       Wait for the gpu-cluster-policy Status to become Ready.

Step 19.                       Connect to a terminal window on the OCP-Installer machine. Type the following commands. The output shown is for two servers that are equipped with GPUs:

oc project nvidia-gpu-operator

Now using project "nvidia-gpu-operator" on server "https://api.ocp.flexpodb4.cisco.com:6443".

 

oc get pods

NAME                                                  READY   STATUS      RESTARTS        AGE

gpu-feature-discovery-jmlbr                           1/1     Running     0               6m45s

gpu-feature-discovery-l2l6n                           1/1     Running     0               6m41s

gpu-operator-6656d9fbf-wkkfm                          1/1     Running     0               11m

nvidia-container-toolkit-daemonset-gb8d9              1/1     Running     0               6m45s

nvidia-container-toolkit-daemonset-t4xdf              1/1     Running     0               6m41s

nvidia-cuda-validator-lc8zr                           0/1     Completed   0               4m33s

nvidia-cuda-validator-zxvnx                           0/1     Completed   0               4m39s

nvidia-dcgm-exporter-k6tnp                            1/1     Running     2 (4m7s ago)    6m41s

nvidia-dcgm-exporter-vb66w                            1/1     Running     2 (4m20s ago)   6m45s

nvidia-dcgm-hfgz2                                     1/1     Running     0               6m45s

nvidia-dcgm-qwm46                                     1/1     Running     0               6m41s

nvidia-device-plugin-daemonset-nr6m7                  1/1     Running     0               6m41s

nvidia-device-plugin-daemonset-rpvwr                  1/1     Running     0               6m45s

nvidia-driver-daemonset-416.94.202407231922-0-88zcr   2/2     Running     0               7m42s

nvidia-driver-daemonset-416.94.202407231922-0-bvph6   2/2     Running     0               7m42s

nvidia-node-status-exporter-bz79d                     1/1     Running     0               7m41s

nvidia-node-status-exporter-jgjbd                     1/1     Running     0               7m41s

nvidia-operator-validator-8fxqr                       1/1     Running     0               6m41s

nvidia-operator-validator-tbqtc                       1/1     Running     0               6m45s

Step 20.                       Connect to one of the nvidia-driver-daemonset containers and view the GPU status:

oc exec -it nvidia-driver-daemonset-416.94.202407231922-0-88zcr -- bash

[root@nvidia-driver-daemonset-417 drivers]# nvidia-smi

Thu Mar  6 13:13:33 2025      

+-----------------------------------------------------------------------------------------+

| NVIDIA-SMI 550.144.03             Driver Version: 550.144.03     CUDA Version: 12.4     |

|-----------------------------------------+------------------------+----------------------+

| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |

| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |

|                                         |                        |               MIG M. |

|=========================================+========================+======================|

|   0  NVIDIA L40S                    On  |   00000000:38:00.0 Off |                    0 |

| N/A   28C    P8             34W /  350W |       1MiB /  46068MiB |      0%      Default |

|                                         |                        |                  N/A |

+-----------------------------------------+------------------------+----------------------+

|   1  NVIDIA L40S                    On  |   00000000:D8:00.0 Off |                    0 |

| N/A   27C    P8             35W /  350W |       1MiB /  46068MiB |      0%      Default |

|                                         |                        |                  N/A |

+-----------------------------------------+------------------------+----------------------+

                                                                                         

+-----------------------------------------------------------------------------------------+

| Processes:                                                                              |

|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |

|        ID   ID                                                               Usage      |

|=========================================================================================|

|  No running processes found                                                             |

+-----------------------------------------------------------------------------------------+

Procedure 4.     Enable the GPU Monitoring Dashboard (Optional)

Step 1.    Using https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/enable-gpu-monitoring-dashboard.html, enable the GPU Monitoring Dashboard to monitor GPUs in the OpenShift Web-Console.

Deploy NetApp Trident

NetApp Trident is an open-source, fully supported storage orchestrator for containers and Kubernetes distributions. It was designed to help meet the containerized applications’ persistence demands using industry-standard interfaces, such as the Container Storage Interface (CSI). With Trident, microservices and containerized applications can take advantage of enterprise-class storage services provided by the NetApp portfolio of storage systems. More information about Trident can be found here: NetApp Trident Documentation. NetApp Trident can be installed via different methods. In this solution we will discuss installing the NetApp Trident version 25.2.1 using Trident Operator (installed using OperatorHub).

Trident Operator is a component used to manage the lifecycle of Trident. The operator simplifies the deployment, configuration, and management of Trident. The Trident operator is supported with OpenShift version 4.10 and above.

Note:     In this solution, we validated NetApp Trident with the ontap-nas driver and ontap-nas-flexgroup driver using the NFS protocol. We also validated the ontap-san driver for iSCSI and NVMe-TCP. Make sure to install only the backends and storage classes for the storage protocols you are using.

Procedure 1.     Install the NetApp Trident Operator

In this implementation NetApp Trident Operator minimally version 25.2.1 is installed.

Step 1.    In the OpenShift web console, click Operators > OperatorHub.

Step 2.    Type Trident in the Filter box and then click the NetApp Trident operator. Click Continue to accept the warning about Community Operators. Click Install.

Step 3.    Verify that at least Version 25.2.1 is selected. Click Install.

Step 4.    Once the operator is installed and ready for use, click View Operator.

Step 5.    In the bar to the right of Details, click Trident Orchestrator.

Step 6.    Click Create TridentOrchestrator. Click Create. Wait for the Status to become Installed.

Related image, diagram or screenshot

Step 7.    On the installer VM, check the Trident OpenShift pods after installation:

oc get pods -n trident

NAME                                  READY   STATUS    RESTARTS   AGE

trident-controller-5df9c4b4b5-sdlft   6/6     Running   0          7m57s

trident-node-linux-7pjfj              2/2     Running   0          7m57s

trident-node-linux-j4k92              2/2     Running   0          7m57s

trident-node-linux-kzb6n              2/2     Running   0          7m57s

trident-node-linux-q7ndq              2/2     Running   0          7m57s

trident-node-linux-tl2z8              2/2     Running   0          7m57s

trident-node-linux-vtfr6              2/2     Running   0          7m57s

Procedure 2.     Obtain tridentctl

Step 1.    From the OpenShift directory, download Trident software from GitHub and untar the .gz file to obtain the trident-installer folder:

mkdir trident
cd trident

wget https://github.com/NetApp/trident/releases/download/v25.02.1/trident-installer-25.02.1.tar.gz
tar -xvf trident-installer-25.02.1.tar.gz

Step 2.    Copy tridentctl to /usr/local/bin:

sudo cp trident-installer/tridentctl /bin/

Note:     If the NetApp Trident deployment fails and does not bring up the pods to Running state, use the tridentctl logs -l all -n trident command for debugging.

Note:     Before configuring the backends that Trident needs to use for user apps, go to: https://docs.netapp.com/us-en/trident/trident-reference/objects.html#kubernetes-customresourcedefinition-objects to understand the storage environment parameters and its usage in Trident.

Procedure 3.     Configure the Storage Backends in Trident

Step 1.    Configure the connections to the SVM on the NetApp storage array created for the OpenShift installation. For more options regarding storage backend configuration, go to https://docs.netapp.com/us-en/trident/trident-use/backends.html.

Step 2.    Create a backends directory and create the following backend definition files in that directory. Note that each backend definition includes a volume name template parameter that will give the volume configured on storage as part of the persistent volume a name that includes the backend name, the namespace, and the persistent volume claim (PVC) name (RequestName).

Note:     Customizable volume names are compatible with ONTAP on-premises drivers only. Also, these volume names do not apply to existing volumes.

Note:     In the following backend config definition files, we used “StoragePrefix” attribute under name template. The default value for StoragePrefix is “trident.”

cat backend_NFS.yaml

---

version: 1

storageDriverName: ontap-nas

backendName: ocp-nfs-backend

managementLIF: 10.102.2.50

dataLIF: 192.168.52.51

svm: OCP-SVM

username: vsadmin

password: <password>

useREST: true

defaults:

  spaceReserve: none

  exportPolicy: default

  snapshotPolicy: default

  snapshotReserve: '5'
  nameTemplate: "{{.config.StoragePrefix}}_{{.config.BackendName}}_{{.volume.Namespace}}_{{.volume.RequestName}}"

cat backend_NFS_flexgroup.yaml

---

version: 1

storageDriverName: ontap-nas-flexgroup

backendName: ocp-nfs-flexgroup

managementLIF: 10.102.2.50

dataLIF: 192.168.52.51

svm: OCP-SVM

username: vsadmin

password: <password>

useREST: true

defaults:

  spaceReserve: none

  exportPolicy: default

  snapshotPolicy: default

  snapshotReserve: '5'
  nameTemplate: "{{.config.StoragePrefix}}_{{.config.BackendName}}_{{.volume.Namespace}}_{{.volume.RequestName}}"

cat backend_iSCSI.yaml

---

version: 1

storageDriverName: ontap-san

backendName: ocp-iscsi-backend

managementLIF: 10.102.2.50

svm: OCP-SVM

sanType: iscsi

useREST: true

username: vsadmin

password: <password>

defaults:

  spaceReserve: none

  spaceAllocation: 'false'

  snapshotPolicy: default

  snapshotReserve: '5'
  nameTemplate: "{{.config.StoragePrefix}}_{{.config.BackendName}}_{{.volume.Namespace}}_{{.volume.RequestName}}"

cat backend_NVMe.yaml

---

version: 1

backendName: ocp-nvme-backend

storageDriverName: ontap-san

managementLIF: 10.102.2.50

svm: OCP-SVM

username: vsadmin

password: <password>

sanType: nvme

useREST: true

defaults:

  spaceReserve: none

  snapshotPolicy: default

  snapshotReserve: '5'
  nameTemplate: "{{.config.StoragePrefix}}_{{.config.BackendName}}_{{.volume.Namespace}}_{{.volume.RequestName}}"
 

Step 3.    Activate the storage backends for all storage protocols in your FlexPod:

tridentctl -n trident create backend -f backend_NFS.yaml
tridentctl -n trident create backend -f backend_NFS_flexgroup.yaml
tridentctl -n trident create backend -f backend_iSCSI.yaml
tridentctl -n trident create backend -f backend_NVMe.yaml
tridentctl -n trident get backend

+-------------------+---------------------+--------------------------------------+--------+------------+-----

|       NAME        |   STORAGE DRIVER    |                 UUID                 | STATE  | USER-STATE | VOLU

+-------------------+---------------------+--------------------------------------+--------+------------+-----

| ocp-nfs-backend   | ontap-nas           | 6bcb2421-a148-40bb-b7a4-9231e58efc2a | online | normal     |     

| ocp-nfs-flexgroup | ontap-nas-flexgroup | 68428a01-c5e6-4676-8cb5-e5521fc04bc7 | online | normal     |      

| ocp-iscsi-backend | ontap-san           | bbf1664d-1615-42d3-a5ed-1b8aed995a42 | online | normal     |      

| ocp-nvme-backend  | ontap-san           | 2b6861a2-6980-449a-b718-97002079e7f3 | online | normal     |      

+-------------------+---------------------+--------------------------------------+--------+------------+-----

Step 4.    Create the following Storage Class files:

cat storage-class-ontap-nfs.yaml

---

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

  name: ontap-nfs

  annotations:

    storageclass.kubernetes.io/is-default-class: "true"

provisioner: csi.trident.netapp.io

parameters:

  backendType: "ontap-nas"

  provisioningType: "thin"

  snapshots: "true"

allowVolumeExpansion: true

cat storage-class-ontap-nfs-flexgroup.yaml

---

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

  name: ontap-nfs-flexgroup

  annotations:

    storageclass.kubernetes.io/is-default-class: "false"

provisioner: csi.trident.netapp.io

parameters:

  backendType: "ontap-nas-flexgroup"

  provisioningType: "thin"

  snapshots: "true"

allowVolumeExpansion: true

cat storage-class-ontap-iscsi.yaml

---

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

  name: ontap-iscsi

parameters:

  backendType: "ontap-san"

  sanType: "iscsi"

  provisioningType: "thin"

  snapshots: "true"

allowVolumeExpansion: true

provisioner: csi.trident.netapp.io

cat storage-class-ontap-nvme.yaml

---

apiVersion: storage.k8s.io/v1

kind: StorageClass

metadata:

  name: ontap-nvme-tcp

parameters:

  backendType: "ontap-san"

  sanType: "nvme"

  provisioningType: "thin"

  snapshots: "true"

allowVolumeExpansion: true

provisioner: csi.trident.netapp.io

Step 5.    Create the storage classes:

oc create -f storage-class-ontap-nfs.yaml
oc create -f storage-class-ontap-nfs-flexgroup.yaml
oc create -f storage-class-ontap-iscsi.yaml
oc create -f storage-class-ontap-nvme.yaml

A screenshot of a computerDescription automatically generated

Step 6.    Create a VolumeSnapshotClass file:

cat ontap-volumesnapshot-class.yaml

---

apiVersion: snapshot.storage.k8s.io/v1

kind: VolumeSnapshotClass

metadata:

  name: ontap-snapclass

driver: csi.trident.netapp.io

deletionPolicy: Delete

Step 7.    Create the VolumeSnapshotClass using the above file.

oc create -f ontap-volumesnapshot-class.yaml

Step 8.    Create a test PersistentVolumeClaim (PVC). In the OpenShift console, click Storage > PersistentVolumeClaims. Select an appropriate project (for example, default) or create a new project and select it. On the right, click Create PersistentVolumeClaim.

Step 9.    Select a StorageClass and give the PVC a name. Select an Access mode (RWO or RWX for NFS classes, and RWO for iSCSI or NVMe-TCP classes). Set a size and select a Volume mode (normally Filesystem). Click Create to create the PVC. For illustration, we created a test PVC using “ontap-nvme-tcp” storage class.

A screenshot of a computerAI-generated content may be incorrect.

Step 10.                       Wait for the PVC to have a status of Bound. The PVC can now be attached to a container.

A screenshot of a computerAI-generated content may be incorrect.

Step 11.                       Create a NetApp volume snapshot of the PVC by clicking the to the right of the PVC and selecting Create snapshot. Adjust the snapshot name and click Create. The snapshot will appear under VolumeSnapshots and can also be seen in NetApp ONTAP System Manager under the corresponding PV with a modified name.

A screenshot of a computerAI-generated content may be incorrect.

A screenshot of a computerAI-generated content may be incorrect.

Note:     Make sure the volume name for the PV matches the volume name mapping from the backend configuration in the above screenshot.

Step 12.                       Delete the test PVC and snapshot by first selecting the Snapshot under Storage > VolumeSnapshots and clicking the to the right of the snapshot and selecting Delete VolumeSnapshot followed by Delete. Select the PVC under Storage > PersistentVolumeClaims and click the to the right of the PVC and select Delete PersistentVolumeClaim and click Delete.

NetApp DataOps Toolkit

The version 2.5.0 toolkit is currently compatible with Kubernetes versions 1.20 and above, and OpenShift versions 4.7 and above.

The toolkit is currently compatible with Trident versions 20.07 and above. Additionally, the toolkit is compatible with the following Trident backend types used in this validation:

    ontap-nas

    ontap-nas-flexgroup

More operations and capabilities about NetApp DataOps Toolkit are available and documented here: https://github.com/NetApp/netapp-dataops-toolkit

Prerequisites

The NetApp DataOps Toolkit for Kubernetes requires that Python 3.8 or above be installed on the local host. Additionally, the toolkit requires that pip for Python3 be installed on the local host. For more details regarding pip, including installation instructions, refer to the pip documentation.

Procedure 1.     NetApp DataOps Toolkit Installation

Step 1.    To install the NetApp DataOps Toolkit for Kubernetes on the OCP-Installer VM, run the following command:

sudo dnf install python3.11
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.11 get-pip.py
rm get-pip.py
python3.11 -m pip install netapp-dataops-k8s

NetApp DataOps Toolkit is used to create jupyterlab, clone jupyterlab, create a snapshot for a JupyterLab workspace, and so on.

Note:     You can use NetApp DataOps Toolkit to create Jupyter notebooks in this solution. For more information, go to: Create a new JupyterLab workspace.

Add an Additional Administrative User to the OpenShift Cluster

It is recommended to install a permanent administrative user to an OpenShift cluster to provide an alternative to logging in with the “temporary” kubeadmin user. This section shows how to build and install an HTPasswd user. Other Identity providers are also available.

Procedure 1.     Add the admin User

Step 1.    On the OCP-Installer VM in the auth directory where the kubeadmin-password and kubeconfig files are stored, create an admin.htpasswd file by typing:

htpasswd -c -B -b ./admin.htpasswd admin <password>

Adding password for user admin

Step 2.    Using Chrome or Firefox on the OCP-Installer VM, connect to the OpenShift console with the kubeadmin user. In the blue banner near the top of the page, click cluster OAuth configuration.

Step 3.    Use the Add pulldown under Identity providers to select HTPasswd. Click Browse and browse to the admin.htpasswd file created above. Highlight the file and click Select. Click Add. The htpasswd should now show up as an Identity provider.

A screenshot of a computerAI-generated content may be incorrect.

Step 4.    Click View authentication conditions for reconfiguration status and wait for the status to become Available.

Step 5.    Log out of the cluster and log back in with htpasswd and the admin user. Click Skip tour and log out of the cluster.

Step 6.    Log back into the cluster with kube:admin and the kubeadmin user. Select User Management > Users, then select the admin user. Select the RoleBindings tab and click Create binding.

Step 7.    Select for a Cluster-wide role binding and name the RoleBinding admin-cluster-admin. From the drop-down list under Role name, select the cluster-admin role. Click Create.

A screenshot of a computerAI-generated content may be incorrect.

Step 8.    Select User Management > Users, then select the admin user. Select the RoleBindings tab. Click the ellipses to the right of the user-settings RoleBinding to delete that RoleBinding, leaving only the cluster-admin RoleBinding.

Step 9.    You can now log out of the cluster and log back in with httpasswd and the admin user. On the top left select the Administrator role. You now have full cluster-admin access to the cluster.

Back up Cluster etcd

etcd is the key-value store for OpenShift, which persists the state of all resource objects.

For more information, see: https://docs.openshift.com/container-platform/4.16/backup_and_restore/control_plane_backup_and_restore/backing-up-etcd.html.

Procedure 1.     Back up etcd using a script

Assuming that the OCP-Installer VM is backed up regularly, regular OpenShift etcd backups can be taken and stored on the OCP-Installer VM.

Step 1.    Create a directory on the OCP-Installer VM and create a directory inside this directory to store the backups:

cd
cd ocp
mkdir etcd-backup
cd etcd-backup
mkdir etcd-backups

Note:     For more robust storage of etcd backups, an NFS volume can be created on the NetApp storage and mounted as etcd-backups in the example above.

Step 2.    The following script can be created and made executable to create and save the etcd backup:

cat etcd-backup-script


#! /usr/bin/bash

ssh core@<control0 ip> sudo /usr/local/bin/cluster-backup.sh /home/core/assets/backup

ssh core@<control0 ip> sudo chmod 644 /home/core/assets/backup/*

scp core@<control0 ip>:/home/core/assets/backup/* /home/admin/ocp/etcd-backup/etcd-backups/

ssh core@<control0 ip> sudo rm /home/core/assets/backup/*

chmod 600 /home/admin/ocp/etcd-backup/etcd-backups/*

find /home/admin/ocp/etcd-backup/etcd-backups -type f -mtime +30 -delete

Note:     This script deletes backups over 30 days old.

Step 3.    Using sudo, add execution of this script to /etc/crontab:

cat /etc/crontab

SHELL=/bin/bash

PATH=/sbin:/bin:/usr/sbin:/usr/bin

MAILTO=root

 

# For details see man 4 crontabs

 

# Example of job definition:

# .---------------- minute (0 - 59)

# |  .------------- hour (0 - 23)

# |  |  .---------- day of month (1 - 31)

# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...

# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat

# |  |  |  |  |

# *  *  *  *  * user-name  command to be executed

  0  2  *  *  * admin      /home/admin/ocp/etcd-backup/etcd-backup-script

Note:     This example backs up etcd data daily at 2:00 am.

Step 4.     In the event that an etcd restore is needed, the appropriate backup files would need to be copied back to a working control-plane node from the control-plane node:

ssh core@<control0 ip> sudo scp admin@<ocp installer vm IP>:/home/admin/ocp/etcd-backup/etcd-backups/snapshot_2024-11-12_165737.db /home/core/assets/backup/

ssh core@<control0 ip> sudo scp admin@<ocp installer vm IP>:/home/admin/ocp/etcd-backup/ static_kuberesources_2024-11-12_170543.tar.gz /home/core/assets/backup/

Step 5.    To recover the cluster, see https://docs.openshift.com/container-platform/4.17/hosted_control_planes/hcp_high_availability/hcp-recovering-etcd-cluster.html#hcp-recovering-etcd-cluster.

Add a Worker Node to an OpenShift Cluster

It is often necessary to scale up an OpenShift cluster by adding worker nodes to the cluster. This set of procedures describes the steps to add a node to the cluster. These procedures require a Cisco UCS Server connected to a set of Fabric Interconnects with all VLANs in the Server Profile configured.

Procedure 1.     Deploy a Cisco UCS Server Profile

Deploy a Cisco UCS Server Profile in Cisco Intersight.

Step 1.    Depending on the type of server added (Cisco UCS X-Series or Cisco UCS C-Series), clone the existing OCP-Worker template and create and adjust the template according to the server type.

Step 2.    From the Configure > Templates page, to the right of the OCP-Worker template setup above, click the and select Derive Profiles.

Step 3.    Under the Server Assignment, select Assign Now and select the Cisco UCS server that will be added to the cluster as a Worker Node. Click Next.

Step 4.    Assign the Server Profile an appropriate Name (for example, ocp-worker3) and select the appropriate Organization. Click Next.

Step 5.    Click Derive.

Step 6.    From the Infrastructure Service > Profiles page, to the right of the just-created profile, click the and select Deploy. Select Reboot Immediately to Activate and click Deploy.

Step 7.    Wait until the profile deploys and activates.

Step 8.    Click the server profile and go to Configuration > Identifiers and Inventory tabs note the server’s management IP, serial number, and the MAC of address of network interface eno5.

Procedure 2.     Create the Bare Metal Host (BMH)

Step 1.    On the OCP-Installer VM, create the following yaml file (the example shown is for worker node worker3.<domain-name>.<base-domain>:

cat bmh.yaml

---

apiVersion: v1

kind: Secret

metadata:

  name: worker3-bmc-secret

  namespace: openshift-machine-api

type: Opaque

data:

  username: ZmxleGFkbWlu

  password: SDFnaJQwbJQ=

---

apiVersion: metal3.io/v1alpha1

kind: BareMetalHost

metadata:

  name: worker3.ocp.flexpodb4.cisco.com

  namespace: openshift-machine-api

spec:

  online: True

  bootMACAddress: 00:25:B5:A2:0A:1B

  bmc:

    address: redfish://10.102.2.238/redfish/v1/Systems/WZP27020EG1

    credentialsName: ocp-worker3-bmc-secret

    disableCertificateVerification: True

  customDeploy:

    method: install_coreos

  externallyProvisioned: true

Note:     The username and password shown in this file are base64 encoded and can be obtained by typing “echo -ne <username> | base64”. In this case typing “echo -ne flexadmin | base64” yielded ZmxleGFkbWlu.

Note:     Also note the bmc address. In this case redfish is used to connect to the server. The URL has the server serial number at the end of the URL. If you would like to use IPMI over LAN instead of redfish, just put the server’s management IP for the bmc address.

Step 2.    Create the Bare Metal Host by typing the following:

oc project openshift-machine-api
oc create -f bmh.yaml

Step 3.    Verify that the BMH is created by selecting Compute > Bare Metal Hosts in the OpenShift Console.

Related image, diagram or screenshot

Note:     With this method of creating the BMH, the server is not inspected and some details such as Serial Number, Network Interfaces, and Disks are not retrieved from the server, but the Power Management functions do work.

Step 4.    In the OpenShift Console, select Compute > MachineSets. Click the to the right of the worker MachineSet and choose Edit Machine count. Use the plus sign to increase the count by one. Click Save.

Step 5.    Click Compute > Machines. A new machine in the Provisioning phase should now appear in the list.

Related image, diagram or screenshot

Procedure 3.     Install Red Hat CoreOS on the New Worker

Step 1.    Connect to the Red Hat Hybrid Cloud Console here: https://console.redhat.com/openshift/overview and log in with your Red Hat credentials. On the left, select Cluster List. Under Cluster List, click your cluster to open it.

Step 2.    Select the Add Hosts tab. Click Add hosts.

Step 3.    Do not change the field settings and click Next.

Step 4.    For Provisioning type, select Full image file. Browse to and select the SSH public key file used in the original cluster installation. Click Generate Discovery ISO.

Step 5.    If your Cisco UCS Servers have the Intersight Advantage license installed, follow the procedure from Step 12 to use the Cisco Intersight workflow to boot the server with the Discovery ISO. Then skip to Step 15.

Step 6.    Click Download Discovery ISO. The file will download to your machine. Click Close.

Note:     This is a slightly different ISO than the one used to install the cluster and must be downloaded to successfully add a node.

Step 7.    Place the downloaded Discovery ISO on your HTTP or HTTPS web server and use a web browser to obtain the URL of the ISO.

Step 8.     In Cisco Intersight, edit the Virtual Media Policy that is part of the server profile. On the Policy Details page, select Add Virtual Media.

Step 9.    In the Add Virtual Media dialogue, leave CDD selected and select HTTP/HTTPS. Provide a name for the mount and add the URL for File Location.

A screenshot of a computerAI-generated content may be incorrect.

Step 10.                       Click Add.

Step 11.                       Click Save.

Step 12.                       Under Infrastructure Service > Profiles, click the three dots to the right of the newly added worker server profile and select Deploy. Select only the bottom checkbox and select Deploy.

Note:     It is not necessary to redeploy the remaining server profiles. The Inconsistent status will be resolved after CoreOS is installed on the newly added worker.

Step 13.                       Click the to the right of the newly added worker profile and select Server Actions > Power > Power Cycle. In the popup, click Power Cycle. The reboot from the Discovery ISO can be monitored with a vKVM Console (Server Actions > Launch vKVM).

Step 14.                       Once the server has booted from the Discovery ISO, return to the Red Hat Hybrid Cloud Console. The newly added worker should appear in a few minutes. Wait for the Status to become Ready.

A screenshot of a computerAI-generated content may be incorrect.

Step 15.                       Click Install ready hosts. The installation of CoreOS will take several minutes.

Note:     Once the CoreOS installation completes (Status of Installed), the server will reboot, boot CoreOS, and reboot a second time.

Step 16.                       In Cisco Intersight, edit the vMedia policy and remove the virtual media mount. Go to Profiles > Server Profiles page, deploy the profile to the newly added worker profile without rebooting the host. The Inconsistent state on the remaining profiles should be cleared.

Step 17.                       In the OpenShift Console, select Compute > Nodes. Once the server reboots have completed, the newly added worker will appear in the list as Discovered. Click Discovered and then select Approve. Click Not Ready and select Approve.

Step 18.                       To link the Bare Metal Host to the Machine, select Compute > Machines. For the newly-added machine in the Provisioning Phase, note the last five characters in the machine name (for example, bqz2k).

Related image, diagram or screenshot

Step 19.                       Select Compute > Bare Metal Hosts. Select the BMH above the newly added BMH (for example, worker2). Select the YAML tab. Select and copy the entire consumerRef field right underneath the externallyProvisioned field.

A screenshot of a computer programAI-generated content may be incorrect.

Step 20.                       Select Compute > Bare Metal Hosts. Select the BMH for the newly added BMH (for example, worker3). Select the YAML tab. Place the cursor at the end of the externallyProvisioned: true line and press Enter to insert a new line. Backspace to the beginning of the line and then paste in the consumerRef field from the previous step. Replace the last five characters in the name field with the five characters noted above (for example, bqz2k).

A screen shot of a computerDescription automatically generated

Step 21.                       Click Save. Click Compute > Machines. The newly added machine should now be in the Provisioned Phase.

Related image, diagram or screenshot

Step 22.                       To link this machine to the node, click this newly added machine and select the YAML tab. Under spec, select and copy the entire providerID line.

A screenshot of a computerAI-generated content may be incorrect.

Step 23.                       Select Compute > Nodes. Select the newly-added node and select the YAML tab. Scroll down to find the spec field. Select and delete the {} to the right of spec: and press Enter to add a line. Paste in the providerID field with a two space indention and click Save.

Note:     The OpenShift nodes update frequently, and it will be necessary if an update has occurred to reload the YAML tab. After reloading, you may need to make the changes again.

Related image, diagram or screenshot

Step 24.                       Select Compute > Bare Metal Hosts. The newly-added BMH should now be linked to a node.

Related image, diagram or screenshot

Step 25.                       Select Compute > Machines. The newly-added machine should now be in the Provisioned as node Phase and should be linked to the node.

Related image, diagram or screenshot

Deploy a Sample Containerized Application

To demonstrate the installation of Red Hat OpenShift on Bare Metal on FlexPod Datacenter, a sample containerized application can be installed and run. In this case Stable Diffusion XL 1.0 will be run utilizing the Intel CPUs. If you have NVIDIA GPUs installed, refer to FlexPod Datacenter with Generative AI Inferencing - Cisco for details on deploying Stable Diffusion utilizing an NVIDIA GPU. This installation will use NetApp DataOps Toolkit, installed above, to install a Jupyter Notebook and then Intel OpenVino to run Stable Diffusion XL.

Procedure 1.     Deploy Jupyter Notebook

Step 1.    From the OCP-Installer VM, run the following command to deploy a jupyter notebook with an nfs persistent storage 90G disk, no gpus, and the latest PyTorch container at this time:

netapp_dataops_k8s_cli.py create jupyterlab --workspace-name=sd-xl -c ontap-nfs --size=90Gi --nvidia-gpu=0 -i nvcr.io/nvidia/pytorch:25.03-py3

Step 2.    Enter and verify a password for the notebook. The notebook is created in the ‘default’ namespace. The deployment will take a few minutes to reach Ready state:

Setting workspace password (this password will be required in order to access the workspace)...

Enter password:

Verify password:

 

Creating persistent volume for workspace...

Creating PersistentVolumeClaim (PVC) 'ntap-dsutil-jupyterlab-sd-xl' in namespace 'default'.

PersistentVolumeClaim (PVC) 'ntap-dsutil-jupyterlab-sd-xl' created. Waiting for Kubernetes to bind volume to PVC.

Volume successfully created and bound to PersistentVolumeClaim (PVC) 'ntap-dsutil-jupyterlab-sd-xl' in namespace 'default'.

 

Creating Service 'ntap-dsutil-jupyterlab-sd-xl' in namespace 'default'.

Service successfully created.

 

Creating Deployment 'ntap-dsutil-jupyterlab-sd-xl' in namespace 'default'.

Deployment 'ntap-dsutil-jupyterlab-sd-xl' created.

Waiting for Deployment 'ntap-dsutil-jupyterlab-sd-xl' to reach Ready state.

Deployment successfully created.

 

Workspace successfully created.

To access workspace, navigate to http://10.102.2.211:31809

Step 3.    Once the Workspace is successfully created, use a Web browser on a machine with access to the Baremetal subnet to connect to the provided URL. Log in with the password provided.

A screenshot of a computerDescription automatically generated

Step 4.    Click the Terminal icon to launch a terminal in the PyTorch container. The Stable Diffusion XL 1.0 model by default is stored in /root/.cache. To redirect this to the persistent storage (mounted on /workspace), run the following commands:

mkdir /workspace/.cache
cp -R /root/.cache/* /workspace/.cache/
rm -rf /root/.cache
ln -s /workspace/.cache /root/.cache

Step 5.    Install Diffusers and OpenVino:

pip install --upgrade diffusers transformers scipy accelerate

pip install optimum[openvino]
pip install openvino==2024.6.0

Step 6.    Click the + icon to add a window and select Python File. Add the following:

from optimum.intel import OVStableDiffusionXLPipeline

 

model_id = "stabilityai/stable-diffusion-xl-base-1.0"

pipeline = OVStableDiffusionXLPipeline.from_pretrained(model_id)

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k resolution"

image = pipeline(prompt).images[0]

 

image.save("astronaut_intel.png")

Step 7.    Right-click untitled.py and select Rename Python File. Name the file Run-SDXL.py and choose Rename. Click the x to the right of Run-SDXL.py to close the file and click Save.

Step 8.    In the Terminal window, run Stable Diffusion XL by typing python Run-SDXL.py. On the first run, the Stable Diffusion XL model will be downloaded to persistent storage. Subsequent runs will take less time.

Step 9.    Once the run is complete, double-click the astronaut_intel.png file from the list on the left.

A screenshot of a computer screenDescription automatically generated

Step 10.                       From the OpenShift console, on the left click Workloads > Pods. In the center pane, from the drop-down list select the default Project.

A screenshot of a computerAI-generated content may be incorrect.

Step 11.                       On the left, select Deployments. In the center pane, select the Jupyterlab Deployment and then select the YAML tab. This info can be used as a guide to create a yaml file to do a command line deployment using “oc” of a pod. The YAML can also be modified to customize the deployment. If you edit the deployment, you will need to delete the corresponding pod to spin the container and you will then need to add the symbolic link and reinstall the python libraries with pip.

About the Authors

John George, Technical Marketing Engineer, Cisco Systems, Inc.

John has been involved in designing, developing, validating, and supporting the FlexPod Converged Infrastructure since it was developed more than 13 years ago. Before his role with FlexPod, he supported and administered a large worldwide training network and VPN infrastructure. John holds a master’s degree in Computer Engineering from Clemson University.

Kamini Singh, Technical Marketing Engineer, Hybrid Cloud Infra & OEM Solutions, NetApp

Kamini Singh is a Technical Marketing engineer at NetApp. She has more than five years of experience in data center infrastructure solutions. Kamini focuses on FlexPod hybrid cloud infrastructure solution design, implementation, validation, automation, and sales enablement. Kamini holds a bachelor’s degree in Electronics and Communication and a master’s degree in Communication Systems.

Acknowledgements

For their support and contribution to the design, validation, and creation of this Cisco Validated Design, the authors would like to thank:

    Archana Sharma, Technical Marketing Engineer, Cisco Systems, Inc.

    Paniraja Koppa, Technical Marketing Engineer, Cisco Systems, Inc.

Feedback

For comments and suggestions about this guide and related guides, join the discussion on Cisco Community at https://cs.co/en-cvds.

CVD Program

ALL DESIGNS, SPECIFICATIONS, STATEMENTS, INFORMATION, AND RECOMMENDATIONS (COLLECTIVELY, "DESIGNS") IN THIS MANUAL ARE PRESENTED "AS IS," WITH ALL FAULTS. CISCO AND ITS SUPPLIERS DISCLAIM ALL WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THE DESIGNS, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

THE DESIGNS ARE SUBJECT TO CHANGE WITHOUT NOTICE. USERS ARE SOLELY RESPONSIBLE FOR THEIR APPLICATION OF THE DESIGNS. THE DESIGNS DO NOT CONSTITUTE THE TECHNICAL OR OTHER PROFESSIONAL ADVICE OF CISCO, ITS SUPPLIERS OR PARTNERS. USERS SHOULD CONSULT THEIR OWN TECHNICAL ADVISORS BEFORE IMPLE-MENTING THE DESIGNS. RESULTS MAY VARY DEPENDING ON FACTORS NOT TESTED BY CISCO.

CCDE, CCENT, Cisco Eos, Cisco Lumin, Cisco Nexus, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, the Cisco logo, DCE, and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unified Computing System (Cisco UCS), Cisco UCS B-Series Blade Servers, Cisco UCS C-Series Rack Servers, Cisco UCS S-Series Storage Servers, Cisco UCS X-Series, Cisco UCS Manager, Cisco UCS Management Software, Cisco Unified Fabric, Cisco Application Centric Infrastructure, Cisco Nexus 9000 Series, Cisco Nexus 7000 Series. Cisco Prime Data Center Network Manager, Cisco NX-OS Software, Cisco MDS Series, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick Study,  LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels, ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trade-marks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries. (LDW_P5)

All other trademarks mentioned in this document or website are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (0809R)

Learn more