Cisco UCS Common Platform Architecture Version 2 (CPA v2) for Big Data with Pivotal HD and HAWQ [Design Zone for Data Centers]

Table Of Contents

About the Authors

Acknowledgment

About Cisco Validated Design (CVD) Program

Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data with Pivotal HD and HAWQ

Audience

Introduction

Cisco UCS Common Platform Architecture for Big Data

Pivotal HD and HAWQ

Key Features and Benefits

Solution Overview

Rack and PDU Configuration

Server Configuration and Cabling

Software Distributions and Versions

Pivotal HD

HAWQ

RHEL

Software Versions

Fabric Configuration

Performing Initial Setup of Cisco UCS 6296 Fabric Interconnects

Configure Fabric Interconnect A

Configure Fabric Interconnect B

Logging Into Cisco UCS Manager

Upgrading UCSM Software to Version 2.2(1b)

Adding Block of IP Addresses for KVM Access

Editing the Chassis/FEX Discovery Policy

Enabling the Server Ports and Uplink Ports

Creating Pools for Service Profile Templates

Creating an Organization

Creating MAC Address Pools

Configuring VLANs

Creating Server Pool

Creating Policies for Service Profile Templates

Creating a Host Firmware Package Policy

Creating QoS Policies

Creating the Best Effort Policy

Creating a Platinum Policy

Setting Jumbo Frames

Creating a Local Disk Configuration Policy

Creating a Server BIOS Policy

Creating a Boot Policy

Creating a Service Profile Template

Configuring Network Settings for the Template

Configuring a Storage Policy for the Template

Configuring a vNIC/vHBA Placement for the Template

Configuring a Server Boot Order for the Template

Configuring Server Assignment for the Template

Configuring Operational Policies for the Template

Configuring Disk Drives for Operating System on NameNodes

Configuring Disk Drives for Operating System on DataNodes

Installing Red Hat Linux 6.4 with KVM

Post OS Install Configuration

Setting Up Password-less Login

Installing and Configuring Parallel SSH

Installing Parallel-SSH

Installing Cluster Shell

Configuring /etc/hosts and DNS

Creating RedHat Local Repository

Upgrading LSI driver

Installing httpd

Enabling Syslog

Setting Ulimit

Disabling SELinux

JDK Installation

Download Java SE 7 Development Kit (JDK)

Install JDK7 on All Node

Setting TCP Retries

Disabling the Linux Firewall

Configuring Data Drives on Data Nodes

Configuring the Filesystem for DataNodes

Installing Pivotal HD Using Pivotal Command Center

Role Assignment

Installing Command Center

Repo to Install PHD Services

Enable the PHD Services

Launching Pivotal Command Center

Configuring and Deploying a Cluster

Configuration Directives for PHD Services

Cluster Services - Global Configuration Variables (Cluster Config.xml)

HDFS

YARN

Post Installation for HAWQ

Starting the Cluster

Initializing HAWQ

Pivotal Command Center Dashboard

Conclusion

Bill of Material

Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data with Pivotal HD and HAWQ

Building a 64 Node Hadoop Cluster with Pivotal HD for Apache Hadoop with YARN and HAWQ

Last Updated: February 5, 2014

Building Architectures to Solve Business Problems

About the Authors

Raghunath Nambiar, Cisco Systems

Raghunath Nambiar is a Distinguished Engineer at Cisco's Data Center Business Group. His current responsibilities include emerging technologies and big data strategy.

Suhas Gogate, Pivotal

Suhas Gogate is a Lead Architect in Hadoop Engineering group focused on overall design and architecture of Pivotal Hadoop (PHD) distribution and its integration with Cloud Foundry platform. He is also founder and PMC member for Apache Ambari project and a lead contributor of Hadoop performance advisor, Hadoop Vaidya project under Apache Hadoop.

Karthik Kulkarni, Cisco Systems

Karthik Kulkarni is a Technical Marketing Engineer in the Data Center Solutions Group at Cisco Systems. He is part of the solution engineering team focusing on big data infrastructure and performance.

Acknowledgment

The authors acknowledge contributions of Manan Trivedi, Ashwin Manjunatha, Don Turnbull, Rui Xiao, Joel Dodd, and Sindhu Sudhir in developing the Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data with Big Data with Pivotal HD and HAWQ.

About Cisco Validated Design (CVD) Program

The CVD program consists of systems and solutions designed, tested, and documented to facilitate faster, more reliable, and more predictable customer deployments. For more information visit:

http://www.cisco.com/go/designzone

ALL DESIGNS, SPECIFICATIONS, STATEMENTS, INFORMATION, AND RECOMMENDATIONS (COLLECTIVELY, "DESIGNS") IN THIS MANUAL ARE PRESENTED "AS IS," WITH ALL FAULTS. CISCO AND ITS SUPPLIERS DISCLAIM ALL WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THE DESIGNS, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

THE DESIGNS ARE SUBJECT TO CHANGE WITHOUT NOTICE. USERS ARE SOLELY RESPONSIBLE FOR THEIR APPLICATION OF THE DESIGNS. THE DESIGNS DO NOT CONSTITUTE THE TECHNICAL OR OTHER PROFESSIONAL ADVICE OF CISCO, ITS SUPPLIERS OR PARTNERS. USERS SHOULD CONSULT THEIR OWN TECHNICAL ADVISORS BEFORE IMPLEMENTING THE DESIGNS. RESULTS MAY VARY DEPENDING ON FACTORS NOT TESTED BY CISCO.

CCDE, CCENT, Cisco Eos, Cisco Lumin, Cisco Nexus, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, the Cisco logo, DCE, and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick Study, IronPort, the IronPort logo, LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels, ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries.

All other trademarks mentioned in this document or website are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other company. (0809R)

© 2014 Cisco Systems, Inc. All rights reserved.

Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data with Pivotal HD and HAWQ

Audience

This document describes the architecture and deployment procedures of Pivotal HD distribution for Apache Hadoop with YARN and HAWQ on a 64 node cluster based Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data. The intended audience of this document includes, but is not limited to, sales engineers, field consultants, professional services, IT managers, partner engineering and customers who want to deploy Pivotal HD and HAWQ on the Cisco UCS CPA v2 for Big Data.

Introduction

Hadoop has become a strategic data platform embraced by mainstream enterprises as it offers the fastest path for businesses to unlock value in big data while maximizing existing investments. The Pivotal HD Distribution for Apache Hadoop is based on the second generation of MapReduce with YARN and is truly enterprise grade having been built, tested and hardened with enterprise rigor. The combination of Pivotal HD and Cisco UCS provides industry-leading platform for Hadoop based applications.

Cisco UCS Common Platform Architecture for Big Data

Cisco UCS Common Platform Architecture (CPA) is a popular big data solution. It has been widely adopted for finance, healthcare, service provider, entertainment, insurance, and public-sector environments. The new Cisco UCS CPA Version 2 (v2) for Big Data improves both performance and capacity. With complete, easy-to-order packages that include computing, storage, connectivity, and unified management features, Cisco UCS CPA v2 for Big Data helps enable rapid deployment, delivers predictable performance, and reduces total cost of ownership (TCO). Cisco UCS CPA v2 for Big Data offers:

•Cisco UCS servers with the versatile Intel® Xeon® E5-2600 v2 product family

•Transparent cache acceleration option with Cisco UCS Nytro MegaRAID technology

•Unified management and unified fabric across enterprise applications.

The Cisco UCS solution for Pivotal HD and HAWQ is based on Cisco Common Platform Architecture Version2 (CPA v2) for Big Data, a highly scalable architecture designed to meet a variety of scale-out application demands with seamless data integration and management integration capabilities built using the following components:

•Cisco UCS 6200 Series Fabric Interconnects—provide high-bandwidth, low-latency connectivity for servers, with integrated, unified management provided for all connected devices by Cisco UCS Manager. Deployed in redundant pairs, Cisco fabric interconnects offer the full active-active redundancy, performance, and exceptional scalability needed to support the large number of nodes that are typical in clusters serving Big Data applications. Cisco UCS Manager enables rapid and consistent server configuration using service profiles and automation of the ongoing system maintenance activities such as firmware updates across the entire cluster as a single operation. Cisco UCS Manager also offers advanced monitoring with options to raise alarms and send notifications about the health of the entire cluster.

•Cisco UCS 2200 Series Fabric Extenders—extends the network into each rack, acting as remote line cards for fabric interconnects and providing highly scalable and extremely cost-effective connectivity for a large number of nodes.

•Cisco UCS C-Series Rack-Mount Servers—Cisco UCS C240M3 Rack-Mount Servers are 2-socket servers based on Intel Xeon E-2600 v2 series processors and supporting up to 768 GB of main memory. 24 Small Form Factor (SFF) disk drives are supported in performance optimized option and 12 Large Form Factor (LFF) disk drives are supported in capacity option, along with 4 Gigabit Ethernet LAN-on-motherboard (LOM) ports.

•Cisco UCS Virtual Interface Cards (VICs)—the unique Cisco UCS Virtual Interface Cards incorporate next-generation converged network adapter (CNA) technology from Cisco, and offer dual 10Gbps ports designed for use with Cisco UCS C-Series Rack-Mount Servers. Optimized for virtualized networking, these cards deliver high performance and bandwidth utilization and support up to 256 virtual devices.

•Cisco UCS Manager—resides within the Cisco UCS 6200 Series Fabric Interconnects. It makes the system self-aware and self-integrating, managing the system components as a single logical entity. Cisco UCS Manager can be accessed through an intuitive graphical user interface (GUI), a command-line interface (CLI), or an XML application-programming interface (API). Cisco UCS Manager uses service profiles to define the personality, configuration, and connectivity of all resources within Cisco UCS, radically simplifying provisioning of resources so that the process takes minutes instead of days. This simplification allows IT departments to shift their focus from constant maintenance to strategic business initiatives.

Pivotal HD and HAWQ

Pivotal offers an enterprise-ready, fully supported Hadoop distribution that allows enterprises to accelerate their Hadoop investment. The Pivotal Hadoop distribution, Pivotal HD, enables enterprises to harness, and quickly gain insight from the massive data being driven by new apps, systems, machines, and the torrent of customer sources.

Pivotal HAWQ adds SQL's expressive power to Hadoop to accelerate data analytics projects, simplify development while increasing productivity, expand Hadoop's capabilities and cut costs. HAWQ can help your organization render Hadoop queries faster than any Hadoop-based query interface on the market by adding rich, proven, parallel SQL processing facilities. HAWQ leverages your existing business intelligence and analytics products and your workforce's SQL skills to bring more than 100X performance improvement to a wide range of query types and workloads. The fast SQL query engine on Hadoop, HAWQ is 100 percent SQL compliant.

Key Features and Benefits

Pivotal HD is a commercially supported, enterprise-capable distribution of the Apache Hadoop stack. It includes Hadoop Distributed File System (HDFS), MapReduce, Hive, Pig, HBase, Zookeeper, Yarn and Mahout. Pivotal also includes a series of value added services that help enterprises manage and operate an enterprise class Hadoop distribution.

•Simple and Complete Cluster Management: Command Center: Command Center is a robust cluster management tool that allows your users to install, configure, monitor and manage Hadoop components and services through a Web graphical interface. It simplifies Hadoop cluster installation, upgrading and expansion using a comprehensive dashboard with instant views of the health of the cluster and key performance metrics. Users can view live and historical information about the host, application and job-level metrics across the entire Pivotal HD cluster. Command Center also provides Command-Line Interface and Web Services APIs for integration into enterprise monitoring services.

•Ingest Management for Hadoop Clusters: Data Loader: Pivotal HD includes a Data Loader that accelerates data ingest, with parallelized HDFS data loading that is faster and easier to use that native methods of ingesting data into the HDFS file system. Data Loader allows users to load large numbers of data feeds in real time, with linear scalability support. An advanced big data ingesting tool, Data Loader can be used to load petabytes of data into the Pivotal HD platform. It utilizes the MapReduce distributed processing paradigm to load data at wire speed. Data Loader also provides a pipeline for moving big data in bulk or as streams in parallel, and it supports bulk/batch loading with high throughput for big data and streaming with low latency for fast data.

Easily accessed through a highly interactive graphical web interface, Data Loader lets you deploy code, partition data into chunks, split jobs into multiple tasks and schedule the tasks, while taking into account data locality and network topology. It also handles job failures. Data Loader allows you to easy migrate data between large data cluster deployments. Users can also stage and batch data loading for offline data analytics, as well as real-time data streaming for online incremental data analytics.

•Storage Layer Abstraction: Unified Storage Service: In a large enterprise, it is not uncommon to have big data in different formats, sizes and stored across different file systems. Moreover, enterprises typically have a multitude of storage systems with gold mines of information that can be put to use for strategic insights. Yet moving this data to a central "data lake" environment would be time consuming and costly. While Apache Hadoop Distribution provides a variety of file systems that you can use to read data, the parallel processing paradigm works best when the data is already in HDFS.

Unified Storage Service (USS) is a service on Pivotal HD that provides a unified namespace view of data across multiple file storage systems (e.g., other HDFS, NFS shares, FTP Site and Isilon). USS enables your users to access data across multiple file systems, without copying the data "to and from" HDFS. USS is implemented as a pseudo Hadoop file system (HDFS) that delegates file system operations directed at it to other file systems in an HDFS-like way. It mounts multiple file systems and maintains a centralized view of the mount points, which are accessible through the URI scheme.

•Distributed Processing Solutions with Apache Hadoop: Spring Data makes it easier for your organization to build Spring-powered applications that use new data-access technologies such as non-relational databases, map-reduce frameworks, and cloud-based data services. Spring for Apache Hadoop simplifies developing big-data applications by providing a unified configuration model and easy-to-use APIs for using HDFS, MapReduce, Pig and Hive. It also provides integration with other Spring ecosystem projects such as Spring Integration and Spring Batch, enabling users to develop solutions for big data ingest/export and Hadoop workflow orchestration.

Solution Overview

The current version of the Cisco UCS CPA v2 for Big Data offers the following configuration depending on the compute and storage requirements:

Table 1 Cisco UCS CPA v2 Configuration Details

Performance and Capacity Balanced

Capacity Optimized for Pivotal HD

Capacity Optimized for Pivotal HD with HAWQ

16 Cisco UCS C240 M3 Rack Servers, each with:

•2 Intel Xeon processors E5-2660 v2

•256GB of memory

•LSI MegaRaid 9271CV 8i card

•24 1TB 7.2K SFF SAS drives (384 TB total)

16 Cisco UCS C240 M3 Rack Servers, each with:

•2 Intel Xeon processors E5-2640 v2

•128GB of memory

•LSI MegaRaid 9271CV 8i card

•12 4TB 7.2 LFF SAS drives (768 TB total)

16 Cisco UCS C240 M3 Rack Servers, each with:

•2 Intel Xeon processors E5-2670 v2

•128GB of memory

•LSI MegaRaid 9271CV 8i card

•12 4TB 7.2 LFF SAS drives (768TB total)

Note For running only Pivotal HD without HAWQ, Capacity Optimized configuration is recommended.

This CVD describes the installation process for a 64-node Capacity Optimized for Pivotal HD with HAWQ configuration.

The Capacity Optimized for Pivotal HD with HAWQ cluster configuration consists of the following:

•Two Cisco UCS 6296UP Fabric Interconnects

•Eight Cisco Nexus 2232PP Fabric Extenders (two per rack)

•64 Cisco UCS C240M3 Rack-Mount Servers (16 per rack)

•Four Cisco R42610 standard racks

•Eight vertical power distribution units (PDU) (country specific)

Rack and PDU Configuration

Each rack consists of two vertical PDU. The master rack consists of two Cisco UCS 6296UP Fabric Interconnects, two Cisco Nexus 2232PP Fabric Extenders and sixteen Cisco UCS C240M3 Servers, connected to each of the vertical PDUs for redundancy; thereby, ensuring availability during power source failure. The expansion racks also consists of two Cisco Nexus 2232PP Fabric Extenders and sixteen Cisco UCS C240M3 Servers are connected to each of the vertical PDUs for redundancy; thereby, ensuring availability during power source failure, similar to master rack.

Note Contact your Cisco representative for country specific information.

Table 2 and Table 3 describe the rack configurations of rack 1 (master rack) and racks 2-4 (expansion racks).

Table 2 Rack Configuration for the Master Rack (Rack 1)

Cisco 42U Rack

Master Rack

42

Cisco UCS FI 6296UP

41

40

Cisco UCS FI 6296UP

39

38

Cisco Nexus FEX 2232PP

37

Cisco Nexus FEX 2232PP

36

Unused

35

Unused

34

Unused

33

Unused

32

Cisco UCS C240M3

31

30

Cisco UCS C240M3

29

28

Cisco UCS C240M3

27

26

Cisco UCS C240M3

25

24

Cisco UCS C240M3

23

22

Cisco UCS C240M3

21

20

Cisco UCS C240M3

19

18

Cisco UCS C240M3

17

16

Cisco UCS C240M3

15

14

Cisco UCS C240M3

13

12

Cisco UCS C240M3

11

10

Cisco UCS C240M3

9

8

Cisco UCS C240M3

7

6

Cisco UCS C240M3

5

4

Cisco UCS C240M3

3

2

Cisco UCS C240M3

1

Table 3 Rack Configuration for the Expansion Racks (Racks 2 - 4)

Cisco 42U Rack

Master Rack

42

Unused

41

Unused

40

Unused

39

Unused

38

Cisco Nexus FEX 2232PP

37

Cisco Nexus FEX 2232PP

36

Unused

35

Unused

34

Unused

33

Unused

32

Cisco UCS C240M3

31

30

Cisco UCS C240M3

29

28

Cisco UCS C240M3

27

26

Cisco UCS C240M3

25

24

Cisco UCS C240M3

23

22

Cisco UCS C240M3

21

20

Cisco UCS C240M3

19

18

Cisco UCS C240M3

17

16

Cisco UCS C240M3

15

14

Cisco UCS C240M3

13

12

Cisco UCS C240M3

11

10

Cisco UCS C240M3

9

8

Cisco UCS C240M3

7

6

Cisco UCS C240M3

5

4

Cisco UCS C240M3

3

2

Cisco UCS C240M3

1

Server Configuration and Cabling

The Cisco UCS C240M3 Rack Server (Capacity Optimized for Pivotal HD with HAWQ configuration) is equipped with Intel Xeon E5-2670 v2 processors, 128GB of memory, Cisco UCS Virtual Interface Card (VIC)1225, LSI MegaRAID SAS 9271 CV-8i storage controller and 12x4TB 7.2K Serial Attached SCSI (SAS) disk drives.

Figure 1 illustrates the physical connectivity of Cisco UCS C240M3 Servers to Cisco Nexus 2232PP Fabric Extenders and Cisco UCS 6296UP Fabric Interconnects.

Figure 1 Fabric Topology

Figure 2 illustrates the ports of the Cisco Nexus 2232PP Fabric Extender connecting Cisco UCS C240M3 Servers. Sixteen Cisco UCS C240M3 Servers are used in the master rack configurations offered by the Cisco.

Figure 2 Connectivity Diagram of Cisco Nexus 2232PP FEX and Cisco UCS C240M3 Servers

For more information on physical connectivity and single-wire management, see:

http://www.cisco.com/en/US/docs/unified_computing/ucs/c-series_integration/ucsm2.1/b_UCSM2-1_C-Integration_chapter_010.html

For more information on physical connectivity illustrations and cluster setup, see:

http://www.cisco.com/en/US/docs/unified_computing/ucs/c-series_integration/ucsm2.1/b_UCSM2-1_C-Integration_chapter_010.html#reference_FE5B914256CB4C47B30287D2F9CE3597

Figure 3 depicts a 64-node cluster, and each link represents 8 x 10 Gigabit link.

Figure 3 64 -Node Cluster Configuration

Software Distributions and Versions

Pivotal HD

Pivotal HD GA version supported is PHD 1.1.1.0 and above. For more information, see:

http://gopivotal.com/

HAWQ

HAWQ also know as Pivotal Advance Database Service (PADS) is an optional service which is part of Pivotal HD distribution and GA version supported is PADS 1.1.3.0 and above.

RHEL

The Operating System supported is Red Hat Enterprise Linux Server 6.4. For more information on the Linux support, see:

www.redhat.com.

Software Versions

Table 4 describes the software versions tested and validated in this document.

Table 4 Software Versions Summary

Layer

Component

Version or Release

Compute

Cisco UCS C240M3

1.5.4f

Network

Cisco UCS 6296UP

UCS 2.2(1b)A

Cisco UCS VIC1225 Firmware

2.2(1b)

Cisco UCS VIC1225 Driver

2.1.1.41

Cisco Nexus 2232PP

5.2(3)N2(2.21b)

Storage

LSI 9271 CV-8i Firmware

23.12.0-0021

LSI 9271 CV-8i Driver

06.601.06.00

Software

Red Hat Enterprise Linux Server

6.4 (x86_64)

Cisco UCS Manager

2.2(1b)

Pivotal HD

1.1.1.0

Pivotal HAWQ

1.1.3.0

Note To download the latest drivers, see: http://software.cisco.com/download/release.html?mdfid=284296254&flowid=31743&softwareid=283853158&release=1.5.1&relind=AVAILABLE&rellifecycle=&reltype=latest

Fabric Configuration

This section provides details for configuring a fully redundant, highly available Cisco UCS 6296 Fabric Interconnect.

1. Initial setup of the Fabric Interconnect A and B.

2. Connect to IP address of Fabric Interconnect A using web browser.

3. Launch the Cisco UCS Manager.

4. Edit the chassis discovery policy.

5. Enable server and uplink ports.

6. Create pools and polices for service profile template.

7. Create Cisco Service Profile template and 64 service profiles.

8. Start discover process.

9. Associate to server.

Performing Initial Setup of Cisco UCS 6296 Fabric Interconnects

This section describes the steps to perform the initial setup of the Cisco UCS 6296 Fabric Interconnects A and B.

Configure Fabric Interconnect A

Follow these steps to configure the Fabric Interconnect A:

1. Connect to the console port on the first Cisco UCS 6296 Fabric Interconnect.

2. At the prompt to enter the configuration method, enter console to continue.

3. If asked to either perform a new setup or restore from backup, enter setup to continue.

4. Enter y to continue to set up a new Fabric Interconnect.

5. Enter y to enforce strong passwords.

6. Enter the password for the admin user.

7. Enter the same password again to confirm the password for the admin user.

8. When asked if this fabric interconnect is part of a cluster, enter y to continue.

9. Enter A for the switch fabric.

10. Enter the cluster name for the system name.

11. Enter the Mgmt0 IPv4 address.

12. Enter the Mgmt0 IPv4 netmask.

13. Enter the IPv4 address of the default gateway.

14. Enter the cluster IPv4 address.

15. To configure DNS, enter y.

16. Enter the DNS IPv4 address.

17. Enter y to set up the default domain name.

18. Enter the default domain name.

19. Review the settings that were printed to the console, and enter yes to save the configuration.

20. Wait for the login prompt to make sure the configuration has been saved.

Configure Fabric Interconnect B

Follow these steps to configure the Fabric Interconnect B:

1. Connect to the console port on the second Cisco UCS 6296 Fabric Interconnect.

2. When prompted to enter the configuration method, enter console to continue.

3. The installer detects the presence of the partner fabric interconnect and adds this fabric interconnect to the cluster. Enter y to continue the installation.

4. Enter the admin password that was configured for the first Fabric Interconnect.

5. Enter the Mgmt0 IPv4 address.

6. Enter yes to save the configuration.

7. Wait for the login prompt to confirm that the configuration has been saved.

Note For more information on configuring Cisco UCS 6200 Series Fabric Interconnect, see: http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/2.0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_0100.html

Logging Into Cisco UCS Manager

Follow these steps to login to Cisco UCS Manager.

1. Open a Web browser and navigate to the Cisco UCS 6296 Fabric Interconnect cluster address.

2. Click the Launch link to download the Cisco UCS Manager software.

3. If prompted, accept the security certificates.

4. When prompted, enter the username as admin and the administrative password.

5. Click Login.

Upgrading UCSM Software to Version 2.2(1b)

This document assumes the use of UCS 2.2(1b). Refer to Upgrading between Cisco UCS 2.0 Releases to upgrade the Cisco UCS Manager software and UCS 6296 Fabric Interconnect software to version 2.2(1b). Also, make sure the UCS C-Series version 2.2(1b) software bundles is installed on the Fabric Interconnects.

Adding Block of IP Addresses for KVM Access

Follow these steps to create a block of KVM IP addresses for the server access in Cisco UCS environment.

1. Click the LAN tab.

2. Select Pools > IPPools > IP Pool ext-mgmt.

3. Right-click on IP Pool ext-mgmt.

4. Select Create Block of IPv4 Addresses as sown in Figure 4

Figure 4 Adding Block of IPv4 Addresses for KVM Access Part 1

5. Enter the starting IP address of the block and number of IPs needed, the subnet and the gateway information as shown in Figure 5.

Figure 5 Adding Block of IPv4 Addresses for KVM Access Part 2

6. Click OK to create the IPv4 Address block as shown in Figure 6.

7. Click OK.

Figure 6 Adding Block of IPv4 Addresses for KVM Access Part 3

Editing the Chassis/FEX Discovery Policy

This section provides details for modifying the chassis discovery policy. Setting the discovery policy ensures easy addition of the Cisco UCS B-Series chassis or fabric extenders for the Cisco UCS C-Series servers in future.

1. Click the Equipment tab.

2. In the right pane, click the Policies tab.

3. Click the Global Policies tab in the right pane of the window.

4. In the Chassis/FEX Discovery Policy area, select 8-link from the drop down list for Action field as shown in Figure 7.

Figure 7 Editing the Chassis/FEX Discovery Policy

5. Click Save Changes.

6. Click OK.

Enabling the Server Ports and Uplink Ports

Follow these steps to enable the server and configure the uplink ports:

1. Click the Equipment tab.

2. Select Equipment > Fabric Interconnects > Fabric Interconnect A (primary) > Fixed Module.

3. Expand the Unconfigured Ethernet Ports.

4. Select all the ports that are connected to the Cisco 2232PP FEX (eight per FEX), right-click and select Reconfigure > Configure as a Server Port.

5. Select port 1 that is connected to the uplink switch, right-click, then select Reconfigure > Configure as Uplink Port.

6. Select Show Interface and select 10GB for Uplink Connection.

7. Click Yes in the confirmation pop-up window and then OK to continue.

8. Select Equipment > Fabric Interconnects > Fabric Interconnect B (subordinate) > Fixed Module.

9. Expand the Unconfigured Ethernet Ports section.

10. Select all the ports that are connected to the Cisco 2232 Fabric Extenders (eight per Fex), right-click and select Reconfigure > Configure as Server Port.

11. Click Yes and then OK to continue.

12. Select port number 1, which is connected to the uplink switch, right-click and select Reconfigure > Configure as Uplink Port.

Figure 8 Enabling Server Ports

13. Select Show Interface and select 10GB for Uplink Connection.

14. Click Yes in the confirmation pop-up window and then OK to continue.

Figure 9 shows all the configured uplink and Server ports.

Figure 9 Server and Uplink Ports Summary

Creating Pools for Service Profile Templates

Creating an Organization

Organizations are used as a means to arrange and restrict access to various groups within the IT organization, and enable multi-tenancy of the compute resources. This document does not use organizations; however, the steps to create an organizations are given for future reference.

Follow these steps to configure an organization in the Cisco UCS Manager:

1. Click New in the left corner of the UCS Manager GUI.

2. Select Create Organization from the options.

3. Enter a name for the organization.

4. (Optional) Enter a description for the organization.

5. Click OK.

Creating MAC Address Pools

Follow these steps to create MAC address pools:

1. Click the LAN tab.

2. Select Pools > root.

3. Right-click MAC Pools under the root organization.

4. Select Create MAC Pool to create the MAC address pool. Enter ucs as the name of the MAC pool.

5. (Optional) Enter a description of the MAC pool.

6. Click Next.

7. Click Add.

8. Specify a starting MAC address.

9. Specify a size of the MAC address pool, which is sufficient to support the available server resources as shown in Figure 10.

10. Click OK.

Figure 10 Specifying the First MAC Address and Size

11. Click Finish as shown in Figure 11.

Figure 11 Adding MAC Addresses

12. Click OK to confirm the addition of the MAC addresses.

Configuring VLANs

Table 5 describes the VLANs that are configured in this design solution.

Table 5 VLAN Configurations

VLAN

Fabric

NIC Port

Function

Failover

vlan160_mgmt

A

eth0

Management, user connectivity

Fabric Failover B

vlan12_HDFS

B

eth1

Hadoop

Fabric Failover A

vlan11_DATA

A

eth2

Hadoop and/or SAN/NAS access, ETL

Fabric Failover B

All of the VLANs created should be trunked to the upstream distribution switch connecting the fabric interconnects. In this deployment, vlan160_mgmt is configured for management access and user connectivity, vlan12_HDFS is configured for Hadoop interconnect traffic, and vlan11_DATA is configured for optional secondary interconnect and/or SAN/NAS access, heavy ETL, and so on.

Follow these steps to configure VLANs in Cisco UCS Manager:

1. Click the LAN tab.

2. Select LAN > VLANs.

3. Right-click the VLANs under the root organization.

4. Select Create VLANs to create the VLAN as shown in Figure 12.

Figure 12 Creating VLAN

5. Enter vlan160_mgmt in the VLAN Name/Prefix text box as shown in Figure 13.

6. Click the Common/Global radio button.

7. Enter 160 in the VLAN IDs text box.

8. Click OK and then click Finish.

9. Click OK.

Figure 13 Creating Management VLAN

10. Click the LAN tab.

11. Select LAN > VLANs.

12. Right-click the VLANs under the root organization.

13. Select Create VLANs to create the VLAN as shown in Figure 14.

14. Enter vlan11_DATA in the VLAN Name/Prefix text box.

15. Click the Common/Global radio button.

16. Enter 11 in the VLAN IDs text box.

17. Click OK and then click Finish.

18. Click OK.

Figure 14 Creating VLAN for Data

19. Click the LAN tab.

20. Select LAN > VLANs.

21. Right-click the VLANs under the root organization.

22. Select Create VLANs to create the VLAN.

23. Enter vlan12_HDFS in the VLAN Name/Prefix text box as shown in Figure 15.

24. Click the Common/Global radio button.

25. Enter 12 in the VLAN IDs text box.

26. Click OK and then click Finish.

Figure 15 Creating VLAN for Hadoop Data

Creating Server Pool

A server pool contains a set of servers. These servers typically share the same characteristics such as their location in the chassis, server type, amount of memory, local storage, type of CPU, or local drive configuration. You can manually assign a server to a server pool, or use the server pool policies and server pool policy qualifications to automate the server assignment.

Follow these steps to configure the server pool within the Cisco UCS Manager:

1. Click the Servers tab.

2. Select Pools > root.

3. Right-click the Server Pools.

4. Select Create Server Pool.

5. Enter the required name (ucs) for the server pool in the name text box as shown in Figure 16.

6. (Optional) Enter a description for the organization.

7. Click Next to add the servers.

Figure 16 Setting Name and Description of the Server Pool

8. Select all the Cisco UCS C240M3L servers to be added to the server pool that were previously created (ucs), then Click >> to add them to the pool as shown in Figure 17.

9. Click Finish.

10. Click OK, and then click Finish.

Figure 17 Adding Servers to the Server Pool

Creating Policies for Service Profile Templates

This section provides you the procedure to create the following policies for the service profile template:

•Creating a Host Firmware Package Policy

•Creating QoS Policies

•Creating a Local Disk Configuration Policy

•Creating a Server BIOS Policy

•Creating a Boot Policy

Creating a Host Firmware Package Policy

Firmware management policies allow the administrator to select the corresponding firmware packages for a given server configuration. The components that can be configured include adapters, BIOS, board controllers, FC adapters, HBA options, ROM and storage controller.

Follow these steps to create a host firmware management policy for a given server configuration using the Cisco UCS Manager:

1. Click the Servers tab in the UCS Manager.

2. Select Policies > root.

3. Right-click Host Firmware Packages.

4. Select Create Host Firmware Package.

5. Enter the required host firmware package name (ucs) as shown in Figure 18.

6. Click the Simple radio button to configure the host firmware package.

7. Select the appropriate Rack Package value.

8. Click OK to complete creating the management firmware package.

9. Click OK.

Figure 18 Creating Host Firmware Package

Creating QoS Policies

This section describes the procedure to create the Best Effort QoS Policy and Platinum QoS policy.

Creating the Best Effort Policy

Follow these steps to create the Best Effort Policy:

1. Click the LAN tab.

2. Select Policies > root.

3. Right-click QoS Policies.

4. Select Create QoS Policy as shown in Figure 19.

Figure 19 Creating QoS Policy

5. Enter BestEffort as the name of the policy as shown in Figure 20.

6. Select BestEffort from the drop down menu.

7. Keep the Burst (Bytes) field as default (10240).

8. Keep the Rate (Kbps) field as default (line-rate).

9. Keep the Host Control radio button as default (none).

10. Click OK to complete creating the Policy.

11. Click OK.

Figure 20 Creating BestEffort QoS Policy

Creating a Platinum Policy

Follow these steps to create the Platinum QoS policy:

1. Click the LAN tab.

2. Select Policies > root.

3. Right-click QoS Policies.

4. Select Create QoS Policy.

5. Enter Platinum as the name of the policy as shown in Figure 21.

6. Select Platinum from the drop down menu.

7. Keep the Burst (Bytes) field as default (10240).

8. Keep the Rate (Kbps) field as default (line-rate).

9. Keep the Host Control radio button as default (none).

10. Click OK to complete creating the Policy.

11. Click OK.

Figure 21 Creating Platinum QoS Policy

Setting Jumbo Frames

Follow these steps to set up Jumbo frames and enable the QoS:

1. Click the LAN tab in the Cisco UCS Manager.

2. Select LAN Cloud > QoS System Class.

3. In the right pane, click the General tab.

4. For Platinum, enter 9000 for MTU as shown in Figure 22.

5. Check the Enabled check box.

6. For Fibre Channel, select None from the Weight drop down list

7. Click Save Changes.

8. Click OK.

Figure 22 Setting Jumbo Frames

Creating a Local Disk Configuration Policy

Follow these steps to create a local disk configuration in the Cisco UCS Manager:

1. Click the Servers tab.

2. Select Policies > root.

3. Right-click Local Disk Config Policies.

4. Select Create Local Disk Configuration Policy.

5. Enter ucs as the local disk configuration policy name as shown in Figure 23.

6. Select Any Configuration from the drop down list to set the Mode.

7. Uncheck the Protect Configuration check box.

8. Make sure the Diasble radio button is selcted for FlexFlash State.

9. Make sure the Diasble radio button is selcted for FlexFlash RAID Reporting State.

10. Click OK to complete creating the Local Disk Configuration Policy.

11. Click OK.

Figure 23 Configuring Local Disk Policy

Creating a Server BIOS Policy

The BIOS policy feature in Cisco UCS automates the BIOS configuration process. The traditional mode of setting the BIOS is manual and is often error-prone. By creating a BIOS policy and assigning the policy to a server or group of servers, you can enable transparency within the BIOS settings configuration.

Note BIOS settings can have a significant performance impact, depending on the workload and the applications. The BIOS settings listed in this section is for configurations optimized for best performance which can be adjusted based on the application, performance and energy efficiency requirements.

Follow these steps to create a server BIOS policy using the Cisco UCS Manager:

1. Select the Servers tab.

2. Select Policies > root.

3. Right-click BIOS Policies.

4. Select Create BIOS Policy.

5. Enter the preferred BIOS policy name.

6. Change the BIOS settings as shown in Figure 24.

Figure 24 Creating Server BIOS Policy

7. Figure 25 and Figure 26 show the Processor and Intel Directed IO properties settings in the BIOS Policy.

Figure 25 Creating Server BIOS Policy for Processor

Figure 26 Creating Server BIOS Policy for Intel Directed IO

8. Set the RAS Memory settings and click Next as shown in Figure 27.

9. Click Finish to complete creating the BIOS Policy.

Figure 27 Creating Server BIOS Policy for Memory

10. Click OK.

Creating a Boot Policy

Follow these steps to create a boot policy within Cisco UCS Manager:

1. Select the Servers tab.

2. Select Policies > root.

3. Right-click the Boot Policies.

4. Select Create Boot Policy as shown in Figure 28.

Figure 28 Creating Boot Policy Part 1

5. Enter ucs as the boot policy name as shown in Figure 29.

6. (Optional) Enter a description for the boot policy.

7. Keep the Reboot on Boot Order Change check box unchecked.

8. Expand Local Devices > Add CD/DVD and select Add Local CD/DVD.

9. Expand Local Devices > Add Local Disk and select Add Local Disk.

10. Expand vNICs and select Add LAN Boot and enter eth0.

11. Click OK to add the Boot Policy.

12. Click OK.

Figure 29 Creating Boot Policy Part 2

Creating a Service Profile Template

Follow these steps to create a service profile template in Cisco UCS Manager:

1. Click the Servers tab.

2. Right-click Service Profile Templates.

3. Select Create Service Profile Template as shown in Figure 30.

Figure 30 Creating Service Profile Template

4. The Create Service Profile Template window appears. Do the following (see Figure 31):

a. In the Identify Service Profile Template window, enter the name of the service profile template as ucs.

b. Click the Updating Template radio button.

c. In the UUID section, select Hardware Default as the UUID pool.

5. Click Next to continue.

Figure 31 Identify Service Profile Template

Configuring Network Settings for the Template

In the Networking window, follow these steps to configure the network settings in the Cisco UCS Manager:

1. Keep the Dynamic vNIC Connection Policy field at the default as shown in Figure 32.

2. Click the Expert radio button to define How would you like to configure LAN connectivity?

3. Click Add to add a vNIC to the template. The Modify vNIC window appears.

Figure 32 Configuring Network Settings for the Template

4. In the Modify vNIC window, enter name for the vNIC as eth0 as shown in Figure 33.

5. Select ucs in the MAC Address Assignment pool.

6. Click the Fabric A radio button and check the Enable failover check box for the Fabric ID.

7. Check the vlan160_mgmt check box for VLANs.

8. Click the Native VLAN radio button.

9. Select MTU size as 1500.

10. Select adapter policy as Linux.

11. Select QoS Policy as BestEffort.

12. Keep the Network Control Policy as Default.

13. For Connection Policies, make sure the Dynamic vNIC radio button is selected.

14. Keep the Dynamic vNIC Connection Policy as <not set>.

15. Click Ok.

Figure 33 Configuring vNIC eth0

16. The Modify vNIC window appears. Enter the name of the vNIC as eth1 as shown in Figure 34.

17. For MAC Address Assignment pool, select ucs.

18. Click the Fabric B radio button and check the Enable failover check box for the Fabric ID.

19. Check the vlan12_HDFS check box for VLANs and click the Native VLAN radio button.

20. Select MTU size as 9000.

21. Select adapter policy as Linux.

22. Select QoS Policy as Platinum.

23. Keep the Network Control Policy as Default.

24. For Connection Policies, make sure the Dynamic vNIC radio button is selected.

25. Keep the Dynamic vNIC Connection Policy as <not set>.

26. Click Ok.

Figure 34 Configuring vNIC eth1

27. The Create vNIC window appears. Enter the name of the vNIC as eth2 as shown in Figure 35.

28. Select ucs in the MAC Address Assignment pool.

29. Click the Fabric A radio button and check the Enable failover check box for the Fabric ID.

30. Check the vlan11_DATA check box for VLANs and select the Native VLAN radio button.

31. Select MTU size as 9000.

32. Select adapter policy as Linux.

33. Select QoS Policy as Platinum.

34. Keep the Network Control Policy as Default.

35. For Connection Policies, make sure the Dynamic vNIC radio button is selected.

36. Keep the Dynamic vNIC Connection Policy as <not set>.

37. Click Ok.

38. Click Next in the Networking window to continue.

Figure 35 Configuring vNIC eth2

Configuring a Storage Policy for the Template

In the Storage window, follow these steps to configure a storage policy in Cisco UCS Manager:

1. Select ucs for the local disk configuration policy as shown in Figure 36.

2. Click the No vHBAs radio button to define How would you like to configure SAN connectivity?

3. Click Next to continue.

Figure 36 Configuring Storage settings

4. Click Next in the Zoning window to continue.

Configuring a vNIC/vHBA Placement for the Template

In the vNIC/vHBA window, follow these steps to configure a vNIC/vHBA placement policy in Cisco UCS Manager:

1. Select the Default Placement Policy option for the Select Placement field as shown in Figure 37.

2. Select eth0, eth1 and eth2 assign the vNICs in the following order:

a. eth0

b. eth1

c. eth2

Review to make sure that all vNICs are assigned in the appropriate order.

3. Click Next to continue.

Figure 37 vNIC/vHBA Placement

Configuring a Server Boot Order for the Template

In the Server Boot Order window, follow these steps to set the boot order for servers in Cisco UCS Manager:

1. Select ucs in the Boot Policy name field as shown in Figure 38.

2. Check the Enforce vNIC/vHBA/iSCSI Name check box.

Review to make sure that all the boot devices are created and identified.

3. Verify that the boot devices are in the correct boot sequence.

4. Click OK.

Figure 38 Creating Boot Policy

5. Click Next to continue.

In the Maintenance Policy window, keep the default no policy as we have not created a policy. Click Next to continue to the next window.

Configuring Server Assignment for the Template

In the Server Assignment window, follow these steps to assign the servers to the pool in Cisco UCS Manager:

1. Select ucs for the Pool Assignment field as shown in Figure 39.

2. Keep the Server Pool Qualification field as default.

3. Select ucs in Host Firmware Package.

Figure 39 Server Assignment

Configuring Operational Policies for the Template

In the Operational Policies window, follow these steps:

1. Select ucs in the BIOS Policy field as shown in Figure 40.

2. Click Finish to create the Service Profile template.

3. Click OK.

Figure 40 Selecting BIOS Policy

4. Click the Servers tab.

a. Select Service Profile Templates > root.

b. Right-click root and select Create Service Profile Template as shown in Figure 41.

Figure 41 Creating Service Profiles from Template

c. The Create Service Profile from Template window appears. Enter the name and number of nodes in the Name and Number fields as shown in Figure 42.

Figure 42 Selecting Name and Total Number of Service Profiles

The Cisco UCS Manager discovers the servers and automatically associate these servers with service profiles. Figure 43 illustrates the service profiles associated with all the 64-nodes.

Figure 43 Cisco UCS Manager showing 64 Nodes

Configuring Disk Drives for Operating System on NameNodes

Admin Node, HAWQ Node, Namenode and Secondary Namenode have a different RAID configuration compared to Datanodes. This section details the configuration of disk drives for OS on these nodes. Nodes 1-3 run the Admin and Master Services. Nodes 4-64 are data and compute.

For these Nodes use RAID5 with strip size of 256KB. For this configuration Read policy and Write Policy are No Read Ahead and Write back with BBU respectively.

There are several ways to configure RAID such as:

•Using the LSI WebBIOS Configuration Utility embedded in the MegaRAID BIOS

•Booting DOS and running MegaCLI commands

•Using Linux-based MegaCLI commands

•Using third party tools that have MegaCLI integrated

For this deployment, the drives are configured using the LSI WebBIOS Configuration Utility.

Follow these steps to create RAID5 on all the eight disk drives:

1. Boot the server, and do the following:

a. Press <Ctrl><H> to launch the WebBIOS.

b. Press Ctrl+H immediately. The Adapter Selection window appears.

2. Click Start to continue as shown in Figure 44.

3. Click Configuration Wizard.

Figure 44 Adapter Selection for RAID Configuration

4. In the Configuration Wizard window, click the Clear Configuration radio button as shown in Figure 45.

5. Click Next to clear the existing configuration.

Figure 45 Clearing Current Configuration on the Controller

6. Click Yes.

7. In the Physical View, ensure that all the drives are Unconfigured Good.

8. In the Configuration Wizard window, click the New Configuration radio button as shown in Figure 46.

9. Click Next.

Figure 46 Choosing to Create a New Configuration

10. Click the Manual Configuration radio button. This enables complete control over all attributes of the new storage configuration, such as, configuration of the drive groups, virtual drives and setting the parameters as shown in Figure 47.

Figure 47 Choosing Manual Configuration Method

11. Click Next. The Drive Group Definition window appears.

12. In the Drive Group Definition window, choose all the drives to create drive groups as shown in Figure 48.

13. Click Add to Array to move the drives to a proposed drive group configuration in the Drive Groups pane.

Figure 48 Selecting All the Drives and Adding to Drive Group

14. Click Accept DG and click Next.

Figure 49 Span Definition Window

15. In the Span Definitions window, click Add to SPAN and click Next as shown in Figure 49.

Figure 50 Adding Array Hole to Span

16. In the Virtual Drive definitions window, do the following (see Figure 51):

a. Click on Update Size.

b. Change Strip Size to 256KB. A larger strip size ensures higher read performance.

c. From the Read Policy drop down list, choose No Read Ahead.

d. From the Write Policy drop down list, choose Write Back with BBU.

e. Make sure RAID Level is set to RAID5.

f. Click Accept to accept the changes to the virtual drive definitions.

g. Click Next.

Note Clicking on Update Size can change some of the settings in the window. Make sure all settings are correct before submitting the changes.

Figure 51 Defining Virtual Drive

17. After you finish the virtual drive definitions, click Next. The Configuration Preview window appears showing VD0.

18. Check the virtual drive configuration in the Configuration Preview window and click Accept to save the configuration.

19. Click Yes to save the configuration.

20. In the Managing SSD Caching window, click Cancel as shown in Figure 52.

Figure 52 SSD Caching Window

21. Click Yes in the confirmation page.

22. Set VD0 as the Boot Drive and click Go as shown in Figure 53.

23. Click Home.

24. Review the configuration and click Exit.

Figure 53 Setting Virtual Drive as Boot Drive

Configuration of disks 3 to 24 are done using Linux based MegaCLI commands described in "Configuring Data Drives on Data Nodes" section.

Configuring Disk Drives for Operating System on DataNodes

Nodes 4 through 64 are configured as data nodes. This section details the configuration of disk drives for OS on the data nodes. As stated above, the focus of this CVD is the Capacity Optimized Configuration featuring 12 4TB LFF disk drives. The first two disk drives are configured as a RAID1 volume with 64KB strip size and the rest individual RAID0 volumes with 1MB strip size. For this configuration Read and write Policy are No Read Ahead and Always write Back respectively.

Note In this CVD, we recommend two disks RAID1 protected for operating system. Depending on specific application requirements, operating system can be hosted on a 1TB partition from a single drive.

There are several ways to configure RAID such as:

•Using LSI WebBIOS Configuration Utility embedded in the MegaRAID BIOS

•Booting DOS and running MegaCLI commands

•Using Linux based MegaCLI commands

•Using third party tools that have MegaCLI integrated

For this deployment, the first disk drive is configured using LSI WebBIOS Configuration Utility and rest is configured using Linux based MegaCLI commands after the OS is installed.

Follow these steps to create RAID1 on the first two disk drives to install the Operating System:

1. Boot the server, and do the following:

a. Press <Ctrl><H> to launch the WebBIOS.

b. Press Ctrl+H immediately. The Adapter Selection window appears.

2. Click Start to continue as shown in Figure 54.

3. Click Configuration Wizard.

Figure 54 Adapter Selection for RAID Configuration

4. In the Configuration Wizard window, click the Clear Configuration radio button as shown in Figure 55 to clear the existing configuration.

5. Click Next to clear the existing configuration.

Figure 55 Clearing Current Configuration on the Controller

6. Click Yes.

7. In the Physical View, ensure that all the drives are Unconfigured Good.

8. In the Configuration Wizard window, click the New Configuration radio button as shown in Figure 56.

9. Click Next.

Figure 56 Choosing to Create a New Configuration

10. Click the Manual Configuration radio button. This enables complete control over all attributes of the new storage configuration, such as, configuration of the drive groups, virtual drives and setting the parameters as shown in Figure 57.

Figure 57 Choosing Manual Configuration Method

11. Click Next. The Drive Group Definition window appears.

12. In the Drive Group Definition window, choose the first two drives to create drive groups as shown in Figure 58.

13. Click Add to Array to move the drives to a proposed drive group configuration in the Drive Groups pane.

14. Click Accept DG and click Next.

Figure 58 Selecting First Two Drives and Adding to Drive Group

15. In the Span Definitions window, click Add to SPAN and click Next as shown in Figure 59.

Figure 59 Span Definition Window

16. In the Virtual Drive definitions window, do the following (see Figure 60):

a. Click on Update Size.

b. Change Strip Size to 64KB.

Note A larger strip size produces higher read performance.

c. From the Read Policy drop down list, choose No Read Ahead.

d. From the Write Policy drop down list, choose Always Write Back.

e. Make sure RAID Level is set to RAID1.

f. Click Accept to accept the changes to the virtual drive definitions.

g. Click Next.

Note Clicking on Update Size can change some of the settings in the window. Make sure all settings are correct before submitting the changes.

Figure 60 Defining Virtual Drive

17. After you finish the virtual drive definitions, click Next. The Configuration Preview window appears showing VD0.

18. Check the virtual drive configuration in the Configuration Preview window and click Accept to save the configuration.

19. Click Yes to save the configuration.

20. In the Managing SSD Caching window, click Cancel as shown in Figure 61.

Figure 61 SSD Caching Window

21. Click Yes in the confirmation page.

22. Set VD0 as the Boot Drive and click Go as shown in Figure 62.

23. Click Home.

24. Review the configuration and click Exit.

Figure 62 Setting Virtual Drive as Boot Drive

The steps above can be repeated to configure disks 3-12 or using Linux based MegaCLI commands described in "Configuring Data Drives on Data Nodes" section.

Installing Red Hat Linux 6.4 with KVM

The following section provides detailed procedures for installing Red Hat Linux 6.4.

There are multiple methods to install Red Hat Linux Operating System. The installation procedure described in this design guide uses KVM console and virtual media from Cisco UCS Manager.

1. Log in to the Cisco UCS 6296 Fabric Interconnect and launch the Cisco UCS Manager application.

2. Click the Equipment tab.

3. In the navigation pane expand Rack-Mounts and Servers.

4. Right-click on the Server and select KVM Console as shown in Figure 63.

Figure 63 Selecting KVM Console Option

5. In the KVM window, select the Virtual Media tab as shown in Figure 64.

6. Click Add Image button in the Client View selection window.

7. Browse to the Red Hat Enterprise Linux Server 6.4 installer ISO image file.

Note The Red Hat Enterprise Linux 6.4 DVD is assumed to be available on the client machine.

Figure 64 Adding an ISO Image

8. Click Open to add the image to the list of virtual media.

9. Check the Mapped check box for the image you just added as shown in Figure 65.

Figure 65 Mapping ISO Image

10. In the KVM console, select the KVM tab to monitor the bootup.

11. In the KVM console, click Boot Server.

12. Click OK.

13. Click OK to reboot the system.

On reboot, the server detects the presence of the Red Hat Enterprise Linux Server 6.4 install media.

14. Select Install or Upgrade an Existing System option as shown in Figure 66.

Figure 66 Select Install Option

15. Click Skip, to skip the Media test as we have chosen the ISO Image for the OS installation.

16. Click Next. The Red Hat Linux Welcome Screen appears.

17. Select the Language for the installation.

18. Click the Basic Storage Devices radio button.

19. Click the Fresh Installation radio button.

20. Enter the host name of the server and click Next.

21. Click Configure Network. The Network Connections window appear.

22. In the Network Connections window, select the Wired tab.

23. Select the interface System eth0 and click Edit.

24. Editing System eth0 appears as shown in Figure 67.

25. Check the Connect automatically check box.

26. Select Manual in the Method drop down list.

27. Click Add and enter IP Address, the netmask and the gateway.

For this demonstration, the following values have been used:

IP Address: 10.29.160.53

Netmask: 255.255.255.0

Gateway: 10.29.160.1

28. Add DNS servers (optional).

29. Click Apply.

Figure 67 Configuring Network for eth0

30. Repeat the steps 26 to steps 32 to configure the network for the System eth1. The following values have been used (see Figure 68):

IP Address: 192.168.12.11

Netmask: 255.255.255.0

Figure 68 Configuring Network for eth1

31. Repeat the steps 26 to steps 32 to configure the network for System eth2. The following values have been used:

IP Address: 192.168.11.11

Netmask: 255.255.255.0

Note Table 6 lists ip address of nodes in the cluster.

32. Select the appropriate time zone.

33. Enter the root password and click Next.

34. Select Use All Space and click Next as shown in Figure 69.

35. Choose an appropriate boot drive.

Figure 69 Selecting Install Option

36. Click Write changes to the disk and click Next.

37. Select Basic Server and click Next as shown in Figure 70.

Figure 70 Selecting Type of Installation

38. After the installer has finished loading, it will continue with the installation.

39. Reboot the system after the installation is complete.

Repeat the above steps (1 to 39) to install the Red Hat Linux on servers 2 through 64.

Note You can automate the OS installation and configuration of the nodes through the Preboot Execution Environment (PXE) boot or through third party tools.

Table 6 describes the hostnames and their corresponding IP addresses.

Table 6 Hostnames and IP Addresses

Hostname

eth0

eth1

eth2

rhel1

10.29.160.53

192.168.12.11

192.168.11.11

rhel2

10.29.160.54

192.168.12.12

192.168.11.12

rhel3

10.29.160.55

192.168.12.13

192.168.11.13

rhel4

10.29.160.56

192.168.12.14

192.168.11.14

rhel5

10.29.160.57

192.168.12.15

192.168.11.15

rhel6

10.29.160.58

192.168.12.16

192.168.11.16

rhel7

10.29.160.59

192.168.12.17

192.168.11.17

rhel8

10.29.160.60

192.168.12.18

192.168.11.18

rhel9

10.29.160.61

192.168.12.19

192.168.11.19

rhel10

10.29.160.62

192.168.12.20

192.168.11.20

rhel11

10.29.160.63

192.168.12.21

192.168.11.21

rhel12

10.29.160.64

192.168.12.22

192.168.11.22

rhel13

10.29.160.65

192.168.12.23

192.168.11.23

rhel14

10.29.160.66

192.168.12.24

192.168.11.24

rhel15

10.29.160.67

192.168.12.25

192.168.11.25

rhel16

10.29.160.68

192.168.12.26

192.168.11.26

...

...

...

...

rhel64

10.29.160.116

192.168.12.74

192.168.11.74

Post OS Install Configuration

Use one of the three master nodes to install Pivotal Command Center (referred to as Admin Node). Pivotal command center facilitates Install, Management and Monitoring of Pivotal HD. In this document, we have used rhel1 for this purpose.

Setting Up Password-less Login

To manage all of the cluster nodes from the admin node we need to setup password-less login. It assists in automating common tasks with Parallel-SSH (pssh) and shell-scripts without having passwords.

Once Red Hat Linux is installed across all the nodes in the cluster, follow these steps in order to enable password less login across all the nodes.

1. Login to the admin node (rhel1).
ssh 10.29.160.53
2. Run the ssh-keygen command to create both public and private keys on the admin node.

3. Run the following commands from the admin node to copy the public key id_rsa.pub to all the nodes of the cluster. The .ssh-copy-id command appends the keys to the remote-host. .ssh/authorized_key.
for IP in {53..116}; do echo -n "$IP -> "; ssh-copy-id -i ~/.ssh/id_rsa.pub 
10.29.160.$IP; done
4. Enter yes at the command prompt to continue connecting.

5. Enter the password of the remote host to login.

Installing and Configuring Parallel SSH

Installing Parallel-SSH

Parallel-ssh is used to run commands on several hosts at the same time. It takes a file of hostnames and a few common ssh parameters as parameters, and executes the given command in parallel on the specified nodes.

1. Download the pssh.
wget https://parallel-ssh.googlecode.com/files/pssh-2.3.1.tar.gz
scp pssh-2.3.1.tar.gz rhel1:/root
2. Run the following command to copy pssh-2.3.1.tar.gz to the admin node:
ssh rhel1 
tar xzf pssh-2.3.1.tar.gz
cd pssh-2.3.1
python setup.py install
3. Extract and install pssh on the admin node.

4. Create a host file containing the IP addresses of all the nodes and all the DataNodes in the cluster. This file is passed as a parameter to pssh to identify the nodes and run the commands on them.
vi /root/allnodes 
# This file contains ip address of all nodes of the cluster 
#used by parallel-shell (pssh). For Details man pssh
10.29.160.53
10.29.160.54
10.29.160.55
10.29.160.56
10.29.160.57
10.29.160.58
10.29.160.59
10.29.160.60
10.29.160.61
10.29.160.62
10.29.160.63
10.29.160.64
10.29.160.65
10.29.160.66
10.29.160.67
10.29.160.68
...
10.29.160.116
vi /root/datanodes 
10.29.160.55
10.29.160.56
10.29.160.57
10.29.160.58
10.29.160.59
10.29.160.60
10.29.160.61
10.29.160.62
10.29.160.63
10.29.160.64
10.29.160.65
10.29.160.66
10.29.160.67
10.29.160.68
...
10.29.160.116
Installing Cluster Shell

1. Download cluster shell (clush) and install it on rhel1.

Cluster shell is available from the Extra Packages for Enterprise Linux (EPEL) repository.
wget 
http://dl.fedoraproject.org/pub/epel//6/x86_64/clustershell-1.6-1.el6.noarch.rpm
scp clustershell-1.6-1.el6.noarch.rpm rhel1:/root/
2. Login to rhel1 and install cluster shell.
yum install clustershell-1.6-1.el6.noarch.rpm
3. Edit /etc/clustershell/groups file to include hostnames for all the nodes of the cluster.
For 64 node cluster all: rhel[1-64], rhel[10-64]
Note Configuring EPEL repository is discussed in detail in another section.

Configuring /etc/hosts and DNS

Follow these steps to create the host file across all the nodes in the cluster:

1. Run the following command to populate the host file with IP addresses and corresponding hostnames on the admin node (rhel1):

On Admin Node (rhel1)
vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.29.160.53    rhel1.mgmt
10.29.160.54    rhel2.mgmt
10.29.160.55    rhel3.mgmt
10.29.160.56    rhel4.mgmt
10.29.160.57    rhel5.mgmt
10.29.160.58    rhel6.mgmt
10.29.160.59    rhel7.mgmt
10.29.160.60    rhel8.mgmt
10.29.160.61    rhel9.mgmt
10.29.160.62    rhel10.mgmt
10.29.160.63    rhel11.mgmt
10.29.160.64    rhel12.mgmt
10.29.160.65    rhel13.mgmt
10.29.160.66    rhel14.mgmt
10.29.160.67    rhel15.mgmt
10.29.160.68    rhel16.mgmt
... 
192.168.12.11 rhel1
192.168.12.12 rhel2
192.168.12.13 rhel3
192.168.12.14 rhel4
192.168.12.15 rhel5
192.168.12.16 rhel6
192.168.12.17 rhel7
192.168.12.18 rhel8
192.168.12.19 rhel9
192.168.12.20 rhel10
192.168.12.21 rhel11
192.168.12.22 rhel12
192.168.12.23 rhel13
192.168.12.24 rhel14
192.168.12.25 rhel15
192.168.12.26 rhel16
...
On Other nodes (rhel2-rhel64)
vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
2. Update /etc/resolv.conf file to point to Admin Node
vi /etc/resolv.conf
nameserver 10.29.160.53
Note This step is required if you are setting up dnsmasq on Admin node, else this file should be updated with the correct nameserver.

3. Deploy /etc/resolv.conf from the admin node (rhel1) to all the nodes via the following pscp command:
pscp -h /root/allnodes /etc/resolv.conf /etc/resolv.conf
4. Start dnsmasq on Admin node.
service dnsmasq start
5. Ensure DNS is working fine by running the following command on Admin node and any data-node.
[root@rhel2 ~]# nslookup rhel1.mgmt
Server: 10.29.160.53
Address: 10.29.160.53#53
Name: rhel1.mgmt
Address: 10.29.160.53
[root@rhel2 ~]# nslookup rhel1
Server: 10.29.160.53
Address: 10.29.160.53#53
Name: rhel1
Address: 192.168.12.11
Creating RedHat Local Repository

To create a repository using RHEL DVD or ISO on the admin node (in this deployment rhel1 is used for this purpose), create a directory with all the required rpms, run the createrepo command and then publish the resulting repository.

1. Login to rhel1 node, and run the following command to create a directory that would contain the repository:
mkdir -p /var/www/html/rhelrepo64
2. Copy the contents of the Red Hat DVD to /var/www/html/rhelrepo64.

3. Alternatively, if you have access to a Red Hat ISO Image, copy the ISO file to rhel1.
scp rhel-server-6.4-x86_64-dvd.iso rhel1:/root
Assuming the Red Hat ISO file is located in your working directory.
mkdir -p /mnt/rheliso
mount -t iso9660  -o loop /root/rhel-server-6.4-x86_64-dvd.iso /mnt/rheliso/
4. Copy the contents of the ISO to the /var/www/html/rhelrepo directory.
cp -r /mnt/rheliso/* /var/www/html/rhelrepo64
5. Run the following command on the rhel1 to create a .repo file that enables the use of the yum command:
vi /var/www/html/rhelrepo/rheliso.repo
[rhel6.4]
name=Red Hat Enterprise Linux 6.4
baseurl=http://10.29.160.53/rhelrepo64
gpgcheck=0
enabled=1
Note The yum command based on the repo file requires httpd to be running on rhel1 so that the other nodes can access the repository. Steps to install and configure httpd are given in the "Installing httpd" section.

6. Copy the rheliso.repo to all the nodes of the cluster.
pscp -h /root/allnodes /var/www/html/rhelrepo/rheliso.repo /etc/yum.repos.d/
7. To use the repository files on rhel1 without httpd, edit the baseurl of the repo file. etc/yum.repos.d/rheliso.repo to point repository location in the file system.
vi /etc/yum.repos.d/rheliso.repo
[rhel6.4]
name=Red Hat Enterprise Linux 6.4
baseurl=file:///var/www/html/rhelrepo64
gpgcheck=0
enabled=1
8. Run pssh -h /root/allnodes "yum clean all" command:

Creating the Red Hat Repository Database

1. Install the createrepo package.

2. Use the createrepo package to regenerate the repository database(s) for the local copy of the RHEL DVD contents.

3. Purge the yum caches:
yum -y install createrepo
cd /var/www/html/rhelrepo64
createrepo .
yum clean all
Upgrading LSI driver

The latest LSI driver is essential for performance and bug fixes.

To download the latest LSI drivers, see:

http://software.cisco.com/download/release.html?mdfid=284296254&flowid=31743&softwareid=283853158&release=1.5.1&relind=AVAILABLE&rellifecycle=&reltype=latest

1. In the ISO image, the required driver kmod-megaraid_sas-v06.601.06.00.rpm can be located at: ucs-cxxx-drivers.1.5.1\Linux\Storage\LSI\92xx\RHEL\RHEL6.4

2. Download and transfer kmod-megaraid_sas-v06.601.06.00.rpm driver to the admin node (rhel1).

3. Run the following commands to install the rpm on all nodes of the cluster:
pscp -h /root/allnodes kmod-megaraid_sas-v06.601.06.00_rhel6.4-2.x86_64.rpm /root/
pssh -h /root/allnodes  "rpm -ivh 
kmod-megaraid_sas-v06.601.06.00_rhel6.4-2.x86_64.rpm"
4. Run the following command to verify the version of kmod-megaraid_sas driver is used on all the nodes (confirm all versions are same):
pssh -h /root/allnodes  "modinfo megaraid_sas | head -5"
Installing httpd

1. Install httpd on the admin node to host repositories.

The Red Hat repository is hosted using http on the admin node, and this machine is accessible by all the hosts in the cluster.
yum -y install httpd
2. Add ServerName, and make the necessary changes to the server configuration file.
/etc/httpd/conf/httpd.conf
SeverName 10.29.160.53:80
3. Run the following command to make sure that the httpd is able to read the repofiles:
chcon -R -t httpd_sys_content_t /var/www/html/rhelrepo64
4. Run the following command to start httpd:
service httpd start
chkconfig httpd on
Enabling Syslog

Syslog must be enabled on each node to preserve logs regarding killed processes or failed jobs. Modern versions such as syslog-ng and rsyslog are possible, making it more difficult to ascertain if a syslog daemon is present.

Run any of the commands to confirm if the service is properly configured:
clush -B -a rsyslogd -v
clush -B -a service rsyslog status
Setting Ulimit

On each node, ulimit -n specifies the number of inodes that can be opened simultaneously. With the default value of 1024, the system appears to be out of disk space and shows no inodes available. This value should be set to 64000 on every node.

Higher values are unlikely to result in an appreciable performance gain.

1. For setting ulimit on Redhat, run the command Edit /etc/security/limits.conf and add the following lines:
root soft nofile 64000
root hard nofile 64000
Note The ulimit values are applied on a new shell. Running the command on a node on an earlier instance of a shell shows old values.

2. To verify the ulimit setting, run the following command:
clush -B -a ulimit -n
The command should report 64000 as the ulimit.

Disabling SELinux

SELinux must be disabled during the ID installation procedure and cluster setup. SELinux can be enabled after installation and while the cluster is running.

SELinux can be disabled by editing /etc/selinux/config and changing the SELINUX line to SELINUX=disabled.

1. Run the following command to disable SELINUX on all nodes:
pssh -h /root/allnodes "sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' 
/etc/selinux/config"
pssh -h /root/allnodes "setenforce 0"
Note This command fails if SELinux is already disabled.

JDK Installation

Download Java SE 7 Development Kit (JDK)

1. Using a web browser, click on the following link:

http://www.oracle.com/technetwork/java/index.html

2. Download the latest Java™ SE 7 Development Kit (JDK™7). Pivotal HD requires Java 1.7 and above.

3. Once the JDK7 package has been downloaded, place it in the /var/www/html/JDK/ directory.

Install JDK7 on All Node

Create the following script install_jdk.sh to install JDK:

Script install_jdk.sh
# Copy and install JDK 
cd /tmp/ 
curl http://10.29.160.53/JDK/jdk-7u45-linux-x64.bin -O -L 
sh ./jdk-6u41-linux-x64.bin -noregister 
Copy script disable_services.sh to all nodes and run the script on all nodes:
pscp -h /root/pssh.hosts /root/install_jdk.sh /root/ 
pssh -h /root/pssh.hosts "/root/install_jdk.sh"
Setting TCP Retries

Adjusting the tcp_retries parameter for the system network enables faster detection of failed nodes. Given the advanced networking features of UCS, this is a safe and recommended change (failures observed at the Operating System layer are mostly serious rather than transitory). On each node, set the number of TCP retries to 5 can help detect unreachable nodes with less latency.

1. Edit the file /etc/sysctl.conf and add the following line:
net.ipv4.tcp_retries2=5
2. Save the file and run the following command.
clush -B -a sysctl -p
Disabling the Linux Firewall

The default Linux firewall settings are far too restrictive for any Hadoop deployment. Since the Cisco UCS Big Data deployment is performed in the isolated network, there is no need to leave the iptables service running.

1. Run the following commands to disable the iptables:
pssh -h /root/allnodes  "service iptables stop"
2. Run the following command to check if the iptables are disabled:
pssh -h /root/allnodes "chkconfig iptables off"
Configuring Data Drives on Data Nodes

The first two disk drives are configured for the Operating System on the nodes, rhel1 and rhel2, as shown in "Configuring Disk Drives for Operating System on NameNodes" section. The remaining disk drives can be configured similarly or by using MegaCli.

1. From the LSI website http://www.lsi.com/support/Pages/Download-Results.aspx?keyword=9271-8i, download MegaCli and its dependencies and transfer the to the admin node.
scp  /root/MegaCli64  rhel1:/root/
scp  /root/Lib_Utils-1.00-08.noarch.rpm rhel1:/root/
scp  /root/Lib_Utils2-1.00-01.noarch.rpm rhel1:/root/
2. To copy all the three files to all the nodes, run the following commands:
pscp -h  /root/allnodes  /root/MegaCli64  /root/
pscp -h  /root/allnodes  /root/Lib_Utils*  /root/
3. Run the following command to install the rpms on all the nodes:
pssh -h /root/allnodes "rpm -ivh Lib_Utils*"
The first two disk drives are configured for the Operating System on the nodes, rhel4 to rhel64 as shown in "Configuring Disk Drives for Operating System on DataNodes" section. The remaining disk drives can be configured similarly or by using MegaCli.

Run the following command from the admin node to create the virtual drives with RAID 0 configurations on all the DataNodes.
pssh -h /root/datanodes "./MegaCli64 -cfgeachdskraid0 WB RA direct NoCachedBadBBU 
strpsz1024 -a0"
WB: Write back

RA: Read Ahead

NoCachedBadBBU: Do not write cache when the BBU is bad

Strpsz1024: Strip Size of 1024K

Note The above command will not override existing configurations. To clear and reconfigure the existing configurations, see Embedded MegaRAID Software Users Guide available at: www.lsi.com.

Configuring the Filesystem for DataNodes

This section describes the procedure to configure the filesystem for DataNodes.

1. On the Admin node, create a file containing the following script.

To create partition tables and file systems on the local disks of each nodes, run the following script as the root user on all the nodes.
vi /root/driveconf.sh
#!/bin/bash
disks_count=`lsblk -id | grep sd  | wc -l`
if [ $disks_count -eq 12 ];  then
    echo "Found 12 disks"
else
    echo "Found $disks_count disks. Expecting 12. Exiting.."
    exit 1
fi
[[ "-x" == "${1}" ]] && set -x && set -v && shift 1
for X in /sys/class/scsi_host/host?/scan
do
echo '- - -' > ${X}
done
count=1
for X in /dev/sd?
do
echo $X
if [[ -b ${X} && `/sbin/parted -s ${X} print quit|/bin/grep -c boot` -ne 0 ]]
then
echo "$X bootable - skipping."
continue
else
Y=${X##*/}1
/sbin/parted  -s  ${X} mklabel gpt quit
/sbin/parted  -s  ${X} mkpart 1 6144s 100% quit
/sbin/mkfs.xfs -f -q -l size=65536b,lazy-count=1,su=256k -d sunit=1024,swidth=6144 
-r extsize=256k -L ${Y} ${X}1
(( $? )) && continue
/bin/mkdir  -p  /mnt/disk$count
(( $? )) && continue
/bin/mount  -t xfs  -o allocsize=128m,noatime,nobarrier,nodiratime  ${X}1  
/mnt/disk$count
(( $? )) && continue
echo "LABEL=${Y} /mnt/disk$count xfs allocsize=128m,noatime,nobarrier,nodiratime 0 
0" >> /etc/fstab
((count++))
fi
done
Note This script would mount the non-bootable 10 drives, that is, /dev/sd<b-x> on /mnt/disk<3-12>. ID choose /mnt/disk<1-n> as the default location for formatting and storing HDFS data on a data-node. This could be overridden during installation.

2. Run the following command to copy driveconf.sh to all the DataNodes.
pscp -h /root/datanodes /root/driveconf.sh /root/
3. Run the following command from the admin node to run the script across all DataNodes.
pssh -h /root/datanodes "./driveconf.sh" 
Installing Pivotal HD Using Pivotal Command Center

Pivotal Command Center (PCC) supports both UI as well as CLI interface to install the Pivotal HD and HAWQ services on the cluster.

This section provides an overview of the installation steps using GUI approach. Each step is covered in more detail in the documentation links referenced below. The installation of Pivotal HD requires installation of Pivotal Command Center (PCC) first. Using the command center Pivotal HD is configured, deployed and monitored.

•To use CLI interface for installing the Pivotal HD and HAWQ services, refer to "Pivotal HD Enterprise 1.1 Installation and Administrator Guide" on Pivotal at: http://bitcast-a.v1.o1.sjc1.bitgravity.com/greenplum/pivotal-docs/PHD_11_Install_Admin.pdf

•To use UI interface for installing the Pivotal HD and HAWQ services, refer to "Pivotal Command Center 2.1 Installation and User Guide" on Pivotal at:

http://bitcast-a.v1.o1.sjc1.bitgravity.com/greenplum/pivotal-docs/PCC_21_User.pdf

The instructions provided in this section gives an overview on the installation steps. For more detailed information on each of these steps go the Pivotal links mentioned above.

Role Assignment

The install wizard attempts to assign the master nodes for various services that have been selected to appropriate hosts in the cluster. Reconfigure the service assignment to match Table 7.

Table 7 Role Assignments of Pivotal HD Distribution on CPA v2

Service Name

Host

Pivotal Command Center Admin Server

rhel1

NameNode

rhel1

HA standby Or Secondary NameNode

rhel2

Resource Manager

rhel2

DataNodes

rhel[4-64]

Node Manager Nodes

rhel[4-64]

Zoo Keeper

rhel2, rhel3, rhel4

Hive Server

rhel3

Hive Metastore

rhel3

Hbase Master

rhel3

Region Server

rhel[4-64]

HAWQ Master

rhel3

HAWQ Standby Master

rhel2

HAWQ Segments

rhel[4-64]

Note•On a small cluster (<16 nodes), consolidate all master services to run on two nodes.
•One or more nodes in the cluster or server outside the cluster can be configured as hadoop client.

•If applications running on Hadoop client node are not resource intensive, they do not require a dedicated node. This role can be collocated with one of the data nodes or service standby masters.

Installing Command Center

1. Copy the Command Center tar file to your host. For example:

# scp ./PCC-2.1.x.version.build.os.x86_64.tar.gz host:/root/phd/

2. Log into the Command Center admin host as root user. cd to the directory where the Command Center tar files are located and untar. For example:

# cd /root/phd

# tar --no-same-owner -zxvf PCC-2.1.x.version.build.os.x86_64.tar.gz

3. Still as root user, run the installation script. This installs the required packages and configures both Pivotal Command Center and starts services.

Note Installation script must be run from the directory where it is installed, for example: PCC-2.1.x.version.

Example:
# ls
PCC-2.1.x.version
PCC-2.1.x.version.build.os.x86_64.tar.gz
# cd PCC-version
# ./install
This will display the installation progress information on the screen. Once the installation successfully completes, installation success message is displayed on the screen.

Once the cluster is configured and deployed, the cluster status can be viewed by going to the following url: https://<CommandCenterHost>:5443/status

4. Enable Secure Connections:

Pivotal Command Center uses HTTPS to secure data transmission between the client browser and the server. By default, the installation script generates a self-signed certificate. Alternatively user can provide their own Certificate and Key by following these steps:

–Set the ownership of the certificate file and key file to gpadmin.

–Change the permission to owner read-only (mode 400)

–Edit the PCC configuration file /usr/local/greenplum-cc/config/commander as follows:

–Change the path referenced in the variable PCC_SSL_KEY_FILE to point to user key file.

–Change the path referenced in the variable PCC_SSL_CERT_FILE to point to user certificate file.

–Restart PCC with the following command:

service commander restart

5. Verify that your PCC instance is running by executing the following command:

$ service commander status

6. From now on you can switch to the gpadmin user. You should no longer need to be root for anything else.

su - gpadmin

Repo to Install PHD Services

Once you have Pivotal Command Center installed, you need to import and enable the PHD services (PHD, PHDTools, and HAWQ). You can use the import utility to sync the RPMs from the specified source location into the Pivotal Command Center (PCC) local yum repository of the Admin Node. This allows the cluster nodes to access the RPMs.

1. Copy the Pivotal HD, ADS, and PHDTools tarballs from the initial download location to the gpadmin home directory.

2. Change the owner of the packages to gpadmin and untar the tarballs. For example:

For PHD, if the file is a tar.gz or tgz, use:

tar zxf PHD-1.1.x-x.tgz

If the file is a tar, use:

tar xf PHD-1.1.x-x.tar

For Pivotal ADS, if the file is a tar.gz or tgz, use

tar zxf PADS-1.1.x-x.tgz

If the file is a tar, use:

tar xf PADS-1.1.x-x.tar

For PHDTools, if the file is a tar.gz or tgz, use

tar zxf PHDTools-1.1.x-x.tgz

If the file is a tar, use:

tar xf PHDTools-1.1.x-x.tar

Enable the PHD Services

1. As gpadmin, extract the following tarball for Pivotal HD:
# icm_client import -s <PATH TO EXTRACTED PHD TAR BALL>
For example:
# icm_client import -s PHD-1.1.x-x/
2. Optional for HAWQ: As gpadmin, extract the following tar ball for HAWQ and PXF:
# icm_client import -s <PATH TO EXTRACTED PADS TAR BALL>
For example:
# icm_client import -s PADS-1.1.x-x/
For more information, see the log file located at:
/var/log/gphd/gphdmgr/gphdmgr-import.log
3. Optional for USS: As gpadmin, extract the following tar ball for USS:
# icm_client import -s <PATH TO EXTRACTED PHDTools TAR BALL>
For example:
# icm_client import -s PHDTools-1.1.x-x/
For more information, see the log file located at:
/var/log/gphd/gphdmgr/gphdmgr-import.log
Launching Pivotal Command Center

Launch a browser and navigate to the host on which Command Center was installed. For example:

https://rhel1:5443

The Command Center login page is launched in the browser. The default username/password is gpadmin/Gpadmin1.

Configuring and Deploying a Cluster

After login into Pivotal Command Center, the Cluster Status page appears. From here, we are able to launch the Add Cluster Wizard through which we can configure and deploy a Pivotal HD Cluster, as follows:

1. Click Add Cluster. The Add Cluster Wizard opens as shown in Figure 71.

Figure 71 Adding Cluster: Creating Cluster Definition Window

The Wizard allows to create a new configuration from scratch or upload and edit any existing configuration. The Summary panel along the right shows the progress of cluster configuration and deployment.

2. Create Cluster Definition:

–If you are configuring a new cluster, select Create a new Cluster Definition then click Next.

–If you are editing an existing cluster, select Upload Cluster Configuration, click Upload, then navigate to the clusterConfig.xml file that needs to be edited; then click Next. In this case, the following fields in the Wizard will be populated with the cluster definition properties of that clusterConfig.xml file that you have just uploaded. Follow these instructions to edit the values.

3. Versions, Services and Hosts as shown in Figure 72.

Figure 72 Adding Cluster: Versions, Services and Hosts Window (part 1)

4. If you are editing an existing configuration, some if not all these fields would have been populated. Edit wherever appropriate.

Note You need to scroll down to view all the fields on this screen. The Next button will not be active until you have entered all the required fields.

a. In Name field (required), enter a desired name for the cluster.

Note Special characters are not supported.

b. In the Hosts field (required), enter a new line-separated list of FQND host names. You can also click Upload to use a text file containing a new line-separated list of host names.

c. In the Root Password field (required), enter the root password.

d. In the GP Admin Password field (required), enter the gpadmin user password. Command Center creates this user on all nodes.

e. In the JDK Path field, enter the JDK filename (not the absolute path). For example: jdk-7u45-linux-x64-rpm.bin.

Note JDK 1.7 is a prerequisite. If not already installed, you can install using the command icm_client import -f.

f. Check the Setup NTP checkbox if you want to set up NTP (Network Time Protocol).

g. Check the Disable SELinux checkbox if you want to disable SELinux (recommended).

h. Check the Disable IPTables checkbox if you want to disable IPTables (recommended).

i. Check the Run ScanHosts checkbox if you want to run scanhosts. The scanhosts command verifies that prerequisites for the cluster node and provides a detailed report with missing prerequisites. Running this command ensures that the clusters are deployed smoothly.

Click Next.

Figure 73 Adding Cluster: Versions, Services and Hosts Window (part 2)

5. Host Verification

The Host Verification page opens. This step may take a few minutes, it verifies connections to the hosts that were just set up. Once the Eligibility field changes from Pending, to Eligible for all hosts, you can click Next. You can see the errors and other related information displayed in the comments fields.

Click Next.

Figure 74 Adding Cluster: Host Verification Window

6. Topology

This is the section where you specify the roles to be installed on the hosts. Follow suggestions in Table 7 to assign roles to the Cluster.

Figure 75 Adding Cluster: Topology Window

Note All mandatory roles should have at least one host allocated.

Each service has its own section on this page; you can use the top menu options as shortcuts to those sections on the page, or simply scroll down to each section.

Type the text in the appropriate text boxes and press Enter or Tab, the text will change appearance and appear enclosed in a box, which means that the entry has been accepted.

At any point during this stage you can click Save Configuration at the top right of the page. This saves the configuration file and downloads it. Once saved, a link to the configuration file appears at the bottom of the page. Click on the link to open and view the clusterConfig.xml file.

Note You cannot edit this xml file directly.

These are the roles that need to have installation nodes defined:

a. CLIENT: ICM installs Pig, Hive, HBase, and Mahout libraries on this host.

b. HDFS: Name Node, Secondary Name Node, Data Nodes

c. YARN: Resource Manager, History Server, Node Managers

d. Zookeeper: Zookeeper Server

e. HBase: Hbase Master, HBase Region Servers.

f. Hive: Hive Master, Hive Metastore

g. HAWQ: Primary Node, Secondary Node, HAWQ Segment Nodes

h. USS: Name Node and Catalog

i. PXF: No hosts to configure. Installed on the client host.

j. Mahout: No hosts to configure. Installed on the client host.

k. Pig: No hosts to configure. Installed on the client host.

Click Next once you have finished role-mapping.

7. Cluster Configuration

This page displays a list of all configuration files that define this cluster; the clusterConfig.xml (to edit service configuration global values) as well as the service specific configuration files.

All these configuration files are populated with the values that were already entered or with the default values.

a. Click on any file name to open that configuration file in an editor to enter or edit values.

b. If you make any changes, click Save to return to the Cluster Configuration page.

Figure 76 Adding Cluster: Cluster Configuration Window

Configuration Directives for PHD Services

This sub-section suggests the specific configuration settings to be used during the deployment of following PHD services using Pivotal Command Center; these settings are primarily based on the CPA v2 cluster hardware configuration. Pivotal Command Center will provide the optimal values for the rest of the configuration, which user can also review, and change as necessary.

Following PHD services are considered for configuration tuning:

•HDFS

•Mapreduce & Yarn

•HAWQ

Use options listed in table at the time of deploying the new cluster or stop all the services before reconfiguring the cluster. Pivotal Command Center will use the optimal configuration settings for the parameters not specified in Table 8 but you can always review and change them per your needs.

Cluster Services - Global Configuration Variables (Cluster Config.xml)

Table 8 provides the list of global configuration variables of the cluster services.

Table 8 Global Configuration Variables

datanode.disk.mount.points

/data/sdc1,/data/sdd1,/data/sde1,/data/sdf1,/data/sdg1,/data/sdh1,/data/sdi1,/data/sdj1,/data/sdk1,/data/sdl1

namenode.disk.mount.points

/data/sdf1,/data/sdg1,/data/sdm1

secondary.namenode.disk.mount.points

/data/sdc1,/data/sdd1,/data/sde1

yarn.nodemanager.resource.memory-mb

65536

yarn.scheduler.minimum-allocation-mb

1024

hawq.segment.directory

/data/sdc1/primary /data/sdd1/primary /data/sde1/primary /data/sdf1/primary /data/sdg1/primary /data/sdh1/primary /data/sdi1/primary /data/sdj1/primary

hawq.master.directory

/data1/master

Figure 77 Adding Cluster: Cluster Configuration Window (part 2)

HDFS

hadoop-env.sh

Set the following variables in hadoop-env.sh with values as specified:

Figure 78 Adding Cluster: Cluster Configuration Window (part 3)

Note HADOOP_LOG_DIR will be automatically substituted by Pivotal Command Center to the default services log path.
export HADOOP_HEAPSIZE=1024
export HADOOP_NAMENODE_HEAPSIZE=49152
export HADOOP_DATANODE_HEAPSIZE=6144
# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true ${HADOOP_OPTS}"
# Extra ssh options. Empty by default.
export HADOOP_SSH_OPTS="-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR"
# Set Hadoop-specific environment variables here.
# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote 
-Xms${HADOOP_NAMENODE_HEAPSIZE}m -Xmx${HADOOP_NAMENODE_HEAPSIZE}m 
-Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT 
-XX:ParallelGCThreads=8 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -verbose:gc 
-XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-XX:+PrintGCDateStamps -Xloggc:${HADOOP_LOG_DIR}/hadoop-hdfs-namenode-`date 
+'%Y%m%d%H%M'`.gclog -XX:ErrorFile=${HADOOP_LOG_DIR}/hs_err_pid%p.log 
$HADOOP_NAMENODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote 
-Xms${HADOOP_NAMENODE_HEAPSIZE}m -Xmx${HADOOP_NAMENODE_HEAPSIZE}m 
Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT 
-XX:ParallelGCThreads=8 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -verbose:gc 
-XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-XX:+PrintGCDateStamps -Xloggc:${HADOOP_LOG_DIR}/hadoop-hdfs-secondary-namenode-`date 
+'%Y%m%d%H%M'`.gclog -XX:ErrorFile=${HADOOP_LOG_DIR}/hs_err_pid%p.log 
$HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote 
-Xms${HADOOP_DATANODE_HEAPSIZE}m -Xmx${HADOOP_DATANODE_HEAPSIZE}m 
-Dhadoop.security.logger=ERROR,DRFAS $HADOOP_DATANODE_OPTS"
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote -server 
-Xmx${HADOOP_HEAPSIZE}m $HADOOP_BALANCER_OPTS"
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx${HADOOP_HEAPSIZE}m $HADOOP_CLIENT_OPTS"
hdfs-site.xml

Add or reset the values for the following properties in the hdfs-site.xml.

Table 9 Adding/ Resetting Values in hdfs-site.xml

Property

Value

dfs.stream-buffer-size

131072

dfs.datanode.failed.volumes.tolerated

5

Figure 79 Adding Cluster: Cluster Configuration Window (part 4)

YARN

yarn-env.sh

Set the following values for Yarn Resource Manager and Node Manager in the yarn-env.sh file.

Figure 80 Adding Cluster: Cluster Configuration Window (part 5)

Note The YARN_LOG_DIR will be automatically substituted by Pivotal Command Center to the default yarn daemon log path.
export YARN_RESOURCEMANAGER_HEAPSIZE=4096
export YARN_NODEMANAGER_HEAPSIZE=2048
# Common JVM settings for resource manager and node managers
export YARN_OPTS="$YARN_OPTS -server -Djava.net.preferIPv4Stack=true 
-XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps 
-XX:ErrorFile=${YARN_LOG_DIR}/hs_err_pid%p.log"
# Yarn resource manager related jvm settings
export YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS  
-Xloggc:${YARN_LOG_DIR}/hadoop-yarn-resourcemanager-`date +'%Y%m%d%H%M'`.gclog"
# Yarn node manager related jvm settings
export YARN_NODEMANAGER_OPTS="$YARN_NODEMANAGER_OPTS 
-Xloggc:${YARN_LOG_DIR}/hadoop-yarn-nodemanager-`date +'%Y%m%d%H%M'`.gclog"
Once you have completed all your edits, click Deploy.

8. Deployment

a. Clicking Deploy, this window shows the progress of deployment and Information displayed includes hostname, status, role and other messages.

b. Once the deployment is complete, click Next.

9. Summary and Topology

Once your cluster has successfully deployed, you can view a summary of the cluster and Topology view by clicking on Status and choosing Topology, which would include the various roles selected as shown Figure 81.

Figure 81 Cluster Summary and Topology

Post Installation for HAWQ

You need to exchange SSH keys between HAWQ Master and Segment Nodes to complete HAWQ installation.

1. Create a hostfile (HAWQ_Segment_Hosts.txt) that contains the hostnames of all your HAWQ segments.

2. As gpadmin, execute the following commands from the HAWQ Master.
# ssh <HAWQ_MASTER>
# source /usr/local/hawq/greenplum_path.sh
# /usr/local/hawq/bin/gpssh-exkeys -f ./HAWQ_Segment_Hosts.txt
Starting the Cluster

To start the cluster, click Actions. Start on the Cluster Status page. Cluster can also be started from the PCC admin server (rhel1 in this case) using the command:
# icm_client start -l <cluster-name>
To list the cluster details, from the PCC admin server run
# icm_client list
Initializing HAWQ

As gpadmin ssh to the HAWQ master, the run the following:
# source /usr/local/hawq/greenplum_path.sh
# /etc/init.d/hawq init
You have now completed your cluster configuration and deployment.

Pivotal Command Center Dashboard

The Pivotal Command Center UI can now be utilized to review the status of the cluster. The dashboard gives a high level view of a cluster at a glance. You are able to view the status of the most important cluster services, such as HDFS and YARN, and allows to start and stop each of the services individually. It also shows how the most important cluster metrics are trending in a visual way.

The graphs provide a unified view of the state of your system. They are also useful in detecting outliers and pinpointing specific problems that may be present in the cluster.

Figure 82 Pivotal Command Center: Dashboard

Conclusion

Hadoop has become a popular data management across all verticals. The Cisco CPA v2 for Big Data with Pivotal HD for Apache Hadoop along with HAWQ offers true SQL processing on enterprise grade Hadoop with Yarn. Further, it offers a dependable deployment model that offer a fast and predictable path for businesses to unlock value in big data.

The configuration detailed in the document can be extended to clusters of various sizes depending on what application demands. Up to 160 servers (10 racks) can be supported with no additional switching in a single UCS domain. Each additional rack requires two Cisco Nexus 2232PP 10GigE Fabric Extenders and 16 Cisco UCS C240 M3 Rack-Mount Servers. Scaling beyond 10 racks (160 servers) can be implemented by interconnecting multiple UCS domains using Nexus 6000/7000 Series switches, scalable to thousands of servers and to hundreds of petabytes storage, and managed from a single pane using Cisco UCS Central.

Bill of Material

This section provides the hardware and software components used in the design setup for deploying the 64-node Capacity Optimized Cluster and Capacity Optimized for Pivotal HD and HAWQ Cluster.

Table 10 provides the BOM for the master rack for Capacity Optimized Cluster. Table 11 provides the BOM for expansion racks (rack 2 to 4) for Capacity Optimized Cluster.

Table 10 Bill of Material for Base Rack for Capacity Optimized Cluster

Part Number

Description

Quantity

UCS-SL-CPA2-C

Big Data Capacity Optimized Cluster

1

UCSC-C240-M3S

UCS C240 M3 SFF w/o CPU mem HD PCIe w/ rail kit expdr

16

UCS-RAID9271CV-8I

MegaRAID 9271CV with 8 internal SAS/SATA ports with Supercap

16

UCSC-PCIE-CSC-02

Cisco VIC 1225 Dual Port 10Gb SFP+ CNA

16

CAB-N5K6A-NA

Power Cord 200/240V 6A North America

32

UCSC-PSU2-1200

1200W 2u Power Supply For UCS

32

UCSC-RAIL-2U

2U Rail Kit for UCS C-Series servers

16

UCSC-HS-C240M3

Heat Sink for UCS C240 M3 Rack Server

32

UCSC-PCIF-01F

Full height PCIe filler for C-Series

48

UCS-CPU-E52640B

2.00 GHz E5-2640 v2/95W 8C/20MB Cache/DDR3 1600MHz

32

UCS-MR-1X082RZ-A

8GB DDR3-1866-MHz RDIMM/PC3-14900/dual rank/x4/1.5v

256

UCS-HD4T7KS3-E

4TB SAS 7.2K RPM 3.5 inch HDD/hot plug/drive sled mounted

192

UCS-SL-BD-FI96

Cisco UCS 6296 FI w/ 18p LIC, Cables Bundle

2

N2K-UCS2232PF

Cisco Nexus 2232PP with 16 FET (2 AC PS, 1 FAN (Std Airflow)

2

SFP-H10GB-CU3M=

10GBASE-CU SFP+ Cable 3 Meter

28

RACK-UCS2

Cisco R42610 standard rack w/side panels

1

RP208-30-1P-U-2=

Cisco RP208-30-U-2 Single Phase PDU 20x C13 4x C19 (Country Specific)

2

CON-UCW3-RPDUX

UC PLUS 24X7X4 Cisco RP208-30-U-X Single Phase PDU 2x (Country Specific)

6

Table 11 Bill of Material for Expansion Racks for Capacity Optimized Cluster

Part Number

Description

Quantity

UCSC-C240-M3L

UCS C240 M3 SFF w/o CPU mem HD PCIe PSU w/ rail kit expdr

48

UCS-RAID9271CV-8I

MegaRAID 9271CV with 8 internal SAS/SATA ports with Supercap

48

UCSC-PCIE-CSC-02

Cisco VIC 1225 Dual Port 10Gb SFP+ CNA

48

CAB-N5K6A-NA

Power Cord 200/240V 6A North America

96

UCSC-PSU2-1200

1200W 2u Power Supply For UCS

96

UCSC-RAIL-2U

2U Rail Kit for UCS C-Series servers

48

UCSC-HS-C240M3

Heat Sink for UCS C240 M3 Rack Server

96

UCSC-PCIF-01F

Full height PCIe filler for C-Series

144

UCS-CPU-E52640B

2.00 GHz E5-2640 v2/95W 8C/20MB Cache/DDR3 1600MHz

96

UCS-MR-1X082RZ-A

8GB DDR3-1866-MHz RDIMM/PC3-14900/dual rank/x4/1.5v

768

UCS-HD4T7KS3-E

4TB SAS 7.2K RPM 3.5 inch HDD/hot plug/drive sled mounted

576

N2K-UCS2232PF

Cisco Nexus 2232PP with 16 FET (2 AC PS, 1 FAN (Std Airflow)

6

CON-SNTP-UCS2232

SMARTNET 24X7X4 Cisco Nexus 2232PP

6

SFP-H10GB-CU3M=

10GBASE-CU SFP+ Cable 3 Meter

84

RACK-UCS2

Cisco R42610 standard rack w/side panels

3

RP208-30-1P-U-2=

Cisco RP208-30-U-2 Single Phase PDU 20x C13 4x C19 (Country Specific)

6

CON-UCW3-RPDUX

UC PLUS 24X7X4 Cisco RP208-30-U-X Single Phase PDU 2x (Country Specific)

18

Table 12 provides the BOM for the master rack for Capacity Optimized for Pivotal HD and HAWQ Cluster. Table 13 provides the BOM for expansion racks (rack 2 to 4) for Capacity Optimized for Pivotal HD and HAWQ Cluster. Table 14 and Table 15 describe the BOM for the software components.

Table 12 Bill of Material for Base Rack for Capacity Optimized for Pivotal HD and HAWQ Cluster

Part Number

Description

Quantity

UCS-SL-CPA2-C

Big Data Capacity Optimized Cluster

1

UCSC-C240-M3S

UCS C240 M3 SFF w/o CPU mem HD PCIe w/ rail kit expdr

16

UCS-RAID9271CV-8I

MegaRAID 9271CV with 8 internal SAS/SATA ports with Supercap

16

UCSC-PCIE-CSC-02

Cisco VIC 1225 Dual Port 10Gb SFP+ CNA

16

CAB-N5K6A-NA

Power Cord 200/240V 6A North America

32

UCSC-PSU2-1200

1200W 2u Power Supply For UCS

32

UCSC-RAIL-2U

2U Rail Kit for UCS C-Series servers

16

UCSC-HS-C240M3

Heat Sink for UCS C240 M3 Rack Server

32

UCSC-PCIF-01F

Full height PCIe filler for C-Series

48

UCS-CPU-E52670B

2.50 GHz E5-2670 v2/115W 10C/25MB Cache/DDR3 1866MHz (Spare)

32

UCS-MR-1X082RZ-A

8GB DDR3-1866-MHz RDIMM/PC3-14900/dual rank/x4/1.5v

256

UCS-HD4T7KS3-E

4TB SAS 7.2K RPM 3.5 inch HDD/hot plug/drive sled mounted

192

UCS-SL-BD-FI96

Cisco UCS 6296 FI w/ 18p LIC, Cables Bundle

2

N2K-UCS2232PF

Cisco Nexus 2232PP with 16 FET (2 AC PS, 1 FAN (Std Airflow)

2

SFP-H10GB-CU3M=

10GBASE-CU SFP+ Cable 3 Meter

28

RACK-UCS2

Cisco R42610 standard rack w/side panels

1

RP208-30-1P-U-2=

Cisco RP208-30-U-2 Single Phase PDU 20x C13 4x C19 (Country Specific)

2

CON-UCW3-RPDUX

UC PLUS 24X7X4 Cisco RP208-30-U-X Single Phase PDU 2x (Country Specific)

6

Table 13 Bill of Material for Expansion Racks for Capacity Optimized for Pivotal HD and HAWQ Cluster

Part Number

Description

Quantity

UCSC-C240-M3L

UCS C240 M3 SFF w/o CPU mem HD PCIe PSU w/ rail kit expdr

48

UCS-RAID9271CV-8I

MegaRAID 9271CV with 8 internal SAS/SATA ports with Supercap

48

UCSC-PCIE-CSC-02

Cisco VIC 1225 Dual Port 10Gb SFP+ CNA

48

CAB-N5K6A-NA

Power Cord 200/240V 6A North America

96

UCSC-PSU2-1200

1200W 2u Power Supply For UCS

96

UCSC-RAIL-2U

2U Rail Kit for UCS C-Series servers

48

UCSC-HS-C240M3

Heat Sink for UCS C240 M3 Rack Server

96

UCSC-PCIF-01F

Full height PCIe filler for C-Series

144

UCS-CPU-E52670B

2.50 GHz E5-2670 v2/115W 10C/25MB Cache/DDR3 1866MHz (Spare)

96

UCS-MR-1X082RZ-A

8GB DDR3-1866-MHz RDIMM/PC3-14900/dual rank/x4/1.5v

768

UCS-HD4T7KS3-E

4TB SAS 7.2K RPM 3.5 inch HDD/hot plug/drive sled mounted

576

N2K-UCS2232PF

Cisco Nexus 2232PP with 16 FET (2 AC PS, 1 FAN (Std Airflow)

6

CON-SNTP-UCS2232

SMARTNET 24X7X4 Cisco Nexus 2232PP

6

SFP-H10GB-CU3M=

10GBASE-CU SFP+ Cable 3 Meter

84

RACK-UCS2

Cisco R42610 standard rack w/side panels

3

RP208-30-1P-U-2=

Cisco RP208-30-U-2 Single Phase PDU 20x C13 4x C19 (Country Specific)

6

CON-UCW3-RPDUX

UC PLUS 24X7X4 Cisco RP208-30-U-X Single Phase PDU 2x (Country Specific)

18

Table 14 RedHat Enterprise Linux License

Red Hat Enterprise Linux

RHEL-2S-1G-3A

Red Hat Enterprise Linux

64

CON-ISV1-RH2S1G3A

3 year Support for Red Hat Enterprise Linux

64

Table 15 Pivotal HD and HAWQ

Pivotal

Pivotal - HD

Pivotal HD

64

HAWQ

HAWQ

64

Design Zone for Data Centers

Cisco UCS Common Platform Architecture Version 2 (CPA v2) for Big Data with Pivotal HD and HAWQ

Hierarchical Navigation

Downloads

Table Of Contents

About the Authors

Acknowledgment

About Cisco Validated Design (CVD) Program

Cisco UCS Common Platform Architecture Version2 (CPA v2) for Big Data with Pivotal HD and HAWQ

Audience

Introduction

Cisco UCS Common Platform Architecture for Big Data

Pivotal HD and HAWQ

Key Features and Benefits

Solution Overview

Rack and PDU Configuration

Server Configuration and Cabling

Software Distributions and Versions

Pivotal HD

HAWQ

RHEL

Software Versions

Fabric Configuration

Performing Initial Setup of Cisco UCS 6296 Fabric Interconnects

Configure Fabric Interconnect A

Configure Fabric Interconnect B

Logging Into Cisco UCS Manager

Upgrading UCSM Software to Version 2.2(1b)

Adding Block of IP Addresses for KVM Access

Editing the Chassis/FEX Discovery Policy

Enabling the Server Ports and Uplink Ports

Creating Pools for Service Profile Templates

Creating an Organization

Creating MAC Address Pools

Configuring VLANs

Creating Server Pool

Creating Policies for Service Profile Templates

Creating a Host Firmware Package Policy

Creating QoS Policies

Creating the Best Effort Policy

Creating a Platinum Policy

Setting Jumbo Frames

Creating a Local Disk Configuration Policy

Creating a Server BIOS Policy

Creating a Boot Policy

Creating a Service Profile Template

Configuring Network Settings for the Template

Configuring a Storage Policy for the Template

Configuring a vNIC/vHBA Placement for the Template

Configuring a Server Boot Order for the Template

Configuring Server Assignment for the Template

Configuring Operational Policies for the Template

Configuring Disk Drives for Operating System on NameNodes

Configuring Disk Drives for Operating System on DataNodes

Installing Red Hat Linux 6.4 with KVM

Post OS Install Configuration

Setting Up Password-less Login

Installing and Configuring Parallel SSH

Installing Parallel-SSH

Installing Cluster Shell

Configuring /etc/hosts and DNS

Creating RedHat Local Repository

Creating the Red Hat Repository Database

Upgrading LSI driver

Installing httpd

Enabling Syslog

Setting Ulimit

Disabling SELinux

JDK Installation

Download Java SE 7 Development Kit (JDK)

Install JDK7 on All Node

Setting TCP Retries

Disabling the Linux Firewall

Configuring Data Drives on Data Nodes

Configuring the Filesystem for DataNodes

Installing Pivotal HD Using Pivotal Command Center

Role Assignment

Installing Command Center

Repo to Install PHD Services

Enable the PHD Services