Explore Cisco
How to Buy

Have an account?

  •   Personalized content
  •   Your products and support

Need an account?

Create an account

Cisco Data Intelligence Platform powered by AMD and Cloudera Data Platform Solution Overview

Available Languages

Download Options

  • PDF
    (674.2 KB)
    View with Adobe Reader on a variety of devices
Updated:July 13, 2020

Available Languages

Download Options

  • PDF
    (674.2 KB)
    View with Adobe Reader on a variety of devices
Updated:July 13, 2020
 

 

Solution highlights

Proven Cisco Unified Computing System (Cisco UCS®) architecture foundation

Cisco UCS Integrated Infrastructure for Big Data and Analytics is a proven platform for enterprise analytics applications. The Cisco UCS platform offers complete integration of computing, networking, and storage resources with unified management, providing easy, linear scalability of the architecture.

Linear scalability: from small to very large

Using the Cisco Application Centric Infrastructure (Cisco ACI®) platform, a cluster can be scaled to thousands of nodes. The Cisco ACI platform implements an application-aware, policy-based approach that treats the network as a single entity rather than a collection of switches.

Ease of deployment with UCS Manager and Cisco Intersight

Cisco UCS Manager simplifies infrastructure deployment with an automated, policy-based mechanism that helps reduce configuration errors and system downtime. It offers proven, high-performance, linear scalability and easy scaling of the architecture with single- and multiple-rack deployments. Apart from UCS Manager, the Cisco Intersight platform provides intelligent cloud-powered infrastructure management for the Cisco UCS and Cisco HyperFlex platforms.

Modular design

The Cisco UCS C4200 Series Rack Server Chassis with Cisco UCS C125 M5 Rack Server nodes powered by AMD EPYC processors is optimized for use in environments requiring dense computing form factors and high core densities, such as scale-out, computing-intensive, general service provider, and bare-metal applications.

Platform for enterprise data lake with Cloudera Data Platform (CDP)

The CDP product suite addresses the complete needs of data in motion and data at rest. CDP offers a scalable, fault-tolerant, and cost-efficient platform.

It provides consistent data management, governance, and security capabilities while delivering robust analytics from real-time customer applications that accelerate decision making and innovation.

Cisco and CDP Solution for Big Data Analytics

Data is an organization’s future and its most valuable asset. The accumulation of data from multiple sources, such as sensors, Internet of Things (IoT) devices, social networking, and online transactions is growing at an unprecedented rate. These sources are generating data that needs to be captured, monitored, and rapidly processed, regardless of origin to enable organizations to make informed decisions based on this data in a timely manner. A huge challenge for enterprises is managing all this data, with its increasing volume and variety and a growth rate that was rarely seen in the past.

Big data and analytics architectures are making this data increasingly available for processing. With both data in motion and data at rest now readily available, customers are discovering that they can extract real business value from insights gained through data analytics and data science.

Data lakes enhance and amplify existing IT investments and provide new ways to create business value. They also provide a cost-effective and technologically feasible way to meet today’s day big data challenge. They enable the storage and analysis of large volumes and wide varieties of data both for real-time and batch processing. Building a next-generation data lake architecture requires simplified and centralized management, high performance, and a linearly scaling infrastructure and software platform.

Cisco UCS and CDP together create a solution that helps enterprises transform their businesses by unlocking the full potential of big data. Cisco UCS Integrated Infrastructure for Big Data and Analytics with Cloudera Data Platform (CDP) powers the next-generation architecture for big data systems, encompassing a myriad of use cases, including IoT, fraud analytics, and precision medicine through genome sequencing (Figure 1).

Cloudera Data Platform

Figure 1.               

Cloudera Data Platform

Cisco Data Intelligence Platform

The Cisco Data Intelligence Platform (CDIP) supports today’s evolving architecture. It brings together a fully scalable infrastructure with centralized management and a fully supported software stack (in partnership with industry leaders in the relevant areas) to each of these three independently scalable components of the architecture: the data lake, AI/ML technologies, and object stores.

Hadoop 3.0 introduced Docker support along with GPU isolation and scheduling. This opened up a plethora of opportunities for modern applications such as micro-services and distributed applications running on thousands of containers to execute AI/ML algorithms on peta bytes of data with ease and in a speedy fashion. CIDP is fully capable of addressing those application needs managed by either YARN or Kubernetes.

As the journey continues in the Hadoop ecosystem, more staggering and impressive frameworks and technologies such as Apache Submarine and Spark 3.0 are here to further complement it. With that, CDIP offers an extremely adaptable architecture and it evolves as underlying technologies, platform, and frameworks change, resulting in total investment protection (Figure 2).

Cisco Data Intelligence Platform

Figure 2.               

Cisco Data Intelligence Platform

Cisco UCS reference architecture

Figure 3.               

Cisco UCS reference architecture

Reference architecture

The reference architecture for the Cisco UCS C4200 Series Rack Service Chassis with Cisco UCS C125 M5 Rack Server nodes (Figure 3) powered by AMD EPYC processors and CDP big data distribution is optimally designed and tested to help ensure a balance between performance and capacity. It can scale out to meet big data and analytics requirements. It can expand to thousands of servers with Cisco Nexus® 9000 Series Switches using the Cisco Application Policy Infrastructure Controller (APIC) with a leaf-and-spine design using the Cisco ACI platform. This next generation infrastructure can be deployed to meet a wide variety of computing, storage, and connectivity options.

Cisco UCS 6454 Fabric Interconnects

The Cisco UCS 6454 Fabric Interconnect is a core part of Cisco UCS, providing both network connectivity and management capabilities for the system. The Cisco UCS 6454 offers line-rate, low-latency, lossless 10, 25, 40, and 100 Gigabit Ethernet, Fibre Channel over Ethernet (FCoE), and Fibre Channel functions. The Cisco UCS 6454 provides the management and communication backbone for the Cisco UCS B-Series Blade Servers, Cisco UCS 5108 B-Series Server Chassis, Cisco UCS Managed C-Series Rack Servers, and Cisco UCS S-Series Storage Servers. All servers attached to the Cisco UCS 6454 Fabric Interconnect become part of a single, highly available management domain. In addition, by supporting a unified fabric, the Cisco UCS 6454 provides both the LAN and SAN connectivity for all servers within its domain.

Cisco UCS C4200 Rack Server Chassis

The Cisco UCS C4200 Series Rack Server Chassis is a modular, density-optimized rack-server chassis that supports:

     Up to four UCS C125 M5 Rack Server Nodes and up to 256 cores per chassis with AMD EPYC processors for environments requiring dense compute form factors and high core densities, such as scale-out, compute-intensive, general service providers, and bare-metal applications.

     24 small-form-factor (SFF) drives. The drive bays are allocated so that each rack server node has access to six SAS, SATA SSDs, or up to four SSDs and two NVMe drives.

Cisco UCS C125 M5 Rack Server

The Cisco UCS C125 M5 Rack Server Node has the highest number of cores commercially available in a multi-node system. There are two sockets per node and from 8 to 32 cores per processor, with the support of the AMD EPYC 7000 series processors, c16 DIMM slots for 2666 MHz DDR4 DIMMs, and capacity points of up to 128 GB per slot for a total of 2 TB per socket, up to 2 half-height/half-length PCI Express (PCIe) 3.0 slots, and an optional M.2 SSD module. The C125 supports either SAS RAID via a PCIe 12-G SAS storage controller card or SATA directly from the AMD EPYC processor. The node also includes a dedicated internal LAN mezzanine slot based on the OCP 2.0 standard, supporting networking speeds up to 100 Gbps. Additionally, installation of a fourth-generation Cisco PCIe virtual interface card (VIC) can be added in the x16 PCIe 3.0 slot.

AMD EPYC 7000 Series processor

Designed from the foundation for a new generation of solutions, AMD EPYC server processors implement a philosophy of choice without restriction. Choose the number of cores that meets your needs without sacrificing key features such as memory and I/O. Each EPYC processor can have from 8 to 64 cores with access to an exceptional amount of I/O and memory, regardless of the number of cores in use. The processors include 128 PCIe Generation 3 lanes and support for up to 2 TB of high-speed memory per socket. The innovative AMD EPYC architecture provides outstanding performance. I/O-intensive workloads can use the plentiful I/O bandwidth with the right number of cores, helping organizations avoid overpaying for unnecessary power. And computing-intensive workloads can make use of fully loaded core counts, dual sockets, and plenty of memory.

Cloudera Data Platform

CDP is an integrated data platform that is easy to deploy, manage, and use. By simplifying operations, CDP reduces the time to onboard new use cases across the organization. It uses machine learning to intelligently auto-scale workloads up and down for more cost-effective use of cloud infrastructure.

The Cloudera Data Platform (CDP) Data Center is the on-premises version of Cloudera Data Platform. This new product combines the best of both worlds—Cloudera Enterprise Data Hub and Hortonworks Data Platform Enterprise—along with new features and enhancements across the stack. This unified distribution is a scalable and customizable platform where you can securely run many types of workloads.

Migrating to Cloudera Data Platform from CDH and HDP provides the feature sets outlined in Table 1.

Table 1.           Cloudera Data Platform features

Cloudera Data Platform features

Conclusion

Cisco UCS Integrated Infrastructure for Big Data and Analytics with C4200 Series Rack Server Chassis and C125 M5 Rack Server powered by AMD Zen architecture based EPYC processors, and CDP big data distribution helps to deliver a highly compute-intensive and scalable solution while aiming to solve a diverse set of real-world business problems. The combined solution enables batching through real-time interactive processing, providing an enterprise-ready platform to support modern data applications.

For more information

     To find out more about Cisco UCS big data solutions, visit: https://www.cisco.com/go/bigdata

     To find out more about AMD EPYC processors, visit: https://www.amd.com/en/products/epyc-server

     To find out more about Cloudera Data Platform, visit: https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/UCS_CVDs/cisco_ucs_cdip_cloudera.html

Learn more