Have an account?

  •   Personalized content
  •   Your products and support

Need an account?

Create an account

Accelerate Data Processing and ETL with Cisco UCS and FASTDATA.io

Networking Solutions Island of Content Event

Available Languages

Download Options

  • PDF
    (830.6 KB)
    View with Adobe Reader on a variety of devices
Updated:April 28, 2020

Available Languages

Download Options

  • PDF
    (830.6 KB)
    View with Adobe Reader on a variety of devices
Updated:April 28, 2020
 

 

The Cisco Unified Computing System (Cisco UCS®) C480 ML M5 Platform together with FASTDATA.io’s PlasmaENGINE® enables end users to significantly accelerate their current data processing and ETL Apache Spark Pipeline. PlasmaENGINE harnesses the power of GPUs to process data at a far greater speed than CPU-based systems on Apache Spark.

Overview

During the past few years, there has been a rapid increase in data available to businesses. However, businesses can only get insights from a fraction of the data as they are not able to process this data fast enough due to lack of massively parallel processing solutions.

Data processing challenges

Data has become one of the most valuable commodities for modern businesses. Businesses see data coming in from many sources and are thinking about how they can extract more intelligence and make better decisions that augment human capabilities. According to one estimate, 1.7 MB of data will be created every second for every person on earth in 2020.

Data processing has historically been done using batch processing software on CPU-based infrastructure. Batch processing–based solutions don’t scale at an affordable rate. Therefore, businesses have the option to either spend enormous amounts of money on infrastructure or process data at a very slow pace. In other words, businesses can only get insights from a fraction of the data as they are not able to process this data fast enough due to lack of massively parallel processing solutions.

Benefits

      PlasmaENGINE harnesses the power of GPUs to process data three times faster than CPU-based systems on Apache Spark.

      PlasmaENGINE can be deployed in minutes with a seamless integration process.

      PlasmaENGINE is fully compatible with programs written for Apache Spark. No changes to Spark code are required to run on PlasmaENGINE. PlasmaENGINE integrates seamlessly and instantly to accelerate and add real-time processing to ELT pipelines.

      PlasmaENGINE enables businesses to drive additional value out of their data by overcoming data-processing limitations.

      Cisco UCS C480 ML M5 Rack Server has a total memory capacity of 7.5 Terabytes (TB). High memory capacity helps avoid swapping data to disk and allows higher throughputs that can fully utilize a GPU’s bandwidth. C480ML has up to six NVMe disk drives. NVMe storage helps mitigate cases where data has to be written to disk. Together, high memory capacity and fast NVMe storage offer substantial benefits to complex pipeline-processing within PlasmaENGINE.

      Cisco UCS C480 ML M5 can be managed with Cisco Intersight. Cisco Intersight is a new cloud-based management platform that uses analytics to deliver proactive automation and support. By combining intelligence with automated actions, you can reduce costs dramatically and resolve issues more quickly.

      The UCS C480 ML M5 servers can be deployed as standalone servers or in a Cisco UCS–managed environment with Cisco UCS Manager (UCSM).

PlasmaENGINE

Modern CPUs run into performance constraints because performance gains can come from greater parallelism and CPU architectures are not well suited for greater parallelism. FASTDATA.io overcomes this by taking advantage of modern GPU architectures that have been designed specifically for massive parallel processing capabilities. FASTDATA.io runs all current SQL-based operations at native GPU speeds by compiling them into native GPU code and utilizing JIT (just-in-time), LLVM (low-level virtual machine), and sophisticated query caching technologies. Moreover, all PlasmaENGINE-compiled code is highly vectorized and optimized for MPP operations on the GPU.

PlasmaENGINE converts data into columnar format and operates on that data in parallel with thousands of threads provided by GPU-enabled processing.

PlasmaENGINE on Cisco UCS C480 ML M5 Rack Server

Figure 1.       

PlasmaENGINE on Cisco UCS C480 ML M5 Rack Server

Because many companies rely on Apache Spark for data processing, PlasmaENGINE was built to be fully compatible with almost all existing algorithms and programs written for Apache Spark. PlasmaENGINE is also the first GPU-based streaming data engine that can be run as a standalone or as part of an Apache Spark infrastructure. Existing and newly written Apache Spark programs can perform up to three times faster than the most optimized version of a CPU-based engine.

The higher efficiency of GPU-centric data processing lowers infrastructure costs by as much as 75 percent. Just as Apache Spark was a huge leap forward from Map Reduce in 2013, PlasmaENGINE is the next generation of data processing software, built for the needs of our ever-growing data-dependent world.

Cisco UCS C480 ML M5 Rack Server

The Cisco UCS C480 ML M5 Rack Server is a purpose-built server for deep learning and is storage and I/O optimized to deliver industryleading performance for various training models. The Cisco UCS C480 ML M5 delivers outstanding levels of storage expandability and performance options in standalone or Cisco UCS–managed environments using a 4RU form factor (Figure 2). Because of a modular design, the platform offers the following capabilities:

      8 NVIDIA SXM2 V100 32GB modules with NVLink Interconnect

      Latest Intel® Xeon® Scalable processors with up to 28 cores per socket and support for two-processor configurations

      24 DIMM slots for up to 7.5 Terabytes (TB) of total memory

      Support for the Intel Optane DC persistent memory (128G, 256G, 512G)

      4 PCI Express (PCIe) 3.0 slots for multiple 10/25G, 40G or100G NICs

      Flexible storage options with support for up to 24 Small-Form-Factor (SFF) 2.5-inch, SAS/SATA Solid-State Disks (SSDs) and Hard-Disk Drives (HDDs)

      Up to 6 PCIe NVMe disk drives

      Cisco 12-Gbps SAS Modular RAID Controller in a dedicated slot

      Dual embedded 10 Gigabit Ethernet LAN-on-motherboard (LOM) ports

For more information about Cisco UCS C480 ML M5 Rack Server, go to: https://www.cisco.com/c/enus/products/collateral/servers-unified-computing/ucs-c-series-rack-servers/datasheet-c78-741211.html.

Cisco UCS C480 ML M5 Rack Server

Figure 2.       

Cisco UCS C480 ML M5 Rack Server

System management

Cisco UCS Manager (UCSM) provides unified, integrated management for all software and hardware components in Cisco UCS. Cisco UCSM manages, controls, and administers multiple blades and chassis enabling administrators to manage the entire Cisco Unified Computing System as a single logical entity through an intuitive GUI, a CLI, as well as a robust API. Cisco UCS Manager is embedded in Cisco UCS Fabric Interconnects and offers a comprehensive set of XML API for third-party application integration. Cisco UCSM exposes thousands of integration points to facilitate custom development for automation and orchestration and to achieve new levels of system visibility and control.

The Cisco UCS C480 ML M5 is supported by the full suite of Cisco Unified Computing System (Cisco UCS) management tools and is engineered for Cisco Intersight. Cisco Intersight is a Software-as-a-Service (SaaS) management platform that uses analytics to deliver proactive automation and support. By combining intelligence with automated actions, you can reduce costs dramatically and accelerate time to resolution.

Performance benchmark and results

PlasmaENGINE can be installed in minutes via Docker, AWS AMI, GCE Image, or Azure VHD. The price is dependent on the amount of data processed and the number of GPU cores utilized.

To showcase the accelerated performance of PlasmaENGINE on Cisco UCS Integrated Infrastructure for Big Data and Analytics, several tests were run comparing PlasmaENGINE to Apache Spark on Cisco UCS C480ML M5 Rack Servers. The tests focused on extracting the highest processing bandwidth of Cisco’s scale-up architecture. The server configuration consisted of two 3 TB NVMe SSDs, dual Intel Xeon Processor Scalable family 6140 CPUs (2x 18 cores and 2.3 GHz), 394 GB of DRAM, and equipped with 8 NVIDIA V100 GPUs with 32GB DRAM each.

The first test was TPC-DS, a standard set of queries designed to demonstrate support for a wide range of big-data-related workloads. All tests were passed and ran faster than Apache Spark without any tuning.

The next test was a real-world use case called Picopath, which extracts a small segment of clean data (Picodata), from the overlap of a massive segment of data (NOAA AIS vessel log in this case) to produce a clear visualization of a large geospatial dataset.

The last test is a synthetic benchmark showing the processing bandwidth potential of PlasmaENGINE on the powerful Cisco UCS C480ML M5 Rack Server. It runs the Haversine formula (GPS point distance approximation) on pairs of geospatial locations to see just how many operations per second can be done.

Figure 3 compares the total execution time of the TPC-DS, Picopath, and Haversine Benchmarks, as well as the throughput of the Haversine Streaming benchmark on Apache Spark and PlasmaENGINE. Tests were run once on CSV input and then again on Apache Arrow input where applicable. Apache Arrow provides even better results as it is a columnar format designed for massively parallelized processing. Note that only the Haversine benchmark was performed using the Apache Arrow input.

It is important to note that the UCS C480ML M5 comes with 8 Nvidia SXM2 V100 GPUs and 2 CPUs. Therefore, Figure 3 compares the execution time on 2 CPUs running Apache Spark with execution time on 8 GPUs running PlasmaENGINE.

Execution time: results for Picopath and Haversine Benchmarks

Figure 3.       

Execution time: results for Picopath and Haversine Benchmarks

Next, let’s look specifically at the throughput of the Haversine Benchmark, consistently the most computationally intensive of the benchmarks. Figure 4 should gives a sense of the potential throughputs PlasmaENGINE can produce.

Throughput for Haversine Benchmark

Figure 4.       

Throughput for Haversine Benchmark

Advantages of Cisco UCS Integrated Infrastructure for big data and analytics

Cisco UCS Integrated Infrastructure for Big Data and Analytics is a proven platform for enterprise analytics applications with capabilities for powering analytics platforms accelerated by GPUs, and possesses a simplified, intelligent infrastructure with the high performance and scalability needed to meet growing business demands.

Big data systems have reached a level of maturity where they are in wide use across every industry. With this success has come additional challenges in processing the vast amount of data quickly enough to be useful. The distributed nature of big data system processing makes Cisco UCS Integrated Infrastructure for Big Data and Analytics the perfect solution for modern data processing.

Learn more

For more information about FASDATA.io PlasmaENGINE visit: https://fastdata.io/plasma-engine/

For more information about Cisco® solutions for AI/ML workloads visit: https://www.cisco.com/go/ai-compute

Learn more