What You Will Learn
This document discusses the following topics:
● How the evolution of data center technology has changed the approach to application dependency mapping
● Why gaining insight into applications and their dependencies is cumbersome in a dynamic data center environment
● Various approaches that customers have tried to gain application insight
● How the Cisco Tetration Analytics™ platform can help solve the application insight problem
● How application insight works in Cisco Tetration Analytics
● What application insight enables you to do
Current Data Center Environment
The modern data center has evolved in a brief period of time into the complex environments seen today, with extremely fast, high-density switching pushing large volumes of traffic, and multiple layers of virtualization and overlays. The result is a highly abstract network that can be difficult to secure, monitor, and troubleshoot.
Data centers typically handle millions of flows a day, and are fast approaching billions of flows. On average, there are 10,000 active flows per rack at any given second (Benson, Akella, & Maltz, 2010). At the same time, networking teams are being asked to increase their operational efficiency and secure the ever-expanding attack surface of their networks.
Although this accelerated rate of change is helping increase the scale and complexity of the applications and solutions that organizations can deliver, it is also putting additional strain on networking teams to respond to these changes.
The type of workload deployed is changing in nature as well. The popularization of microservices has caused development of containerized applications that exhibit entirely different behavior on the network than traditional services. Their lifecycles often can be measured in milliseconds, making their operations difficult to capture for analysis. The same effect can be seen with hybrid cloud and virtualized workloads, which move between hypervisors as needed. Furthermore, highly scaled data centers are challenging TCP to meet their high-performance needs, including the need to handle multiple unique conversations within the same long-lived TCP session (Qiu, Zhang, & Keshav, 2001).
To be able to efficiently and confidently respond to changes, you need a complete understanding of the applications that depend on your network (Xu, Zhang, Mao, & Bahl, 2008). In data centers, it can be almost impossible to manually catalog and inventory the myriad applications that are running at any one time, and doing so usually requires extensive and often challenging manual collaboration across a number of separate groups.
And even if an organization performs manual application dependency mapping, it is not a one-time operation. Mappings must be maintained as living documents. Otherwise, the source of information will increasingly diverge from the source of truth (the actual application traffic). This state can create new challenges for the data center team as the team attempts to manage changes to the network based on misinformation.
Difficulties of Application Dependency Mapping
Mapping applications has traditionally been a difficult task, from both business and technical perspectives.
Mapping anything in this world is generally a problematic and laborious process. For example, the ocean occupies roughly 70 percent of the earth’s surface, yet we have mapped only about 5 percent of it in detail (Schmidt Ocean Institute, 2013).
The application landscape of a data center can be conceptualized as an ocean. You can understand from the surface level what appears to be running, but below the surface is an entire world of interrelated flows that connect all the surface components together. Truly grasping these deep flows is what makes application dependency mapping difficult because the flows are not easy to discover, analyze, and map.
Currently, three approaches are usually used:
● Manual approach: In this approach, the network team attempts to manually collect application information across groups that have deployed applications on top of the network infrastructure.
● Simple automated approach: In this approach, simple automated tools are run in an attempt to collect application dependency mapping information.
● Outsourced approach: An external agency combines the manual and native approaches and then evaluates the collected information and compiles it into a report.
Most organizations find the manual approach labor and cost intensive, and from a technical perspective it is too rigid. Manual application dependency mapping is feasible only at a very small scale. The network team also should have a strong relationship with the application teams, who must gather the required data. This process is prone to human error and mistakes, and the probability of accurately mapping the applications in the data center is low. The collected data is usually compiled into a static report that diverges from reality as soon as it is published.
Businesses may be tempted by the simple automated option. It appears to incur only the low capital costs needed to purchase software that claims to be able to map applications, and to require only a small operations team to run the application dependency mapping tool. However, current simple software implementations of application dependency mapping are notoriously ineffective at truly mapping the application landscape.
Often, the application dependency mapping software is not to blame. The problem instead is the source data they ingest. With poor or incomplete input data, the application dependency mapping tool cannot make accurate observations, however revolutionary the mapping algorithm may be. Therefore, to complete the mapping process, the low-quality automated recommendations must be supplemented by human input, increasing the long-term costs of using this approach. Traditional application dependency mapping software usually does not learn from this human supervision, causing staff to have to repetitively help the tool.
Traditionally, capturing enough high-quality source input data at data centers at scale has been a task that only few organizations with huge resource pools and equipment have been able to achieve.
The outsourced approach has the problems of both approaches, and it adds a layer of cost and complexity. It also does not usually provide any more actionable or insightful mapping than the other two techniques.
Cisco Tetration Analytics Platform Uses Application Insight to Map Dependencies Accurately
Application dependency mapping is difficult to achieve, it is not feasible to perform manually at scale, and automated tools today provide incomplete results due to low-quality input data and poor classification (grouping) algorithms.
Cisco Tetration Analytics application insight introduces unsupervised machine learning and behavior analysis that aims to solve all these problems.
To perform accurate application dependency mapping, a huge breadth of high-quality input data is critical to be able to make relevant observations. To achieve this, the Cisco Tetration Analytics architecture was designed from the foundation to capture high-quality data. The Cisco Tetration Analytics platform consists of next-generation Cisco® switches that have flow capturing capabilities built in to the application-specific integrated circuit (ASIC), small-footprint OS-level agents, and a next-generation analytics cluster that incorporates big data technology that can scale to billions of recorded flows (Figure 1).
Figure 1. Cisco Tetration Analytics Architecture
To continue the ocean analogy, with this Cisco Tetration Analytics architecture, you can “analyze the ocean.”
Because data is exported to the Cisco Tetration Analytics cluster every 100 ms, the detail of the data is much finer. Cisco Tetration Analytics is much less likely to miss so called mouse flows, which often are missed by NetFlow collectors, with an export rate set to 60 seconds. Because the flow table is flushed so often, Cisco Tetration Analytics sensors are unlikely to be overwhelmed and therefore unlikely to need to drop flow records because of lack of available memory. This is a significant change in the architectural model: the flow-state tracking responsibility is transferred from the sensor to the collector, allowing hardware engineers to design smaller, more efficient, and more cost-effective ASICs. (See the “For More Information” section of this document for more information about the Cisco Tetration Analytics sensor architecture.)
Not only does Cisco Tetration Analytics allow you to analyze a huge amount of data at one time; it keeps a detailed historical record without archiving or losing any resolution in the data. This look-back feature allows the system to understand more than a single snapshot in time, and to show trends for an application. For example, many businesses have applications that have a seasonality about them, with additional capacity or dependencies pulled in dynamically at moments of extreme load or at the end of a fiscal quarter. Understanding this behavior is important because the application may have hidden dependencies that are not visible during quiet periods. Without this knowledge, traditional application dependency mapping tools may recommend policies that do not truly map an application’s full requirements across its entire lifecycle.
After this high-quality and comprehensive input data has been ingested by the analytics cluster, it is available for analysis by application insight algorithms within the Cisco Tetration Analytics platform. Whereas traditional application dependency mapping tools use a top-down approach because they do not have access to such data, Cisco Tetration Analytics is designed with a bottom-up approach in which recommendations are made from actual application traffic. Because the Cisco Tetration Analytics architecture delivers such high-quality data, new, the platform can make use of unique mapping algorithms that take advantage of this capability, generating far more accurate and complete application dependency maps and recommendations.
The mapping algorithm computes a unique profile about every endpoint to be mapped. The profile is based on a large number of data points, including detailed network flow information mapped to the originating process within the endpoint operating system. This information is then combined with information from external data sources such as domain name, load-balancer configuration, and geolocation information.
The machine learning algorithm then iteratively compares each profile to the other profiles until a best fit is achieved.
Any recommendations that the application dependency mapping tool makes are accompanied by a confidence score. Confidence scores indicate how sure the algorithm is that the grouping is correct. This score allows human operators to prioritize their time when providing any additional input: they can start with the recommendations about which the tool is least confident. As the system learns, the confidence score is likely to go up for a given grouping as the tool starts to understand the patterns that group particular endpoints together in your data center.
The recommendations that the application insight feature makes can then be confirmed by a human supervisor. However, unlike traditional application dependency mapping tools, Cisco Tetration Analytics will understand the instructions from the operator and learn from the experience. Next time the application dependency mapping tool is run, it will factor in the human guidance, saving valuable time for the operator.
Benefits of Application Insight
Cisco Tetration Analytics application insight has the intelligence to truly map your data center’s application landscape, helping you address a wide range of problems in the data center.
During the mapping process, the platform captures and tracks the consumers and providers of services for an application. Correctly identifying the provider of a service is achieved using degree analysis: tracking the number of inbound connections and, if possible, who initiated the TCP session. With this information, you can apply whitelist security policies across the data center. Application whitelisting is considered by some, such as the Australian government information security department, to be the number-one mitigation strategy to protect against targeted cyber intrusions (Australian Goverment, 2014); however, whitelists are expensive to implement and difficult to maintain. With Cisco Tetration Analytics application dependency mapping, this task is substantially easier.
Application whitelisting is a core concept of the next-generation Cisco Application Centric Infrastructure (Cisco ACI™) data center networking solution (see “For More Information” at the end of this document). To be able to deploy a data center network that implements application whitelisting, application dependency must be fully mapped before migration; otherwise, you risk cutting off legitimate communications to a given workload. Cisco Tetration Analytics mapping results can be used to ease the migration path. In fact, the platform has a policy compliance tool that uses the results of application dependency mapping to simulate what-if whitelist scenarios on real application traffic, displaying detailed information about what traffic would have been permitted, mistakenly dropped, escaped, or rejected (Figure 2).
Figure 2. Policy Compliance What-If Analysis Based on Application Dependency Mapping Results
Data centers are living entities that change constantly. Often applications need to update, change configuration, or scale. If you have application dependency mapping data for reference before and after an application change, you can verify the success of an update, check the results of a change, or monitor the stability of an application at the network level.
For example, an update to an application might cause it to incorrectly launch with development settings that point at a test database server, causing a critical application to fail in minutes. application dependency mapping would be able to detect that the application is now incorrectly connecting to a test database server and alert staff.
Disaster recovery is a minimum requirement for most production data centers. In fact, nine out of ten IT leaders have a disaster-recovery plan in place (Apicella, 2003). But how can you validate your disaster-recovery plan at the network level, especially when application whitelisting is applied and may affect the effectiveness of your disaster recovery plan? Because of the long look-back period, Cisco Tetration Analytics can help make sure that any policy it recommends is aware of disaster-recovery dependencies that it has detected from any tests.
Some other benefits of an accurate application dependency map are listed here:
● Facilitate mergers and acquisitions (M&A): Get a clear understanding of which applications need to be migrated as part of the M&A process, and which application components and services are redundant and can be phased out. This information is critical and can accelerate the integration of the two entities.
● Migrate applications to other infrastructure or the cloud: Now you can migrate applications to a new infrastructure or a public cloud with confidence, knowing exactly which components are required and what network services this application is dependent on.
Figure 3 shows the traffic patterns discovered between neighboring endpoint clusters. The thickness of a chord indicates that amount of ports that are exposed between neighboring clusters.
Figure 3. Application Traffic Dependency Diagram
Cisco Tetration Analytics Application Dependency Mapping in Action
To understand how Cisco Tetration Analytics application dependency mapping works, step through the application dependency mapping process:
1. Cisco Tetration Analytics collects high-quality flow data from your data center.
2. The user selects the sensor data and time range to be analyzed by the mapping algorithm.
3. The user uploads any supporting information that the application dependency mapping algorithms can benefit from in a simple standardized format (load-balancer configurations, Domain Name System [DNS] records, known endpoint groups, and known route tags).
4. The mapping algorithm processes the flow data and anything previously learned to generate endpoint groups (clusters).
5. The user provides any additional hints or groupings required to achieved the desired mapping.
6. One or more applications are defined using the clustering results. A confidence score is presented that indicates how sure the mapping algorithm is of the clustering.
8. (Optional) The user can run the application dependency mapping tool again to identify any additional or new hosts.
Application dependency mapping is difficult to achieve. To be effective, application dependency mapping requires a tremendous amount of high-quality source data and intelligent algorithms specifically designed to quickly understand and process large data sets.
The Cisco Tetration Analytics platform solves these problems by using a next-generation architecture that can ingest, store, and analyze huge high-quality data sets with specifically tailored algorithms. This new approach unlocks a new set of truly actionable insights about your data center and the applications that reside within it, bringing better application visibility, monitoring, troubleshooting, and policy compliance capabilities.
For More Information
Cisco Tetration Analytics http://www.cisco.com/c/en/us/products/data-center-analytics/tetration-analytics/index.html
Cisco Tetration Analytics Policy Compliance: Find the Needle by Removing the Haystack Advantage Series White Papers – Flow Telemetry in Cisco ASICs
Apicella, M. (2003, 02 25). Disaster recovery taken to heart. Retrieved 03 24, 2016, from Computer World: http://www.computerworld.com/article/2581411/disaster-recovery/disaster-recovery-taken-to-heart.html
Australian Government. (2014, 02). Strategies to Mitigate Targeted Cyber Intrusions. Retrieved 03 23, 2016, from Australian Government - Department of Defence: http://www.asd.gov.au/infosec/top-mitigations/mitigations-2014-table.htm
Benson, T., Akella, A., & Maltz, D. (2010). Network traffic characteristics of data centers in the wild. SIGCOMM conference on Internet measurement (pp. 267-280). ACM.
Powers, A. (2011, 08 08). Measuring NetFlow Scalability: How Much is a Lot of Flows? Retrieved 03 25, 2016, from Lancope: https://www.lancope.com/blog/measuring-netflow-scalability-how-much-is-a-lot-of-flows
Qiu, L., Zhang, Y., & Keshav, S. (2001). Understanding the performance of many TCP flows. Computer Networks, 37(3), 277-306.
Schmidt Ocean Institute. (2013, 03 12). The Ocean: Haven't we already mapped it? Retrieved 03 22, 2016, from schmidtocean.org: http://schmidtocean.org/cruise-log-post/the-ocean-havent-we-already-mapped-it/
Xu, C., Zhang, M., Mao, Z., & Bahl, P. (2008, 12). Automating Network Application Dependency Discovery: Experiences, Limitations, and New Solutions. OSDI, 8, pp. 117-130.