Chalk Talk September

Workload Mobility across Data Centers

Server Virtualization adoption is growing rapidly in today’s data centers. Server virtualization enables single physical server to host multiple virtual machines, allowing cost savings in terms of fewer servers bought as well as power and cooling costs. Server virtualization is no longer limited to lab environments or certain specific applications. The benefits are clearly established. With increasing server CPU and memory capacity such as Cisco Unified Computing System, one can easily fit hundreds of virtual machines on a single server blade. Server virtualization is widely adopted by all enterprises for various different types of applications, such as desktop virtualization, Microsoft Exchange, SQL server etc.

One of the main advantages of server virtualization is the ability to provide high flexibility to move the virtual machine (workload) from one physical server to another without impacting the workload through technologies like virtual machineware ESX vMotion or Citrix XenSever XenMotion.

Today’s workload mobility is typically restricted within the same data center from one POD/rack to another rack. However, the requirements are changing to move the workload beyond the data center for various reasons to achieve true mobility and realize the power of server virtualization with no boundaries.

Here are some use cases for long distance workload mobility:

  • Disaster avoidance: Data centers in the path of natural calamities (such as hurricanes) needs to proactively migrate the mission-critical application environment to another data center.
  • Data center migration or consolidation: Applications need to be migrated from one data center to another without business downtime as part of a data center migration or consolidation effort.
  • Data center expansion: Virtual machines need to be migrated to a secondary data center as part of data center expansion to address power, cooling, and space constraints in the primary data center.
  • Workload balancing across multiple sites: Virtual machines need to be migrated between data centers to provide compute power from data centers closer to the clients (follow the sun) or to load-balance across multiple sites.

Figure 1

In order to achieve virtual machine mobility across data center, the following infrastructure needs to be considered:

Layer 2 Connectivity between the physical servers across Data Centers

The IP address of the virtual machine needs to remain the same as the virtual machine moves from one server to another server across the data center. LAN extension is a key requirement.

Solution:
Cisco Overlay Transport Virtualization protocol (OTV) on the Cisco Nexus 7000. OTV is an industry-first solution that significantly simplifies extending Layer 2 applications across distributed data centers. You can now deploy Data Center Interconnect (DCI) between sites without changing or reconfiguring your existing network design. With OTV you can deploy virtual computing resources and clusters across geographically-distributed data centers, delivering transparent workload mobility, business resiliency, and superior computing resource efficiencies. OTV does not depend on the type of WAN that you may have such as dedicated fiber, low-speed/high-speed WAN links or MPLS service etc as it is purely based on IP connectivity between the sites.

For more information about OTV, please refer to the following URL: http://www.cisco.com/en/US/prod/switches/ps9441/nexus7000_promo.html

Optimized Outbound/Inbound Traffic to the Workload

Optimized Outbound Traffic Routing
A primary requirement for application mobility is that the migrated virtual machine maintains all its existing network connections after it has been moved to the secondary data center. Traffic routing to and from the virtual machine needs to be optimized so that any traffic flows in an optimized way to the virtual machine’s new location.

After the virtual machine has moved to the new data center, it needs to able to reach its default gateway for outbound communication. The gateway router is still in the old data center and hence the traffic will get switched via the DC interconnect to the old data center. Not really optimal.

Solution:
Cisco OTV allows first hop router localization options. With this feature, first hop router is local to the site where the virtual machine is residing. For example, the HSRP virtual IP is shared between both sites and there is a local HSRP active router with the SAME IP address on both data centers. When the virtual machine moves to the new data center, it is still continuing to talk to the same default gateway IP address, but it will served by the local router. Because they share the same VLAN, the communication of the HSRP between the sites needs to be contained —OTV allows you do that. It helps to avoid traffic from being routed all the way back to the old data center.

Optimized Inbound Traffic Routing
Traffic that is bound for the virtual machine from the external world needs to be routed to the virtual machine. If the traffic to the virtual machine originates in the same Layer 2 domain, then the Layer 2 extension between the data centers will suffice. However, if the traffic to the virtual machine is traversing a Layer 3 network or the Internet, the traffic will be routed via the old data center.

Solution:
To avoid this triangulation of traffic, advertise more granular routes from the secondary data center for migrated virtual machines. If these changes are not provisioned, suboptimal routing may result in additional delay, which may or may not be acceptable depending on the specific virtual machine or application. In the future, technologies like Location/ID Separation Protocol (LISP) will provide IP address portability between sites so routing to the virtual machine can be done without needing a lot of routing changes.

Storage Extension between the Data Centers

In order for the mobility to happen live, the source and the destination ESX server needs to have access to the shared storage.

Solution:
There are three options for making shared storage available for both the ESX servers:

  1. Shared storage:
    Storage remains in the same location as the compute moves to the secondary data center. In order for the destination ESX to able to reach the storage, use either IP storage such as NAS or use storage extension options like Cisco MDS FC and FCIP extension. This method may have some IO performance implication depending on the distance between the data centers. The compute IO operations to the storage are performed over the wide area link.
  2. Active – Passive Storage
    Another approach is to move the storage to the same location as the compute to alleviate IO performance concerns. The storage is active only in the data center. As the compute is moved from the primary to secondary data center, one can move the storage via techniques like Storage vMotion which offers non-disruptive moving of storage while the workload is active. Depending on the size of storage it may take a longer time to move the storage to second data center.
  3. Active – Active Storage
    In this method, the storage is actively maintained in both data centers as it is synchronized live between the two. As the compute is moved to the secondary storage, the ESX server is able to access the local active storage and hence has no IO performance impact. Active-Active storage solutions are provided by storage array vendors like EMC vPLEX and Netapp Metro Clusters. These active-active storage solutions still have distance limitation to a few hundred miles between data centers.

Summary

Today, technologies exist to seamless provide mobility between data centers and many deployments have already happened across the globe. Live mobility involving importing and exporting to and from the cloud is still not a reality today, but with advancement in many of these underlying infrastructure technologies, they are beginning to take shape.

About the Author:


Balaji Sivasubramanian, is a product line manager in the Data Center Switching business unit in Cisco focusing on marketing the Nexus 7000 product family of data center switches. Before this role, Balaji was a senior product manager for the Catalyst 6500 switches product line, where he successfully launched the Virtual Switching System (VSS) technology worldwide. He started his Cisco career in Cisco Technical Assistant Center working in the LAN switching products and technologies. Balaji has been a speaker at various industry events such as Cisco Live and virtual machine world. Balaji has a Master of Science degree in computer engineering from the University of Arizona and a Bachelor of Engineering degree in electrical and electronics from the College of Engineering, Guindy, Anna University (India).

Balaji Sivasubramanian

CCNP Routing and Switching Foundation Learning Library: Foundation Learning for CCNP ROUTE, SWITCH, and TSHOOT (642-902, 642-813, 642-832)


By Diane Teare, Richard Froom, Balaji Sivasubramanian, Erum Frahim, Amir Ranjbar
ISBN-10: 1-58705-885-5
ISBN-13: 978-1-58705-885-1
Published August 20, 2010
US SRP $175.00
Published by Cisco Press.