Network Operational Simplification: Reducing Time to Deployment
Amit Dutta, Product Manager, Cisco
Operational simplification is a big buzzword in the industry, with total aggregate market to be valued at just under a billion dollars. But what exactly is operational simplification?
Today deploying a certain service set in network devices requires an in depth understanding of, not only the network design but also the feature set which would be required to make the design come to life. "Simplification", as a definition makes something easier to do or makes it easy to understand, so reduction in complexity, cost and time falls quite easily under this definition.
Let's define each one of these pieces which would drive the need for simplification:
Complexity: What is so complex about network today? Well everything. Network planning to design to implementation and maintenance are manual, expensive and rigorous activities. Right from boot up to end host provisioning is currently a very tedious process. Multiple teams are involved who are experts in each part of the planning and design and all have very different types of requirements, which in turn make a deployment very complex. And not just deployment, to keep all of these considerations in perspective, the operating system on the device needs to maintain feature sets which are either archaic or corner knobs for customer.
Time: It takes an incredible amount of time to deploy a network device, longer time to deploy a service and even more to deploy a feature/solution which then added to complexity and multiple iterations leading to high cost and maintenance. For critical services where time is of mandatory importance, network/device downtime is a strict no no.
Cost: Cost is again a very generic term. Cost can mean monetary value, resources, overheads and downtime. Each one of them can add to the opex of an operator, which in turn leads to more time and more complexity.
Let's stop here and have a reality check. Simplification is not automation, and not every function of a network needs to be simplified they should be optimized. This might sound like a contradiction, but a network device is a car chassis while the routing for that network is a custom engine and you would need specialists to make sure that is installed properly. So what we are really looking at to simplify?
The problem space: Let's put this in perspective: Let us assume a large Indian Oil company wants to empower all of their petrol pumps across the country by putting a network device in 10,000 locations. To achieve this the traditional way of doing things are:
- Pre-provision all 10,000 devices in a rolling manner so each device has some configuration to reach a server or start forwarding traffic.
- Ship all devices ( without tracking, or manual tracking i.e. which location the device goes)
- Handling misconfigurations, which generate truck-rolls after installation and additional cost.
- Remote sites would have more downtime due to time taken to reach them
- This is called a Day 0 problem
After installation, if there are any misconfigurations, albeit, very low, the same cycle as above applies. Send someone out, fix, and then validate. Also if they have just done some policy changes which needs one configuration to be updated to all devices, they would need to add all the devices to a NMS system, meaning each device have to be configured with NMS parameters and then, maybe they can push configurations out in one go to all 10,000 devices, but there is also a case that these devices may not be centrally managed, meaning technicians have to log into each and every one of the devices and then put the configuration in, validate and sign off. This is a Day N problem.
The premise is can we do something better to take this burden off the customers. If you read the above example, you can easily understand that the amount of time it would take to even deploy that number of devices, the amount of people who are involved to make this network work and the ongoing issues that may arise.
Even for India, with the wealth of engineers, the first step is so complex that companies who are not IT/ITES are not willing to deploy infrastructure which scale across the country. So we have complexity, time and cost issues that need to be alleviated, customers should not perceive deploying large network devices as a hindrance, but as an enablement or empowerment.
Requirements for a solution: So what do we need to make devices easier to deploy? Surely current methodologies don't work, meaning a new architecture is required.
- A central management station should be able to discover devices in the field without any configuration. From the above example, ship all the 10,000 devices to locations, power them up and they should be visible to a central NMS or a central network entity which can orchestrate the next steps.
- Why not let the device which was sent to the field discover its provisioning server without any configurations? The NMS can double up, and have this functionality or this can be integrated to another entity. In the above example, this will eliminate the pre-provisioning needs of the 10,000 devices.
- Create an underlay management plane which can be used to push configurations or connect securely to devices. This plane should be separate from traffic forwarding and should be built again with no configuration, hence underlay, it comes before routing. In the above example any incremental updates as well as the first configurations can easily be sent to all devices or a subset of devices as required.
- Consistent reachability across any misconfiguration or errors that may occur during the provisioning of the device is key, which can be achieved by the separated management plane. Meaning the NMS or the central systems can reach the device if the configuration pushed to the device had any flaws or it failed. In the above example this will eliminate sending personnel. This also allows reduction in downtime, as the device does not need to be taken offline or reloaded just because something went wrong.
- A lot of these devices would be in locations which are unsecure, replacing a device with a rogue device or stealing passwords can be easily dealt with some innovative security solution, again without configurations or user intervention, i.e. secure from boot up of the device and into operation.
- Last but not the least a feedback mechanism is required which can be used to make sure the configurations sent to the device have been applied correctly.
India requires a simplified and automated delivery of devices primarily because of the geo diversity and the space it has. A solution which can allow very quick and secure way of deploying network infrastructure is quite needed for a country of this size. Even if the Oil Company employs a managed service provider to help them deploy these devices, the problems plaguing the initial deployment will only shift, but unfortunately will not go away, and would still add to the opex burden.