Wednesday, February 8, 2012

Nicira: fear them not

So Nicira is finally out of stealth mode. This is good news. Much of what we have seen from their website now is confirming rumors and expectations. In the press however, there continues to be a bit too much hype in my opinion. Talks about the next VMWare are a bit out of plance I think, if for no other reason, because the server and networking industries are very very different (in dollar value to begin with ...).

The general assumption is that networks are very static and difficult to manage and adapt to business needs. Michael Bushong from Juniper writes that they  "are far too big and complicated to run by hand and are therefore operated by a maze of management, provisioning and OSS/BSS systems".

I guess when you are coming from a small installed base as a networking vendor in this space you want to exaggerate the issues faced by DC and Enterprise networks today. It is true that networks are not managed by hand, and are managed by OSS/BSS systems, but then isn't this a good thing anyways? And more importantly, isn't this true for server, server images, and storage as well? I wouldn't say managing hundreds or thousands of VMs with different images, patching levels, etc is a simple task that anybody want to run by hand.

But it is true that networks are static and that automating network configuration isn't an easy task. Adding ports to VLANs can be automated in a somewhat easy way. But things like stretching VLANs, or moving entire subnets around are a more difficult task. Now, a network engineer would claim that the problem isn't the network itself, but the way applications are built.

After all, if you build a network based on L3 with proper subnet planning, you will never have an issue allocating network resources for any VM you provision on the network, and all a VM needs for communication is an IP address. But the issue is that applications aren't built to run in "just any subnet", for once, they need to communicate within the subnet for many tasks with other components of the application. And then there's policy and security which if tied to the IP Address, becomes a nightmare to manage and enforce. And decoupling the policy and security rules from the IP address isn't easy to do today either.

There are many things I agree with Mr. Bushong on though, and one is that "programmability is about adding value to the network control, rather than a threat of commoditization".

Several years ago I read a paper on Microsoft's VL2 proposal. It is very similar to what Nicira is doing in concept: at the server network stack you build a tunneling mechanism that facilitates endpoint communications. At that moment I thought such approach wouldn't be feasible for it demands to change the server TCP/IP stack, a daunting task. But virtualization has changed that, because now we CAN change the stack at the vSwitch level, while the server OS, close to the application, remains unchanged. Nicira has also added one more thing: a northbound API to provisioning systems that can harmonize the network connectivity for endpoints with other resources (server, storage, etc.).

In itself, Nicira's solution isn't providing anything new: builds overlays to facilitate endpoint communication. This can be done with VXLAN  as well, what Nicira is providing is, supposedly, a control plane capable of creating and managing those overlays in an automated and scalable way. The latter point is to be confirmed of course.

Many seem to think that this will be the end of networking as we see it, and that the physical network becomes a commodity. I think this isn't true. First because building a large, scalable, fast and performing L3 network ins't rocket science but it isn't something many have succeeded at. There is a reason why Internet operators rely on just two companies for that: Cisco and Juniper.

Second because as you want to improve efficiency of utilization of your physical topology, and provide differentiation to applications that require it, your PHYSICAL network must have a way to view and interact with your overlays.

And there is more. Once you have built an architecture that enables you to create such overlays to allow endpoint connectivity, what happens when connectivity needs to be done with elements outside of your overlay? You need a gateway out, and a gateway in. I can see ways in which you leveraging OF and a controller you can scale the gateway out from the vSwitch itself, but scaling the gateway in is more difficult and chances are it will be done via appliances of some sort, which then need to be redundant etc.

So I think the more Nicira's we have the better. The more development we see that facilitates moving towards cloud architectures, the more demand for performing and intelligent networks. I do not see commoditization happening in the network for the same reason hypervisors haven't commoditize the CPUs. Intel and AMD are now building CPUs that offer more services to the hypervisor layer, in the same way, we will see networks which offer better services to the overlay networks.

Net net, I am one who thinks that much of the complexity in networking isn't created because of wrongdoings in the past, or just legacy technologies. It is because dealing with a network isn't like dealing with endpoints. It is a complex and evolving challenge in itself.

Bright times ahead for the industry …

No comments:

Post a Comment