Translate

Wednesday, September 7, 2011

SDN wont solve network problems if you just try to dimiss them

I am very excited overall about SDN, and recently about VXLAN, a protocol that is well suited for delivering L2 broadcast domains to support IaaS and sort of follows the SDN paradigm (a software controller instantiating the L2 domains as needed).

But I still have the perception that from certain vendors, VMware in particular, there is a lack of interest and knowledge in anything related to networking. The blog post on VXLAN by Allwyn Sequeira has some comments (and some minor mistakes) which feed my perception.

To begin with, the almost constant mantra that networks need to become "fast, fat and flat". For years networks have been faster than most applications could leverage (we have had 10GE for years, but no server was capable of using that capacity until a couple of years ago, and even today many servers being deployed do not have such capacity). The other part of the mantra (fat and flat) shows just ignorance, and I am sorry I can't be polite about it. Even more when put in the context of VXLAN.

Engineering a network for it to perform, scale and provide fast convergence isn't an easy task. Period. VXLAN is cool, yes, but it looks like Allwyn, and many others too, forget the minor detail that for it to work, it requires a (very well) performing L3 multicast network. Of course this comes to no surprise from someone who writes "[...] tenant broadcasts are converted to IP multicasts (Protocol Independent Multicast – PIM).". IP multicast and PIM are two different (but of course tightly related) things.  It is funny that people see "24-bit ID, so I can run millions of VNs" ... so cool ... anybody thought of running with millions of (S,G) entries in the network? ...

Of course I imagine you can (and will) group several VNIs mapped to a single (S,G) but still ... Anybody with networking experience knows that running a multicast network with thousands of entries isn't a simple task, even less if you want to achieve sub-second convergence on any network failure (hint, no fault tolerance built-in in VXLAN, but this is a minor detail for the VMware folk ...).

Anybody though that current ToRs from merchant silicon vendors can't run more than 1K-2K mroutes in hardware? Anybody thought that they don't support BiDir PIM? ...

Bottom line, running networks isn't complicated because network-heads are evil and want to ruin the happiness of application writers. There's more to it than evilness ...

I write this with the utmost respect for Allwyn and VMware in general. I just wish that one day I'll see bright people from the application world be open and willing to work with the stuff they don't know about or don't understand, as oppose to simply dismiss it and expect it to be fast, fat and flat (... and dumb they'd gladly add for sure).

VXLAN is a cool protocol that does not solve any network complexity problems, but provides a great way to abstract L2 edge domains in virtual environments.