Saturday, October 15, 2011

On SDN, and OpenFlow - Are we solving any real problem?

I must admit that I started looking at SDN and OpenFlow with a lot of skepticism. It is not the first time I was facing a networking technology which proposed a centralized intelligence to setup the network paths for traffic to go through. Such thinking always reminds me of old circuit-based networks and NewBridge 46020, which proposed TDM path setup to FR and ATM, and when making in-roads into ADSL, offered it as the solution to all evils. ATM LANE was, in a different way, another "similar" attempt.

Being a long term CCIE makes me an IP-head, no doubt about it, and a controller-based network is something that I really need to open my mind to, in order to even consider it. I am trying though ...

The way I look at it, the question about SDN and OpenFlow alike is: what problem are we solving?

So far, I see three potential problems we are trying to solve:

-  scale: help building more scalable networks spending less money
-  complexity: simplify running a complex network
-  agility: simplify or fasten implementing network services & network virtualization

In this blog post I try to look at SDN/OF from those three potential problems. I make little to no distinction between SDN and OF and that is wrong because they aren't quite the same thing. Things like 802.1x, or even LISP to some extent, could be somehow considered Software Defined Networking, in that the forwarding and/or the policy is defined in a "central" software engine or database lookup.

But for the purpose of this post, I really look at OF as THE way for implementing SDNs. But before I begin ...

A Common Comparison I consider flawed ...

Many people put existing Wireless LANs as a way to say the controller based approach is proven. Sure, look at wireless networks today, almost all are using a controller based approach ... well ... yes, but no. The biggest difference with OF from a networking perspective is that the controller FORWARDs all the traffic, which is tunneled in an overlay from each of the access points. So the controller IS part of the datapath, and an intrinsic part of it. In fact, can be the bottleneck.

Moreover, the capillarity of a WLAN network is orders of magnitude lower than that of a datacenter fabric, so any analogy is, IMHO, flawed.


From a scalability perspective, my first take at SDN and OpenFlow was focused on two points which I looked at as big limitations:

1. a totally decoupled control plane (perhaps centralized - albeit distributed in a cluster) requires an out of band management network which could limit scale and reliability (plus add to the cost)
2. programming "flows" on device TCAMs using OF will not scale or at least will not provide any savings in CAPEX lead by networking hardware itself

I see point one above less as a limitation now, so long as we really succeed at achieving a large simplification of the network in all other areas (beyond management), thanks to the SDN approach.
It isn't unusual to have an OOB management network in tier-1 infrastructures anyways. In the SDN/OF approach however, the OOB network is really a critical asset (even more than critical ...). It must have redundancy with fast convergence built into it and be built to scale as well.  We also need to factor the cost of running and operating this network too. Also, as the policy and/or number of flows becomes higher, the cost of the cluster itself may be non-negligible.

Point two from above is still one where I need to better understand how things will be done. A a first glance, I thought this was going to be a big limitation because I thought each network forwarding element would be managed like a remote linecard, programmed perhaps using OpenFlow. In such case, the hardware table sizes of each network element would limit the entire infrastructure because for L3 forwarding you want to keep consistent FIBs and for L2 forwarding even more to minimize flooding. Hence, if your forwarding elements are limited to, say, 16K IPv4 routes, that's the size of your OpenFlow network ... there are ways to optimize that by programing only part of the FIB, which is possible as the controller knows the topology. But then if there are flow exceptions ...

But then of course, things change if you consider that ... why would you need to do "normal" L2 or L3 lookups for switching packets? You can forget about L2 and L3 altogether (potentially).  And then, I assume the "controller" could keep track of the capabilities of each node including table sizes, and program state only as needed and where needed. This adds complexity to the controller, but should help scaling.

But can this scale if hardware is programmed per flow?

I still don't see this happening really. Two issues with this: scale the flow setup (a software process), and scale the hardware flow tables. I understand the flow setup is not necessarily THE problem, but still, let's review it. Let's say the first packet is sent to the controller for the flow to be setup, all of this over the OOB network. This will add delay to initial communication and put load on the controller, but no reason why this can't scale out with multiple controller servers in a cluster or splitting the forwarding elements between different instances of the controller. All of this adds to the cost of the solution though (and add management complexity too).

But what I don't see is this scaling at the forwarding chip level. I wonder how to program hardware with the SDN approach. The OpenFlow way seems to be to do it on a per flow basis, leveraging table pipelining.

Of course it all depends on what do we call a flow. If we take source/destination IP addresses plus tcp/ip port, any aggregation switch will easily see hundreds of thousands of flows at any given time. Even at the ToR level the number of flows will ramp up rapidly. This will kill the best silicon available from vendors such as Broadcom, Fulcrum or Marvel so easily. We can indeed limit to source-destination mac addresses, or IP for that matter, but then that limits a lot what you CAN do with the packet flows. So if host A wants to communicate with host B that is two flows a->b and b->a. That is if you define a flow by source/destination mac address. In this case, let's assume you have 48 servers connected to a ToR switch. Let's say there are 10 VMs per server (40 is very common in today's Enterprises by the way). Let's say each VM needs to talk to two load balancers and/or default gateway plus to 10 other VMs. This means each VM would generate 12-14 flows. So this means each ToR switch would see 48 x 10 x 12 = 5,760 (times two, because they have to be bi-directional). Now that isn't too much, chips like Trident can fit 128K L2 entries, which in this case would mean flows if we define them as per mac-address. But think of the aggregation point which has 100+ ToR switches connected. Now those switches need to handle 576,000 flows (times two). Way more if you assume more than one vNic per VM.

At any rate, if you want to handle overlapping addresses, you also need to add some other field to the flow mask ... So I still don't see how this can scale at all, certainly not using "cheap" hardware.

In the end, if you want to run fast, you'll need to pay for a Ferrari, whether you drive it yourself or have a machine do it for you.

But I see a benefit if we can run the network forwarding elements in a totally different way than what we do today. The options for virtualizing the network can be much richer, this is true, but would come at a cost (discussed below). And also, I see a point to scale the network beyond what current network protocols allow which can be interesting. I certainly understand the interest from companies running very large datacenters, which tend to have very standardized network topologies which can benefit a lot from the SDN approach. But at the same time, I do not see why standard routing protocols can't be made to scale to larger networks too ...

I doubt OpenFlow will be the right approach in the enterprise for quite a while, because in that world, you rarely build from scratch 100%. There is always a need to accomodate for legacy and this will mandate for "traditional" networking for quite a while, no doubt (if SDN ever really takes up, that is). Sure, I know companies like Big Switch are looking at ways to use OF as an overlay into existing networks. We will have to wait and see how this works ...

Simplify Running a Complex Network

This point is a tough one. What is simple to some is complex to others. Someone with networking background will not consider that running an ISIS network which implements PIM-SM is difficult, while someone with software development background will see it very complicated. Likewise, the same software developer may think that running a cluster of servers which control other "servers" (that each do packet forwarding) is very simple.

SDN looks, on powerpoint, very promising for network management simplification. But when you begin to dig in for details, on how to do troubleshooting, how to look deep into the hardware forwarding tables, how to ensure path continuity or simply test it etc, you begin to see that what was simple in concept, becomes more complex.

Simplify Network Services

A lot of the writing I have seen around this focuses on the idea that once we have a "standard" way of programming the forwarding hardware (OpenFlow, that is), then all forwarding actions become sort of like instruction sets on a general purpose CPU. Hence, all network problems become solvable by writing programs that operate on the network using such instructions.

I have seen typically two examples given for quick-wins of this approach: load balancing and network virtualization. Both hot topics on any data center. Others point to fancier ones, like unequal load balancing, load based routing, or even shutting down unused nodes. The latter speaks for itself and is foolish thinking ...

All others CAN be done with traditional networking technologies, and if they are not implemented is typically for very good reasons.

On the point of load balancing and network virtualization, what I have seen so far are discussions at very high level, which show how this can indeed be done. OpenFlow-heads praise how this is going to be not only simple, but even free! ...

Implement a load balancer, nothing simpler and cheaper. The fantastic OF Controller will simply load balance flows depending on the src address for instance. Done. Zero dollars to it. Of course, it obviates the point of hardware (and software) flow table scalability - already mentioned above. Of course it obviates the fact that a load balancer does A LOT MORE than push packets out of various ports depending on source address ... it keeps track of real ip addresses, polls the application to measure load, off loads server networking tasks, etc. There's a reason why people invest in appliances from Cisco or F5 to do load balancing. Switches (from multiple vendors) have been able to do server load balancing for a long time, but what you can do there isn't just enough for most applications. OF changes nothing there.

Network Virtualization is another one where the complex becomes so simple thanks to SDN and OF. I admit to write here from ignorance of the actual work of companies like Big Switch or Nicira, of course. But most of what I read resolves into implementing an overlay network with a control plane running on software. Nothing different to many other approaches today and what would be done using VXLAN or NVGRE. At any rate, I would argue LISP is a better choice, but alas, it does not solve L2 adjacencies which are required for clustering and other reasons (which OF doesn't solve either).

I have seen many others proposing that thanks to OF, one can easily program the controller to use fields like MPLS tags, or VLAN tags, etc. to do segmentation ala carte. This is true, and fine. But I wonder, how is this good?!

And how is it different from doing traditional networking? Sure, Cisco, Brocade, Juniper, F10 and others could have decided to change the semantics of existing network fields to implement segmentation and many other features. And sometimes we have seen this done. But in so doing, they become proprietary. They don't interoperate.

IF a controller software vendor X provides a virtualization solution that works that way (by redefining the semantics of existing fields), it offers a solution that locks-in the customer with that software vendor. A solution for which a tailor-made gateway will be required to connect with a standards based network, or to a network from any other company.

Imagine company Z who runs a DC with a controller from software vendor X. Imagine company Z merges with company Y, who runs with a controller from a different vendor, ... imagine the trouble. Today, company Z running Juniper merges with company Y running Cisco and they connect their networks using OSPF or BGP, or just plain STP if they are not very skilled ...

I am sure I am missing the obvious when so many bright people praise OpenFlow. I just can't see how it solves problems that we can't solve today, or how it does it better. Or how it really will make it for a better industry. Many would like to see a world where networking wont be dominated by two vendors. OF, at best, could change who those two vendors are, nothing more.

Conclusion ... (for now)

I think  OpenFlow is a very interesting technology, and the SDN paradigm one that can contribute to many good things in the industry. But up until today, almost everything that I read about both topics is of very high level, and idealistic to the point of being naive. In my opinion, a lot of the assumptions made on the problems that OpenFlow will solve are made without knowledge of other existing solutions. I am yet to see an expert in IP and MPLS praise OpenFlow for solving problems that we couldn't solve with those two, or solve them in a clearly advantageous way.

So far to me, from my ignorance of things I admit, OpenFlow looks at just a different way to do the same. And for that ... what for?

No comments:

Post a Comment