Tuesday, March 21, 2017


Ivan, I believe you are misinformed. I am not talking specifically about the vSphere API that AVS uses, and the constant rumours about it, but more about ACI in general. 

It is true that VMware is becoming a company where their technology is increasingly vertically integrated and eventually they want that if you use their hypervisor you have to use their SDN solution and their orchestration system and their VDI and ... I think the market will steer things otherwise … but we shall see! 😃 

I don’t know how much of an opportunity you had to have hands-on with ACI recently. I’d love to spend time showing you how some of the things we do with ACI work. Meanwhile I provide here my respectful feedback. 

I definitely disagree with some comments that you made. Some are in fact open for debate, for instance:

You can’t do all that on ToR switches, and need control of the virtual switch”.

I wouldd say you can do a lot of what you need to do for server networking on a modern ToR. And yet you are right, and you do need a virtual switch. That is clear. How much functionality you put on one vs. another is a subject for debate with pros and cons.

But in an SDN world with programmable APIs, it does not mean you need YOUR virtual switch in order to control it. You just need one virtual switch that you can program. That is all.

There’s a lot that we can do on AVS that we can do on OVS too. And we do it on OVS too. There’s a lot with do on AVS that we can’t do with VDS. But there’s enough that we can do in VDS so that when combining it with what we do on the ToR we deliver clever things (read more below).

The below comment on the other hand is imho misinformed:

That [running without AVS] would degrade Cisco ACI used in vSphere environments into a smarter L2+L3 data center fabric. Is that worth the additional complexity you get with ACI? "

First, it is wrong to assume that in a vSphere environment ACI is no more than smarter L2+L3 data center fabric. But even if it was only that … it is a WAY smarter L2+L3 fabric.

And that leads me to your second phrase. What is up with the “additional complexity you get with ACI?”.

This bugs me greatly. ACI has a learning curve, no doubt. But we need to understand that the line between “complex” and “different” is crossed by eliminating ignorance. 

Anyone that has done an upgrade on anymore than a handful of switches from any vendor and then conducts a network upgrade (or downgrade) on dozens and dozens of switches under APIC control will see how it becomes much simpler.

Configuring a new network service involving L2 and L3 across dozens and dozens of switches on multiple data centers is incredibly simpler. Presenting those new networks to multiple vCenters? Piece of cake. Finding a VM on the fabric querying for the VM name? … done. Reverting the creation on the previously created network service is incredibly simpler too. I mean … compared to traditional networking … let me highlight: INCREDIBLY simpler.
Changing your routing policies to announce or not specific subnets from your fabric, or making a change in order to upgrade storm control policies to hundreds or thousands of ports, … or - again - reverting any of those changes, becomes real simple. Particularly when you think how you were doing it on NX-OS or on any other vendor’s box-by-box configuration system.

And the truth is that APIC accomplishes all of that, and more, with a very elegant architecture based on distributed intelligence in the fabric combined with a centralised policy and management plane on a scale-out controller cluster. 

Other vendors require combining six different VM performing three different functions just to achieve a distributed default gateway that is only available to workloads running on a single-vendor hypervisor. Now that’s complex, regardless of how well you know the solution.

Finally, when it comes to using VDS and ACI, we can have multiple EPGs mapped to a single vSphere PortGroup and associate the VM to the right EPG (and correspondingly the right policy) dynamically using vCenter Attributes. To do that we leverage PVLAN on the VDS, but just at the base PortGroup level, not across the infrastructure. In a way, we use an isolated PVLAN to make the VDS perform as a FEX, so that we can ensure that all VM traffic can be seen, and classified, on the ToR. And that works very well. And we are not the only ones doing that … btw. 

The clear disadvantage of that approach is that you lose local switching on the hypervisor. This may be a show stopper in some cases. However, for organisations running DRS clusters it is rarely an issue from a practical standpoint since in any case, it’s difficult to predict how much traffic is switched locally anyways and the switching through the ToR has often times only marginal impact on application performance.

On the other hand, the advantage of this approach is that you keep your hypervisor clean of any networking bits and with the simplest configuration, thus making operations simpler because network maintenance does not need to upset the DRS cluster, and vice versa.

The other advantage is that you get programmable network and security at the VM level without having to pay per-socket licenses and service fees. And that you can extend all of that to other environments running Hyper-V, KVM, etc.

No comments:

Post a Comment