Taking a look at VMtracer with Arista

I’ve had the privilege of spending some time with the Arista product this week. For those of you who don’t follow HPE obsessively, some backstory – in the past, HPE led with the Comware product line in the datacenter, which was part of their 3COM acquisition. Comware was solid product and many of our clients had a good experience with the platform… even though the CLI was … “unique” compared to some. Really, it just took some getting used to.

However, last year HPE announced that they were opening a shared partner program of sorts with the Arista product. Arista had been making a lot of positive waves in the datacenter networking environment, so I was happy to add another tool to my arsenal. But then several months things became more serious as HPE announced that all new datacenter opportunities should be designed using the Arista lineup… and that the Comware gear wasn’t going to be their main focus moving forward.

Well, guess I better learn Arista then!

Arista uses Fedora Linux at the core, which allows a lot of cool tricks. Tools like grep, cat, zcat, tcpdump, awk and more are ready to go from boot. Administrators can choose to work in the bash shell or work in the Arista CLI, which will be very familiar to Cisco shops. Because of the Linux architecture, common networking protocols like STP, OSPF, BGP all run as individual processes and can be isolated if problems pop up. VXLAN, MLAG and other datacenter goodies are supported as well.

One tool that stood out to me during the boot camp was the VMtracer. VMtracer allows the Arista switch to keep an eye on the virtual environment by collaborating with the vCenter server. A few weeks ago I published a post around NSX and how it marries the physical network infrastructure with the virtual, solving a lot of problems for the large datacenter and the inherent scaling issues present there. Arista fully supports the VXLAN tunnel overlay I described (in fact Arista was one of the co-authors of the VXLAN standard), but the Arista gear presents a new convenient solution and QoL enhancement for smaller datacenters that still utilize a L2 topology with VMtracer.

It’s always better to show than tell, so here’s a quick walkthrough:

So first, within the CLI we configure VMtracer to communicate with vCenter – see below.

1 - Configure and Verify VM Tracer

Once this is up and running the Arista switch and the VMware environment can exchange info. Running show vmtracer sess brings up the session and gives confirmation that we’re all set.

2 - sho vmtracer sess

After a few minutes the VM database will populate. You can check this with show vmtracer vm, which lists out the VMs, their host, and what interface on the Arista switch they are connected to. In this case everything is connecting via the trunk port Ethernet1.

3 - sho vmtracer vm

We can pull more information with the “detail” flag, showing more information on each VM.

4 - sho vmtracer vm det

Someone who is good with Linux will be able to use pipes and grep-esque commands to quickly filter through info, but frankly I am very rusty with Linux. Keep in mind that Arista does support XMMP for multi switch CLI, so you can pull info from multiple sources simultaneously through pre-defined XMMP groups. Here’s my attempt to pull up info on a specific VM – note that you can quickly determine where a VM is connected using this tool.

5 - sho vmtracer vm name

Now, VMtracer includes a very cool function called “autovlan” that is enabled by default. You can verify its function by pulling up the VMtracer session as shown below. If vmotion moves a VM from one host to another, the physical network infrastructure tied to the new host must support the same VLANs. This can lead to administrators enabling every VLAN for every host to avoid orphaning an unlucky VM that vmotioned to the wrong host. By using VMtracer, the Arista switch can dynamically add VLANs to switches to support VMs as they move around the datacenter.

6 - sho vmtracer sess autovlan

Here I run a sho VLAN command to see what is active on my switch. Notice that VLANs tagged with an asterisk have been added to the switch dynamically with autovlan.

7 - sho vlan dynamic

Cool, huh?

Never fear, you can reign this in as needed. Let’s say I want to ensure that VLAN 1001 does not automatically populate on my switch for some security reason. I can go back into the Lab VMtracer session and remove 1001 from the allowed-vlan pool:

8 - trimming VLANs

To confirm that the change took, run sho vmtracer sess.

9 - sho vmtracer sess with trim

Now I can pull up the list of VLANs again and confirm that VLAN 1001 is no more.

10 - sho vlan with trim

Pretty cool, right?

There’s a bunch of stuff like this that you can do with the Arista lineup. Time to brush up on Linux!

 

 

 

VXLAN – One Tunnel to Rule Them All

I often feel like half my job is just keeping up with the new advancements and offerings that constantly churn through my inbox. I think the blogosphere likes to call it… “Disruption.”  Some trends don’t make it much further than the marketing department’s brainstorming session, but some actually stick and begin to be widely accepted as practice.

From my perspective, the trends that don’t persist are ones that don’t solve a “real” problem – sure, there might be marginal benefits to the technology, but not enough to change buying behaviors. The trends that stick are the ones that actually solve a big underlying issue, opening up greater efficiency and productivity. Server virtualization is a great example of a trend that “stuck.” When virtualization was first getting started, there were a lot of naysayers and many that didn’t like how it was changing the datacenter, but it made a very compelling argument for itself with the resiliency, power savings, and hardware savings that it offered. As a result, server virtualization is more or less standard these days, with software-defined storage following it closely.

There is one element of IT that is heavily lagging when it comes to modernization though – networking. If you look at some underlying principles behind network engineering, it makes sense. Networking at the core is a distributed communications system that uses very established protocols. No one entity “owns” the web, nor are updates rolled out across the world simultaneously. Tinkering with a protocol to “improve things” is more likely to cut you off from the rest of the world than help modernize your infrastructure. And in case of misconfiguration, a network outage is HIGHLY visible – if the network is down, so is everything else!

So, for those reasons, the network industry has been historically resistant to change. Configurations are still often done via the CLI. There is a steep tribal knowledge barrier to entry, and because networks can be taken down by forgetting a single keyword, any changes made to the network are tightly regulated. Networking follows the “if it ain’t broke, don’t fix it” mantra and that mantra has served it pretty well… for the most part.

I know I’m venturing into generalization, but I think networks are happiest when they are architected, configured… and then left alone. Nobody wants to be constantly logging in to the datacenter network infrastructure and making changes due to the massive blast radius when human error inevitably strikes.  Human error is one of the leading causes for network outages, after all.

So, this mindset causes several issues when working with large “webscale” datacenters. Some of the key advantages that a virtualized datacenter brings is resilience, efficiency and optimization. If VMs start to run into problems at the host level, a well-designed VMware environment will work around it seamlessly – via vMotion, HA, DRS, and other key technologies – and the users will never realize what’s happening behind the scenes. They just know that everything works and will continue to work. The VMware datacenter can be constantly refreshing, updating, optimizing, shuffling, and so on – all in the name of user experience and efficiency!

The problem is – how does this interact with a network that is designed to be static? Often the network is statically configured in a ‘set-it-and-forget-it’ style. VMware doesn’t natively have a great way to tell OSPF that a VM in subnet A has moved to the other side of the datacenter because it will run better over there. So, instead the virtualization engineers asked the network engineers to extend L2 everywhere so communication wouldn’t be broken… and the network engineers wailed and gnashed their teeth, but it was all for naught as the project had already been approved and the network engineers hadn’t been invited to the initial planning meetings. L2 extension does allow VMware to be as dynamic as it pleases, because RARP allows VMs to move around on the network and they don’t have to be re-IPed each time they move (which would cause all kinds of complexity with user access as well)… but it comes at a cost.

Those of you familiar with networking no doubt recognize some of the pain that is inherent with L2 communications. It’s fast, sure – but in its raw state, volatile. A L2 segment can often rightly be considered a failure domain. Broadcast storms cause headaches, MAC tables have hardware limitations, some flavor of STP is a necessity, and so on. L2 is only meant to have a single path for data, which throws conventional redundancy options out the window. L3 by comparison is stable and resilient. During my slow steady march to the dark side of datacenter networking I had to learn all these complex ways that companies were trying to solve the L2 datacenter problem in network hardware – SPB, TRILL, Fabricpath, VPLS – the hits kept on coming. And frankly, while I know that these are running happily in many production environments, they all seemed very complex to me and they required some expensive hardware and licenses just to function. (And to make matters worse, the one that made the most sense to me, TRILL, developed by Radia Perlman who introduced the original Spanning Tree, was just announced as dead by some rather prominent individuals in the networking space due to vendor ASIC limitations. That’s a bummer.).

I know that following my thought pattern is a bit of a rabbit trail here, but recall what I mentioned at the start of this blog – in order for a technological trend to catch, it has to solve a real problem. And hopefully I’ve illustrated that there is a real problem when pairing a dynamic environment with an environment that resists change.

Enter VXLAN, a newer protocol that is gaining a lot of steam. At the core, VXLAN is another tunneling protocol that can encapsulate frames inside of packets. It orchestrates this through VXLAN Tunnel End Points (VTEPs), which act as entry points into the datacenter fabric and construct and tear down tunnels between each other to shuttle traffic around. The concept of tunneling isn’t new and VXLAN is not vendor proprietary; it was originally created in collaboration by Arista, Cisco and VMware and it is being utilized by many vendors today. It allows the creation of an “Overlay” network that can be defined in software and runs over the “Underlay” network that is created via physical hardware.

The “Underlay” network can be constructed with a L3 design in a rock solid, resilient configuration. Network engineers only have to ensure that there is basic IP connectivity across the datacenter, and that the MTU is large enough to accommodate the VXLAN header that is added to the frame. Once the Underlay is in place, it will not often have to change.

The “Overlay” network is where the real magic happens. Using VXLAN, software can rapidly construct tunnels that ride on top of the “Underlay” network, shaping the network as required by applications. If a VM needs to think that it has L2 adjacency to a VM on the other side of the datacenter, the VTEPs can construct a L2 tunnel between the two VMs. No hardware changes are required and again, the physical network hardware only needs to give basic IP connectivity between the two hosts. Even better, because the VTEPs in software keep track of MAC address tables and L2 connectivity, your hardware only needs to remember where the VTEPs themselves are and how to move traffic between the VTEPs. Everything else is out of sight and out of mind of your physical infrastructure. The real weight of the MAC address table is now in software, rather than inflating the cost of our hardware.

So, as I mentioned, there are several solutions out there that utilize VXLAN – Arista, Cisco’s ACI, VMware NSX, and others. We at Edge are particularly excited to be working with the VMware NSX platform (I believe almost our entire engineering and technical team hold the VCP-NV badge of honor and yours truly is aggressively pursuing it). VMware has been virtualizing servers and abstracting storage for years now; it only makes sense that they would also tackle the last piece of the trifecta – the network!

VMware networking in its raw state has some limitations. First, the standard vSS/vDS is not able to route traffic – it can only switch traffic. This results in some less than ideal traffic patterns, as traffic is hairpinned on the physical network to move between VMs in a segmented multi-tier application. In addition, the physical network has to be configured to support changes in the virtualized network environment… and if network change control is in place (as it should be), this can take a long time!

VMware NSX solves these issues by bringing routing, security, edge services, load balancing, NAT and more down into the virtualized environment. Virtualization administrators can now spin up all the network services that they need within the virtual environment. By bringing L3 intelligence all the way down into the host, the hairpinning problem is solved. By enforcing network traffic policies at the vnic level, even L2 East/West traffic is now secured.

As a result, the physical network is more or less abstracted with VMware NSX. If VMs need to be on the same subnet to operate but they get moved to separate hosts, VTEPs on each host construct a VXLAN tunnel between the two hosts to allow the traffic to pass through as if they were still L2 adjacent. These tunnels are architected by the VMware NSX platform in software, allowing them to fulfill the promise of a Software Defined Data Center.

VMware NSX consists of several key components:

  • NSX Manager
  • NSX Controllers
  • NSX vSwitch
  • NSX Edge Services Gateway

The NSX Manager integrates with vCenter Server to coordinate management across the VMware environment. It’s important to note that if the NSX Manager goes down or otherwise becomes unavailable, NSX will continue to function.

The NSX Controllers share the burden of the control plane and coordinate network functions across the vSphere, ensuring that changes are kept in sync.

The NSX vSwitch is where L2, L3 and firewall decisions are made – within the host! This piece of NSX is responsible for a great deal of what gives the platform it’s “teeth,” so to speak.

Finally, the NSX Edge Services Gateway provides additional services that are not included in the vSwitch – services like IPsec VPN, NAT, load balancing, DHCP, DNS relay, and more. Edge Services Gateways can also be deployed as Perimeter Edges, which bridge the gap between the physical network and your virtual environment. Using dynamic routing protocols like BGP, OSPF or IS-IS they can coordinate changes between the two environments and advertise any changes you make in the NSX environment.

All these components work together to create a software defined data center. Finally the last piece of the puzzle, networking, has been virtualized and can collaborate properly with the VMware environment. It’s starting to take off, too – if you watched VMware’s earning report a few weeks ago, you’ll know that NSX is on track to be a billion dollar product line by the end of 2017. That’s probably artificially boosted by the steep price tag that NSX carries, but hey, still impressive.

There are many other benefits that NSX offers the datacenter that I haven’t covered in this blogpost – service-insertion for traffic filtering, microsegmentation, and so on – and there are many different ways that it can be implemented. Really, I’ve only scratched the surface. But I’ve rambled on more than long enough at this point. If you’d like to learn more about VMware NSX and start going into specifics about how it will interact with your environment – drop me a line and I’d be happy to compare notes. I have a bunch of big books filled with all the specifics, fine print, addendum and nerd knobs.

Software Defined Stuff

vmware-nsx

 

So, this is going to be a bit of an informal blog post, but I’m in the middle of a weeklong boot camp for VMware NSX and I wanted to share a few things I’ve learned with the internet.

First, this is one of the first times that I’ve had a chance to get down-and-dirty with VMware. Shameful, I know, but I’ve really been more historically focused on layers 1 – 4. So I appreciate the chance to see things from the other side and broaden my horizons.

Second, and this was the topic of much debate in the class, I think that NSX is going to require that the virtualization team learn networking. One of the big bottlenecks in IT without NSX in place is waiting on the network team to make changes to the physical infrastructure to accommodate the changes put in place at the virtual level. NSX now allows the virtualization team to deploy virtual switches, routers, and firewalls (making L3 routing between hosts in separate subnets possible in a strictly virtualized environment without having to hairpin on the physical network) AND it abstracts out the hardware layer when traffic needs to go between hosts. Frankly, the networking team is not going to be the ones logging in to VMware and creating the new virtual switch port groups every time a change needs to be made…. That’s going to be the virtualization team.

As a side note, NSX’s virtual environment still plays by the same rules as physical networks, so the networking knowledge that you’ve obtained will not go to complete waste in this “bold new future.” OSPF still has all the same timers and requirements, you still need to redistribute routes, subnetting didn’t go away, and good old BGP is still trundling along. And yes, the VMware virtual router can interact with physical routers, creating adjacencies and sharing routes and all that good stuff.

Third, in a completely virtualized environment, this really simplifies the maintenance and design of the physical network. NSX uses VXLAN tunnels to achieve L2 adjacency between VMs. These tunnels are automagically constructed in software, and can be torn up and torn down regardless of where the VM winds up in the datacenter as long as there is basic connectivity between hosts. When I was working through the MASE exam earlier this year, I felt like HPE was shouting to the heavens “Look at us! We can do all this finicky L2 extension to satisfy that blasted vMotion requirement, same as everyone else! We have TRILL! We have SPB! We have VPLS! We have EVI!” Well, now it seems to me that all that doesn’t matter as much anymore because all NSX requires to simulate L2 adjacency across the datacenter is underlying IP connectivity and an MTU of 1600 or more. You can picture the physical network as a rock solid underlay that shouldn’t require tweaking after the initial setup and NSX as the overlay that constructs tunnels over top of it. The networking team of yesteryear can create one big BGP datacenter and let it run… and NSX will do what it needs to do without any outside intervention required.

Fourth, I would highly recommend taking a look at this technology or a similar tech if you’re a network engineer. I know, I know, it’s marketing heavy and virtualization of the network isn’t happening as quickly as any of the highly paid experts predicted, but it is happening. Personally, I would rather be the one having fun designing the network layout in VMware rather than be the one ensuring that the physical underlay is still in place.

Early morning slap-dash blog post with my two cents, make of it what you will.

OpenFlow

Software Defined Networking has been bouncing around the marchitecture blogosphere for a few years now (even onto my own apparently), but what is it exactly? SDN enabled switches and routers promise to actually interact with the applications that are cruising along top of them, rather than blindly shuttling packets from point A to point B. This type of promised capability threatens to open up a lot of new options for your network. It could allow you to hand craft routes for particular classes of traffic without having to bring up the command line for dozens of routers. It could allow you to automatically prioritize and optimize video sessions as they are created, rather than having to rely on a blanket of “catch-all” rules. You could even create specific rules for individual users, tunneling all of their traffic (even intranetwork traffic) through a NGFW until their hosts are deemed healthy again.

So with all of this promise, why are we still hugging tight to our trusty CLI?

In my opinion, while it is an exciting technology, the term “software defined networking” has become a bit of a catch-all as more and more companies introduce their own version of SDN… and as a result, there can be some confusion around how it actually works underneath the hood. And to be frank, networking is a distributed system…. so it is critically important that each piece and part has to 1) be reliable and 2) play well with others. Despite the quirks, we at least know how STP is going to behave, we can work around the quirks, and we know that it’s generally going to be present in most environments. Completely reinventing tried and true protocols in favor of being at the bleeding edge of technology seems unnecessarily painful.

However, earlier this year I took a course around OpenFlow® and it opened my eyes a little bit to how SDN can complement the network, rather than replace it.

First, what is OpenFlow®? OpenFlow® is defined by the Open Networking Foundation (https://www.opennetworking.org/) as the first standard communications interface defined between the control and forwarding layers of an SDN architecture. In short, OpenFlow® can modify the forwarding plane on devices such as switches and routers remotely, creating a centralized and programmable control plane.

There are two common approaches to designing an Openflow® based software defined networks – pure Openflow® architecture and the hybrid model. In the pure Openflow® system, conventional routing logic and MAC address tables are replaced by “flow tables” that are built entirely through Openflow®. This gives the network administrator complete control over the network, but it requires a lot of experience with protocol behaviors and keen attention to detail – because spanning tree won’t be there to help out if a switching loop is introduced! This style of design doesn’t hold much appeal for me, as detailed above. The other option is the more commonly used Hybrid model, which overlays Openflow® on top of a conventional network. In the hybrid model, relevant network traffic is compared against Openflow® flow tables first, but if there isn’t a specific rule defined for the network flow it is passed down to the conventional network intelligence and it is handled as usual. The overlay model gives flexibility while still utilizing tried and true protocols.

HPE Aruba’s lineup of campus switching is a good example of the hybrid Openflow® architecture in action. Most of HPE Aruba’s campus switches are OpenFlow® ready out of the box. They have solid conventional network protocol support at the core that we all know and love/hate, but you have the option to introduce OpenFlow® enhancements with no additional feature licenses required. These switches have custom ASICs that were designed to provide full featured OpenFlow® support (take a look at the TCAM specs) and as a result they can perform flow table operations in hardware at almost line speed. This makes the network “programmable” – for example, you can specify that a specific traffic type from a specific host signature be mirrored to another location for analysis, modified as needed, and even be completely rerouted… while everyone else still moves along as usual.

So what pieces and parts make up the HPE Aruba SDN system? At the core is the SDN controller, which is installed on a VM in your environment. This controller establishes OpenFlow® tunnels to each managed switch so that it can orchestrate a centralized control plane. These OpenFlow® tunnels also allow the controller to insert or retrieve packets retrieve packets for analysis. Once the SDN controller is in place, networking apps can be loaded onto the controller to add additional features. Think of the HPE SDN controller like your smartphone – if you want added features on your phone (friend finding, picture filters, etc etc), you go to the app store, download an app, and you’re all set. HPE Aruba’s SDN system is similar. If you want to add features to your network you can go to the HPE SDN app store, select the functionality you want, install it on your controller, and then you are up and running.

HPE’s SDN App Store has 32 applications currently available for download, offering network enhancements ranging from active honeypot technology to software defined load balancing. Three premium applications are currently available from HPE, offering services like packet analysis from anywhere on the network, URL filtering through DNS interception at the access layer, and voice and video call prioritization for Skype for Business. In addition, the SDN controller can integrate with other software platforms thanks to its northbound RESTful APIs. Aruba’s ClearPass software can pass authenticated user information to the SDN controller via these APIs, which can then find and mirror network traffic specific to a named employee, regardless of where they’ve wandered on the network – making life much easier.

While SDN is still trying to make its “big break,” I think it promises to introduce significant changes in networking landscape… eventually. What are your thoughts?

To learn more about the Open Networking Foundation: https://www.opennetworking.org/index.php

For a case study around South Washington County Schools that implemented HPE SDN: https://www.youtube.com/watch?v=jtlb6i4UcwM

For a case study around BAMA Company that implemented HPE SDN:
https://www.youtube.com/watch?v=nftU3oi75QU

 

SD-WAN Solutions

Tin Can String

Below is a rough “marketing blog” that I wrote for Edge. Forgive some of the marketing speak, but I wanted to share this here as SD-WAN is a big hot topic with a lot of exciting potential. If you want to swap ideas and compare notes… hit me up!

BEGIN TRANSMISSION—–

Is your wide area network a source of stress? Office boundaries are blurring as campuses evolve and expand, more and more services are being stored in a centralized location, and just about everything is moving to the network, making wide area network connectivity a proud member of the “mission critical” realm. Without a well-designed WAN you are sure to have some headaches!

There are two predominant WAN solutions out there today – MPLS packaged with related services like VPLS, and a network of point-to-point VPN tunnels over the internet. Each has their own advantages and drawbacks. For example, MPLS can offer a higher level of guaranteed service and can host a variety of services like Metro Ethernet but it can be expensive. VPN tunnels over internet are more affordable but because they go over the internet they can be more susceptible to jitter and disruption, and managing a full mesh of VPN tunnels can be a hassle without the right hardware and an ADVPN based or similar solution. The standard “Hybrid WAN” approach balances these options by bundling the two services together and routing traffic across the links via policies, often using MPLS for critical traffic and a larger broadband link for less sensitive applications.

This standard static routing definition does leave something to be desired. First, you are still paying a substantial amount for the high performance circuit. On average, a private WAN subscription will cost 10x more per Mbps than a broadband link. Second, you are often paying for bandwidth that you aren’t using … even though a broadband pipe can be sized up to multigig speeds, it is often underutilized because it has a bad reputation for being less stable. However, broadband internet is improving and in some cases is able to offer latency that would be acceptable for mission critical applications. What if you could closely monitor the performance of that broadband link and intelligently use the excess bandwidth for your prioritized traffic, but only when the health of the pipe meets your application’s criteria? A software defined WAN could give you that capability.

The world of SD-WAN has emerged to solve classic WAN problems like the ones listed above. Manufacturers have different definitions for SD-WAN technology, but the underlying intent is the same – to simplify your wide area network administration and utilize your bandwidth more intelligently. This can be done through a variety of methods. Some manufacturers provide hardware that coordinate tunnel creation through a management platform and gather real time link health information so traffic is sent down the best path. Some companies do this via a software platform that can be installed in a public cloud like AWS, allowing for easy migration to a hybrid cloud model. Some remove all responsibility from your premises and provide peering locations on the web that take over the WAN for you and get the traffic where it needs to be. Several manufacturers are capable of monitoring each individual packet as it traverses the WAN and collecting real time telemetry, allowing for packet by packet forwarding decisions, millisecond failover between links, and session continuity even in link failure. Even better, with the level of insight and QoS that the telemetry can provide you can start moving away from expensive MPLS subscriptions entirely and instead start bundling multiple broadband links together to provide a resilient WAN. The desired result is a seamless and high performance WAN that can support the new generation of collaboration tools.

There is no “one size fits all” SD-WAN solution, as every network has its own unique set of requirements and challenges and every manufacturer has their own take on this new technology. But that’s why Edge is here to help. If you are interested in learning more about how you can use SD-WAN to lower your bandwidth bills and enhance your user experience give us a call!