Schedule a Consultation

For most of its history, BGP — the Border Gateway Protocol — was the routing protocol that enterprise engineers heard about but didn’t need to understand deeply. It lived at the edge of the internet, managed by service providers and large carriers, governing how traffic flowed between autonomous systems across the global routing table. If you were an enterprise network engineer, your job ended at the WAN handoff. BGP was someone else’s problem.

That’s no longer true. BGP has moved decisively into the enterprise, and the gap between where the protocol is being deployed and where most enterprise teams’ BGP knowledge actually sits is widening. This article is about that gap — why it exists, why it matters, and what to do about it.

Where BGP Is Now

The scope of BGP’s expansion into enterprise environments is broader than most IT leaders appreciate:

Data center fabrics. As leaf-spine architecture became the dominant data center topology, BGP followed it in. The now-standard approach of running eBGP as the underlay routing protocol — leaf switches peering with spine switches, each with its own autonomous system number — means that organizations running modern data center fabrics are operating BGP at scale, whether they fully recognize it or not. Add EVPN on top as the overlay control plane, and BGP is now responsible for distributing MAC/IP reachability information across the entire fabric.

Campus and enterprise WAN. Microsoft’s deployment of BGP on their campus network — documented in RFC 7938, “Use of BGP for Routing in Large-Scale Data Centers” — was influential, and the pattern has spread. Some enterprise organizations running large, complex campus environments are adopting BGP to replace OSPF or EIGRP, drawn by BGP’s policy flexibility and its alignment with the data center protocols their teams are already running.

SD-WAN underlay and overlay. Most SD-WAN platforms use BGP to exchange routing information between branch sites and data centers, and between the SD-WAN overlay and the underlying transport. Even organizations that think of their WAN as “managed by the SD-WAN box” are often running BGP that needs to be understood when troubleshooting path selection issues.

Cloud connectivity. AWS Direct Connect, Azure ExpressRoute, and Google Cloud Interconnect all use BGP to exchange routing information between your on-premises environment and the cloud provider’s network. If your organization has private cloud connectivity — not just internet-based VPN — you’re running BGP at the cloud edge. The AS path, MED attributes, and community strings that control how traffic flows between your data center and your cloud presence require BGP knowledge to manage correctly.

Multi-cloud and hybrid environments. As organizations run workloads across multiple cloud providers and maintain on-premises footprints, the routing fabric that connects them becomes more complex. BGP is the glue that holds it together — and understanding it is essential for diagnosing connectivity problems, optimizing traffic paths, and designing reliable hybrid architectures.

Why This Creates a Skills Gap

BGP is a powerful protocol, and its power comes from its complexity. Unlike link-state protocols like OSPF and IS-IS, which have relatively straightforward convergence behavior, BGP is a path-vector protocol with a rich set of attributes that govern path selection. Community strings, local preference, MED, AS path prepending, route maps, prefix lists, route reflectors — each of these has behavior that can be subtle and, when misconfigured, consequential.

The engineers who built their skills during the OSPF and EIGRP era — which describes a significant portion of the enterprise networking workforce — have limited exposure to BGP beyond basic peering configuration at the WAN edge. They know that BGP exists. They may have configured a few static BGP sessions at an MPLS handoff. But the kind of deep operational fluency needed to manage a BGP-based data center fabric, troubleshoot path selection issues in a multi-cloud environment, or design a BGP routing policy for an enterprise WAN is genuinely uncommon outside of service provider environments.

Meanwhile, the enterprise has quietly become a BGP environment. The protocol is running in places it never ran before, often configured by specialists during the initial deployment and then handed off to teams who were not involved in the design. When something breaks — and eventually something always breaks — the engineers on call may not have the background to diagnose it quickly.

What IT Leaders Need to Understand

You don’t need to become a BGP expert to manage an organization that runs BGP. But there are several things IT leaders need to understand and account for:

BGP failures can be silent and subtle. Unlike a link failure, which is immediately visible, a BGP misconfiguration can cause traffic to take suboptimal paths, certain prefixes to be unreachable from specific locations, or workloads to experience intermittent connectivity issues that are difficult to correlate with the underlying cause. These problems can persist for extended periods without triggering obvious alerts.

Route leaks and misconfigurations have significant blast radius. A BGP misconfiguration that causes incorrect routes to be advertised — whether into a data center fabric, a cloud connection, or an SD-WAN overlay — can affect large portions of your environment simultaneously. Proper prefix filtering, route policies, and maximum-prefix limits are essential safeguards that are often underimplemented in enterprise BGP deployments.

BGP operational knowledge needs to be explicit, not assumed. If your data center was designed by a VAR or vendor professional services team, there’s a reasonable chance that the BGP configuration is functional but not fully understood by your internal team. This is a risk. Document the design, understand the routing policies, and ensure that at least two people on your team can answer the question: “What happens to traffic between these two subnets if this spine switch fails?”

Tooling matters. BGP troubleshooting with a CLI and a notepad is feasible, but slow and error-prone at scale. Modern network observability tools — anything with streaming telemetry and BGP state visibility — dramatically improve the ability to diagnose BGP issues quickly. If you’re running BGP at scale and don’t have purpose-built observability, you’re operating without adequate instruments.

What to Do About It

The practical steps are straightforward, even if executing on them takes time:

Audit your BGP footprint. Know where BGP is running in your environment — data center, WAN, cloud connections, campus. Understand what protocols and configurations are in use. This is often revelatory for organizations that have grown through acquisitions or organic complexity.

Assess your team’s actual BGP knowledge. Not “have they configured BGP” but “do they understand BGP path selection, can they read a BGP table, do they know how to write a route map and predict its effect.” There’s often a meaningful gap between what people put on their resumes and what they can actually do under pressure.

Invest in training. BGP training resources have improved considerably. Vendor-specific training, CCIE-track material, and hands-on lab environments (including cloud-based options) all provide pathways to building genuine fluency. This is an investment worth making, not a checkbox.

Build and maintain runbooks. For each BGP-peering relationship in your environment — data center spine-leaf, cloud provider connections, WAN edge — document the design intent, the routing policies, and the expected behavior under failure conditions. Review and update these regularly.

Consider external expertise for complex design decisions. BGP policy design for multi-cloud environments, route reflector design for large fabrics, or troubleshooting persistent BGP path selection anomalies are all areas where engaging someone with deep operational experience can be faster and less risky than learning through trial and error in production.

BGP is not going away. If anything, its role in enterprise environments will continue to expand as networks become more distributed, more connected to cloud, and more reliant on automated overlay fabrics. The organizations that build genuine BGP operational capability now will be better positioned to manage the environments they’ll be running in five years.


Alan Sukiennik is the founder of Acton Pacific Strategies, a Las Vegas-based independent infrastructure advisory firm. He has 30 years of experience in enterprise and service provider networking, including senior engineering roles at Arista Networks, F5 Networks, BlueCat Networks, and Nokia. Reach him at alan@actonpacific.com or schedule a consultation.

Leave a Reply

Your email address will not be published. Required fields are marked *