Schedule a Consultation

There’s a pattern I’ve seen repeat itself across enterprise environments of every size and sector: DNS gets attention right after something breaks, and almost never before.

The server team owns the authoritative zones, loosely. The network team manages the resolvers, sort of. DHCP was set up during the last campus refresh and nobody’s touched the configuration since. IPAM is either a spreadsheet or an enterprise platform that’s three versions out of date and partially populated. And the overall DNS architecture — the thing that every application, every user, and every device in the environment depends on for basic connectivity — was designed during a time when the environment looked nothing like it does today.

This isn’t a criticism. It’s a description of how DNS architecture evolves in organizations that aren’t paying deliberate attention to it. And most organizations aren’t, because DNS almost always works well enough that it doesn’t generate complaints — right up until it doesn’t.

Why DNS Matters More Than Most Organizations Treat It

DNS is the first dependency in virtually every network transaction. Before a client can connect to an application, it needs to resolve the application’s name to an IP address. Before a security tool can correlate an event to a hostname, it needs DNS data. Before a device can authenticate, it typically makes DNS queries. Before your monitoring system can reach the endpoint it’s monitoring, it resolves a name.

When DNS is slow, everything is slow. When DNS is unavailable, everything stops — in ways that often confuse end users and helpdesk teams who aren’t thinking about DNS as the failure point. The symptoms look like application outages, connectivity failures, or authentication problems, and it can take significant time to trace them back to the actual root cause.

When DNS is insecure, attackers exploit it. DNS is one of the most abused protocols in the security landscape, for a simple reason: most organizations allow DNS traffic through their perimeters with minimal inspection. DNS tunneling — using DNS query and response traffic as a covert channel to exfiltrate data or maintain command-and-control communication — is well-documented and actively exploited. DNS cache poisoning, while harder to execute against modern resolvers that implement source port randomization and DNSSEC, remains a risk in environments running older infrastructure.

And yet, DNS rarely gets a line item in the security budget, and almost never gets strategic attention from IT leadership.

The Typical State of Enterprise DNS

Having worked with enterprise DNS environments at BlueCat Networks — and having had to untangle DNS issues in complex environments before and after those engagements — I’ll describe what’s common:

Resolver sprawl. Organizations accumulate DNS resolvers the way they accumulate servers: through organic growth, acquisitions, and one-off deployments. A typical mid-sized enterprise might have DNS resolvers running as Windows DNS roles on domain controllers, Linux BIND instances deployed for specific applications, a commercial DDI platform that was deployed for some portions of the environment but never fully rolled out, and cloud-native resolvers in AWS or Azure that weren’t integrated with on-premises DNS at all. The result is inconsistent behavior, no single source of truth, and troubleshooting that requires checking multiple systems.

Flat, fragile architecture. Many enterprise DNS architectures have single points of failure that aren’t immediately obvious. A single authoritative server for an internal zone. A resolver cluster that isn’t properly load-balanced. A forwarding configuration that routes all external queries through a single upstream provider with no failover. These configurations work fine until the one server or path that everything depends on has a problem.

No DNSSEC. DNSSEC (DNS Security Extensions) signs DNS responses cryptographically, allowing resolvers to verify that the responses they receive are authentic and haven’t been tampered with. Despite being a mature standard, DNSSEC deployment in enterprise environments remains low. The operational complexity of managing signing keys and the historical performance overhead — both largely mitigated by modern tooling — have made organizations reluctant to deploy it. The result is continued exposure to response manipulation attacks.

IPAM as an afterthought. IP Address Management — tracking which addresses are assigned to which devices, managing DHCP scopes, and keeping DNS records synchronized — is genuinely difficult to do well at enterprise scale. Spreadsheet-based IPAM is common but fragile. Enterprise IPAM platforms (BlueCat, Infoblox, Men&Mice) exist and are capable, but they require ongoing maintenance, data hygiene, and organizational commitment to be useful. Many organizations have deployed IPAM tools that are partially populated, partly integrated with DHCP and DNS, and not trusted enough by operations teams to be the authoritative source of truth.

No meaningful monitoring. DNS query volume, resolution latency, NXDOMAIN rates, and unusual query patterns are all indicators of both operational health and security anomalies. Most enterprises have minimal DNS-specific monitoring. DNS failures often manifest first as application tickets rather than infrastructure alerts.

What Strategic DNS Architecture Looks Like

Treating DNS as strategic infrastructure — rather than background plumbing — means making deliberate decisions across several dimensions:

Resilience by design. Every internal zone should have at least two authoritative sources, physically separated, with automatic failover. Resolver clusters should be properly load-balanced with health checking. External resolution should have redundant upstream providers with automatic failover if one is unavailable. These requirements aren’t exotic — they’re the minimum for a protocol that the entire environment depends on.

Centralized visibility. DNS logs are security gold. Query logs show you what devices are trying to reach, when, and how often. NXDOMAIN spikes often indicate malware C2 communication attempting to reach domains that have been taken down. Unusually large query responses can indicate DNS amplification abuse. This data is only useful if it’s being collected, correlated, and monitored — which requires centralized logging and ideally integration with your SIEM or security analytics platform.

Integrated DDI. When DNS, DHCP, and IPAM are managed as integrated systems rather than separately, the operational benefits are significant: automatic DNS record creation when DHCP leases are issued, accurate IP inventory without manual reconciliation, and a single interface for changes that currently require touching multiple systems. The investment in integrated DDI is real, but so is the operational debt of managing them independently at scale.

Split-horizon DNS, properly implemented. Most enterprises need different DNS responses for internal and external clients — internal clients should resolve an application’s internal address; external clients should get the public IP. Split-horizon DNS implements this, but it introduces complexity: the internal and external zones need to be maintained separately, and misconfigurations can result in internal resources being unreachable from within the network or internal addresses being exposed externally. If you’re running split-horizon, ensure it’s documented, tested, and regularly audited.

DNS security controls. Response Policy Zones (RPZ) — also called DNS firewall — allow you to block or redirect resolution of known-malicious domains at the DNS layer, before a client makes a connection. This is an effective and often underutilized security control. Combined with DNSSEC validation for external lookups and encrypted DNS (DoT or DoH) for client-to-resolver traffic, these measures substantially raise the bar for DNS-based attacks.

The Organizational Challenge

The hardest part of improving enterprise DNS isn’t technical. It’s organizational. DNS sits at the intersection of networking, server infrastructure, security, and application teams — and nobody feels full ownership of the problem. Projects that require coordination across multiple teams tend to stall, and DNS architecture projects are among the most prone to stalling because DNS works well enough most of the time to avoid escalation.

The way through this is to make the business case explicitly: what is the cost of a DNS outage in your environment? How many applications stop working? How many users are affected? How quickly can you detect and respond to a DNS-based security incident today? Framed this way, DNS architecture becomes a risk management conversation rather than a technical refresh, and it tends to get more traction.

DNS deserves a named owner, a written architecture document, a monitoring dashboard, and a seat at the table when infrastructure decisions are made. In most organizations, it has none of these things. That gap is the problem — and it’s entirely solvable.


Alan Sukiennik is the founder of Acton Pacific Strategies, a Las Vegas-based independent infrastructure advisory firm. He has 30 years of experience in enterprise and service provider networking, including senior engineering roles at Arista Networks, F5 Networks, BlueCat Networks, and Nokia. Reach him at alan@actonpacific.com or schedule a consultation.

Leave a Reply

Your email address will not be published. Required fields are marked *