By Philip Griffiths and Kenneth Bingham

The problem of powerless superclouds

With the proper separation of concerns, a complex system like a supercloud doesn’t mean a complex network. If you’ve ever managed a mess of VPNs with Bi-NAT and split-horizon DNS in play, then you know what pain is. Now, I know you’re thinking, “There’s no way out of this mess!” and “You haven’t seen my network.” Let’s set the stage for a surprisingly-simple survival plan with a bit of understanding of how we got here.

In the prior installment in our trilogy, ‘The Rise of the Supercloud’, we explored how the concerns of the network are inseparable from the applications they deliver, i.e. “the network is the computer”. We illuminated a fundamental problem that emerges from that tight coupling: a shared responsibility model that overcomplicates service infrastructure and ultimately, unfortunately, overburdens the customer. Consequently, this impedes realizing the total value of the supercloud for customers and shareholders. We concluded with the recognition that supercloud innovators will eventually build secure and programmable connectivity into their offerings and thereby rebalance the operational burden.

Logical Separation > Inspectors and Gatekeepers

Network configurations are complex because they are specially configured to meet the needs of the applications they deliver. This app-to-network dependency requires the person with application know-how to communicate the app’s needs precisely to the person with the networking know-how. You’re probably thinking, “Doesn’t as-code automation solve this problem?” Sure, that’d be superior to hand-crafting specialized network configurations for apps, but what if we could prevent the problem from within the app instead of solving it with more app-external tooling, processes and skillsets?

As Dave Vellante’s ‘The Rise of the Supercloud’ pointed out, superclouds “can span multiple clouds — and on-premises workloads — and hide the underlying complexity of the infrastructure supporting this work”. The far-flung nodes in any distributed system must communicate over a network. This distributed system depends too much on the perfect alignment of the network configuration. Superclouds that require unique network configuration and have a presence in the network fabrics managed by the consumer of the supercloud have an issue of co-management. This forces the service provider to separate concerns in the wrong place, thereby burdening you, their customer, with shared responsibility for complex network configurations. This responsibility includes firewall exceptions, access control lists, proxies, virtual private networks, reverse tunnels, web application firewalls, and other sources of cranial agony.

Many vendors will bring a network-first approach to solving a ‘secure, multi-cloud network’. These certainly include vendors like Aviatrix and F5. Following first principles, a network’s purpose is to convey data packets from a sender address to a recipient address. You cannot secure them, only isolate them. ‘Secure network’ is an oxymoron, and so we can only say ‘secure, multi-cloud network’ with a hint of sarcasm. It is short-sighted to suggest that the application and its data are secured because the network is secure. Distributed networks using the public internet expose the application server to network-borne attacks – e.g., leveraging a known vulnerability to intrude, denial of service by abusing the login mechanism, and the inevitable future zero-day exploits. This is part of the reason why cyber-crime is a trillion-dollar drag on the global economy, and surveillance techniques known as scan-and-exploit have become the No. 1 attack vector for cybercriminals. It’s quite simply the easiest way to gain an intrusive foothold.

Industry analysts have articulated this problem, but the discussed solutions do not go far enough to decouple the application from the network configuration. In 2021 Gartner® released its report ‘Innovation Insight for Comprehensive Secure Connectivity for Composite Applications’ (CASCE), “describing the convergence of application and network security to build a comprehensive policy configuration and enforcement model.” Core principles of CASCE include the perimeter being logical rather than physical, identity-based interaction with the application to access services and separation of policy definition and enforcement. They name ten vendors, including F5, HashiCorp, Kong, NetFoundry, Palo Alto Networks. The report describes a real problem and points at the incremental improvements these vendors can offer with IP inspector and IP gatekeeper tactics. There are several broad problems with these approaches:

  • Almost all of the mentioned technologies for CASCE or multi-cloud are bolt-on solutions that operate at the cloud network level. They cannot be embedded into the application and are non-transparent to the user and customer.
  • Most of these technologies depend on public IPs at source and destination, which means they can be subject to external network-level attacks from malicious actors. Therefore, companies try to isolate source and destination with proxies, firewalls, and other things simultaneously depending upon open inbound ports that reintroduce the same vulnerability to network-level attacks.
  • Many of the technologies are closed source, preventing the consumer from controlling the software to audit, build, and innovate. Instead, the consumer is locked in. Superclouds do not want to depend on any single cloud or service (e.g., using AWS Privatelink is tied only to AWS).
  • Bolt-on gatekeepers do not resolve the inherent tension between business velocity and security. They are not built for automation using Infrastructure-as-Code, APIs, GitOps, and DevOps tools and methodology.

“It’s not our fault; we don’t control the network”.

For this reason, many superclouds and applications state, ‘we don’t control secure networking – that’s our customer’s job’. Snowflake, for example, give this responsibility to their customers. The below illustrates the many layers of infrastructure that must align to deliver the Snowflake supercloud to the user. This is a picture of fragility, rigidity, and complexity in the name of security, but with many hidden costs, not the least of which is business velocity. Bolted-on solutions limit business velocity because they introduce handoffs, interfaces, and third parties. This is why the current shared-responsibility cybersecurity model doesn’t work, asymmetrically favors the cyber attacker, and ultimately offloads responsibility for securing the supercloud to the end-user.

With respect to John Gage, the current operating model for most superclouds means that the network is no longer the computer in quite the same way. The current operating mode can be dramatically improved by building-in secure networking inside the application source code. This ensures continual, perfect alignment of the application’s network configuration without burdening the customer with the shared responsibility model. Superclouds can get superpowers! We will take a closer look at how this works in the concluding instalment of this series.

VP - Head of Global Business Development and Alliances at NetFoundry
Developer Advocate at NetFoundry | Website

Ken is crafting developer experiences with the NetFoundry API and OpenZiti. He is enthusiastic about Linux, security, and building things with free and open source software and hardware. You can find him in his native habitat talking and clowning around at tiny tech events all over.

Discuss On: