Why have we suffered a recent rash of damaging DoS, DDoS and ransomware type cyberattacks against VoIP systems, and how can we mitigate against this threat?  Let’s start with the three main factors which are combining to lead to the recent rise in breaches of VoIP, WebRTC and similar systems:

Why attack VoIP?  Factor #1: VoIP is vulnerable and now more prevalent

We will dive deeper into the specific VoIP security vulnerabilities in this post, but they include VoIP within traditional use cases such as voice calls, conferencing, contact centers and BPOs.  They also include emerging use cases such as video calls, gaming, IoT, ML/AI systems and app embedded communication.  These emerging use cases often use WebRTC for their real-time communication capabilities (voice, video, data), and WebRTC has the same basic vulnerabilities as VoIP.

Why attack VoIP?  Factor #2: VoIP, WebRTC and similar solutions are now more valuable

Simply put, cyberattackers now have a better business case to attack VoIP, WebRTC and similar solutions.  Similar to how the proverbial bank robber targets the bank because that is where the money is, the modern day cyberattacker targets the use cases which are valuable.   Due to the innovation of app developers to embed voice, video and chat into apps, often aided by the magic of WebRTC, our businesses and supply chains increasingly depend on VoIP.  This also means there is more value for cyberattackers to target.

Why attack VoIP?  Factor #3: VoIP and WebRTC attacks can be monetized

Vulnerable and valuable isn’t enough to cause a rash of cyber attacks.  Attackers need to be able to monetize the attack.  Cyberattackers can now more effectively monetize attacks on VoIP and WebRTC, often using new extortion techniques, a developing ecosystem of third-party organizations and improved attack methods, such as cyberattackers used in the recent ransomware attacks.

So, in summary, many VoIP and WebRTC use case are vulnerable, valuable targets which can be monetized if successfully hit.  Yeah, that adds up to a target on our back.  But where specifically is the target, and can we mitigate the threat?  Can we at least make it a worse business case for the cyberattackers?  Let’s dive in.

VoIP, WebRTC, voice, video, chat…why are they vulnerable targets?

Networks are the world’s most critical CVE.  Even though they are not CVEs.  Huh?  Well, the job of a network is to move bits from point A to point B.  To make things reachable and connect things.  Unfortunately those bits can be emails, ransomware, or funny cat videos…the network doesn’t know the difference, and doesn’t care.  So, while a network is simply doing the job it is hired to do, it is also a massive vulnerability.

Well, what does that have to do with VoIP, WebRTC and friends?  These architectures have critical components which need to be open to networks in order to operate.  It looks something like this:

voip vulnerability architecture

These components which are open to the networks vary a bit across architectures, but include SIP servers, SBCs, WebRTC signaling servers, softswitches, IP-PBX, proxies and TURN servers.  They are circled in red because they are targets – open to the networks.

Notice how the attacker and the communications app both are identified as 192.168.100.1 to the firewall, SBC and signaling server.  Yeah, that’s a problem (it is not much better when we use NAT or ICE/STUN/TURN to give them public addresses…they are still IP addresses which don’t differentiate funny cat videos from malware).

These SIP servers and similarly exposed components enable your video app, softphone, video game, or AI/ML software (e.g. using WebRTC to send voice to the cloud to feed learning algorithms) to ultimately get connected.  In other words, if these vulnerable components are compromised by the billions of nodes which can access them, then your service, business or customers are likely down.

VoIP, WebRTC, etc. vulnerabilities – how have we traditionally defended against these attacks?

Let’s quickly review the historical mitigation techniques (spoiler alert: there is now a better way, but it helps to know where we came from).  These techniques can help, but they are inherently limited because they attempt to bolt-on security to an architecture which is inherently insecure. Let’s take a look:

  • VPNs add a security layer, but have been avoided whenever possible because VPNs often make your app seem like it is not performing (the VPNs are often the real culprits), and the amount of time which VPNs cost administrators and support to deal with.  Meanwhile, in modern app topologies, we are seeing the VPNs are not adding the security that they once did.  In fact, they are often what is attacked.
  • Firewall and Session Border Controller (SBC) ACLs are often not sufficient by themselves, and often all the source IPs are not known or static, which can make ACLs infeasible.
  • SBC rate limiting may help with some DoS style attacks, but many attacks use app-based attacks rather than volume based attacks.  An app-based attack can use carefully crafted SIP invites (designed to take advantage of vulnerabilities in the SIP server – similar to a SQL injection attack) which will not be mitigated by rate limiting.
  • Using deep inspection to separate between funny cat videos, legit voice sessions or malware is often infeasible (add too much latency or jitter), prohibitively expensive and/or ineffective at identifying previously unidentified attack vectors.  It can also be too late – critical info can be gleaned by just getting to the inspection systems.
  • Hosting the SIP servers (WebRTC signaling servers, SBCs, softswitches, IP PBX etc) in a third party security cloud is often the most effective legacy technique against attacks like DoS and DDoS.  Like most solutions, it can struggle to defend against previously unidentified attack vectors, such as the app based attacks described above.  Additive latency of backhauling in and out of the third party cloud can also be problematic, and the solution adds cost and complexity because you are essentially ‘hiding’ vulnerabilities from attackers, and trying to identify and thwart the attacks or probes, which equates to a massive whack a mole type game (which the attacker only needs to win once).

Note: while there is now a better way, it doesn’t mean that these bolted-on techniques shouldn’t be used as additional layers.  These techniques are better than nothing, and they continually evolve to get better.  In fact, partially because VoIP related attacks couldn’t be monetized by cyberattackers the way they can be today, these defenses worked fairly well, even if they were expensive or complex.  Now however, as seen by the recent rash of expensive breaches, we need a better mousetrap.  Actually, we don’t need a better mousetrap – we need an entirely new approach.

VoIP, WebRTC, etc. vulnerabilities – tell us about the new approach that you keep teasing!

Fine, enough background and history.  The new approach we need is to to replace bolted-on security with built-in security.

In today’s hyperconnected world, and even more so in tomorrow’s AI-accelerated world, bolted-on is simply too vulnerable, expensive, slow, reactive, difficult to automate and tough to operate.  That’s because bolted-on is not code, and only code can scale with code.

Security as code is a new approach in which we proactively design security into the application development and delivery lifecycle.  This applies for any app, but here is how security as code looks for VoIP, video or WebRTC apps:

secure voip architecture

Your SIP server is no longer exposed to the networks!  The most important vulnerability on the planet – the open network – is effectively taken out of the game.  How?

As you see above, the VoIP, WebRTC, or AI/ML app built-in a ‘library’ of “Ziti software” code which enables the VoIP infrastructure (SIP signaling servers, SBCs, TURN servers, etc.) to be taken off the network.   This code is our Ziti SDKs, part of OpenZiti, the open source Zero Trust software which NetFoundry open sourced and maintains.  It is built into the app so that there is no separate agent (and it can run as a separate agent if necessary) to deploy.

These apps (VoIP, WebRTC, etc.) can then spawn app-specific, programmable Zero Trust overlays (including the Ziti Router and Ziti Session Controllers) to communicate.  This enables you to close your inbound firewall ports and make your SIP Servers (SBCs, softswitches, WebRTC servers, etc.) off the network! 

Why is this Zero Trust?  Because both sides of the app, e.g. your Zoom client and a Zoom bridge, open up strongly identified (X.509 certificate; bi-directional), authenticated, authorized, session specific, outbound-only overlays, which are bridged by Zero Trust overlay routers.  Notice in the diagram above that the attacker can no longer reach your SIP server – or even see it.  It is very difficult for that attacker to successfully identify, authenticate and authorize on the app-specific overlay (the attacker will need to get that X.509 certificate, for starters…so the same type of cryptography which secures your banking app is now securing your ephemeral, app-specific network connection).  Of course, anything is possible so the native app level microsegmentation of this approach is very important – each session is a microsegment.  This means that if one session is hacked then that session can’t be easily used to attack laterally, for example to attack your other SIP servers.

That’s great but who wants to deploy a Zero Trust overlay?  Sounds difficult.  Nope, the Zero Trust overlay is cloud orchestrated software – you can spin up a global, app specific, Zero Trust overlay in minutes (don’t take my word for it – try it for free, here).  And you have flexibility – you can do it all yourself with the OpenZiti open source, or you can leverage the NetFoundry SaaS services, including the hosted NetFoundry Fabric and bootstrapped PKI type processes with hosted CA (or you can use your CA).  The NetFoundry Fabric is the world’s largest hosted, dynamic Zero Trust SDN (spans hundreds of data centers with routers spun up and down on demand).  Now, I am not claiming you snap your fingers, spin around twice, and are done, but the move from bolted-on infrastructure to built-in code is like insulating your house and installing windows when you build it, rather than throwing blankets on the roof each time it gets cold, or punching holes in the walls when it gets too hot.

It is closing time

We are super excited to offer this new art of the possible – extending our app embedded security to the VoIP and WebRTC worlds.  The security advantages, innovation potential (no more bolted-on infrastructure to impede innovation) and automation aspects (DevOps and NetOps can extend their Infrastructure as Code (IaC) paradigm to Security as Code (SaC) )are borderline ridiculous.  Ok, they are ridiculous.

However, I will tap on the brakes: our extension into VoIp and WebRTC is new.  If you are the type of innovator or early adopter that is willing to take a few bruises while building a new art of the possible, then you can mail support@netfoundry.io to apply to join the pilot program.  The pilot program is free and the help of our developers who will help you embed Zero Trust into your app.  In fact, we’d love to understand the intricacies of your app, and help you build a new art of the possible – and one that is much more secure.

If you would rather wait until others work out the rough spots, then you can subscribe to this blog and we will let you know when Zero Trust VoIP and Zero Trust WebRTC are Generally Available (GA).  Or, if you have other apps that need security as code, or things (databases, app servers, IoT endpoints, etc.) that you want to make unreachable from the networks, then you can get started with the open source or the SaaS.

No matter what your next step is, we hope you consider new approaches (from us or from other) to transition from bolted-on security to built-in security, and make the business case to attack you absolutely miserable for cyberattackers.

Discuss On: