Convergent networks that convey data, voice and video in an integrated manner are spread all around. And this is not just because people consider convergence something fashionable but, rather, because organizations have realized how the concept of Collaboration can contribute to business efficiency and productivity. Of course this convergence promise sounds appealing but along with business relevance comes the responsibility of providing adequate security…
Network Security principles are well mapped and you might be asking what is so special about integrating voice and video.
I will start by saying that the new service possibilities associated with Collaboration are materialized by a set of application protocols that have specific characteristics and networking requirements and, as such, bring new challenges. As usual, from the perspective of network security professionals, the first step is precisely to understand the protocols involved.
And, please, never forget that there is no magic black box approach… Although firewalls constitute one indispensable protection element of an UC Security solution, there are many other resources that need to be taken into account for any practical implementation. For instance, LAN switch, router and IP Phone security features.
Quick Review of Telephony Terminology
SIP, MGCP, H.323, H.225, RTP, RTCP,… really looks like an alphabet soup. A key point to determine what is needed for your IPT project is to categorize these elements by noticing that some of them are designed to accomplish similar tasks. For example, whereas SIP, MGCP, SCCP and H.225 are in charge of call signaling, RTP and RTCP deal with the actual media sessions ( typically the intended product of the signaling process).
H.323 is a suite of protocols (the H.323 framework), which relies, for instance, on H.225 for call setup and H.245 for call control. H.323 offers a direct and a centralized call signaling model (employing an element known as a Gatekeeper).
The Skinny Client Control Protocol (SCCP or simply Skinny) is used by Call Agents (such as the Cisco Unified Communications Manager - CUCM) to control endpoints like IP Phones and audio conferencing stations. H.323 protocols are commonly used by the CUCM when Skinny stations need to communicate with terminals outside of its original cluster. This integration would happen by means of H.323 Trunks.
SIP is both a line-side and a trunk-side protocol. A Call Agent might control IP terminals using SIP and also integrate with elements external to its cluster by means of SIP. Although most of the times SIP uses port UDP/5060 for signaling, TCP/5060 is also reserved for this purpose.
MGCP is focused on controlling Media Gateways (elements that translate between media types, like IP to TDM) from a central point (Call Agent or Media Gateway Controller).
Another important element within the context of telephony deployments is the voice gateway, which is in charge of establishing the interface between the VoIP network and non-IP systems such as the PSTN (Public Switched Telephone Network), a FAX equipment or an analog phone. Many Cisco IOS Routers support integrated voice gateway functionality.
The Real-Time Transport Protocol (RTP) was designed to allow receivers to compensate from jitter (delay variation) and out-of-order packets that may be introduced by IP Networks. RTCP (Real Time Control Protocol) is frequently used in conjunction with RTP and is concerned with transmitting control packets to participants of a given RTP session.
Cisco Unified Communications Manager (CUCM) supports all of the signaling protocols mentioned and, therefore, may be used as a kind of interworking element for different classes of IP-based voice networks. When communicating with legacy systems, CUCM would still rely on the functionality provided by Gateways. A Call Agent like CUCM would sometimes be called an IP-PBX.
How a Stateful Firewall can contribute to VoIP Security
We will now try to briefly explain what is meant by Advanced Inspection of Telephony Protocols in most of the firewall data sheets.
The main requirements for a stateful firewall intended to provide telephony security, irrespectively of the selected signaling method, are described below:
Limit the protocols accessible on the Call Agent side (CUCM on Cisco’s architecture). Classic stateful ACL definitions specifying source/destination IP addresses and ports are used to achieve this goal. Notice that if the firewall understands the protocols, you will only need to open those signaling ports used for your specific environment. This is critical for limiting the exposure of your CUCM.
Obtain information about the RTP/RTCP dynamically negotiated channels, from the inspection of signaling protocols, in order to open only the appropriate ports between the communicating parties. This somewhat resembles what FTP does (watching the control channel to detect any task that requires a data channel to be created). But the voice protocols are a bit more complicated than FTP because they not only open at least two (media) ports but also may use TCP transport for signaling while carrying the real-time media over UDP.
Allow differentiation between the timeouts for signaling and media connections. By doing so, you avoid the need of frequently renewing the control connections while guaranteeing that resources associated with media sessions are freed quickly.
Take care of address translation not only at the network layer but also at the application level. This L7 awareness in the firewall’s support for NAT is relevant because some signaling protocols carry IP addresses in the application payload.
Figure 2 depicts a simple scenario in which two Skinny IP Phones that reside on different subnets need to communicate through an ASA-controlled network. Besides the permission for phone registration and signaling requests (TCP/2000 in the phone to CUCM direction) you will need to allow some auxiliary traffic:
TFTP (from the phones to CUCM): in this case the main responsibilities of TFTP are the delivery of firmware and configuration files to IP Phones
DNS for resolution of the CUCM name.
DHCP is the classic option for IP phone addressing. DHCP Option 150 informs the phones about the IP Address of the TFTP Server. This DHCP server functionality could be enabled on a per-interface basis on your ASA.
But there are environments that require even more coordination capabilities from your firewalls. Even after migrating all your phones to IPT, you will need to integrate with legacy voice networks, thus requiring voice gateways to come into the scene. For example, when using the H.323 framework, the first additional concern has to do with the selection of direct call signaling between CUCM and voice gateways or Gatekeeper-controlled calls.
Among the protocols defined under the H.323 umbrella, some are of particular importance for VoIP/IPT Networks:
H.225: deals with call setup and termination on H.323 environments. Inspection of H.225 is enabled by default on ASA global policy over TCP port 1720 ( inspect h323 h225 ).
H.245: controls traffic flow, capabilities negotiation and allocation of media channels, among other tasks. The TCP Port used for H.245 during a call is obtained from initial H.225 inspection ( and, good news, your ASA understands H.225…)
H.225 RAS: Registration, Admission and Status (RAS) messages are used for H.323 Gatekeeper-controlled scenarios. RAS inspection is turned on by default on ASA global policy for UDP port 1719 ( inspect h323 ras ). The RAS inspection also takes care of Gatekeeper Discovery messages sent over UDP Port 1718 (if you are using statically defined gatekeepers you just need to use UDP/1719 for RAS).
These two H.323 call signaling models are portrayed in Figure 3. The main reason for including this here is to emphasize that the firewall must be smart enough to protect the various phases of the call setup process and keep everything as transparent as possible to the end users.
It is important to observe that there are more signaling entities now (CUCM, gateways, gatekeeper), typically residing on different interfaces. Moreover, the final goal (after inspecting all the control messages) is to establish a call between IP phones (and, again, potentially located on distinct interfaces).
Is it possible to integrate Voice Confidentiality with Stateful Firewalling?
The discussion conducted so far, examined the process of opening only the necessary media channels, after inspecting the exchange of signaling messages among call control elements. This is supported for Skinny, H.323, SIP and MGCP.
If encrypted signaling is employed, the protocol messages become meaningless to Firewall inspection engines, because they will only see a static port reserved for signaling (such as TCP/2443, the SCCP over TLS port) and treat it as a generic, single-channel, TCP-based protocol. This ends up meaning that all the RTP and RTCP media ports would need to be open in advance. Well, but maintaining a large range of UDP ports open, would virtually render the Firewall useless. (Wouldn't it ?)
But, if from the end users perspective the most relevant feature is voice confidentiality, a simple solution would be using unencrypted signaling, so that the connections involving Secure RTP (SRTP) ports could be dynamically created across firewall interfaces. ( - Well, not that simple actually…) You should never lose sight of the fact that the keys used for media encryption are generated during the signaling process and, as a result, in the cases were signaling is not encrypted, these session keys would be exposed…
After this quick reflection about the impact of enabling encryption of Voice traffic on the work of Firewalls, a basic question naturally arises: - Is it feasible at all to enable Stateful Inspection and encryption simultaneously on IP Telephony environments ? ( - Yes, indeed…) This is made possible by the use of advanced mechanisms such as the ASA TLS-Proxy feature, which is quickly summarized in Figure 4. In this solution, after inserting itself in the signaling path between phones and CUCM, ASA recovers visibility about the operations of the supported protocols (Skinny and SIP), thus allowing normal filtering policies to be deployed. (Please notice that a detailed description of TLS-Proxy would deserve a dedicated article and is out of the scope of this text).
This article summarized key functionality that must be available on firewalls that are meant to provide protection for IP Telephony environments. Deep knowledge of the signaling protocol messages and the ability to allow coexistence of stateful inspection with voice confidentiality are among the basic requirements. Recalling the basic principle of building layers of Security, it should be stated clearly that, although firewalls play a critical role on telephony security, other protection measures such as those provided by LAN switches and routers should not be neglected.
For more about this and other Firewall related topics, please check the Cisco Press title: