Tekelec.com

Subscribe via Email

Your email:

SIP Sessions

Current Articles | RSS Feed RSS Feed

MSRP Target Path

My last post described how MSRP endpoints use SDP to setup sessions. Today, we'll discuss how the MSRP protocol uses the results of the SDP offer/answer exchange.

Each endpoint builds a "target path" that it will use for all MSRP communication with its peer. If an endpoint does not use a relay, then the target path is exactly the same as the SDP path attribute value it received from its peer. On the other hand, if the endpoint does use a relay, then it forms the target path by prepending the URI that it got from each relay to the peer's SDP path attribute value. In both cases, the target path form a roadmap of how to get an MSRP request to the peer, i.e. a list of MSRP URIs showing each hop to visit on the way, ending with the URI of the destination device.

If you recall from last time, Alice sent Bob an SDP offer containing the following:

a=path:msrps://alice.example.com:7654/asfd34;tcp

Bob responded with an SDP answer containing this:

a=path:msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp

Since Alice didn't introduce a relay (Bob's relay doesn't count here), she's got life easy. Her target path is exactly what Bob sent her:

msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp
 
Okay, I lied a little bit about Bob's relay not counting. The relay clearly exists in the path, because Bob put it there. But since Alice did not introduce a relay on her own behalf, she uses Bob's path value as-is, without worrying about relays in general.
 
On the other hand, Bob did introduce a relay. So to get his target path, he takes the path Alice sent him, and prepends his own relay, and gets this:
 
msrps://relay.example.net:8211/asfioef;tcp  msrps://alice.example.com:7654/asfd34;tcp
 

Alice and Bob are now almost ready to exchange MSRP messages. But there's one more step that must happen first. Alice must open a TCP connection towards Bob. Note that RFC 4975 says the offerer always opens the connection towards the answerer. There's work afoot to allow MSRP endpoints to negotiate the connection direction using COMEDIA, but for now lets assume Alice and Bob are using RFC 49752 as-is.

Alice connects to the first device in her target path. In this case, that's Bob's relay. She uses the DNS to get an IP address for "relay.example.net" and opens a TCP connection to port 8211, and starts sending messages. That's assuming she doesn't already have such a connection--for example, she might already have an MSRP session in progress with someone else that uses the same relay as Bob. In that case, she just uses the connection she already has.

Now, lets pretend for just a moment that things were reversed, and Bob had sent the original offer. The first device in his target path is also "relay.example.net". But since he already set up a connection to that relay when he authenticated with it, he doesn't need to setup a new one. He just reuses the one he already has. The relay would establish a connection to the next hop (Alice, in this case) on demand.

When either endpoint wants to send a message to the other, it constructs a SEND request with the message content in its payload. The endpoint puts its target path in a To-Path header field, and its own URI in the From-Path header field. It then sends the request to the first device in the To-Path. If that device is the peer, then that's pretty much all there is to it. If the first device is a relay, the relay removes its own URI from the To-Path and prepends it to the From-Path, then relays the request downstream.

Here's a picture for Alice and Bob. I've replaced the actual URIs with the symbols "Alice", "Bob", and "Relay" to try to keep it readable.

To-Path and From-Path 

At this point, you are probably (and rightly) wondering what the point of a relay moving its URI to the From-Path header field when it relays a request. This allows a downstream device to send a response to a SEND request, in the form of a REPORT request. Don't confuse this with a message from the human Bob in response to a message from the human Alice. That would simply be another SEND request in the opposite direction. Instead REPORT requests carry delivery information about the original request.

The peer device, and any relays in between can originate a REPORT request back to the endpoint that sent a SEND request. They do this by inserting the From-Path that they observed in the Send request into the To-Path of the REPORT request. Here's a picture showing a REPORT request sent by Bob, and another by Bob's relay.

REPORT Paths

That's enough for now. Next time, I'll talk about how MSRP messages can be broken into "chunks" in order to multiplex multiple sessions across the same connection. 

 

 

 

SIP-I and SIP-T Challenge: Feature Interworking

This is the final post in the series on SIP-I and SIP-T deployment challenges. You may wish to read the Introduction to SIP-I and SIP-T post for some general background on these two protocols before continuing.

The difficulties in interworking features between the PSTN and SIP networks stem predominantly from two areas. The first is the different models for where services are implemented -- the PSTN expects them to be performed by the network, while SIP is designed for them to be implemented in the end points. The second is the fact that SIP defines tools instead of named services, while the PSTN relies on a discrete and highly constrained set of standardized services.

To be clear, most basic services work just fine in a mixed-protocol network. Call waiting, call forwarding, calling party identification -- in fact, almost all of the most popular PSTN CLASS services work just fine.

A good demonstration of how these two issues can cause problems is the class of services commonly known as "Call Completion." (This service goes by a large number of other names, such as "Auto Callback" and "Camp On Extension.") At a high level, the call completion service works as follows: a calling party attempts to contact a called party; however, the attempt is unsuccessful (either because the called party is busy or because they do not answer). The calling party then activates the Call Completion service. When the called party become available (is no longer busy, or demonstrates availability by changing hookstate), the calling party is alerted. The calling party then picks up the phone, and the call proceeds as normal.

To examine why this is not trivial to interwork, we need to understand how ISUP implements this service and how SIP implements it.

The ISUP version of this service is defined in the ITU-T documents Q.733.3 and Q.733.5. At a high level: during a call setup attempt, the ISUP Address Complete (ACM) message includes an indication that a called party's end office supports the Call Completion service. If the calling party wants to activate the Call Completion service, then TCAP is used to convey an activation request. When the called party is available, TCAP is again used to indicate that the remote user is free. The calling party's end office alerts the calling party and waits for their phone to go offhook. The end office then attempts a new call to the called party. The ISUP Initial Address Message (IAM) indicates that this call attempt is a result of the Call Completion service, to allow proper handling at the called party end office. From this point, the call proceeds pretty much as normal.

The overall call flow looks something like this:


By contrast, the tool SIP uses to perform this service is the dialog event package, defined in RFC 4235. The dialog event package allows a user to subscribe to the state of sessions at another user's device or devices. In other words, by subscribing to the dialog event status of another user, I can tell whether that other user is busy in one or more calls. I can also tell when their terminal goes on-hook or off-hook (when those concepts apply to the terminal they are using).

Of couse, this tool enables a lot more than the Call Completion service -- but it can be used to implement a Call Completion service in the SIP network. At a high level, the call flow actually looks pretty similar to the SS7 version: the calling party sends an INVITE to the called party, which indicates support for the dialog event package in its provisional responses (e.g., 180 Ringing) and final response (e.g., 486 Busy). The calling party can then subscribe to the dialog event package, and learn when the called party is available again. (This can be greatly enhanced using SIP presence information about the called party, but we'll stay focused on replicating the PSTN functionality in this example.) The calling party's terminal then tracks the state of the called party's calls and terminals. When it determines that the called party is available for a call, it alerts its local user, and sends a new INVITE to the called party.

The call flow looks something like this; note that no network servers are shown in this call flow because they do not participate in the service:


The issue that arises is due to the difference between ISUP's call completion service and SIP's dialog event package tool. ISUP's service is very narrowly focused on making this one specific use case work. None of its procedures or messages can be re-used to implement new services. By contrast, the SIP tool can be used for myriad services, such as enhancing multiparty conferences, enabling certain types of advanced third-party call control, and implementing shared-line behavior on multiple devices. And, of course, it can be deployed in new and clever ways to create services that we haven't even thought of yet.

The problem is that the PSTN gateway can tell that the called party supports the call completion service, but can't actually get more general information about the calls that the called party is involved in. And there is no way for the gateway to tell the calling party "I support the call completion service, but don't have enough information to actually do the dialog event package" -- because, in SIP, we use tools, not services.

Similar problem arise with distinctive alerting, call parking, line sharing, and several other advanced services. Luckily, the IETF has formed a working group, BLISS, to tackle these issues. BLISS has been working in concert with TISPAN and other standards groups to ensure that the solutions work well with existing solutions in the PSTN. So, unlike the other deployment challenges we've gone over, this one is likely to get better as time goes on.

 

Tags: ,

Can SIP be a successful protocol?

Some time ago my colleague Jiri Kuthan recommended me to read RFC5218. In it the authors discuss what makes protocols succeed or fail. A successful protocol is defined as one that meets its design goals and is widely deployed.  The authors present some factors which they believe to be crucial for the success of a protocol and present some use cases in which they apply these factors to some successful and failed protocols. Among these factors the authors list the design, extensibility and openness of networking protocols.

While reading the RFC I started thinking, what would be the result of applying these factors on SIP:

Initial Success factors: These are the factors that help a protocol to become successful in the initial phase of their deployment

  • Positive net value: SIP obviously solves a problem; namely that of establishing a session in IP networks. While SIP bears the promise of enabling all kinds of sessions it is mostly used for establishing voice calls. In this context it does not offer more functionality than traditional SS7 signaling, H.323 or Skype. The real positive net value of SIP is hence demonstrated when operators start deploying more SIP-based services such as presence and application servers that offer more flexible and intelligent communication services than we have today.
  • Incremental deployment: SIP can be deployed without having to update the network routers. However, unlike the arguably most successful Internet protocol, HTTP, it is not sufficient to provide a server and a client. For a communication service to be of use there must be a lot of clients and users available. While there are already different providers offering VoIP services using SIP with millions of users, these providers act as islands that are connected over the PSTN. Hence, in order for SIP to excel on this point, more SIP-based peering between providers is needed.
  • Open code availability: There are already different open source components needed for a SIP service. The SIP Express Router is an excellent and widely used SIP proxy. Asterisk and SEMS offer flexible and easy to use media services such as IVR or conferencing. On the user agent side, there are also different implementations of different quality.
  • Restriction free: SIP is a provided as a patent free technology for all.
  • Open specifications: The SIP specifications are provided by IETF and are open.
  • Open maintenance: SIP is maintained by the IETF and is extended and fixed continuously. While this is surely a good thing, this has also led to a load of specifications that some might claim are too much.
  • Good technical design: While SIP was being hailed at the beginning as the simpler alternative to H.323, it has gained a lot of weight over the years. Taking the same comparison factors used in RFC5218 - namely security and congestion control - then SIP does not seem so perfect as congestion control is not considered and it does not have a powerful concept for identity management. Also, deployment issues such as NAT traversal were only added at later stages.

Wild success factors: These are the factors that contribute to success and wide deployment:

  • Extensible: While designed in the early stage for simple calls, SIP is now used for multi-party calls, presence and trunking scenarios. Also, the integration of new applications and services should be rather straightforward as SIP is not restricted to a certain usage scenario.
  • Scalability: While we still do not have any experience regarding the cost and complexity of building a SIP infrastructure for hundreds of millions of users. I do not see a real reason why this could not be done.
  • Security: SIP has different mechanisms for authenticating users and protecting the signaling traffic. However, it does not have explicit mechanisms for protection against DoS attacks or fraud.

Discussion

Looking at the points above it looks like SIP has more or less a positive result on the discussed factors. However, getting positive marks on the evaluation factors does not mean that a protocol will be a success. If we evaluate Skype based on these parameters then we should conclude that Skype should fail. There is no open source code or open specifications and the net value is not much higher than PSTN or SIP. However, the number of users of Skype is higher than that of SIP.

So does this mean that SIP will become a wild success? Well, I guess the answer is a very definite maybe! The success or failure of a protocol can only be judged 5 to 10 years after finishing the standardization - so we still have a few years in front of us. But, it has the needed success factors, and with more applications, peering relations and clearer business models, the chance that SIP will be wildly successful are pretty good.

Mobile Infrastructure Trends for 2010

Mobile data traffic is growing beyond our imagination. After the introduction of the latest iPhone in June, Google stated that the mobile upload of video increased by 400% over the previous day. The introduction of Android powered phones including the recently launched Nexus One will only increase this rate. Over the next four years, it is projected that global mobile traffic will exceed 50,000 Terabytes per day! I believe that this humongous data avalanche that is taking over the mobile world will remain the single most important factor that will influence mobile infrastructure trends in 2010 and beyond.

So if I have to pick the top three mobile infrastructure trends for 2010 what will they be?

  • 3G/4G Network expansion will continue through this year at an accelerated pace. We will see more and more deployments of HSPA, HSPA+ and LTE.
  • Convergence of Wireless LAN/WAN - Traffic growth may be huge, but operators are yet to find that magic formula to monetize all this traffic growth. That means cost optimization of network expansion is critical. Wi-Fi and Femtocell micro-sites will complement 3G/4G networks by offloading much of the data traffic.
  • Just throwing money into mobile bandwidth infrastructure will not by itself address the problem of exploding mobile data traffic. Networks will get smarter in handling the traffic. Better traffic management and differentiated treatment of different traffic types are the essential short-term solution. In the long run, the industry also needs innovative pricing that monetizes different types of traffic that can fund infrastructure growth. Net Neutrality driven regulation may be a wild card that will influence this in certain regions, but I still think this will be the case globally.

What would be your top 3 trends?

FAQ: What is WINFO?

As described in an earlier FAQ, SIP Events uses a notion of a "package" to determine what kind of information is being asked for, what kind of change will cause a notification to be sent, and what the available options are for encoding the information in a NOTIFY request.

The current set of standardized SIP Event packages is maintained at the sip-events namespace registry at IANA. At the beginning of 2010, there are thirteen registered packages, and one special thing called a "template-package": winfo.

Subscribing to this template-package will give you "Watcher INFOrmation": details of each subscription to a particular event. For instance, I could subscribe to "presence.winfo" for sip:RjS@tekelec.com to see who is watching my presence.

Template packages are never used directly - they must be applied to regular packages. In other words, it isn't possible to subscribe to "winfo", only to events like "presence.winfo" or "message-summary.winfo".

The template package concept was introduced to make it easier to build packages that would extend every other existing package the same way. It would have been possible to build the same system without template packages by creating separate "presence-winfo", "message-summary-winfo", etc. packages, but each of those would have to respecify the common behavior. Having this meta-package tool avoids that extra specification work (and makes it less likely that watcher information for package "foo" and for package "bar" would behave in subtly different ways).

The concept has been with us for nearly a decade, and the only event-template package we've found a need for is winfo. It may be the only one that ever exists.

Like many aspects of SIP Events, winfo was driven by presence. When a new person tries to add me to their list, I need a way to find out so that I can give the service permission to hand my presence to that new person. My client needs a nudge so it can ask me whether I would like to allow or deny the subscription. Early attempts at a solution involved having the server send me a QAUTH request before answering the SUBSCRIBE from this new person. That turned out to be a dead-end for two reasons. First, like all SIP non-INVITE requests, QAUTH had to get an answer within 32 seconds (64*T1, where T1's default value is 500ms). If I didn't happen to be sitting in front of my computer, notice the dialog, and answer within that time, the wrong thing happened. Second, if this new person and I never happened to be online at the same time, authorization would never complete.

To solve these problems, we reused SIP Events itself - using the winfo template-event package to subscribe to changes in the set of watchers for any other package, like presence. The initial NOTIFY for a winfo subscription will describe each of the existing subscriptions detailing who the subscriber is, how long the subscription has been in place, when it will expire (if it isn't refreshed), and what the current authorization state for the subscription is. (Remember that subscriptions can enter a "pending" state if a server doesn't have authorization when the SUBSCRIBE arrives). To solve the never-online-at-the-same-time problem, winfo carries one more state, named "waiting", for subscriptions which were attempted recently but for which authorization was not available.

Here's a short example of winfo in action. Assume at the beginning of this flow that I've authorized Ben and Adam to see my presence, but not Theo. Note that in this flow, Theo and I are never online at the same time.

 

The winfo NOTIFY body format is XML. The initial NOTIFY for a winfo subscription will have a complete list of current watchers. Subsequent NOTIFYs will only contain information for those watchers whose subscription has changed state.

For more details on the format of the winfo NOTIFY bodies, see RFC 3858. The winfo template-package itself is in RFC 3857.

 

BICC: A "Temporary" Solution to a Real Problem?

Recently BICC became the protocol of choice for several wireless operators deploying VoIP trunking backbones. For many it is perceived as a temporary workaround solution.

What I'm asking myself is whether such a temporary solution might possibly stay for longer?

In the IP world we have quite a few examples of "temporary solutions" that have featured momentary advantage, real or perceived, and which have stayed with us. I consider NATs (Network Address Translators) the most important example. NATs translate IP addresses in packets from/to the public Internet to a private address range. This allows for the sharing of a scarce public IP address among multiple devices (real problem) and hiding these devices better. The latter is considered a perceived security feature as the same functionality can be implemented in firewalls without the side effect of changing IP packets. The changes to IP packets can cause dysfunctional applications, errors in security protocols and limitations to redundancy schemes. (See RFC3027 for more).

In the IETF standardization body NATs have therefore been considered as architecturally unsound, or even evil architecture. It shall be added that guarding an "architectural spirit" has allowed Internet technology to develop, stay manageable and prevail. In this spirit the "right answer" to the problem of scarce IP addresses has been proposed: IPv6. NATs have been labeled as a "temporary solution". However, many years later it is as hard to find a user without NATs as it is a user with IPv6. NATs are to stay and the fate of IPv6 is all but clear.

I think a similar scenario is occurring with Session Border Controllers (SBCs). They solve real problems such as NAT difficulties that have bubbled up to the application layer. They solve temporary problems such as interoperability of immature implementations. They also solve perceived problems -- we know yet too little about what security problems are out there but SBCs offer an answer already. However, the coincidence of solving real problems and claims they are of a temporary nature seems to me remarkably similar to NATs. I think this may lead to a similar ending, with SBCs staying and solving the problems that are compelling and permanent. Certainly, re-connecting disjoint IP addressing spaces would be one of those.

What is the next controversial architecture to stay? NATs are definitely staying, SBCs keep "temporarily" reconnecting disjoint address spaces. The capability to solve compelling problems has prevailed over desires for a consistent and presumably easier-to-maintain architecture. Recently, several BICC deployments have emerged -- is BICC going to be the next protocol that is "architecturally unsound", yet here to stay?

The BICC architecture is very special-purposed and therefore of limited applicability. Basically, it inherits encoding from ISUP, and transports it over IP or even ATM. Development beyond the purpose of trunking began stalling at "CS3" (Capability Set 3). By any measure it is a real hybrid vehicle.

Still, as unappealing as it sounds, deployments are reported and continue to solve the MSC trunking scenario. We can soon witness again an architecture solving a real problem and prevailing over "grand architectures" in specific use cases.

It appears that the notion of hybrid vehicles is not exclusively owned by the automotive industry.

Happy New Years!

Tags: , , , ,

MSRP SDP Extensions with Relays

My last SIP Sessions post discussed the SDP offer/answer extensions used by MSRP in the peer-to-peer scenario. Today, we will look at how this changes when you introduce MSRP Relays into the mix.

RFC 4976 defines the MSRP relay extension. There's quite a bit to talk about with MSRP Relays. Today we're going to focus just on the parts that impact the offer/answer process. I'll cover more about relays in a future post.

An MSRP client that needs to use an MSRP relay must first authenticate to the relay and request an MSRP URI that represents the session at that relay. It does this using an MSRP extension method called "AUTH". We will dive into the gory details of AUTH after we discuss the general MSRP transaction model--also in future posts (Are you starting to see the pattern here?). But we need to understand it conceptually in order to explore how relays affect the offer/answer model.

The client sends the AUTH request to the relay over a TLS connection. The relay authenticates the client using a form of digest authentication much like that from HTTP and SIP. The client uses the TLS association to authenticate the relay.

Once the authentication completes the relay generates an MSRP URI that resolves to the relay itself. The relay puts this URI in a "Use-Path" header field in the 200 OK response that it sends back to the client in response to the AUTH request. The client then uses the relay's URI in the session negotiation.

This is conceptually similar to how some other relay-based NAT traversal mechanisms work. For example. SOCKS and TURN each allow a client to request a relay device allocate a port on its behalf.

Once the client gets a "Use-Path" header value from the relay, it can then build the SDP path attribute by appending its local URI to the relay URI. For example, assume the client's URI is "msrps://client.example.com:2855/asfd34;tcp" and the relay returned a Use-Path value of "msrps://relay.example.com:7212/d3asdf43;tcp" The path attribute would now look like the following: 

a=path:msrps://relay.example.com:7212/d3asdf43;tcp msrps://client.example.com:2855/asfd34;tcp

You're probably wondering why "Use-Path" is not called "Use-URI". The reason for this is, just like the SDP path attribute, "Use-Path" can contain more than one URI. There are few cases where a relay might need to return more than one URI. (You guessed it--we'll talk about those in a future post.) But regardless of why it might happen, the relay-using client would build the SDP path header by taking the entire contents of "Use-Path", reversing it, then adding its own URI to the end. Thus, the SDP path attribute becomes an assertion to "to get to me, follow this path from left to right. My local URI is on the end."

Let's look at a more complete example from RFC 4976. Alice invites Bob to an MSRP session. Alice does not use a relay, but Bob does. Alice's offer looks something like the following:

v=0
o=alice 2890844526 2890844526 IN IP4 alice.example.com
s= 
c=IN IP4 alice.example.com
t=0 0
m=message 7654 TLS/TCP/MSRP *
a=accept-types:text/plain
a=path:msrps://alice.example.com:7654/asfd34;tcp

When Bob sees the offer, he connects to his relay (relay.example.net), and performs an AUTH transaction. He gets back a 200 OK response containing, among other things, the following:

 Use-Path: msrps://relay.example.net:8211/asfioef;tcp

Bob's SDP answer then looks something like this:

v=0
o=bob 2890844542 2890844542 IN IP4 bob.example.net
s= 
c=IN IP4 bob.example.net
t=0 0
m=message 6581 TLS/TCP/MSRP *
a=accept-types:text/plain
a=path:msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp

Note that in this case, Alice's client does not have to implement RFC 4976 at all. It won't know how to use the AUTH method, but even basic RFC 4975 clients can still talk to relay-using peers.

That's enough for now. Next time, we will talk about how these SDP path attributes get used inside MSRP proper.

Tags: 

SIP-I and SIP-T Challenge: SIP Forking

This post continues the series on SIP-I and SIP-T deployment challenges. You may wish to read the Introduction to SIP-I and SIP-T post for some general background on these two protocols before continuing.

One of the most powerful features built into the core of the SIP protocol is called “forking.” Forking allows any SIP proxy to send an inbound request – such as an INVITE request – to more than one destination. It can send these multiple requests either all at once, sequentially, in groups, or use any arbitrary combination of those options.

This feature allows the implementation of services such as “find me, follow me,” parallel ringing, delivery of instant messages to multiple devices, and several other interesting capabilities.

When SIP forking occurs during session establishment, the INVITE messages involved in setting up the call actually travel all the way to the called party’s devices, and establish a protocol relationship directly between the calling device and the called devices.

The reason this was built into the core of the SIP protocol is that, unlike many other technologies used for real-time communication, SIP inherently supports the concept of having a single user potentially available via several devices simultaneously. Callers are generally interested in contacting a user, not a device – so, to support mapping from one user to several devices, we decided to inherently provide functionality for contacting several devices.

While it is immensely useful, SIP forking has proven to be one of the most difficult challenges we face when developing SIP protocol extensions in general. SIP-I and SIP-T are no exception: forking causes problems for both signaling and for audio.

The signaling problem arises from the fact that ISUP and BICC have no inherent protocol behavior that is analogous to SIP forking. Implementation of parallel ringing services in an ISUP network requires termination of the call at an application server, which re-initiates the call towards the various target devices. So, for example, if a parallel ringing call alerts three devices, there are four ISUP calls involved: one from the caller to the application server, and one from the application server to each of the three devices. There is no direct relationship, from an ISUP perspective, between the caller and the devices.

Consider the case in which a SIP-I or SIP-T call arrives at an ingress gateway, and is forked by a SIP proxy to two different egress gateways. The messaging looks something like this (I’ve omitted PRACK transactions for the sake of clarity):


The INVITE messages sent to the two egress gateways will contain the IAM message that started the call, which is sent to both of the called end offices. (Note that the egress gateways will adjust the called party number in the IAM according to the SIP URI in the INVITE, so it will end up indicating the two different devices the call is being sent to).

Assuming that both of the called devices are available, both egress gateways will receive ACM messages from the called end offices, which get mapped into SIP “180 Ringing” messages. Both of these messages arrive at the ingress gateway. The first one that arrives – message 5 in the above diagram – will have its ACM extracted by the ingress gateway, and sent back towards the calling party. However, the gateway must be careful not to send the second ACM (from message 7) into the ISUP network: doing so would be a protocol error, which would cause the calling end office to tear down the call.

Depending on how much ISUP signaling occurs prior to the called party answering, there may be several tunneled ISUP messages that arrive at the ingress gateway while the SIP forking is still active. The ingress gateway is responsible for taking the two different streams of ISUP messages and converting them into a coherent set of messages for the calling party’s end office. This can be tricky to get right, and any errors will cause the call to fail.

The media-related issue with forking arises from the difference between when SIP expects media to start flowing and when ISUP expects media to start flowing (see my earlier entry about early media for a summary of the general issue). Forking makes this problem much more difficult, since there can be more than one media stream present. If both media streams are simply ringback, it doesn’t typically make much difference which one the ingress gateway plays. But there’s no way for the ingress gateway to know ahead of time what might arrive in the media – it could contain ringback, an announcement, or even playout of an IVR menu.

To further complicate matters: if the gateway elects to play the media stream from one gateway, but the call is answered by another gateway, the called party’s media won’t be played out immediately. This will clip off the beginning of whatever the called party says upon answering the phone. Even worse, it isn’t always possible to tell which media stream belongs to which call, which means the gateway might have to wait for the “incorrect” media stream to completely stop before it can switch over to the proper stream. Since the only way to detect the end of an RTP stream is via timeout, it may be a full second or longer between the called party answering and the media being established.

Unfortunately, neither SIP-I nor SIP-T provides guidance for handling the issues that arise from forking. Implementations are left to handle the problems how they best see fit. And, in many cases, there aren’t any good answers.

Tags: , , ,

SIP Security: Theft of SIP Services

Stealing the identity of another user allows the attacker to use some service with the costs getting charged to someone else. However, the attacker would be limited to the privileges of the stolen identity and all calls conducted by the attacker would have the user's identity as the originator or recipient. Fraudsters would, however, in general like to conduct fraud on a larger scale, e.g., by selling stolen services to other people and, hence, gaining from the fraud not only free calls but also monetarily. This can be achieved by getting access to the infrastructure components of the SIP service, e.g., SIP proxies, databases or gateways to the PSTN. With such access, the attacker can manipulate the authentication process so that his calls are not authenticated or are considered as legitimate or can simply ensure that no billing records are generated for his calls.

Recently, there have been two patterns for conducting this kind of fraud, namely password guessing and credential emulation.

Password Guessing

SIP components usually have an administration interface that allows the administrator to configure the system, control the privileges of different users and actions and set the logging and billing criteria. This interface is usually protected through a password. Often, all devices manufactured by the same company share the same password. Administrators often forget to change this password during the installation process at the provider's premises. By knowing this default password, an attacker can assume the identity of the administrator, which would allow him to receive the needed privileges for misusing the service. Such fraud can be prevented by changing the password of the SIP components and protecting the administration interface so that it is only accessible over a trusted network link.

Credential Emulation

A popular setup for VoIP services is presented in the figure below. In this setup the proxy is responsible for authenticating the incoming requests and forwarding legitimate requests to the PSTN gateway. To indicate to the gateway that a request is legitimate, the proxy adds special information in the forwarded requests. This information is then used by the gateway as an indication of the legitimacy of the request and would, hence, only initiate calls to the PSTN if a request included this information. A fraudster can detect this information either by guessing or by brute force. By including this information in his own requests, a fraudster can fool a gateway into believing that his requests are legitimate. By running his own proxy server and adding this information to the requests of his customers, the fraudster would receive access to the PSTN without having to pay for it.

In general, this kind of attack is more complex. The fraudster needs to detect gateways that accept SIP signaling requests directly from the Internet and use this kind of authentication approach. Further, to cover their traces, fraudsters need to first gain access to a VoIP server of an enterprise or university with wideband Internet access and then route the calls through these servers.

To protect against such fraud, the communication between the proxy and the gateway must be secured. This can be achieved by having the gateway all SIP requests arriving from any other IP address than that of a set of trusted proxies. This could, however, be circumvented by having the fraudster spoof the IP addresses of his requests. Higher security can be achieved by establishing a secure tunnel, e.g., using IPSec or TLS, between the proxy and gateway and rejecting all SIP traffic not arriving over this secured link.

Why there isn't a successful SIP certification program

Over the last several years, I've had many conversations about building a certification program for SIP, including trying to define a few. All of those conversations have ended either in frustration or the conclusion that such a certification program is not the right thing to build.

The proponents of such a program came in each time with a lot of energy and excitement. The conversations got tough when we looked closely at what the program would actually certify. What do you test? What do you require a passing system to do? Each time, it turned out that the proponents really had a single, focused use of SIP in mind (usually simple telephony replacement). The motivation statements tended to look like "I want to buy a phone from and have it work with my service". The participants quickly became mired in arguments about what the tests to ensure that should cover. They discovered that they really wanted to test for a lot of end-user visible behavior that the SIP specification itself leaves undefined.

As we tried to work further through the details, we'd frequently see arguments to profile the protocol. In very early conversations, there was pressure to not require (or even penalize) the implementation of SIP over TCP or the use of the 100rel/PRACK extension, usually driven with a "nobody really does that" argument. It's worth noting that both of those are required in many deployments today. The folks focusing on simple telephony didn't want to be burdened with testing the parts of the protocol needed for presence and vice-versa. The business telephony oriented people had an entirely different idea of what a program should look like than the single-line replacement oriented folks.

In short, what people really wanted was a certification program for their particular application, not for the protocol itself. Unfortunately, at the time, I don't think anyone involved realized that was the root of why such programs weren't coming together.

With that realization, I've become even more convinced that a generic SIP certification program isn't feasible - it wouldn't produce a useful tool for making our ecosystem(s) better. The energy would be better focused on how the protocol is used rather than trying to certify implementation of the protocol itself.

There are a few new programs under discussion now, in the SIP Forum and other organizations, which are trying the approach of defining certification programs for an application. Those conversations seem to be going further than the earlier attempts, and I think some of them have a chance of succeeding.

All Posts