Posted by Adam Roach on Tue, Mar 09, 2010 @ 05:29 PM
One of the abilities that SIP has had since its very earliest days is the ability to use X.509 certificates to sign and encrypt messages being sent across the network. If you haven’t heard of X.509 before, don’t worry – you’re hardly alone.
X.509 itself is based on a really elegant class of algorithms called “Public Key Cryptography.” At a high level, here’s how public key cryptography works: there are mathematical functions you can apply to a large random number to generate two linked cryptographic keys. These keys have the very interesting property that something that has been encrypted by one key can be decrypted by the other, and vice versa. In Public Key Encryption, you generally designate one of these keys “public,” and make it available to anyone who wants it. You designate the other key “private,” and keep it secret.
Once you’ve done this, there are a couple of very interesting things you can do with these keys: first, other people can use your public key to encrypt messages they want to send to you. But the only way to decrypt them is using your private key, which guarantees that no one else can read the messages (not even the person who encrypted them!). This is shown in Figure 1.
Figure 1: Public Key Encryption and Decryption
The other thing you can do is sign messages with your private key (proving that you generated the message), and anyone with your public key can verify that the message hasn’t changed since you created it. This is shown in Figure 2.
Figure 2: Public Key Signing and Signature Verification
So, that’s all very nice from a security perspective – so why haven’t we seen more of this? Well, it turns out that the hard part of this isn’t the cryptography (in fact, every modern email client has the ability to do this); the hard part is getting your public keys to everyone. There have been a number of designs for public key infrastructures (PKIs) that are supposed to address this problem, but they haven’t been deployed for a number of reasons.
There’s one notable exception, though: web sites. Because e-commerce depends so heavily on the ability for customers to verify that a web site is who it claims to be (through the signing operation), and to send information like credit card numbers in a encrypted form (using the encryption operation), web sites pretty much have this all figured out.
That gives us just enough of a PKI to pull SIP up by the web’s bootstraps, and that’s exactly what RFC 4474 does. At a high level here’s how that works. Let’s say Alice wants to call Bob, and Bob wants to make sure it’s actually Alice calling before he answers the phone. So, Alice sends a SIP INVITE message to her proxy. The proxy makes sure that the caller is actually Alice, usually by asking Alice’s SIP device to prove it has a password of some kind. Alice’s proxy then adds a signature to the SIP INVITE, which says that the proxy has verified that the caller is, in fact, Alice. It also includes the address of its web server in this INVITE message.
So, when Bob gets the INVITE, he can use this web server address to ask Alice’s Proxy’s web server for the public key for Alice’s proxy. He knows he can trust the web server, because we already have a working PKI for web servers. So, Bob can then use the public key he got back from the trusted web server to verify that the proxy is who they claim to be, and that they’re in a position to vouch for Alice’s identity. The overall information flow for this set of operations is demonstrated by Figure 3.
Figure 3: RFC 4744 Identity Verification
And that, by itself, gets us a lot of the way to where we need to be. Bob now has cryptographic proof that the person calling him is, in fact Alice.
But this still doesn’t get us all the way to a fully-functioning X.509 service. For example, it doesn’t let Alice sign messages herself, and it doesn’t let her encrypt the actual SIP messages she sends to Bob. Sure, she can send them over TLS, but that only makes things encrypted between proxies – each proxy is still decrypting and re-encrypting the message at every hop. And those proxies can read any part of the message they want.
We’re just finishing up work in the IETF that’s about to change that. The document is called “Certificate Management Service for the Session Initiation Protocol” (or “SIP Certs” for short), and it uses RFC 4474 to do something even more clever.
What’s really neat about RFC 4474 is that it can be used for any kind of SIP request, not just INVITE requests. The SIP Certs work leverages this even further by counting on the ability of RFC 4474 to certify the identity of the entity sending NOTIFY requests. Here’s how we take advantage of that.
Let’s imagine that Alice wants to encrypt something to Bob using his own X.509 certificate. With the SIP Certs framework, Bob will use the SIP PUBLISH method to send his pubic key to what is called a Certificate Server. The certificate server makes sure the person sending the certificate is actually Bob, and then stores the public key so that anyone who comes along asking for it can get a copy. Later on, when Alice wants to encrypt something that only Bob can read, she asks Bob’s certificate server for Bob’s public key, and uses this to encrypt her message. This flow is shown in Figure 4.
Figure 4: SIP Certficate-based Encryption and Decryption
Now, the step of asking Bob’s certificate server for Bob’s public key is actually a bit tricky, because Alice wants to make sure that Bob’s certificate server is actually
Bob’s certificate server. And that’s where RFC 4474 comes back into the picture: just like the INVITE in the example above, the NOTIFY that contains Bob’s public key contains a signature from Bob’s certificate server, and a pointer to a web site that Alice can go to get a certificate to verify that signature. Which is all a very clever way to get back to leveraging the web Public Key Infrastructure to get a usable public key all the way out to Alice in a way that she knows she can trust that the key is actually Bob’s public key.
And once Alice know that she has Bob’s actual public key, she can encrypt a message for Bob and send it to him.What’s really cool about this mechanism is that it can be used to sign things as well, as shown by the information flow in Figure 5. Alice simply publishes her own public key to her Certificate server, and then uses her private key to sign a message. Bob can verify Alice's signature by grabbing her public key from her Certificate server, and using it to validate the message he received.
Figure 5: SIP Certificate-Based Signing and Signature Verification
Posted by Cristian Constantin on Tue, Mar 02, 2010 @ 04:35 PM
Session Initiation Protocol messages can be transported over several different protocols: UDP, TCP, SCTP; each of which has advantages and disadvantages. What follows is an overview of them.
SIP over User Datagram Protocol (UDP)
UDP is the simplest way of transmitting chunks of data from one host to another in an IP network. Provided that the amount of data to be sent at once is not too big, UDP will do its best to accomplish the task. Pretty fast too. If the programmers did their job correctly, multi-process or multi-threaded applications do not require extra synchronization delay since the read/write operations are atomic when it comes to UDP sockets.
What you get is maximum throughput. However, this comes at a cost - it may trigger congestion. Congestion means basically that the infrastructure cannot support the amount of traffic that is sent/received through it. Congestion can show up in different parts and layers of the infrastructure:
- it can happen on the way to the remote host - network congestion; in this case the network cannot route and transport at the expected rate.
- it can happen on the remote host itself - application congestion; the end host (a sip proxy for example) cannot process the packets as fast as they are received.
Since UDP and SIP do not provide any explicit mechanism for overcoming congestion, it has to be taken care of in the signaling application.
Here is an example of application congestion that can take place during a failover in an active/standby configuration of sip proxies. The active proxy fails and the standby one takes over after several seconds; due to the SIP retransmissions the newly active proxy will experience traffic spikes which persist for some seconds after the service functions again. In the case where the traffic rate both before and after the failover happened is close to the engineered calls-per-second (limit supported by the proxy), these spikes may lead to application congestion on the proxy which has just become active.
Another drawback of UDP is that it does not provide either acknowledgment of received datagrams or retransmission mechanisms; SIP takes care of this at application level by using a simple retransmission algorithm.
SIP over Transmission Control Protocol (TCP)
TCP offers a lot more than UDP when it comes to congestion, retransmission and error control. However, TCP is a stream oriented protocol which was designed for transferring reliable chunks of data from host A to host B. Signaling with real-time constraints was not one of the design requirements for TCP.
Conceptually, a TCP based application sees received or sent data as a continuous flow; and this is correct for applications that copy files from remote hosts. For protocols like SIP which is using delimited messages sent over the same TCP connections, things are getting more complicated. The reads and writes on the TCP socket have to be serialized and the reading of the SIP message from the stream is more complicated than in the case of UDP since it may arrive in different TCP segments whose payloads are not delivered all at once to the user space socket.
Standard TCP implementations do not allow configuration of internal timers. Timing is important for SIP based applications though. For example, in the telecom world you need to be able to tell pretty fast whether your peer is still there or not.
TCP subsystems on modern operating systems offer some support for that: keep-alives. These are basically empty TCP segments having the ACK flag set, which are sent periodically in case of idle intervals to monitored peers. If the peer is still running it answers back an ACK, otherwise there is no answer and the local application knows that either the remote peer has crashed or there are problems at lower layers. Again, modern operating systems like Linux offer the possibility to configure the timeouts for the keep-alive mechanism:
-
for how long a connection should be idle before sending keep-alives; one drawback here is that the minimum value for this parameter is 1 second
-
how often the keep-alives should be sent; again the minimum value is 1 second
- how many keep-alives are sent before the peer is declared dead
SIP over Stream Control Transmission Protocol (SCTP)
SCTP can be considered the Swiss army knife of transport protocols. It basically offers combined features of both UDP and TCP. UDP-like features are: message boundary preservation, unordered message delivery, one-to-many sockets at the application level. Among TCP-like features: positive (selective) acknowledgment, retransmission of lost data, windowed flow control, congestion control, one-to-one sockets at the application level. What makes SCTP unique are some features which do not appear in other transport protocols:
- multihoming
- multiple streams per connection
- built-in heartbeats
- much more flexible when it comes to configuring certain parameters - especially for controlling timing
- exposes asynchronously its internal states to the application level through the use of notifications
How much of this is useful for SIP? Message boundary preservation as in case of UDP will make reading/parsing of the SIP message easier; whereas unordered message delivery can help in case of head of line blocking.
What is really helpful for real time oriented applications is that SCTP sockets offer fine tuning of timer values and more details about what has happened on a certain association. The parameters that control the timers for association setup, retransmissions and heartbeats, are configurable per system and per socket. Transport layer failure detection can work fast when appropriate values for SCTP heartbeats are used.
Things get even simpler from the application layer programming perspective. The SCTP notifications provide the means of monitoring what happens with a certain SCTP socket and are standardized by the SCTP socket API. A broad range of asynchronous notifications are sent on the SCTP socket: association start-up, association setup attempt failure, transport-level events, remote operational errors, undeliverable messages.
There are of course pitfalls - SCTP is a relatively newcomer in the transport protocols ecosystem. The SCTP socket API is a moving target still under development. Due to novelty, the level of complexity of some of the SCTP stack implementations is inversely proportional with the time spent on testing them; sometimes their performance in terms of throughput is not on a par with the one offered by TCP.
Posted by Robert Sparks on Tue, Feb 23, 2010 @ 10:50 AM
One of the
first articles posted to SIP Sessions discussed Postel's Maxim, also known as the Robustness Principle:
Be conservative in what you do, be liberal in what you accept from others.
That article gave examples of how robustness declines when individual implementations don't follow that principle. Today, I'd like to explore how the principle helps large systems and architectural design.
Imagine a system where all of the elements present are following this principle. Then introduce a single new element that doesn't.
- An element not following the first part of the principle will not be careful with what it emits. It might emit things that force edge conditions to occur, or even outright violate the protocols the system is using. The system as a whole will continue to behave robustly. Being liberal in what they receive, the remaining elements will forgive as much protocol violation as they can without emitting non-conformant messages themselves. They will have anticipated gracefully handling edge conditions. The system as a whole will not be significantly perturbed by the new element, and for the most part, the new element is likely to get whatever services it wanted.
- An element not following the second part of the principle will be over-sensitive to variations in the messages from others. In the perfect system discussed so far, all the other elements are being careful to send conforming messages, so the new element's brittleness won't be exposed. The system continues to work robustly, at least until the rest of the system starts to evolve.
Now take the same system and introduce a very large number (enough to be a majority) of new elements that don't follow this principle.
- If the new elements are not conservative in what they emit, the entire system is placed under strain. The previous elements are now executing exception code as part of normal operation. Edge cases become the norm for them. The system no longer meets the expectations used to make optimization decisions in the original elements. In many cases, however, the system still won't fail. But the robustness has been removed - used up. The next wave of non-conformance has a much higher chance of bringing the system down.
- If the new elements are overly sensitive to what they receive, the strain is not (initially) so great. Because the original elements are conservative, this sensitivity isn't exercised and the system will seem unperturbed. But then someone will have a bright idea and introduce a new, standard behavior into the system. These new elements, and remember they are the majority, will not accept this new behavior. A forklift upgrade of all of them will be required before a single new element trying to exercise this new behavior can be expected to work.
In short, a system that takes advantage of the robustness principle can survive even a large number of ill-behaved participants, but the cost is becoming brittle.
It's easy to overlook this resulting brittleness when trying to solve any particular problem in isolation. It might be tempting to say - "Well, it's ok if we violate this part of the specification, because this other part of the specification says conformant elements must not break if we behave this way." This is a trap. It's like saying "I can speed through every intersection, even if the lights are red, because the law says other drivers are required to make sure the intersection is safe to enter before doing so." It's even worse for a specification or an operational policy to encourage violating half of the robustness principle this way. That's like having a driving school telling its students to run through all the red lights.
Hey, it would work if the rest of the system were perfect in executing the "make sure the intersection is safe" rule.
That's where reality tears the notion down. The system I had you imagine at the beginning of the two exercises above cannot exist
in the real world. Real systems will have imperfections. Systems that use the robustness principle make it less likely that those imperfections will result in failure. Introducing elements that don't follow the principle negates that robustness, and in real systems, will cause those systems to fail.
Posted by Vince Lesch on Tue, Feb 16, 2010 @ 05:26 PM
This is my first in what will become regular contributions to the Tekelec SIP Blog. In addition to covering SIP from a purely technical perspective, I will look at some of the traction and activities in the market. This installment focuses on the adoption of IMS, and hence SIP, by mobile operators as the foundation for supporting voice and SMS in LTE environments.
IMS, like many new technologies, has gone through a tremendous "hype cycle". There has been the typical name bashing - with claims such as IMS means "I Must Sell" new technologies and the ups and downs of "everyone is moving to IMS" and then "no one is moving to IMS". While initially developed by the mobile industry in 3GPP, it was widely adopted as a standard by the wireline and cable communities as well. At this point in time the number of commercial deployments of IMS has been modest by any measure and my guess is that there are probably more commercial wireline deployments of IMS then there are commercial wireless deployments. However, over the next few years that may all change.
On November 4th 2009, a relatively simple press release was made by a collection of mobile operators on the work that they had collaborated on to create an IMS profile that they named "One Voice". Simply put, this profile was created to help the industry secure a common standardized IMS voice (and SMS) solution. To do this, they defined a "profile" or specification that defines a common, recommended feature set from the 3GPP IMS specifications when multiple options exist for a single functionality. The goal of One Voice was in essence to have a common IMS profile to support Voice and SMS in an LTE environment.
Since then there has been considerable collaboration between the GSMA and participating One Voice companies to enable the profile work to be handed over to the GSMA to take the lead moving forwards. On February 15, 2010, GSMA announced that it had chosen to adopt the One Voice initiative and has given it its full backing. Having adopted the One Voice profile, the GSMA has opened the work to its entire membership (820+ operators and 220+ vendors) and will work with all interested companies to define protocols needed for LTE voice connectivity. It will also work to define the interfaces and functional architecture required to enable international roaming and establish interconnection policies between mobile operators, using the One Voice profile as the basis of that work. This will all result in the definition of end-to-end service principles for Voice over LTE. The sum total of work will be completed under the name of Voice over LTE, or VoLTE.
This work will hopefully benefit the entire operator and vendor community and move towards broadening the deployment of SIP and IMS-based technologies. The worldwide penetration of mobile phones greatly exceeds that of wireline phones so the adoption of IMS in wireless networks will greatly expand the use of SIP. Of course, the work will take time and LTE will not be deployed over night, although there appears to be growing momentum for LTE in large part driven by the huge uptake in mobile data.
In the mean time, what do I expect to see? Well, many operators today have deployed VoIP in their core network as part of their R4 soft switch deployments. However, when the 3GPP R4 specifications were adopted they selected BICC - or Bearer Independent Call Control - as the signaling protocol because of the relative immaturity of SIP at the time. Over the last 12-18 months we have begun to see some of the R4 soft switches (or MSS) in mobile networks evolve to support SIP in addition to BICC and of course SIP has become the "winner" for next generation signaling. Recently we have also seen interest in BICC-to-SIP functionality to help mobile operators cost effectively transition to SIP and interwork with other SIP based networks (e.g., international transit networks) from their BICC-based mobile core.
So will IMS happen in mobile networks as part of the deployment of LTE? More than likely the answer is yes based on the current direction being pushed by the mobile operators behind the VoLTE initiative. Of course many other questions remain, including: over what time frame will this happen, what is the evolution path, and how long will hybrid 3G-4G networks exist and what challenges will these hybrid networks create for operators? I will likely address these and other topics in future posts.
Vince Lesch
Tekelec CTO
Posted by Jiri Kuthan on Tue, Feb 09, 2010 @ 03:52 PM
It may be of interest to readers of this blog that recently a thought-provoking Internet Draft was published by Henry Sinnreich et al: SIP APIs for Communications on the Web - draft-sinnreich-sip-web-apis-00. In a nutshell, the Internet Draft exhibits critique of SIP and suggests a consequent usage of HTTP technology for VoIP used along with other Internet applications.
The question begs, whether the critique is justified and whether the shortcomings mentioned introduce sufficient pain to make one-self busy with abandoning SIP. I think the critique is largely fair; I'm less sure that there is a time window in which the level of pain can cause such a change.
The critique cites several reasons: standards' complexity, insufficient consideration of NAT traversal and difficulties to link to web apps - to name the most significant ones. The complexity statement can be easily confirmed by looking at the VoIP RFC Watch site maintained by Nils Ohlmeier or by checking the SIP WG RFC dependency graph. And, NAT traversal has proven itself to be a large problem resulting in the formation of new network element type - the session border controller (SBC). Non-voice apps are indeed still letting us wait for them.
The real question to me is whether these difficulties will create enough motivation to change from SIP to Web technology. The window of opportunity seems closed: complexity can be somewhat lowered; like it or not, we have SBCs for NAT traversal; and web-apps can be built with or without SIP.
Still, isn't the number of these workarounds a good enough reason to give the HTTP legacy a try? I have recently found out that remote sharing of "smart boards" (i.e. whiteboards with direct output to PCs) works over web port 80 via a third party to facilitate NAT and firewall traversal. Most video apps are using HTTP (even though typically one-way), and the number of emails exchanged via HTTP certainty creates a fair traffic share. To me it looks like we're already giving HTTP a try...
Posted by Ben Campbell on Wed, Feb 03, 2010 @ 09:36 AM
My last post described how MSRP endpoints use SDP to setup sessions. Today, we'll discuss how the MSRP protocol uses the results of the SDP offer/answer exchange.
Each endpoint builds a "target path" that it will use for all MSRP communication with its peer. If an endpoint does not use a relay, then the target path is exactly the same as the SDP path attribute value it received from its peer. On the other hand, if the endpoint does use a relay, then it forms the target path by prepending the URI that it got from each relay to the peer's SDP path attribute value. In both cases, the target path form a roadmap of how to get an MSRP request to the peer, i.e. a list of MSRP URIs showing each hop to visit on the way, ending with the URI of the destination device.
If you recall from last time, Alice sent Bob an SDP offer containing the following:
a=path:msrps://alice.example.com:7654/asfd34;tcp
Bob responded with an SDP answer containing this:
a=path:msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp
Since Alice didn't introduce a relay (Bob's relay doesn't count here), she's got life easy. Her target path is exactly what Bob sent her:
msrps://relay.example.net:8211/asfioef;tcp msrps://bob.example.net:6581/asfd34;tcp
Okay, I lied a little bit about Bob's relay not counting. The relay clearly exists in the path, because Bob put it there. But since Alice did not introduce a relay on her own behalf, she uses Bob's path value as-is, without worrying about relays in general.
On the other hand, Bob did introduce a relay. So to get his target path, he takes the path Alice sent him, and prepends his own relay, and gets this:
msrps://relay.example.net:8211/asfioef;tcp msrps://alice.example.com:7654/asfd34;tcp
Alice and Bob are now almost ready to exchange MSRP messages. But there's one more step that must happen first. Alice must open a TCP connection towards Bob. Note that RFC 4975 says the offerer always opens the connection towards the answerer. There's work afoot to allow MSRP endpoints to negotiate the connection direction using COMEDIA, but for now lets assume Alice and Bob are using RFC 49752 as-is.
Alice connects to the first device in her target path. In this case, that's Bob's relay. She uses the DNS to get an IP address for "relay.example.net" and opens a TCP connection to port 8211, and starts sending messages. That's assuming she doesn't already have such a connection--for example, she might already have an MSRP session in progress with someone else that uses the same relay as Bob. In that case, she just uses the connection she already has.
Now, lets pretend for just a moment that things were reversed, and Bob had sent the original offer. The first device in his target path is also "relay.example.net". But since he already set up a connection to that relay when he authenticated with it, he doesn't need to setup a new one. He just reuses the one he already has. The relay would establish a connection to the next hop (Alice, in this case) on demand.
When either endpoint wants to send a message to the other, it constructs a SEND request with the message content in its payload. The endpoint puts its target path in a To-Path header field, and its own URI in the From-Path header field. It then sends the request to the first device in the To-Path. If that device is the peer, then that's pretty much all there is to it. If the first device is a relay, the relay removes its own URI from the To-Path and prepends it to the From-Path, then relays the request downstream.
Here's a picture for Alice and Bob. I've replaced the actual URIs with the symbols "Alice", "Bob", and "Relay" to try to keep it readable.
At this point, you are probably (and rightly) wondering what the point of a relay moving its URI to the From-Path header field when it relays a request. This allows a downstream device to send a response to a SEND request, in the form of a REPORT request. Don't confuse this with a message from the human Bob in response to a message from the human Alice. That would simply be another SEND request in the opposite direction. Instead REPORT requests carry delivery information about the original request.
The peer device, and any relays in between can originate a REPORT request back to the endpoint that sent a SEND request. They do this by inserting the From-Path that they observed in the Send request into the To-Path of the REPORT request. Here's a picture showing a REPORT request sent by Bob, and another by Bob's relay.

That's enough for now. Next time, I'll talk about how MSRP messages can be broken into "chunks" in order to multiplex multiple sessions across the same connection.
Posted by Adam Roach on Tue, Jan 26, 2010 @ 02:30 PM
This is the final post in the series on SIP-I and SIP-T deployment challenges. You may wish to read the Introduction to SIP-I and SIP-T post for some general background on these two protocols before continuing.
The difficulties in interworking features between the PSTN and SIP networks stem predominantly from two areas. The first is the different models for where services are implemented -- the PSTN expects them to be performed by the network, while SIP is designed for them to be implemented in the end points. The second is the fact that SIP defines tools instead of named services, while the PSTN relies on a discrete and highly constrained set of standardized services.
To be clear, most basic services work just fine in a mixed-protocol network. Call waiting, call forwarding, calling party identification -- in fact, almost all of the most popular PSTN CLASS services work just fine.
A good demonstration of how these two issues can cause problems is the class of services commonly known as "Call Completion." (This service goes by a large number of other names, such as "Auto Callback" and "Camp On Extension.") At a high level, the call completion service works as follows: a calling party attempts to contact a called party; however, the attempt is unsuccessful (either because the called party is busy or because they do not answer). The calling party then activates the Call Completion service. When the called party become available (is no longer busy, or demonstrates availability by changing hookstate), the calling party is alerted. The calling party then picks up the phone, and the call proceeds as normal.
To examine why this is not trivial to interwork, we need to understand how ISUP implements this service and how SIP implements it.
The ISUP version of this service is defined in the ITU-T documents Q.733.3 and Q.733.5. At a high level: during a call setup attempt, the ISUP Address Complete (ACM) message includes an indication that a called party's end office supports the Call Completion service. If the calling party wants to activate the Call Completion service, then TCAP is used to convey an activation request. When the called party is available, TCAP is again used to indicate that the remote user is free. The calling party's end office alerts the calling party and waits for their phone to go offhook. The end office then attempts a new call to the called party. The ISUP Initial Address Message (IAM) indicates that this call attempt is a result of the Call Completion service, to allow proper handling at the called party end office. From this point, the call proceeds pretty much as normal.
The overall call flow looks something like this:
By contrast, the tool SIP uses to perform this service is the dialog event package, defined in RFC 4235. The dialog event package allows a user to subscribe to the state of sessions at another user's device or devices. In other words, by subscribing to the dialog event status of another user, I can tell whether that other user is busy in one or more calls. I can also tell when their terminal goes on-hook or off-hook (when those concepts apply to the terminal they are using).
Of couse, this tool enables a lot more than the Call Completion service -- but it can be used to implement a Call Completion service in the SIP network. At a high level, the call flow actually looks pretty similar to the SS7 version: the calling party sends an INVITE to the called party, which indicates support for the dialog event package in its provisional responses (e.g., 180 Ringing) and final response (e.g., 486 Busy). The calling party can then subscribe to the dialog event package, and learn when the called party is available again. (This can be greatly enhanced using SIP presence information about the called party, but we'll stay focused on replicating the PSTN functionality in this example.) The calling party's terminal then tracks the state of the called party's calls and terminals. When it determines that the called party is available for a call, it alerts its local user, and sends a new INVITE to the called party.
The call flow looks something like this; note that no network servers are shown in this call flow because they do not participate in the service:
The issue that arises is due to the difference between ISUP's call completion service and SIP's dialog event package tool. ISUP's service is very narrowly focused on making this one specific use case work. None of its procedures or messages can be re-used to implement new services. By contrast, the SIP tool can be used for myriad services, such as enhancing multiparty conferences, enabling certain types of advanced third-party call control, and implementing shared-line behavior on multiple devices. And, of course, it can be deployed in new and clever ways to create services that we haven't even thought of yet.
The problem is that the PSTN gateway can tell that the called party supports the call completion service, but can't actually get more general information about the calls that the called party is involved in. And there is no way for the gateway to tell the calling party "I support the call completion service, but don't have enough information to actually do the dialog event package" -- because, in SIP, we use tools, not services.
Similar problem arise with distinctive alerting, call parking, line sharing, and several other advanced services. Luckily, the IETF has formed a working group, BLISS, to tackle these issues. BLISS has been working in concert with TISPAN and other standards groups to ensure that the solutions work well with existing solutions in the PSTN. So, unlike the other deployment challenges we've gone over, this one is likely to get better as time goes on.
Posted by Dorgham Sisalem on Tue, Jan 19, 2010 @ 05:30 PM
Some time ago my colleague Jiri Kuthan recommended me to read RFC5218. In it the authors discuss what makes protocols succeed or fail. A successful protocol is defined as one that meets its design goals and is widely deployed. The authors present some factors which they believe to be crucial for the success of a protocol and present some use cases in which they apply these factors to some successful and failed protocols. Among these factors the authors list the design, extensibility and openness of networking protocols.
While reading the RFC I started thinking, what would be the result of applying these factors on SIP:
Initial Success factors: These are the factors that help a protocol to become successful in the initial phase of their deployment
- Positive net value: SIP obviously solves a problem; namely that of establishing a session in IP networks. While SIP bears the promise of enabling all kinds of sessions it is mostly used for establishing voice calls. In this context it does not offer more functionality than traditional SS7 signaling, H.323 or Skype. The real positive net value of SIP is hence demonstrated when operators start deploying more SIP-based services such as presence and application servers that offer more flexible and intelligent communication services than we have today.
- Incremental deployment: SIP can be deployed without having to update the network routers. However, unlike the arguably most successful Internet protocol, HTTP, it is not sufficient to provide a server and a client. For a communication service to be of use there must be a lot of clients and users available. While there are already different providers offering VoIP services using SIP with millions of users, these providers act as islands that are connected over the PSTN. Hence, in order for SIP to excel on this point, more SIP-based peering between providers is needed.
- Open code availability: There are already different open source components needed for a SIP service. The SIP Express Router is an excellent and widely used SIP proxy. Asterisk and SEMS offer flexible and easy to use media services such as IVR or conferencing. On the user agent side, there are also different implementations of different quality.
- Restriction free: SIP is a provided as a patent free technology for all.
- Open specifications: The SIP specifications are provided by IETF and are open.
- Open maintenance: SIP is maintained by the IETF and is extended and fixed continuously. While this is surely a good thing, this has also led to a load of specifications that some might claim are too much.
- Good technical design: While SIP was being hailed at the beginning as the simpler alternative to H.323, it has gained a lot of weight over the years. Taking the same comparison factors used in RFC5218 - namely security and congestion control - then SIP does not seem so perfect as congestion control is not considered and it does not have a powerful concept for identity management. Also, deployment issues such as NAT traversal were only added at later stages.
Wild success factors: These are the factors that contribute to success and wide deployment:
- Extensible: While designed in the early stage for simple calls, SIP is now used for multi-party calls, presence and trunking scenarios. Also, the integration of new applications and services should be rather straightforward as SIP is not restricted to a certain usage scenario.
- Scalability: While we still do not have any experience regarding the cost and complexity of building a SIP infrastructure for hundreds of millions of users. I do not see a real reason why this could not be done.
- Security: SIP has different mechanisms for authenticating users and protecting the signaling traffic. However, it does not have explicit mechanisms for protection against DoS attacks or fraud.
Discussion
Looking at the points above it looks like SIP has more or less a positive result on the discussed factors. However, getting positive marks on the evaluation factors does not mean that a protocol will be a success. If we evaluate Skype based on these parameters then we should conclude that Skype should fail. There is no open source code or open specifications and the net value is not much higher than PSTN or SIP. However, the number of users of Skype is higher than that of SIP.
So does this mean that SIP will become a wild success? Well, I guess the answer is a very definite maybe! The success or failure of a protocol can only be judged 5 to 10 years after finishing the standardization - so we still have a few years in front of us. But, it has the needed success factors, and with more applications, peering relations and clearer business models, the chance that SIP will be wildly successful are pretty good.
Posted by Ravi Ravishankar on Thu, Jan 14, 2010 @ 09:47 PM
Mobile data traffic is growing beyond our imagination. After the introduction of the latest iPhone in June, Google stated that the mobile upload of video increased by 400% over the previous day. The introduction of Android powered phones including the recently launched Nexus One will only increase this rate. Over the next four years, it is projected that global mobile traffic will exceed 50,000 Terabytes per day! I believe that this humongous data avalanche that is taking over the mobile world will remain the single most important factor that will influence mobile infrastructure trends in 2010 and beyond.
So if I have to pick the top three mobile infrastructure trends for 2010 what will they be?
- 3G/4G Network expansion will continue through this year at an accelerated pace. We will see more and more deployments of HSPA, HSPA+ and LTE.
- Convergence of Wireless LAN/WAN - Traffic growth may be huge, but operators are yet to find that magic formula to monetize all this traffic growth. That means cost optimization of network expansion is critical. Wi-Fi and Femtocell micro-sites will complement 3G/4G networks by offloading much of the data traffic.
- Just throwing money into mobile bandwidth infrastructure will not by itself address the problem of exploding mobile data traffic. Networks will get smarter in handling the traffic. Better traffic management and differentiated treatment of different traffic types are the essential short-term solution. In the long run, the industry also needs innovative pricing that monetizes different types of traffic that can fund infrastructure growth. Net Neutrality driven regulation may be a wild card that will influence this in certain regions, but I still think this will be the case globally.
What would be your top 3 trends?
Posted by Robert Sparks on Wed, Jan 06, 2010 @ 05:02 PM
As described in an earlier FAQ, SIP Events uses a notion of a "package" to determine what kind of information is being asked for, what kind of change will cause a notification to be sent, and what the available options are for encoding the information in a NOTIFY request.
The current set of standardized SIP Event packages is maintained at the sip-events namespace registry at IANA. At the beginning of 2010, there are thirteen registered packages, and one special thing called a "template-package": winfo.
Subscribing to this template-package will give you "Watcher INFOrmation": details of each subscription to a particular event. For instance, I could subscribe to "presence.winfo" for sip:RjS@tekelec.com to see who is watching my presence.
Template packages are never used directly - they must be applied to regular packages. In other words, it isn't possible to subscribe to "winfo", only to events like "presence.winfo" or "message-summary.winfo".
The template package concept was introduced to make it easier to build packages that would extend every other existing package the same way. It would have been possible to build the same system without template packages by creating separate "presence-winfo", "message-summary-winfo", etc. packages, but each of those would have to respecify the common behavior. Having this meta-package tool avoids that extra specification work (and makes it less likely that watcher information for package "foo" and for package "bar" would behave in subtly different ways).
The concept has been with us for nearly a decade, and the only event-template package we've found a need for is winfo. It may be the only one that ever exists.
Like many aspects of SIP Events, winfo was driven by presence. When a new person tries to add me to their list, I need a way to find out so that I can give the service permission to hand my presence to that new person. My client needs a nudge so it can ask me whether I would like to allow or deny the subscription. Early attempts at a solution involved having the server send me a QAUTH request before answering the SUBSCRIBE from this new person. That turned out to be a dead-end for two reasons. First, like all SIP non-INVITE requests, QAUTH had to get an answer within 32 seconds (64*T1, where T1's default value is 500ms). If I didn't happen to be sitting in front of my computer, notice the dialog, and answer within that time, the wrong thing happened. Second, if this new person and I never happened to be online at the same time, authorization would never complete.
To solve these problems, we reused SIP Events itself - using the winfo template-event package to subscribe to changes in the set of watchers for any other package, like presence. The initial NOTIFY for a winfo subscription will describe each of the existing subscriptions detailing who the subscriber is, how long the subscription has been in place, when it will expire (if it isn't refreshed), and what the current authorization state for the subscription is. (Remember that subscriptions can enter a "pending" state if a server doesn't have authorization when the SUBSCRIBE arrives). To solve the never-online-at-the-same-time problem, winfo carries one more state, named "waiting", for subscriptions which were attempted recently but for which authorization was not available.
Here's a short example of winfo in action. Assume at the beginning of this flow that I've authorized Ben and Adam to see my presence, but not Theo. Note that in this flow, Theo and I are never online at the same time.

The winfo NOTIFY body format is XML. The initial NOTIFY for a winfo subscription will have a complete list of current watchers. Subsequent NOTIFYs will only contain information for those watchers whose subscription has changed state.
For more details on the format of the winfo NOTIFY bodies, see RFC 3858. The winfo template-package itself is in RFC 3857.