Get started Bring yourself up to speed with our introductory content.

An introduction to SIP and SIP functions

The Session Initiation Protocol (SIP) is the primary protocol that's used by most VoIP and unified communications (UC) products. This is part 1 of a three-part series that will explain how SIP works.

The Session Initiation Protocol (SIP) is the primary protocol that's used by most VoIP and unified communications (UC) products, so I wanted to take the opportunity to introduce you to this protocol. In this series of three articles, I will explain how SIP works.

Before I begin

One of my goals in writing this series is to make the information that I am presenting both practical and easy to understand. That being the case, I am going to avoid taking a bit-level approach to describing the protocol and focus instead on its functions. I'm doing this because bit-level descriptions are typically useful to only a small group of people. If you need more detailed information than I am providing here, you can access the SIP's RFC.

SIP's five main functions

SIP's primary job is to control user sessions. As such, SIP contains five primary functions that allow it to perform various session-related tasks.

The first of these SIP functions is the user location function. As I'm sure you know, UC deployments often involve multiple networks, each containing multiple types of devices. As such, SIP has to be able to locate the end user geographically and to know what end systems will be used by the session.

The second SIP function is user availability. This function is best known for the way that it is used in providing presence information. End users can tell the system that they are available to talk or that they are busy and do not wish to be disturbed.

The third SIP function is the user capabilities function. The basic idea behind this function is that different devices have different capabilities. For example, there are many things that a computer is capable of doing that a phone is not. The user capabilities function allows SIP to make a determination of the media being used and of the parameters that are associated with that media type. For example, will the user be communicating using voice, video or something else?

The fourth SIP function is the session setup. This is the function that is responsible for connecting a call. It establishes session parameters for both the caller and the recipient of the call.

The fifth of the primary SIP functions is the session management function. This is the function that allows users to end a call, transfer a call to someone else, or make modifications to the session parameters.

The protocol stack

Now that I have shown you the five basic SIP functions associated, I want to take a step back and talk about how SIP fits in with the rest of the IP-based communications that are taking place on a network.

The most important thing you need to know about SIP is that it is designed only to create, manage and terminate sessions. SIP does not provide any services by itself; rather, it lays the groundwork for services to be provided by other protocols.

In Figure A, you can see that SIP is an application-layer protocol and is designed to work parallel to other multimedia protocols, such as the Real Time Streaming Protocol (RTSP) and the Real Time Transport Protocol (RTTP). Of course, this diagram is just an example. There are numerous other protocols that can be used to create a full-blown multimedia experience. For example, the Media Gateway Control Protocol (MEGACO) is also commonly used to control the connection to the gateway to the public-switched telephone network.

Figure A

The SIP protocol resides at the application layer of the OSI model.

In this diagram, the Session Definition Protocol provides SIP with a session description. This allows SIP to be aware of the type of session that needs to be established. SIP is then able to communicate through the IP network and use the session description to establish a session with the requested host.

I have also shown in this diagram how RTSP and RTTP can be used alongside SIP. Neither of these protocols is an absolute requirement; they are just in the diagram for demonstration purposes. RTSP controls the delivery of streaming audio or video. RTTP provides QoS feedback to ensure that the appropriate amount of bandwidth is being reserved for the session.


In this article, I have talked about the SIP's primary function. In the next article in this series, I will discuss the various verbs associated with SIP, and what they are used for.

Read part 2 of this SIP introduction series, an  introduction to SIP verbs.
Read part 3 of this SIP introduction series, an introduction to SIP packet routing

About the author:
Brien M. Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server and IIS. Brien has served as CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. As a freelance technical writer, he has written for Microsoft, CNET, ZDNet, TechTarget, MSD2D, Relevant Technologies and other technology companies. You can visit Brien's personal website at

Introduction to SIP series

Dig Deeper on SIP and Unified Communications Standards