Guide to Cisco Systems' VoIP Infrastructure Solution for SIP
Chap 1: Overview of the Session Initiation Protocol
Table of ContentsOverview of the Session Initiation Protocol
Introduction to SIP
How SIP Works
SIP Versus H.323
This chapter provides an overview of SIP. It includes the following sections:
Session Initiation Protocol (SIP) is the Internet Engineering Task Force's (IETF's) standard for multimedia conferencing over IP. SIP is an ASCII-based, application-layer control protocol (defined in RFC 2543) that can be used to establish, maintain, and terminate calls between two or more end points.
Like other VoIP protocols, SIP is designed to address the functions of signaling and session management within a packet telephony network. Signaling allows call information to be carried across network boundaries. Session management provides the ability to control the attributes of an end-to-end call.
SIP provides the capabilities to:
Conferences can consist of two or more users and can be established using multicast or multiple unicast sessions.
SIP is a peer-to-peer protocol. The peers in a session are called User Agents (UAs). A user agent can function in one of the following roles:
Typically, a SIP end point is capable of functioning as both a UAC and a UAS, but functions only as one or the other per transaction. Whether the endpoint functions as a UAC or a UAS depends on the UA that initiated the request.
From an architecture standpoint, the physical components of a SIP network can be grouped into two categories: clients and servers. Figure 1-1 illustrates the architecture of a SIP network.
Figure 1-1: SIP Architecture
SIP clients include:
SIP servers include:
SIP is a simple, ASCII-based protocol that uses requests and responses to establish communication among the various components in the network and to ultimately establish a conference between two or more end points.
Users in a SIP network are identified by unique SIP addresses. A SIP address is similar to an e-mail address and is in the format of sip:userID@gateway.com. The user ID can be either a user name or an E.164 address.
Users register with a registrar server using their assigned SIP addresses. The registrar server provides this information to the location server upon request.
When a user initiates a call, a SIP request is sent to a SIP server (either a proxy or a redirect server). The request includes the address of the caller (in the From header field) and the address of the intended callee (in the To header field). The following sections provide simple examples of successful, point-to-point calls established using a proxy and a redirect server.
Over time, a SIP end user might move between end systems. The location of the end user can be dynamically registered with the SIP server. The location server can use one or more protocols (including finger, rwhois, and LDAP) to locate the end user. Because the end user can be logged in at more than one station and because the location server can sometimes have inaccurate information, it might return more than one address for the end user. If the request is coming through a SIP proxy server, the proxy server will try each of the returned addresses until it locates the end user. If the request is coming through a SIP redirect server, the redirect server forwards all the addresses to the caller in the Contact header field of the invitation response.
For more information, see RFC 2543SIP: Session Initiation Protocol, which can be found at http://www.faqs.org/rfcs/.
If a proxy server is used, the caller UA sends an INVITE request to the proxy server, the proxy server determines the path, and then forwards the request to the callee (as shown in Figure 1-2).
Figure 1-2: SIP Request Through a Proxy Server
The callee responds to the proxy server, which in turn, forwards the response to the caller (as shown in Figure 1-3).
Figure 1-3: SIP Response Through a Proxy Server
The proxy server forwards the acknowledgments of both parties. A session is then established between the caller and callee. Real-time Transfer Protocol (RTP) is used for the communication between the caller and the callee (as shown in Figure 1-4).
Figure 1-4: SIP Session Through a Proxy Server
If a redirect server is used, the caller UA sends an INVITE request to the redirect server, the redirect server contacts the location server to determine the path to the callee, and then the redirect server sends that information back to the caller. The caller then acknowledges receipt of the information (as shown in Figure 1-5).
Figure 1-5: SIP Request Through a Redirect Server
The caller then sends a request to the device indicated in the redirection information (which could be the callee or another server that will forward the request). Once the request reaches the callee, it sends back a response and the caller acknowledges the response. RTP is used for the communication between the caller and the callee (as shown in Figure 1-6).
Figure 1-6: SIP Session Through a Redirect Server
In addition to SIP, there are other protocols that facilitate voice transmission over IP. One such protocol is H.323. H.323 originated as an International Telecommunications Union (ITU) multimedia standard and is used for both packet telephony and video streaming. The H.323 standard incorporates multiple protocols, including Q.931 for signaling, H.245 for negotiation, and Registration Admission and Status (RAS) for session control. H.323 was the first standard for call control for VoIP and is supported on all Cisco Systems' voice gateways.
SIP and H.323 were designed to address session control and signaling functions in a distributed call control architecture. Although SIP and H.323 can also be used to communicate to limited intelligence end points, they are especially well-suited for communication with intelligent end points.
Table 1-1 provides a brief comparison of SIP and H.323.
Table 1-1: SIP versus H.323
Although SIP messages are not directly compatible with H.323, both protocols can coexist in the same packet telephony network if a device that supports the interoperability is available.
For example, a call agent could use H.323 to communicate with gateways and use SIP for inter-call agent signaling. Then, after the bearer connection is set up, the bearer information flows between the different gateways as an RTP stream.