The Real-time Transport (RTP) Protocol

In RTP the data transport is augmented by a control protocol (RTCP) to allow monitoring of the data deliverance in a manner scalable to large multi cast networks, and to provide minimal control and identification functionality. In short The Real-Time Transport Protocol provides end-to-end network transport functions appropriate for applications transmitting real-time data such as audio, video or simulation data, over multicast or unicast network services. RTP and RTCP are designed to be independent of the underlying transport and network layers and does not address resource reservation and does not guarantee quality-of-service for real-time services. The protocol ropes the use of RTP-level translators and mixers.

The Real-time Transport Protocol (RTP) defines a standardized packet format for delivering audio and video over the Internet. It was developed by the Audio-Video Transport Working Group of the IETF and first published in 1996 as RFC 1889, and superseded by RFC 3550 in 2003.

RTP is used extensively in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applications and web-based push to talk features. For these it carries media streams controlled by H.323, MGCP, Megaco, SCCP, or Session Initiation Protocol (SIP) signaling protocols, making it one of the technical foundations of the Voice over IP industry.

RTP is usually used in conjunction with the RTP Control Protocol (RTCP). While RTP carries the media streams (e.g., audio and video) or out-of-band signaling (DTMF), RTCP is used to monitor transmission statistics and quality of service (QoS) information. When both protocols are used in conjunction, RTP is usually originated and received on even port numbers, whereas RTCP uses the next higher odd port number.


RTP was developed by the Audio/Video Transport working group of the IETF standards organization, and it has since been adopted by several other standards organization, including by ITU as part of its H.323 standard.[1] The RTP standard defines a pair of protocols, RTP and the Real-time Transport Control Protocol (RTCP). The former is used for exchange of multimedia data, while the latter is used to periodically send control information and Quality of service parameters.

RTP protocol is designed for end-to-end, real-time, audio or video data flow transport. It allows the recipient to compensate for the jitter and breaks in sequence that may occur during the transfer on an IP network. RTP supports data transfer to multiple destination by using multicast. RTP provides no guarantee of the delivery, but sequencing of the data makes it possible to detect missing packets. RTP is regarded as the primary standard for audio/video transport in IP networks and is used with an associated profile and payload format.

Multimedia applications need timely delivery and can tolerate some loss in packets. For example, loss of a packet in audio application results may result in loss of a fraction of a second of audio data, which, with suitable error concealment can be made unnoticeable. Multimedia applications require timeliness over reliability. The Transmission Control Protocol (TCP), although standardized for RTP use (RFC 4571), is not often used by RTP because of inherent latency introduced by connection establishment and error correction, instead the majority of the RTP implementations are based on the User Datagram Protocol (UDP).[4] Other transport protocols specifically designed for multimedia sessions are SCTP and DCCP, although they are not in widespread use yet.

The design of RTP was based on an architectural principle known as Application Level Framing (ALF). ALF principle is seen as a way to design protocols for emerging multimedia applications. ALF is based on the belief that applications understand their own needs better, and the intelligence should be placed in applications and the network layer should be kept simple. RTP Profiles and Payload formats are used to describe Application specific details.(explained below)

Protocol components

There are two parts to RTP: Data Transfer Protocol and an associated Control Protocol. The RTP data transfer protocol manages delivery of real-time data (audio and video), between end systems. It defines the media payload, incorporating sequence numbers for loss detection, timestamps to enable timing recovery, payload type and source identifiers, and a marker for significant events. Depending on the profile and payload format in use, rules for timestamp and sequence number usage are specified.

The RTP Control Protocol (RTCP) provides reception quality feedback, participant identification and synchronization between media streams. RTCP runs alongside RTP, providing periodic reporting of this information.[7] While the RTP data packets are sent every few milliseconds, the control protocol operates on the scale of seconds. The information in RTCP may be used for synchronization (e.g. lip sync)[7] The RTCP traffic is small when compared to the RTP traffic, typically around 5%.

Sessions

To setup an RTP session, an application defines a pair of destination ports (an IP address with a pair of ports for RTP and RTCP). In a multimedia session, each media stream is carried in a separate RTP session, with its own RTCP packets reporting the reception quality for that session. For example, audio and video would travel in separate RTP session, enabling a receiver to select whether or not to receive a particular stream. An RTP port should be even and the RTCP port should be the next higher port number if possible. Deviations from this rule can be signaled via RTP session descriptions in other protocols (SDP). RTP and RTCP typically use unprivileged UDP ports (1024 to 65535), but may use other transport protocols (most notably, SCTP and DCCP) as well, as the protocol design is transport independent.

Voice over Internet Protocol (VoIP) systems most often use the Session Description Protocol (SDP) to define RTP sessions and negotiate the parameters involved with other peers. The Real Time Streaming Protocol (RTSP) may be also be used to setup and control media session on remote media servers.