Until people learn how to speak digitally, the analog signal of the human voice has to be converted into a digital signal that computer equipment can understand and transmit. The translation of analog to digital (and back to analog so the user on the other end can understand it) is done using a codec. "Codec" is short for compression/decompression, or coder/decoder.
Various codecs are available for transmitting voice data over an IP network. They have different sampling rates and packet sizes, and each has its pros and cons. Higher quality can mean higher processing or bandwidth overhead; or smaller, more compressed packets could mean a loss of voice quality. Microsoft developed a proprietary adaptive codec, RTC, to deliver VoIP that is optimized for the bandwidth available at the time. First, we'll take a quick look at how the analog sound from the party who is speaking gets translated to RTC in the first place.
Mediating between protocols
At the heart of all of this translating back and forth with Microsoft Unified Communications
Mediation Server provides a single interface based on industry-standard Session Initiation Protocol (SIP) to enable communication interoperability. The server is able to take calls from IP-PBX, SIP gateways, and standard phone line PSTN gateways and convert them using Microsoft's proprietary adaptive codec, RTC. It then forwards the data to the Office Communications Server for internal communications. The Mediation Server then works backward to take those internal RTC communications and convert them back to SIP, PSTN or other protocols in order to deliver them to their destinations.
Depending on where you look, RTC can be "Real-Time Communications," "Real-Time Collaboration" or "Real-Time Codec." Within the various definitions of RTC, there are both audio and video codecs used by Microsoft Unified Communications to enable audio and video data to be transmitted across the network.
When it was developing Office Communications Server 2007 and deciding how to handle the translation of analog to digital and back again, Microsoft could have opted for an existing codec. Global IP Sound (GIPS) was already in wide use by popular VoIP applications such as Skype and Google Talk. Instead, Microsoft chose to develop its own proprietary codec in-house. The result is the Real-Time Codec, which provides the backbone of Microsoft's voice capabilities.
As it relates to the audio codec, Microsoft's solution relies upon a variety of existing, standard codecs. What Microsoft's adaptive codec does is negotiate the best possible connection given the conditions at the time and select the best codec accordingly.
Microsoft analyzes a variety of factors, including the available bandwidth, the maximum possible bandwidth, the predefined minimum bandwidth for a given audio codec, the preferred codec based on Session Description Protocol (SDP) negotiation, and whether or not there is also an outgoing video data stream present.
In the end, Microsoft's solution selects the best codec for the given conditions at the time. In addition, the codec used is changed dynamically during a given call to ensure that the quality of the voice transmission is optimized for the available bandwidth. This on-the-fly negotiation also means that third-party products must be able to support dynamic payload types and packet frame size changes in order to work with Microsoft UC implementations.
Tony is a CISSP (Certified Information Systems Security Professional) and ISSAP (Information Systems Security Architecture Professional). He is Microsoft Certified as an MCSE (Microsoft Certified Systems Engineer) and MCSA (Microsoft Certified Systems Administrator) in Windows 2000 and an MCP (Microsoft Certified Professional) in Windows NT. Tony has been recognized by Microsoft as an MVP (Most Valuable Professional) in Windows security since 2006.
This was first published in June 2008