Voice transmission is analogical, whereas the data network is digital. The process to sample analogical waves into digital information is made by an encoder-decoder (CODEC). There are many standards to sample an analogical voice signal into a digital one. The process is often quite complex. Most of the conversions use pulse code modulation (PCM) or variations
In addition, the CODEC zip the sequence of data, and sometimes provides echo cancellation. The compression of the waveform can save bandwidth. This is especially interesting in low speed connections so you can have more VoIP connections at the same time. Another way to save bandwidth is using the silence suppression. The goal is not to send packages when there is no voice in the conversations.
Next is a table with the most known codecs in use:
- Bit Rate - The rate at which bits are transmitted over a communication path. Normally expressed in Kilobits per second (Kbps)
- Sampling Rate - the number of samples taken per second when digitizing sound. The quality of the digital reproduction improves as the number of samples taken per second increases.
- Frame size - The time between packets sent
- MOS - (Mean Opinion Score). It is a subjective measure of sound quality from 1 to 5.
In order to understand better the codec process and the parameters expressed in the table we recommended to read the section of G.711 codec process where it is possible to learned how it works the G.711 codec.
Number |
Standard by |
Description |
Bit rate (kb/s) |
Sampling rate (kHz) |
Frame size (ms) |
Remarks |
MOS (Mean Opinion Score)
|
---|
G.711 * |
ITU-T |
Pulse code modulation (PCM) |
64 |
8 |
Sampling |
U-law (US, Japan) and A-law (Europe) companding |
4.1
|
G.721 |
ITU-T |
Adaptive differential pulse code modulation (ADPCM) |
32 |
8 |
Sampling |
Now described in G.726; obsolete. |
|
G.722 |
ITU-T |
7 kHz audio-coding within 64 kbit/s |
64 |
16 |
Sampling |
Subband-codec that divides 16 kHz band into two subbands, each coded using ADPCM |
|
G.722.1 |
ITU-T |
Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss |
24/32 |
16 |
20 |
|
|
G.723 |
ITU-T |
Extensions of Recommendation G.721 adaptive differential pulse code modulation to 24 and 40 kbit/s for digital circuit multiplication equipment application |
24/40 |
8 |
Sampling |
Superceded by G.726; obsolete. This is a completely different codec than G.723.1 |
|
G.723.1 |
ITU-T |
Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s |
5.6/6.3 |
8 |
30 |
Part of H.324 video conferencing. It encodes speech or other audio signals in frames using linear predictive analysis-by-synthesis coding. The excitation signal for the high rate coder is Multipulse Maximum Likelihood Quantization (MP-MLQ) and for the low rate coder is Algebraic-Code-Excited Linear-Prediction (ACELP). |
3.8-3.9
|
G.726 |
ITU-T |
40, 32, 24, 16 kbit/s adaptive differential pulse code modulation (ADPCM) |
16/24/32/40 |
8 |
Sampling |
ADPCM; replaces G.721 and G.723. |
3.85
|
G.727 |
ITU-T |
5-, 4-, 3- and 2-bit/sample embedded adaptive differential pulse code modulation (ADPCM) |
var. |
|
Sampling |
ADPCM. Related to G.726 |
|
G.728 |
ITU-T |
Coding of speech at 16 kbit/s using low-delay code excited linear prediction |
16 |
8 |
2.5 |
CELP. |
3.61
|
G.729 ** |
ITU-T |
Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP) |
8 |
8 |
10 |
Low delay (15 ms) |
3.92
|
GSM 06.10 |
ETSI |
Regular璓ulse Excitation Long璗erm Predictor (RPE-LTP) |
13 |
8 |
22.5 |
Used for GSM cellular telephony. |
|
LPC10 |
USA Government |
Linear-predictive codec |
2.4 |
8 |
22.5 |
10 coefficients. |
|
Speex |
|
|
8, 16, 32 |
2.15-24.6 (NB) 4-44.2 (WB) |
30 ( NB ) 34 ( WB ) |
|
|
iLBC |
|
|
8 |
13.3 |
30 |
|
|
DoD CELP |
American Department of Defense (DoD) USA Government |
|
4.8 |
|
30 |
|
|
EVRC |
3GPP2 |
Enhanced Variable Rate CODEC |
9.6/4.8/1.2 |
8 |
20 |
Se usa en redes CDMA |
|
DVI |
Interactive Multimedia Association (IMA) |
DVI4 uses an adaptive delta pulse code modulation (ADPCM) |
32 |
Variable |
Sampling |
|
|
L16 |
|
Uncompressed audio data samples |
128 |
Variable |
Sampling |
|
|
* G711 has two versions called U-law (US, Japan) and A-law (Europe) . U-law is in relation with the T1 standard used in North America and Japan. The A-law is relation with the E1 standard used in the rest of the world. The difference is the method to sample the analog signal. In both schemes, the signal is not sampled linearly, but in a logarithmic way. For more information about the differences you could visit
G.711 A Law versus u Law.** There are different versions of g729 codec that it is interesting to explain because this codec is very used nowadays.
G729: original codec
G729A or A annex: it is a simplification of G729 and it is compatible with G729. He is less complex but it has less quality.
G729B or B annex: G729 with silence suppression and not compatible with the previous ones
G729AB: g729A with silence suppression and only compatible with G729B.
Besides, every version of G729 have 8Kbps of bitrate but there are versions with 6.4 kbps (D annex) and 11.4 Kbps (E annex).