PCM文件:模擬音頻信號經模數轉換(A/D變換)直接形成的二進制序列,該文件沒有附加的文件頭和文件結束標誌。Windows的Convert工具能夠把PCM音頻格式的文件轉換成Microsoft的WAV格式的文件。
將音頻數字化。事實上就是將聲音數字化。最常見的方式是透過脈衝編碼調製PCM(Pulse Code Modulation) 。
運作原理例如以下:首先我們考慮聲音經過麥克風,轉換成一連串電壓變化的信號。例如以下圖所看到的。這張圖的橫座標爲秒。縱座標爲電壓大小。要將這種信號轉爲 PCM 格式的方法,是使用三個參數來表示聲音。它們是:聲道數、採樣位數和採樣頻率。
採樣頻率:即取樣頻率,指每秒鐘取得聲音樣本的次數。採樣頻率越高,聲音的質量也就越好,聲音的還原也就越真實,但同一時候它佔的資源比較多。因爲人耳的分辨率非常有限,太高的頻率並不能分辨出來。
在16位聲卡中有22KHz、44KHz等幾級,當中,22KHz相當於普通FM廣播的音質,44KHz已相當於CD音質了,眼下的經常使用採樣頻率都不超過48KHz。
採樣位數:即採樣值或取樣值(就是將採樣樣本幅度量化)。它是用來衡量聲音波動變化的一個參數。也能夠說是聲卡的分辨率。
它的數值越大,分辨率也就越高。所發出聲音的能力越強。
聲道數:非常好理解,有單聲道和立體聲之分,單聲道的聲音僅僅能使用一個喇叭發聲(有的也處理成兩個喇叭輸出同一個聲道的聲音)。立體聲的PCM 能夠使兩個喇叭都發聲(一般左右聲道有分工) ,更能感受到空間效果。
以下再用圖解來看看採樣位數和採樣頻率的概念。讓我們來看看這幾幅圖。圖中的黑色曲線表示的是PCM 文件錄製的自然界的聲波,紅色曲線表示的是PCM 文件輸出的聲波。橫座標便是採樣頻率;縱座標便是採樣位數。
這幾幅圖中的格子從左到右,逐漸加密,先是加大橫座標的密度,然後加大縱座標的密度。顯然,當橫座標的單位越小即兩個採樣時刻的間隔越小。則越有利於保持原始聲音的真實情況,換句話說,採樣的頻率越大則音質越有保證;同理,當縱座標的單位越小則越有利於音質的提高。即採樣的位數越大越好。
在計算機中採樣位數一般有8位和16位之分。但有一點請大家注意,8位不是說把縱座標分成8份,而是分成2的8次方即256份; 同理16位是把縱座標分成2的16次方65536份; 而採樣頻率一般有11025HZ(11KHz),22050HZ(22KHz)、44100Hz(44KHz)三種。
那麼,如今我們就能夠得到PCM文件所佔容量的公式:存儲量 = (採樣頻率*採樣位數*聲道)*時間/8(單位:字節數).
比如,數字激光唱盤(CD-DA。紅皮書標準)的標準採樣頻率爲44.lkHz。採樣數位爲16位,立體聲(2聲道),能夠差點兒無失真地播出頻率高達22kHz的聲音,這也是人類所能聽到的最高頻率聲音。
激光唱盤一分鐘音樂須要的存儲量爲:
(44.1*1000*l6*2)*60/8=10。584。000(字節)=10.584MBytes
這個數值就是PCM聲音文件在硬盤中所佔磁盤空間的存儲量。
計算機音頻文件的格式決定了其聲音的品質,日常生活中電話、收音機等均爲模擬音頻信號。即不存在採樣頻率和採樣位數的概念,我們能夠這樣比較一下:
- 44KHz,16BIT的聲音稱作:CD音質;
- 22KHz、16Bit的聲音效果近似於立體聲(FM Stereo)廣播。稱作:廣播音質;
- 11kHz、8Bit的聲音,稱作:電話音質。
G711格式
G711編碼的聲音清晰度好,語音自然度高,但壓縮效率低,數據量大常在32Kbps以上。常用於電話語音(推薦使用64Kbps),sampling rate爲8K,壓縮率爲2,即把S16格式的數據壓縮爲8bit,分爲a-law和u-law。
a-law也叫g711a,輸入的是13位(其實是S16的高13位),使用在歐洲和其他地區,這種格式是經過特別設計的,便於數字設備進行快速運算。
運算過程如下:
(1) 取符號位並取反得到s,
(2) 獲取強度位eee,獲取方法如圖所示
(3) 獲取高位樣本位wxyz
(4) 組合爲seeewxyz,將seeewxyz逢偶數爲取補數,編碼完畢
示例:
輸入pcm數據爲3210,二進制對應爲(0000 1100 1000 1010)
二進制變換下排列組合方式(0 0001 1001 0001010)
(1) 獲取符號位最高位爲0,取反,s=1
(2) 獲取強度位0001,查表,編碼制應該是eee=100
(3) 獲取高位樣本wxyz=1001
(4) 組合爲11001001,逢偶數爲取反爲10011100
編碼完畢。
u-law也叫g711u,使用在北美和日本,輸入的是14位,編碼算法就是查表,沒啥複雜算法,就是基礎值+平均偏移值,具體示例如下:
pcm=2345
(1)取得範圍值
+4062 to +2015 in 16 intervals of 128 |
(2)得到基礎值0x90,
(3)間隔數128,
(4)區間基本值4062,
(5)當前值2345和區間基本值差異4062-2345=1717,
(6)偏移值=1717/間隔數=1717/128,取整得到13,
(7)輸出爲0x90+13=0x9D
代碼如下:
g711codec.h
/*
* G711 encode decode HEADER.
*/
#ifndef __G711CODEC_H__
#define __G711CODEC_H__
/*
* u-law, A-law and linear PCM conversions.
*/
#define SIGN_BIT (0x80) /* Sign bit for a A-law byte. */
#define QUANT_MASK (0xf) /* Quantization field mask. */
#define NSEGS (8) /* Number of A-law segments. */
#define SEG_SHIFT (4) /* Left shift for segment number. */
#define SEG_MASK (0x70) /* Segment field mask. */
#define BIAS (0x84) /* Bias for linear code. */
int PCM2G711a( char *InAudioData, char *OutAudioData, int DataLen, int reserve );
int PCM2G711u( char *InAudioData, char *OutAudioData, int DataLen, int reserve );
int G711a2PCM( char *InAudioData, char *OutAudioData, int DataLen, int reserve );
int G711u2PCM( char *InAudioData, char *OutAudioData, int DataLen, int reserve );
int g711a_decode(short amp[], const unsigned char g711a_data[], int g711a_bytes);
int g711u_decode(short amp[], const unsigned char g711u_data[], int g711u_bytes);
int g711a_encode(unsigned char g711_data[], const short amp[], int len);
int g711u_encode(unsigned char g711_data[], const short amp[], int len);
#endif /* g711codec.h */
g711codec.c
#include "g711codec.h"
static short seg_end[8] = {0xFF, 0x1FF, 0x3FF, 0x7FF,
0xFFF, 0x1FFF, 0x3FFF, 0x7FFF};
static int search(int val, short *table, int size)
{
int i;
for (i = 0; i < size; i++) {
if (val <= *table++)
return (i);
}
return (size);
}
/*
* alaw2linear() - Convert an A-law value to 16-bit linear PCM
*
*/
static int alaw2linear( unsigned char a_val )
{
int t;
int seg;
a_val ^= 0x55;
t = (a_val & QUANT_MASK) << 4;
seg = ( (unsigned)a_val & SEG_MASK ) >> SEG_SHIFT;
switch (seg)
{
case 0:
t += 8;
break;
case 1:
t += 0x108;
break;
default:
t += 0x108;
t <<= seg - 1;
}
return ((a_val & SIGN_BIT) ? t : -t);
}
/*
* ulaw2linear() - Convert a u-law value to 16-bit linear PCM
*
* First, a biased linear code is derived from the code word. An unbiased
* output can then be obtained by subtracting 33 from the biased code.
*
* Note that this function expects to be passed the complement of the
* original code word. This is in keeping with ISDN conventions.
*/
static int ulaw2linear(unsigned char u_val)
{
int t;
/* Complement to obtain normal u-law value. */
u_val = ~u_val;
/*
* Extract and bias the quantization bits. Then
* shift up by the segment number and subtract out the bias.
*/
t = ((u_val & QUANT_MASK) << 3) + BIAS;
t <<= ((unsigned)u_val & SEG_MASK) >> SEG_SHIFT;
return ((u_val & SIGN_BIT) ? (BIAS - t) : (t - BIAS));
}
/*
* linear2alaw() - Convert a 16-bit linear PCM value to 8-bit A-law
*
*/
unsigned char linear2alaw(int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char aval;
if (pcm_val >= 0) {
mask = 0xD5; /* sign (7th) bit = 1 */
} else {
mask = 0x55; /* sign bit = 0 */
pcm_val = -pcm_val - 8;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/* Combine the sign, segment, and quantization bits. */
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
aval = seg << SEG_SHIFT;
if (seg < 2)
aval |= (pcm_val >> 4) & QUANT_MASK;
else
aval |= (pcm_val >> (seg + 3)) & QUANT_MASK;
return (aval ^ mask);
}
}
/*
* linear2ulaw() - Convert a linear PCM value to u-law
*
*/
unsigned char linear2ulaw(int pcm_val) /* 2's complement (16-bit range) */
{
int mask;
int seg;
unsigned char uval;
/* Get the sign and the magnitude of the value. */
if (pcm_val < 0) {
pcm_val = BIAS - pcm_val;
mask = 0x7F;
} else {
pcm_val += BIAS;
mask = 0xFF;
}
/* Convert the scaled magnitude to segment number. */
seg = search(pcm_val, seg_end, 8);
/*
* Combine the sign, segment, quantization bits;
* and complement the code word.
*/
if (seg >= 8) /* out of range, return maximum value. */
return (0x7F ^ mask);
else {
uval = (seg << 4) | ((pcm_val >> (seg + 3)) & 0xF);
return (uval ^ mask);
}
}
int g711a_decode( short amp[], const unsigned char g711a_data[], int g711a_bytes )
{
int i;
int samples;
unsigned char code;
int sl;
for ( samples = i = 0; ; )
{
if (i >= g711a_bytes)
break;
code = g711a_data[i++];
sl = alaw2linear( code );
amp[samples++] = (short) sl;
}
return samples*2;
}
int g711u_decode(short amp[], const unsigned char g711u_data[], int g711u_bytes)
{
int i;
int samples;
unsigned char code;
int sl;
for (samples = i = 0;;)
{
if (i >= g711u_bytes)
break;
code = g711u_data[i++];
sl = ulaw2linear(code);
amp[samples++] = (short) sl;
}
return samples*2;
}
int g711a_encode(unsigned char g711_data[], const short amp[], int len)
{
int i;
for (i = 0; i < len; i++)
{
g711_data[i] = linear2alaw(amp[i]);
}
return len;
}
int g711u_encode(unsigned char g711_data[], const short amp[], int len)
{
int i;
for (i = 0; i < len; i++)
{
g711_data[i] = linear2ulaw(amp[i]);
}
return len;
}
decode.c
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "g711codec.h"
int main( int argc, char *argv[] )
{
if(argc < 3)
{
printf("==> Usage:\n\tdecode [src.g711a] [dest.pcm]\n");
//printf("==> Usage:\n\tdecode [src.g711u] [dest.pcm]\n");
return 0;
}
FILE *pInFile = fopen(argv[1], "rb");
FILE *pOutFile = fopen(argv[2], "wb");
if (NULL == pInFile || NULL == pOutFile)
{
printf("open file failed\n");
return 0;
}
struct stat s_buf;
int status = 0;
status = stat( argv[1], &s_buf );
printf("file_size = %d\n", s_buf.st_size);
int Ret = 0;
int Read = 0;
int DataLen = s_buf.st_size;
printf("datalen = %d, %s, %d\n", DataLen, __func__, __LINE__);
unsigned char ucInBuff[ DataLen + 1 ];
unsigned char ucOutBuff[ 2*DataLen + 1 ];
memset( ucInBuff, 0, sizeof(ucInBuff) );
memset( ucOutBuff, 0, sizeof(ucOutBuff) );
Read = fread( ucInBuff, 1, DataLen, pInFile );
printf("Read = %d, Ret = %d\n", Read, Ret);
if (Read)
{
Ret = G711a2PCM( (char *)ucInBuff, (char *)ucOutBuff, Read, 0 );
//Ret = G711u2PCM( (char *)ucInBuff, (char *)ucOutBuff, Read, 0 );
printf("Read = %d, Ret = %d, %s, %d\n", Read, Ret, __func__, __LINE__);
fwrite( ucOutBuff, 1, Ret, pOutFile );
memset( ucInBuff, 0, sizeof(ucInBuff) );
memset( ucOutBuff, 0, sizeof(ucOutBuff) );
}
else
{
printf("fread error !\n");
return -1;
}
fclose(pInFile);
fclose(pOutFile);
return 0;
}
encode.c
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "g711codec.h"
int main(int argc, char *argv[])
{
if(argc < 3)
{
printf("==> Usage:\n\tencode [src.pcm] [dest.g711a]\n");
//printf("==> Usage:\n\tencode [src.pcm] [dest.g711u]\n");
return 0;
}
FILE *pInFile = fopen(argv[1], "rb");
FILE *pOutFile = fopen(argv[2], "wb");
if (NULL == pInFile || NULL == pOutFile)
{
printf("open file failed\n");
return 0;
}
struct stat s_buf;
int status = 0;
status = stat( argv[1], &s_buf);
printf("file_size = %d\n", s_buf.st_size);
int Ret = 0;
int Read = 0;
int Len = s_buf.st_size;
printf("datalen = %d\n", s_buf.st_size);
unsigned char ucInBuff[ Len +1 ];
unsigned char ucOutBuff[ Len + 1 ];
memset(ucInBuff, 0, sizeof(ucInBuff));
memset(ucOutBuff, 0, sizeof(ucOutBuff));
Read = fread(ucInBuff, 1, Len, pInFile);
printf("Read = %d, Ret = %d\n", Read, Ret);
if (Read)
{
Ret = PCM2G711a( (char *)ucInBuff, (char *)ucOutBuff, Read, 0 );
//Ret = PCM2G711u( (char *)ucInBuff, (char *)ucOutBuff, Read, 0 );
printf("Read = %d, Ret = %d, %s, %d\n", Read, Ret, __func__, __LINE__);
fwrite(ucOutBuff, 1, Ret, pOutFile);
memset(ucInBuff, 0, sizeof(ucInBuff));
memset(ucOutBuff, 0, sizeof(ucOutBuff));
}
else
{
printf("fread error !\n");
return -1;
}
fclose(pInFile);
fclose(pOutFile);
return 0;
}
g711.c
#include <stdio.h>
#include "g711codec.h"
/*
* function: convert PCM audio format to g711 alaw/ulaw.(zqj)
* InAudioData: PCM data prepared for encoding to g711 alaw/ulaw.
* OutAudioData: encoded g711 alaw/ulaw.
* DataLen: PCM data size.
* reserve: reserved param, no use.
*/
/*alaw*/
int PCM2G711a( char *InAudioData, char *OutAudioData, int DataLen, int reserve )
{
//check params.
if( (NULL == InAudioData) && (NULL == OutAudioData) && (0 == DataLen) )
{
printf("Error, empty data or transmit failed, exit !\n");
return -1;
}
printf("DataLen = %d, %s, %d\n", DataLen, __func__, __LINE__);
int Retaen = 0;
printf("G711a encode start......\n");
Retaen = g711a_encode( (unsigned char *)OutAudioData, (short*)InAudioData, DataLen/2 );
printf("Retaen = %d, %s, %d\n", Retaen, __func__, __LINE__);
return Retaen; //index successfully encoded data len.
}
/*ulaw*/
int PCM2G711u( char *InAudioData, char *OutAudioData, int DataLen, int reserve )
{
//check params.
if( (NULL == InAudioData) && (NULL == OutAudioData) && (0 == DataLen) )
{
printf("Error, empty data or transmit failed, exit !\n");
return -1;
}
printf("DataLen = %d, %s, %d\n", DataLen, __func__, __LINE__);
int Retuen = 0;
printf("G711u encode start......\n");
Retuen = g711u_encode( (unsigned char *)OutAudioData, (short*)InAudioData, DataLen/2 );
printf("Retuen = %d, %s, %d\n", Retuen, __func__, __LINE__);
return Retuen;
}
/*
* function: convert g711 alaw audio format to PCM.(zqj)
* InAudioData: g711 alaw data prepared for encoding to PCM.
* OutAudioData: encoded PCM audio data.
* DataLen: g711a data size.
* reserve: reserved param, no use.
*/
/*alaw*/
int G711a2PCM( char *InAudioData, char *OutAudioData, int DataLen, int reserve )
{
//check param.
if( (NULL == InAudioData) && (NULL == OutAudioData) && (0 == DataLen) )
{
printf("Error, empty data or transmit failed, exit !\n");
return -1;
}
printf("DataLen = %d, %s, %d\n", DataLen, __func__, __LINE__);
int Retade = 0;
printf("G711a decode start......\n");
Retade = g711a_decode( (short*)OutAudioData, (unsigned char *)InAudioData, DataLen );
printf("Retade = %d, %s, %d\n", Retade, __func__, __LINE__);
return Retade; //index successfully decoded data len.
}
/*ulaw*/
int G711u2PCM( char *InAudioData, char *OutAudioData, int DataLen, int reserve )
{
//check param.
if( (NULL == InAudioData) && (NULL == OutAudioData) && (0 == DataLen) )
{
printf("Error, empty data or transmit failed, exit !\n");
return -1;
}
printf("DataLen = %d, %s, %d\n", DataLen, __func__, __LINE__);
int Retude = 0;
printf("G711u decode start......\n");
Retude = g711u_decode( (short*)OutAudioData, (unsigned char *)InAudioData, DataLen );
printf("Retude = %d, %s, %d\n", Retude, __func__, __LINE__);
return Retude;
}