特徵參數提取

1. 語音 信號的數學模型

根據語音產生的機理,可以將語音發聲系統 分爲3個子系統:在聲門(聲帶)以下,稱爲“聲門子系統”,它負責產生激勵振動,是“激勵子系統”;從聲門到嘴脣的呼氣通道是聲道,是“聲道系統”;語音從嘴脣輻射出去,所以嘴脣以外是“輻射系統”.人類 的發聲是一個複雜的過程.

語音發聲系統的完整數學模型可表示爲:H(z)=G(z)V(z)R(z)

 

2. 特徵 參數 提取

2.1 線性預側係數LPC(Linear Predictor Coeffieient)

提取LPC特徵的程序 如下:

function b = addwin(f)
% 此函數用來分幀後,在“加窗”的同時提取lpc倒譜參數
% f是分幀後得到的值,f = enframe(py, win, inc)
b = [];
for i = 1 : size(f, 1)
    y = f(i, :); % 顯示第i行的所有數據
    p = y .* hamming(256);   % 加窗
    c = lpc(p); % 求lpc
    d = cceps(c);   % 求lpc倒譜
    b = [b, d(2 : 13)]; % 取2到13個,一共12個
end
return

 

2.2 美爾倒譜系數MFC C(Mel Frequency Cepstrum Coefficients)

提取MFCC特徵的代碼:

 function ccc = mfcc(x)
% 歸一化mel濾波器組係數
bank = melbankm(24, 256, 8000, 0, 0.5, 'm');
bank = full(bank);
bank = bank / max(bank(:));
% DCT係數,12 * 24
for k = 1 : 12
    n = 0 : 23;
    dctcoef(k, :) = cos((2 * n + 1) * k * pi / (2 * 24));
end
% 歸一化倒譜提升窗口
w = 1 + 6 * sin(pi * [1 : 12] ./ 12);
w = w / max(w);
% 預加重濾波器
xx = double(x);
xx = filter([1 - 0.9375], 1, xx);
% 語言信號分幀
xx = enframe(xx, 256, 80);
% 計算每幀的MFCC參數
for i = 1 : size(xx, 1)
    y = xx(i, :);
    s = y' .* hamming(256);
    t = abs(fft(s));
    t = t .^ 2;
    c1 = dctcoef * log(bank * t(1 : 129));
    c2 = c1 .* w';
    m(i, :) = c2';
end
% 差分系數
dtm = zeros(size(m));
for i = 3 : size(m, 1) - 2
    dtm(i, :) = -2 * m(i - 2, :) - m(i - 1, :) + m(i + 1, :) + 2 * m(i + 2, :);
end
dtm = dtm / 3;
% 合併mfcc參數和一階差分mfcc參數
ccc = [m dtm];
% 去除首尾兩幀,因爲這兩幀的一階差分參數爲0
ccc = ccc(3, size(m, 1) -2, :);
return

 

其中enframe的代碼[2]如下:

 

function f=enframe(x,win,inc)
%ENFRAME split signal up into (overlapping) frames: one per row. F=(X,WIN,INC)
%   F = ENFRAME(X,LEN) splits the vector X up into
%   frames. Each frame is of length LEN and occupies
%   one row of the output matrix. The last few frames of X
%   will be ignored if its length is not divisible by LEN.
%   It is an error if X is shorter than LEN.
%   F = ENFRAME(X,LEN,INC) has frames beginning at increments of INC
%   The centre of frame I is X((I-1)*INC+(LEN+1)/2) for I=1,2,...
%   The number of frames is fix((length(X)-LEN+INC)/INC)
%   F = ENFRAME(X,WINDOW) or ENFRAME(X,WINDOW,INC) multiplies
%   each frame by WINDOW(:)
nx=length(x);
nwin=length(win);
if (nwin == 1)
   len = win;
else
   len = nwin;
end
if (nargin < 3)
   inc = len;
end
nf = fix((nx-len+inc)/inc);
f=zeros(nf,len);
indf= inc*(0:(nf-1)).';
inds = (1:len);
f(:) = x(indf(:,ones(1,len))+inds(ones(nf,1),:));
if (nwin > 1)
    w = win(:)';
    f = f .* w(ones(nf,1),:);
end

 

加矩形窗的短時能量 函數:
a=wavread('F:/WO.wav');
subplot(6,1,1),plot(a);
N=32;
for i=2:6
h=linspace(1,1, (i-1)*N);
%形成一個矩形窗,長度爲N
En=conv(h,a.*a);
%求卷積得其短時能量 函數En
subplot(6,1,i),plot(En);
if(i==2) legend('N=32');
elseif(i==3) legend('N=64');
elseif(i==4) legend('N=128');
elseif(i==5) legend('N=256');
elseif(i==6) legend('N=512');
end
end
加hamming窗的短時能量 函數:
把h=linspace(1,1, (i-1)*N);
改爲h1=hamming((i-1)*N);

加矩形窗的短時平均幅度:
a=wavread('F:/WO.wav');
subplot(6,1,1),plot(a);
N=32;
for i=2:6
h=linspace(1,1,(i-1)*N);
%形成一個矩形窗,長度爲N
En=conv(h,abs(a));
%求卷積得其短時能量 函數En
subplot(6,1,i),plot(En);
if(i==2) legend('N=32');
elseif(i==3) legend('N=64');
elseif(i==4) legend('N=128');
elseif(i==5) legend('N=256');
elseif(i==6) legend('N=512');
end
end

短時過零率:
a=wavread('F:/WO.wav');
n=length(a);
N=320;
subplot(3,1,1),plot(a);
h=linspace(1,1,N);%形成一個矩形窗,長度爲N
En=conv(h,a.*a);%求卷積得其短時能量 函數En
subplot(3,1,2),plot(En);

for i=1:n-1
if a(i)>=0
b(i)= 1;
else
b(i) = -1;
end
if a(i+1)>=0
b(i+1)=1;
else
b(i+1)=-1;
end
w(i)=abs(b(i+1)-b(i));
end%求出每相鄰兩點符號的差值的絕對值
k=1;
j=0;

while (k+N-1)<n
Zm(k)=0;
for i=0:N-1;
Zm(k)=Zm(k)+w(k+i);
end
j=j+1;
k=k+160; %每次移動半個窗
end
for w=1:j
Q(w)=Zm(160*(w-1)+1)/640;%短時平均過零率
end
subplot(3,1,3),plot(Q);

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章