基於線性預測的語音編碼原理解析

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"早期的音頻系統都是基於聲音的模擬信號實現的,在聲音的錄製、編輯和播放過程中很容易引入各種噪聲,從而導致信號的失真。隨着信息技術的發展,數字信號處理技術在越來越多領域得到了應用,數字信號更是具備了易於存儲和遠距離傳輸、沒有累積失真、抗干擾能力強等等,信號和信號處理都往數字化發展。爲了使得數字音頻可以被高效地壓縮存儲並高品質地還原,數字音頻的編碼技術就變成至關重要的一個部分了。本篇文章會介紹當今的音頻的編碼器(傳統算法非深度學習)的兩大主流陣營之一的基於線性預測的語音編碼器的原理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#01 音頻的編碼器分類及簡介","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比較流行基於傳統算法的音頻的編碼器基本可以分成兩個大的類別:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Audio Codec(音頻編碼器): aac, mp3, ogg, celt(inside of opus) ...","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Speech Codec(語音編碼器): ilbc, isac, silk(inside of opus) ...","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而這兩種編碼器類型基於完全不同的編碼原理,Audio Codec (音頻編碼器)利用了人類聽覺感知系統的特性來研究音頻編碼的方法,可以對較多音源,複雜信號進行高品質的編碼。而Speech Codec (語音編碼器)是以語音生成模型爲基礎,可以對單個音源(人或者一些樂器的發音器官單元)進行更低碼率的高效編碼。爲什麼已經有了可以對較多音源,複雜信號進行高品質編碼的Audio Codec,還需要研究和發展Speech Codec呢?因爲應用領域的需求完全不一樣。Audio Codec的應用領域更多和音樂有關,研究的是在保證儘量小的感知失真的前提下,對聲音進行壓縮編碼。早期mp3想要實現高品質所需要的編碼碼率還是比較高的,壓縮比並不高。而早期的數字電信系統的帶寬有限,如何可以用盡量小的帶寬實現可以還原出清晰的語音則成了Speech Codec的任務。更多在8kHz和16kHz採樣率下實現較低碼率的編碼。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#02 語音的發聲模型和特性","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"既然需要設計一款專門針對語音的編碼器,那肯定要先研究一下語音的一些特性。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"人的發聲模型","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c6/c6e6b7f571f9940ad72e4acde81e60c3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總的來說,人的發聲模型可以分成三個部分:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由肺和氣管產生生氣源","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"喉和聲帶組成聲門","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"咽腔,口腔,鼻腔等組成聲道","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"人的發聲過程基本過程可以這樣描述:由肺部擠壓產生流動高壓氣體,通過氣管,經過喉嚨,喉嚨控制相關軟骨組織和肌肉組織(其中最爲重要爲聲道)進行復雜運動,最終聲帶在控制下進行合攏或者分離,最終產生了聲音的激勵,再經過咽腔、口腔、鼻腔共鳴最終形成聲音。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"2. 語音信號的一般分類","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"人發出不同的聲音時,語音激勵和聲道的情況也是完全不同的,發出的聲音基本可以分類爲兩種類型:濁音:空氣流經過聲帶時,聲帶呈緊繃狀態,併產生張弛振動,即聲帶進行週期性的開啓和閉合,空氣流經過聲帶後形成一個一個脈衝,然後再經過各種聲道的共鳴作用,最後形成濁音。濁音典型波形如下圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/00/0061aa5560e26da7bbc329afecdac073.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"非常的週期性","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"輕音:空氣流經過聲帶時,聲帶呈放鬆狀態,之後在進入聲道時,如果聲道收縮,則氣流被迫高速通過,最終產生摩擦音或者清音,如果聲道某個部位完全閉合,氣流經過這裏則受到阻塞,氣壓增大,然後閉合點突然開啓則最終產生爆破音。清音典型波形如下圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/53/53790addce85158b183d3ce2bd77adbc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"無明顯週期性","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":3,"normalizeStart":3},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"語音信號的特性和模型","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"視頻編碼會針對時間和空間的冗餘,那語音裏面是不是也有一些明顯冗餘呢?至少就前面的濁音來看,有非常明顯的週期性,從時間軸上來看是應該有明顯的冗餘的,那如何進行壓縮呢,另外清音是否也有類似的冗餘呢?想弄明白答案,還是要先從根本來分析,先針對人的發聲系統進行建模:需要注意的是發生濁音和清音是由完全不同的聲音激勵,再經過聲道共鳴發出的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"濁音的語音激勵近似爲:脈衝信號","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"清音的語音激勵近似爲:白噪聲","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以整個語音產生可以描述爲一個系統,叫Speech Source-Filter Model,如下圖所示:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/59/599ec0b337e3661009de54a7347922d8.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從這個模型我們可以看出來,語音的激勵本身不是脈衝信號就是白噪聲,基本上頻譜都是比較平的,且基本不包含實際信息。而語音本身所包含的複雜信息主要是由變化的聲道的處理形成的。而語音信號本身又符合短時平穩的特性。那麼語音信號編碼器的一個樸素的編碼思想就在這裏形成了:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"是不是可以針對每個短時語音序列分析,由於語音的複雜性基本由聲道處理形成,嘗試對於這個短時語音信號的聲道進行建模,然後再把簡單的語音激勵信號和聲道模型進行編碼,解碼的時候就可以利用語音的激勵信號,再次通過聲道模型,從而讓這個語音再次“說”一遍。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過對人體生理髮聲的一系列基礎研究,發現聲道模型的傳遞函數是自迴歸滑動平均模型ARMA(Autoregressive moving average model),而ARMA模型本身就表明存在內在的相關性,即可以從歷史預測未來。這也就從根本上確定了之前對於語音信號時間冗餘的編碼的可行性和理論基礎。總的來說,語音的產生是:激勵模型G(z)、輻射模型R(z)和聲道模型V(z)進行級聯組合形成的,也符合ARMA模型。而建模過程爲了可以方便計算經過近似,大致可以用全極點模型AR(p)過程來表達:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/21/2176c64c9d51321ff666e70a8d1de659.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用這樣的一個簡單模型的主要優點在於可以用線性預測分析法對增益G和濾波器係數{ ai } 進行直接而高效的計算。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#03 LPC線性預測線性預測編碼","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(LPC, Linear predictive coding)是主要用於音頻信號處理與語音處理中根據線性預測模型的信息用壓縮形式表示數字語音信號譜包絡(spectral envelope)的工具。它是最有效的語音分析技術之一,也是低碼率下編碼高質量語音最有用的方法之一,它能夠提供非常精確的語音參數預測。線性預測的基本思想是:一個語音取樣的現在值可以用若干個語音取樣過去值的線性加權組合來逼近。語音抽樣信號s(n)和激勵信號u(n)之間的關係可以用下列簡單的差分方程來表示:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2e/2e1760b7a2558ee3dae7445fb4c47362.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"p階線性預測是根據信號過去的p個取樣值的加權和來預測信號的當前取樣值s(n)的:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e3/e390f8ab2268ac35c6e217d593774bb7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預測誤差定義爲:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6a/6a332b3acaf0aed9fe199595e8bcbe19.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其系統函數爲:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/45/4537e81300a3ddea5082be8576a8f266.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼A(z)和H(z)的關係如下:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e2/e2ca027f334bee32e904ba668370de70.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時線性預測的問題就變成求出一組預測係數{ ai } ,但是問題是e(n)的表達公式只有s(n)序列是已知的, 和 e(n) 都是未知的,這個方程其實是過定的,也就是解不唯一,那我們只需要找到一個“好”的解即可。而我們希望預測誤差越小越好,所以預測誤差的“最小均方差”就是一個很好方案,即:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9f/9f15d50e9b81cbadc425830e18a858cc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後圖片對於a求偏導,經過一系列的計算,最終得到了著名的Yule-Walker方程:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/21/215f7203e3d5b6a6d28a0a01f867778b.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中R是自相關函數,R(j-i)=E[s(n-i)s(n-j)] (i, j = 1, 2, 3, ..., p)可以看到上面的矩陣,不僅是對稱矩陣,而且平行對角線的元素也都相等,這樣的矩陣稱Toeplitz矩陣。這個方程組包含p+1個未知數(p個預測係數a1...ap 和一個最小均方誤差Ep ),而 R0 到 Rp 都是已知數,可解。另外增益模型也可以利用激勵信號的爲白噪聲的均方誤差爲1且自相關函數爲0的特性得出:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d0/d0780ffcb485a44c8aab612126ad70ea.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PS: 即使在濁音的情況下,由於一串脈衝信號在大部分時間也是非常小的,所以使用最小的預測誤差e(n)逼近u(n)和u(n)能量很小不矛盾。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至此公式(1)內的全極點模型的所有參數都解出來了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#04 Levinson-Durbin算法與格型濾波器","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"全極點模型的參數在可以計算的前提下,實際應用特別是對編碼傳輸來說還是有很大痛點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解方程需要求矩陣的逆,非常的消耗計算量;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AR的模型裏面使用的是FIR濾波器,直接型FIR特別是高階的對於濾波係數的量化誤差異常敏感。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼怎麼解決呢?首先來看如何快速求解預測參數。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"Levinson-Durbin算法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於求解Yule-Walker方程,由於Toeplitz矩陣自己的特性,可以得到一種有效的迭代算法,論證的過程這裏就不詳細寫了,相信有很多資料都可以查到。這個算法是一個迭代計算的過程,從最低階往上計算逐步遞推。不僅求出了所有的p階預測係數,還得到了所有低階預測係數,具體過程如下:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9c/9c5f7a46912d08c338c8ad3f46845996.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中 ki 又稱爲是反射係數,並且只有 |ki|<1 ,才能保證系統H(z)穩定,且 k1 和 a1 是一一對應的。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"格型濾波器","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了 ki 這個中間變量後,ki 把預測濾波器的低階係數和高階係數聯繫到了一起:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/43/432b0f19b735c6fbb2515547ae493bb6.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼既然反射係數 和預測係數替換,系統都由反射係數表達會變成什麼樣呢?","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4e/4e829c60b2599dc2e2ac142fc4a20247.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原本FIR濾波器的時域差分方程爲:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於這裏 是從之前i個採樣點來預測s[n]時產生的誤差,所以稱爲:前向預測誤差。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/85/852bb31b828a2a05a955c58fa6333dc2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再將公式(10)帶入並推導,最後得到的時域解釋爲:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏 則是根據樣本(n-i)之後的i個樣本點預測得到的,所以這個 又被稱爲後向預測誤差。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前向和後向表達裏的預測係數 替換成 的表達爲:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/53/531f2df548f7f317c703e3c2f9b5ae9f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼根據(13)(14)遞推公式,已知我們知道s[n]求 ,p階預測誤差濾波器由 表達如下,且各階前向和後向誤差也如下圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/22/228019773a37da3187900ad5539d67b9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而語音的發聲的全極點建模如下圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/39/390046a64a87d5abea362bd21125aa0e.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"缺點:輸出每個樣本所經歷的計算量是普通直接型濾波器的兩倍。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優點:只要保證反射係數 ,就不會使得系統不穩定。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"特別是高階型的濾波器對於係數的量化非常敏感,量化誤差稍大就可能導致系統不穩定,所以在需要對係數粗量化時,格型濾波器仍然是最優的實現方法。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#05 LPC的全極點模型階數的影響","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線性預測編碼裏的全極點建模提供了一種從截取或加窗數據中獲得一個高分辨率信號頻譜的方法。但這是基於一個前提,即如果參數信號數據和模型互相擬合,那麼可以用數據的有限長區間段來確定模型參數,進而也確定了其頻譜。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"是不是LPC的全極點模型的階數越高越精確呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果s[n]本身就是全極點系統生成的,那麼只要估計方案選擇得當,使用高於原生成系統的階數意義並不大。而即使信號s[n]本身不能用一個全極點系統來精確建模,通常也會存在一個p的取值,在該值之上再增大時,對於預測誤差的影響就會非常小或者沒有影響。而這個閾值將表明一個全極點模型階數的有效選擇。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼如果在有效階數一下,降低p的取值會對建模本身帶來多少誤差呢,或者說誤差主要表現在哪裏呢?這裏直接給出一個基於線性預測全極點重建數據頻譜包絡在不同p值之下和原信號的對比圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/20/201cc4539b61dc11f034f30591174c2f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有圖可見,降低LPC的全極點模型的階數和原信號的頻譜的對比表明了,其重建信號在階數越低的情況下其頻譜包絡越平滑(如圖:p=12),而越高的階數(p=40)則能表現更多的細節。所以基於LPC的編碼器的碼率肯定會影響模型的階數和量化的精度,由此也能看出重建聲音在低碼率下的音質並不能表現更多細節了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"#06 總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文介紹了基於線性預測的編碼器的主要工作原理,更說明了各個部分的設計緣由、特點、缺陷等,希望讀者可以對整個編碼主體部分的分工和由來有一個基本概念。很多推導過程都已省略,相信很多書籍和材料都能查到相應算法的具體推導。另外這也只是基於LPC編碼器編碼的主體原理,實際應用時,編碼器可能還會有很多其他模塊,比如熵編碼模塊、抵抗量化噪聲的模塊等等。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章