視頻編碼部分定義

Definitions:

Ø GOP（Group of Pictures）

策略影響編碼質量：所謂GOP，意思是畫面組，一個GOP就是一組連續的畫面。MPEG編碼將畫面（即幀）分爲I、P、B三種，I是內部編碼幀，P是前向預測幀，B是雙向內插幀。簡單地講，I幀是一個完整的畫面，而P幀和B幀記錄的是相對於I幀的變化。沒有I幀，P幀和B幀就無法解碼，這就是MPEG格式難以精確剪輯的原因，也是我們之所以要微調頭和尾的原因。GOP 越長，B 幀所佔比例更高，編碼的率失真性能越高。

In Video coding, a group of pictures specifies the order in which intra- and inter-frames are arranged.

The GOP is a group of successive pictures within a coded video stream. Each coded video stream consists of successive GOPs. From the pictures contained in it, the visible frames are generated.

A GOP can contain the following picture types:

§ I-picture or I-frame (intra coded picture) - reference picture, which represents a fixed image and which is independent of other picture types. Each GOP begins with this type of picture.

§ P-picture or P-frame (predictive coded picture) - contains motion-compensated difference information from the preceding I- or P-frame.

§ B-picture or B-frame (bidirectionally predictive coded picture) - contains difference information from the preceding and following I- or P-frame within a GOP.

§ D-picture or D-frame (DC direct coded picture) - serves the fast advance.

A GOP always begins with an I-frame. Afterwards several P-frames follow, in each case with some frames distance. In the remaining gaps are B-frames. A few video codecs allow for more than one I-frame in a GOP.

The I-frames contain the full image and do not require any additional information to reconstruct it. Therefore any errors within the GOP structure are corrected by the next I-frame. B-frames within a GOP only propagate errors in H.264, where B-frames can be referenced by other pictures in order to increase compression efficiency.

The more I-frames the video stream has, the more editable it is. However, having more I-frames increases the stream size. In order to save bandwidth and disk space, videos prepared for internet broadcast often have only one I-frame per GOP.

The GOP structure is often referred by two numbers, for example M=3, N=12. The first one tells the distance between two anchor frames (I or P). The second one tells the distance between two full images (I-frames): it is the GOP length <就是說GOP長度是兩個I幀的距離>. For the example M=3 N=12, the GOP structure is IBBPBBPBBPBBI.

QP <quantization parameter> 量化參數

Wikipedia上居然沒有對這個做一個解釋，至少現在還木有。只好查了別的資料，解釋如下：

H.264編解碼器中，量化參數QP和量化步長Qstep的關係：

量化步長Qstep共有52個值。（對於亮度編碼而言）

量化參數QP是量化步長Qstep的序號，取值0~51。

QP取最小值0 時，表示量化最精細；相反，QP取最大值51時，表示量化是最粗糙的。

Qstep隨着QP的增加而增加，QP每增加6，Qstep增加一倍。

對於色度編碼，QP的最大值是39。

在深度視頻實驗裏我用的QP分別是22,27,32,37；結果可見22的最清晰，37的最模糊。

Bit Rate 碼率

In telecommunications and computing, bitrate (sometimes written bit rate, data rate or as a variable R[1]) is the number of bits that are conveyed or processed per unit of time.

碼率就是數據傳輸時單位時間傳送的數據位數,一般我們用的單位是kbps即千位每秒。通俗一點的理解就是取樣率，單位時間內取樣率越大，精度就越高，處理出來的文件就越接近原始文件，也就是說畫面的細節就越豐富，但壓縮率也就越小。

碼流 x 時間 = 總容量

Multimedia encoding

In digital multimedia, bit rate often refers to the number of bits used per unit of playback time to represent a continuous medium such as audio or video aftersource coding (data compression). The encoding bit rate of a multimedia file is the size of a multimedia file in bytes divided by the playback time of the recording (in seconds), multiplied by eight.

For realtime streaming multimedia, the encoding bit rate is the goodput that is required to avoid interrupt:

Encoding bit rate = Required goodput

The term average bitrate is used in case of variable bitrate multimedia source coding schemes. In this context, the peak bit rate is the maximum number of bits required for any short-term block of compressed data.[12]

A theoretical lower bound for the encoding bit rate for lossless data compression is the source information rate, also known as the entropy rate.(熵率)

Entropy rate ≤ Multimedia bit rate

PSNR Peak signal-to-noise ratio

The phrase peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity(保真度) of its representation. Because many signals have a very wide dynamic range, PSNR is usually expressed in terms of the logarithmic decibel scale.

The PSNR is most commonly used as a measure of quality of reconstruction of lossy compression codecs (e.g., for image compression). The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs it is used as an approximation to human perception of reconstruction quality, therefore in some cases one reconstruction may appear to be closer to the original than another, even though it has a lower PSNR (a higher PSNR would normally indicate that the reconstruction is of higher quality). One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content. PSNR值越大，就代表失真越少。

站內首發文章

笑傲江湖曲

發佈了29 篇原創文章 · 獲贊 7 · 訪問量 2萬+

私信關注

視頻編碼部分定義

Wireshark 安裝+使用（一）

webrtc從入門到深入---02

音頻資料

ffmpeg play network streaming

ffmpeg參考

jni層解析java層傳下來的JSONArray string 對象

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結