NB_vol_1的博文在比特流之前講了去方塊濾波和SAO,這個部分我打算放到後面,這裏先看一下比特流。本文整理自http://blog.csdn.net/nb_vol_1/article/details/55057213
在講比特流之前先了解下VCL和NAL,HEVC編碼分成兩個層次,高層處理編碼具體細節的被稱爲VCL(視頻編碼層)、底層處理比特流的被稱爲NAL(網絡適配層)。預測編碼、變換、量化、環路濾波以及熵編碼都屬於VCL。而處理比特流封裝細節的部分則屬於NAL。
VCLU
編碼之後的數據要在網絡上傳輸,必須按照一定的格式進行封裝成報文,這樣的數據報文被稱爲nal unit簡稱NALU。一個NALU包含了一個參數集或者一個SS(slice segment)的數據;包含一個參數集或者其他信息的NALU被稱爲non-VCLU;包含一個SS的壓縮數據的NALU被稱爲VCLU。
HEVC規定一副圖像中的VCLU具有相同的時域重要性及與其他圖像的時域依賴關係。
NALU可以包含一個SS的壓縮數據、vps、sps、pps、補充增強信息(SEI)、也可以爲定界、序列結束、比特流結束、填充數據等
NALU的結構如下,它包含NALU頭部(2字節)和NALU載荷(稱爲RBSP,整數字節)。
注意,RBSP和壓縮的數據(SODB)是有區別的。SODB通過下面方式轉換成RBSP:
1、把下面的字節流進行轉換:
0x000000 ——> 0x00000300
0x000001 ——> 0x00000301
0x000002 ——> 0x00000302
0x000003 ——> 0x00000303
因爲,0x000001是NALU的起始碼,0x000000是NALU的結束碼,0x000002是預留碼,對0x000003進行替換是爲了避免與0x0000030X(0x00000300,0x00000301,0x00000302,0x00000303)衝突。
2、如果轉換之後的SODB不足整數個字節,那麼在後面填充比特0,直到包含整數個字節。
3、最後可能會加入16個比特的0作爲填充比特。
NALU的頭部:
1比特的forbidden_zero_bit(固定的0比特)
6比特的nal_unit_type(NALU類型)
6比特的nuh_layer_id(當前應該取0,非0值用於3D視頻等)
3比特的nuh_temporal_id_plus1,該值減去1表示NALU所在時域層的標識號temporalId,不能爲0,temporalId表示NALU的時域層級,根據temporalId可以確定圖像的重要性,配合nal_unit_type就可以實現視頻流的時域分級。
關於上文出現的一些縮寫和概念可以參考我以前的博文:
http://blog.csdn.net/qq_21747841/article/details/73332394
http://blog.csdn.net/qq_21747841/article/details/75224344
Access Unit
HEVC中引入了接入單元(Access Unit,AU)的概念,一個AU表示多個按照順序(解碼順序)排列的NALU,這些NALU剛好可以解碼生成一個圖像。
AU可以看作壓縮視頻比特流的基本單位,即壓縮的視頻流由多個按順序排列的AU組成。
一個AU應該包含一幅圖像的所有VCLU,還可以包含non-VCLU。一個AU可以從定界NALU、SEI類型的NALU或者第一個SS的NALU開始,可以用最後一個SS的NALU、序列結束NALU或者比特流結束NALU來結束。
一般情況下,一個AU不包含下面類型的NALU:參數集,保留VCL、填充、保留non-VCLU、未明確等。
NALU在網絡上的傳輸
NALU在網上傳輸的時候分爲兩種類型:
1、字節流。
NALU生成字節流的過程如下:
(1)在每個NALU前面插入3字節的起始碼start_code_prefix_one_3bytes,其值爲0x000001
(2)如果NALU的類型爲:VPS_NUT,SPS_NUT,PPS_NUT或者解碼順序爲一個AU的第一個NALU,則在其起始碼前再插入一個zero_byte,值爲0x00
(3)在視頻首個NALU的起始碼(可能包含zero_byte)前插入leading_zero_8bits,值是0x00
(4)根據需要可在每個NALU後增加trailing_zero_8bits,值是0x00,作爲填充數據。
2、分組流。使用RTP、RTMP等協議,把NALU直接作爲網絡分組的有效載荷。
比特流在HEVC中的實現
這裏是重點,因爲之前都是看的幀內幀間的實現,這裏要重新找它的位置,這裏是在編碼的入口TAppEncTop搜索m_nalUnitType找到的。
HEVC中的實現:
1、NALUnit表示一個NALU頭部
/**
* Represents a single NALunit header and the associated RBSPayload
*/
// NAL單元頭部
struct NALUnit
{// NAL單元的類型
NalUnitType m_nalUnitType; ///< nal_unit_type
UInt m_temporalId; ///< temporal_id
UInt m_nuhLayerId; ///< nuh_layer_id
/** construct an NALunit structure with given header values. */
// 構造函數
NALUnit(
NalUnitType nalUnitType,
Int temporalId = 0,
Int nuhLayerId = 0)
:m_nalUnitType (nalUnitType)
,m_temporalId (temporalId)
,m_nuhLayerId (nuhLayerId)
{}
/** default constructor - no initialization; must be perfomed by user */
NALUnit() {}
/** returns true if the NALunit is a slice NALunit */
// 判斷是否爲條帶,即判斷這個NAL單元存放的是否爲條帶數據
Bool isSlice()
{
return m_nalUnitType == NAL_UNIT_CODED_SLICE_TRAIL_R
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_TRAIL_N
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_TSA_R
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_TSA_N
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_STSA_R
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_STSA_N
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_BLA_W_LP
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_BLA_W_RADL
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_BLA_N_LP
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_IDR_W_RADL
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_IDR_N_LP
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_CRA
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_RADL_N
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_RADL_R
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_RASL_N
|| m_nalUnitType == NAL_UNIT_CODED_SLICE_RASL_R;
} // 判斷這個NAL單元中存放是否爲sei(增強信息)
Bool isSei()
{
return m_nalUnitType == NAL_UNIT_PREFIX_SEI
|| m_nalUnitType == NAL_UNIT_SUFFIX_SEI;
}
// 判斷是否爲vcl
Bool isVcl()
{
return ( (UInt)m_nalUnitType < 32 );
}
};
這裏開始和原文略有不同,下面是原文中的話,直接摘下來,我們來看看HM16.3中的代碼,發現還是可以大致適用的,關鍵是理解代碼的意圖。 2、OutputNALUnit表示一個編碼器輸出的NALU,擁有TComOutputBitstream成員,用於處理編碼器的輸出,OutputNALUnit直接繼承自NALUnit
struct OutputNALUnit : public NALUnit
{
/**
* construct an OutputNALunit structure with given header values and
* storage for a bitstream. Upon construction the NALunit header is
* written to the bitstream.
*/
OutputNALUnit(
NalUnitType nalUnitType,
UInt temporalID = 0,
UInt reserved_zero_6bits = 0)
: NALUnit(nalUnitType, temporalID, reserved_zero_6bits)
, m_Bitstream()
{}
OutputNALUnit& operator=(const NALUnit& src)
{
m_Bitstream.clear();
static_cast<NALUnit*>(this)->operator=(src);
return *this;
}
3、NALUnitEBSP是一個輔助類,專門用來處理OutputNALUnit,它有個ostringstream類型的成員。NALUnitEBSP在構造函數裏調用void write(std::ostream& out, OutputNALUnit& nalu);
/**
* A single NALunit, with complete payload in EBSP format.
*/
// 一個輔助類,作用是把NALU寫到ostringstream(相當於buffer)中
struct NALUnitEBSP : public NALUnit
{
std::ostringstream m_nalUnitData; // 會把NALU的頭部連同數據寫到這裏
/**
* convert the OutputNALUnit nalu into EBSP format by writing out
* the NALUnit header, then the rbsp_bytes including any
* emulation_prevention_three_byte symbols.
*/
NALUnitEBSP(OutputNALUnit& nalu);
};
//! \}
//! \}
#endif
4、AccessUnit表示了Access Unit,實際就是一個list,list的元素是NALUnitEBSP (這裏不是很懂)
class AccessUnit : public std::list<NALUnitEBSP*> // NOTE: Should not inherit from STL.
{
public:
~AccessUnit()
{
for (AccessUnit::iterator it = this->begin(); it != this->end(); it++)
{
delete *it;
}
}
};
5、TComBitIf表示比特流的公共接口
/// pure virtual class for basic bit handling
class TComBitIf
{
public:
virtual Void writeAlignOne () {};
virtual Void writeAlignZero () {};
virtual Void write ( UInt uiBits, UInt uiNumberOfBits ) = 0;
virtual Void resetBits () = 0;
virtual UInt getNumberOfWrittenBits() const = 0;
virtual ~TComBitIf() {}
};
6、TComOutputBitstream表示編碼器的輸出比特流,它有一個char類型的vector用於編碼器的輸出。
class TComOutputBitstream : public TComBitIf
{
/**
* FIFO for storage of bytes. Use:
* - fifo.push_back(x) to append words
* - fifo.clear() to empty the FIFO
* - &fifo.front() to get a pointer to the data array.
* NB, this pointer is only valid until the next push_back()/clear()
*/
std::vector<uint8_t> m_fifo; // 用vector存放數據
// 已經持有,但是還沒有沖刷的比特數
UInt m_num_held_bits; /// number of bits not flushed to bytestream.
// 總的比特數,包括還沒有沖刷的
UChar m_held_bits; /// the bits held and not flushed to bytestream.
/// this value is always msb-aligned, bigendian.
public:
// create / destroy
TComOutputBitstream();
~TComOutputBitstream();
// interface for encoding
/**
* append uiNumberOfBits least significant bits of uiBits to
* the current bitstream
*/
// 寫入比特,數據存放在uiBits,比特數存放在uiNumberOfBits
Void write ( UInt uiBits, UInt uiNumberOfBits );
/** insert one bits until the bitstream is byte-aligned */
// 寫入比特1直到按字節對齊
Void writeAlignOne ();
// 寫入比特0知道按字節對齊
/** insert zero bits until the bitstream is byte-aligned */
Void writeAlignZero ();
/** this function should never be called */
Void resetBits() { assert(0); }
// utility functions
/**
* Return a pointer to the start of the byte-stream buffer.
* Pointer is valid until the next write/flush/reset call.
* NB, data is arranged such that subsequent bytes in the
* bytestream are stored in ascending addresses.
*/
Char* getByteStream() const;
/**
* Return the number of valid bytes available from getByteStream()
*/
UInt getByteStreamLength();
/**
* Reset all internal state.
*/
Void clear();
/**
* returns the number of bits that need to be written to
* achieve byte alignment.
*/
Int getNumBitsUntilByteAligned() { return (8 - m_num_held_bits) & 0x7; }
/**
* Return the number of bits that have been written since the last clear()
*/
UInt getNumberOfWrittenBits() const { return UInt(m_fifo.size()) * 8 + m_num_held_bits; }
Void insertAt(const TComOutputBitstream& src, UInt pos);
/**
* Return a reference to the internal fifo
*/
std::vector<uint8_t>& getFIFO() { return m_fifo; }
UChar getHeldBits () { return m_held_bits; }
//TComOutputBitstream& operator= (const TComOutputBitstream& src);
/** Return a reference to the internal fifo */
const std::vector<uint8_t>& getFIFO() const { return m_fifo; }
// 添加子數據流
Void addSubstream ( TComOutputBitstream* pcSubstream );
Void writeByteAlignment();
//! returns the number of start code emulations contained in the current buffer
Int countStartCodeEmulations();
};
7、TComInputBitstream表示解碼器的輸入比特流
8、 NALUnit(OutputNALUnit)與TComBitIfTComOutputBitstream)之間關係:NALUnit是NALU的頭部,而TComBitIf表示有效的編碼數據
9、那麼NALUnit與TComBitIf(TComOutputBitstream)是怎麼樣結合起來的,這就要從熵編碼器入手了:
(1)先定義一個OutputNALUnit,因爲OutputNALUnit包含TComOutputBitstream成員,因此比特流也同時存在了;
(2)TEncEntropy有一個setBitstream函數,用於設置熵編碼比特流,setBitstream的參數就是OutputNALUnit中的TComOutputBitstream成員。因此,這時候熵編碼的輸出就可以寫入比特流(TComOutputBitstream)中了。
(3)最後AccessUnit是一個NALUnitEBSP的隊列,NALUnitEBSP的構造函數需要OutputNALUnit類型的參數,實際NALUnitEBSP的構造函數內部就會調用
write(std::ostream& out, OutputNALUnit& nalu)函數,這個函數用來把NALU寫到NALUnitEBSP的ostringstream(相當於一個buffer)成員中,最後,NALUnitEBSP被添加到AccessUnit中。
Void write(ostream& out, OutputNALUnit& nalu)
{
writeNalUnitHeader(out, nalu);
/* write out rsbp_byte's, inserting any required
* emulation_prevention_three_byte's */
/* 7.4.1 ...
* emulation_prevention_three_byte is a byte equal to 0x03. When an
* emulation_prevention_three_byte is present in the NAL unit, it shall be
* discarded by the decoding process.
* The last byte of the NAL unit shall not be equal to 0x00.
* Within the NAL unit, the following three-byte sequences shall not occur at
* any byte-aligned position:
* - 0x000000
* - 0x000001
* - 0x000002
* Within the NAL unit, any four-byte sequence that starts with 0x000003
* other than the following sequences shall not occur at any byte-aligned
* position:
* - 0x00000300
* - 0x00000301
* - 0x00000302
* - 0x00000303
*/
vector<uint8_t>& rbsp = nalu.m_Bitstream.getFIFO();
vector<uint8_t> outputBuffer;
outputBuffer.resize(rbsp.size()*2+1); //there can never be enough emulation_prevention_three_bytes to require this much space
std::size_t outputAmount = 0;
Int zeroCount = 0;
for (vector<uint8_t>::iterator it = rbsp.begin(); it != rbsp.end(); it++)
{
const uint8_t v=(*it);
if (zeroCount==2 && v<=3)
{
outputBuffer[outputAmount++]=emulation_prevention_three_byte[0];
zeroCount=0;
}
if (v==0)
{
zeroCount++;
}
else
{
zeroCount=0;
}
outputBuffer[outputAmount++]=v;
}
/* 7.4.1.1
* ... when the last byte of the RBSP data is equal to 0x00 (which can
* only occur when the RBSP ends in a cabac_zero_word), a final byte equal
* to 0x03 is appended to the end of the data.
*/
if (zeroCount>0)
{
outputBuffer[outputAmount++]=emulation_prevention_three_byte[0];
}
out.write((Char*)&(*outputBuffer.begin()), outputAmount);
}
(4)最後我們得到了一個完整的AccessUnit(包含一個圖像的所有的NALU),壓縮的數據都在AccessUnit的NALUnitEBSP的ostringstream成員中
(5)最後我們需要把AccessUnit寫到網絡或者本地磁盤上,需要使用writeAnnexB函數(這個函數被xWriteOutput函數調用),該函數使用字節流的方式,把數據寫到網絡或者磁盤上。
static std::vector<UInt> writeAnnexB(std::ostream& out, const AccessUnit& au)
{//通過在編碼器入口搜索writeAnnexB找到的
std::vector<UInt> annexBsizes;
for (AccessUnit::const_iterator it = au.begin(); it != au.end(); it++)
{
const NALUnitEBSP& nalu = **it;
UInt size = 0; /* size of annexB unit in bytes */
static const Char start_code_prefix[] = {0,0,0,1};
if (it == au.begin() || nalu.m_nalUnitType == NAL_UNIT_VPS || nalu.m_nalUnitType == NAL_UNIT_SPS || nalu.m_nalUnitType == NAL_UNIT_PPS)
{
/* From AVC, When any of the following conditions are fulfilled, the
* zero_byte syntax element shall be present:
* - the nal_unit_type within the nal_unit() is equal to 7 (sequence
* parameter set) or 8 (picture parameter set),
* - the byte stream NAL unit syntax structure contains the first NAL
* unit of an access unit in decoding order, as specified by subclause
* 7.4.1.2.3.
*/
out.write(start_code_prefix, 4);
size += 4;
}
else
{
out.write(start_code_prefix+1, 3);
size += 3;
}
out << nalu.m_nalUnitData.str();
size += UInt(nalu.m_nalUnitData.str().size());
annexBsizes.push_back(size);
}
return annexBsizes;
}
//! \}