計算機工程中知識,只要不是從源代碼中獲取的,都是二手知識。我原來打算,看看能否在視頻編解碼方向找到一些研究點的。當然,現在也沒有找到,準備放棄,遂有此篇,作爲記錄。之前看了很多編碼器相關的博客,大多數講解甚是模糊。也看了一些書,其中[1]寫的還是不錯的。JM,X264又太複雜。
之前在github上發現一個比較mini的H264編碼器[2],沒有看懂,畢竟一萬多行呢,但是他的參考列表裏有jcodec[3],晚上看了一通,竟然有點懂了,因爲它足夠簡單。幀內預測只實現了DC模式,也有一個簡單的碼率控制模塊。簡單分析下I幀的編碼過程。
H264Encoder.java
private void encodeSlice(SeqParameterSet sps, PictureParameterSet pps, Picture pic, ByteBuffer dup, boolean idr,
int frameNum, SliceType sliceType, int qp) {
for (int mbY = 0, mbAddr = 0; mbY < sps.picHeightInMapUnitsMinus1 + 1; mbY++) {
for (int mbX = 0; mbX < sps.picWidthInMbsMinus1 + 1; mbX++, mbAddr++) {
do {
candidate = sliceData.fork();
totalQpDelta += qpDelta;
encodeMacroblock(mbType, pic, mbX, mbY, candidate, qp, totalQpDelta);
//碼率控制模塊
qpDelta = rc.accept(candidate.position() - sliceData.position());
if (qpDelta != 0)
restoreMacroblock(mbType);
} while (qpDelta != 0);
sliceData = candidate;
qp += totalQpDelta;
collectPredictors(outMB.getPixels(), mbX);
addToReference(mbX, mbY);
}
}
}
encodeMacroblock編碼一個(16X16)的宏塊。collectPredictors(outMB.getPixels(), mbX),宏塊被編碼後,會被重新解碼到outMB,用於預測。 addToReference(mbX, mbY),把outMB添加到參考幀中,在編碼P幀時會用到。
H264Encoder.java
private void encodeMacroblock(MBType mbType, Picture pic, int mbX, int mbY, BitWriter candidate, int qp, int qpDelta) {
if (mbType == MBType.I_16x16) {
mbEncoderI16x16.save();
mbEncoderI16x16.encodeMacroblock(pic, mbX, mbY, candidate, outMB, mbX > 0 ? topEncoded[mbX - 1] : null,
mbY > 0 ? topEncoded[mbX] : null, qp + qpDelta, qpDelta);
} else if (mbType == MBType.P_16x16) {
mbEncoderP16x16.save();
mbEncoderP16x16.encodeMacroblock(pic, mbX, mbY, candidate, outMB, mbX > 0 ? topEncoded[mbX - 1] : null,
mbY > 0 ? topEncoded[mbX] : null, qp + qpDelta, qpDelta);
} else
throw new RuntimeException("Macroblock of type " + mbType + " is not supported.");
}
只簡單分析第一個分叉。
MBEncoderI16x16.java
public void encodeMacroblock(Picture pic, int mbX, int mbY, BitWriter out, EncodedMB outMB,
EncodedMB leftOutMB, EncodedMB topOutMB, int qp, int qpDelta) {
CAVLCWriter.writeUE(out, 0); // Chroma prediction mode -- DC
CAVLCWriter.writeSE(out, qpDelta); // MB QP delta
outMB.setType(MBType.I_16x16);
outMB.setQp(qp);
//編碼Y
luma(pic, mbX, mbY, out, qp, outMB.getPixels(), cavlc[0]);
//編碼U,V
chroma(pic, mbX, mbY, out, qp, outMB.getPixels());
new MBDeblocker().deblockMBI(outMB, leftOutMB, topOutMB);
}
MBEncoderI16x16.java
private void luma(Picture pic, int mbX, int mbY, BitWriter out, int qp, Picture outMB, CAVLC cavlc) {
int x = mbX << 4;
int y = mbY << 4;
int[][] ac = new int[16][16];
byte[][] pred = new byte[16][16];
//幀內預測,只實現了DC模式
lumaDCPred(x, y, pred);
//求殘差,並進行Hadamard變換。
transform(pic, 0, ac, pred, x, y);
int[] dc = extractDC(ac);
writeDC(cavlc, mbX, mbY, out, qp, mbX << 2, mbY << 2, dc, I_16x16, I_16x16);
writeAC(cavlc, mbX, mbY, out, mbX << 2, mbY << 2, ac, qp, I_16x16, I_16x16, DUMMY);
restorePlane(dc, ac, qp);
for (int blk = 0; blk < ac.length; blk++) {
MBEncoderHelper.putBlk(outMB.getPlaneData(0), ac[blk], pred[blk], 4, BLK_X[blk], BLK_Y[blk], 4, 4);
}
}
private void transform(Picture pic, int comp, int[][] ac, byte[][] pred, int x, int y) {
for (int i = 0; i < ac.length; i++) {
int[] coeff = ac[i];
MBEncoderHelper.takeSubtract(pic.getPlaneData(comp), pic.getPlaneWidth(comp), pic.getPlaneHeight(comp), x
+ BLK_X[i], y + BLK_Y[i], coeff, pred[i], 4, 4);
CoeffTransformer.fdct4x4(coeff);
}
}
宏塊大小爲16X16,被分成了16個4X4的子塊,對子塊進行Hadamard變換,但是總變換後的係數個數,仍等於宏塊內像素的個數。一個4X4的塊,變換爲產生16個係數,因此 ac = new int[16][16]
。writeAC還要進行量化操作。restorePlane進行反量化,逆變化,最會把恢復的數據放進outMB中。因爲在編碼下一塊的時候,就需要使用此塊的左列像素,或者上列像素。這句話說的有點饒,就是lumaDCPred函數需要用到,即16X16的DC預測模式。
private void lumaDCPred(int x, int y, byte[][] pred) {
int dc;
if (x == 0 && y == 0)
dc = 0;
else if (y == 0)
dc = (ArrayUtil.sumByte(leftRow[0]) + 8) >> 4;
else if (x == 0)
dc = (ArrayUtil.sumByte3(topLine[0], x, 16) + 8) >> 4;
else
dc = (ArrayUtil.sumByte(leftRow[0]) + ArrayUtil.sumByte3(topLine[0], x, 16) + 16) >> 5;
for (int i = 0; i < pred.length; i++)
for (int j = 0; j < pred[i].length; j++)
pred[i][j] += dc;
}
這裏面的三種處理情況,書[1]的p71有說明,有一處不同,書中說當P(x,-1)與P(-1,y)都不可用的時候,預測值是128,代碼是0。那就不知道會產生什麼後果了。
outMB是怎樣和leftRow和topLine在什麼地方產生聯繫的?
private void collectPredictors(Picture outMB, int mbX) {
arraycopy(outMB.getPlaneData(0), 240, topLine[0], mbX << 4, 16);
arraycopy(outMB.getPlaneData(1), 56, topLine[1], mbX << 3, 8);
arraycopy(outMB.getPlaneData(2), 56, topLine[2], mbX << 3, 8);
copyCol(outMB.getPlaneData(0), 15, 16, leftRow[0]);
copyCol(outMB.getPlaneData(1), 7, 8, leftRow[1]);
copyCol(outMB.getPlaneData(2), 7, 8, leftRow[2]);
}
[1] 深入理解視頻編解碼技術-基於H.264標準及參考模型
[2] minih264 https://github.com/lieff/minih264
[3]jcodec https://github.com/jcodec/jcodec
[4] Intra Luma Prediction https://www.cnblogs.com/TaigaCon/p/4190806.html
[5] Hadamard變換 https://www.cnblogs.com/xkfz007/archive/2012/07/31/2616791.html