【FFmpeg 3.x API應用一】視頻解碼

摘要

這篇文章介紹怎麼實現視頻解碼，具體步驟爲讀取Sample.mkv視頻文件，從中提取視頻流，然後解碼爲YUV圖像數據，把YUV數據存儲爲PGM灰度圖像，或者存儲爲YUV420p RAW格式視頻。

初始化FFmepg和FormatContext

使用FFmpeg API第一個操作就是執行初始化函數：av_register_all註冊所有相關組件，然後使用avformat_open_input打開指定的媒體文件，並使用avformat_find_stream_info獲取媒體流相關信息，把這些格式信息映射到AVFormatContext *mFormatCtx這個結構中。
使用函數av_dump_format可以從控制檯輸出媒體文件相關信息。

bool VideoDecoding::init(const char * file)
{
    av_register_all();

    if ((avformat_open_input(&mFormatCtx, file, 0, 0)) < 0) {
        printf("Failed to open input file\n");
    }

    if ((avformat_find_stream_info(mFormatCtx, 0)) < 0) {
        printf("Failed to retrieve input stream information\n");
    }

    av_dump_format(mFormatCtx, 0, file, 0);

    return false;
}

查詢媒體流序號

多媒體文件一般都有一個視頻流和多個音頻流或者字幕流，每個媒體流都有序號Index。新版本的API使用av_find_best_stream函數查詢相應的媒體流，第一個參數爲初始化後的媒體格式Context，第二個參數即爲媒體類型：
- AVMEDIA_TYPE_VIDEO：視頻流
- AVMEDIA_TYPE_AUDIO：音頻流
- AVMEDIA_TYPE_SUBTITLE：字幕流

後面幾個參數是指定流特性的，如果從多個音頻流中選擇一個的話可以進行相關設置。此時只有一個視頻流，所以參數設爲-1即可返回默認的媒體流Index，得到這個Index後，接下來可以根據這個Index讀取所需要的流。

bool VideoDecoding::findStreamIndex()
{
    // Find video stream in the file
    mVideoStreamIndex = av_find_best_stream(mFormatCtx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);
    if (mVideoStreamIndex < 0) {
        printf("Could not find stream in input file\n");
        return true;
    }

    return false;
}

配置編解碼器CodecContext

首先使用avcodec_find_decoder函數根據流Index查找相應的解碼器。
然後使用avcodec_alloc_context3函數根據解碼器申請一個CodecContext。
接着根據流數據填充CodecContext各項信息。
最後完成CodecContext初始化操作。

// Initialize the AVCodecContext to use the given AVCodec.
bool VideoDecoding::initCodecContext()
{
    // Find a decoder with a matching codec ID
    AVCodec *dec = avcodec_find_decoder(mFormatCtx->streams[mVideoStreamIndex]->codecpar->codec_id);
    if (!dec) {
        printf("Failed to find codec!\n");
        return true;
    }

    // Allocate a codec context for the decoder
    if (!(mCodecCtx = avcodec_alloc_context3(dec))) {
        printf("Failed to allocate the codec context\n");
        return true;
    }

    // Fill the codec context based on the supplied codec parameters.
    if (avcodec_parameters_to_context(mCodecCtx, mFormatCtx->streams[mVideoStreamIndex]->codecpar) < 0) {
        printf("Failed to copy codec parameters to decoder context!\n");
        return true;
    }

    // Initialize the AVCodecContext to use the given Codec
    if (avcodec_open2(mCodecCtx, dec, NULL) < 0) {
        printf("Failed to open codec\n");
        return true;
    }

    return false;
}

讀取視頻數據進行解碼

這裏有兩個概念：packet和frame。可以簡單地理解爲包packet爲編碼的數據結構，幀frame爲解碼後的數據結構。
使用av_read_frame函數從FormatContext中循環讀取packet，每讀到一個packet先根據流Index判斷是否是需要的媒體流，如果是需要的視頻流就進行下一步解碼操作。
新版本的API裏面編解碼統一使用avcodec_send_packet和avcodec_receive_frame這一對函數對媒體文件進行編解碼操作，實現從packet到frame的相互轉換（解碼和編碼）。此時是解碼，從函數名字可以理解爲向處理器發送一個packet，處理器實現自動解碼，然後再從處理器接收一個解碼後的frame。舊版本APIavcodec_decode_video2這一系列編解碼函數已經棄用了。
這個步驟只進行視頻解碼，解碼後的數據可以進行各種操作。

bool VideoDecoding::readFrameProc()
{
    AVPacket packet;
    AVFrame *frame = av_frame_alloc();
    int tmpW = mFormatCtx->streams[mVideoStreamIndex]->codecpar->width;
    int tmpH = mFormatCtx->streams[mVideoStreamIndex]->codecpar->height;
    char outFile[40] = { 0 };
    sprintf(outFile, "../assets/Sample_%dx%d_yuv420p.yuv", tmpW, tmpH);

    FILE *fd = fopen(outFile, "wb");

    while (int num = av_read_frame(mFormatCtx, &packet) >= 0) {
        // find a video stream
        if (packet.stream_index == mVideoStreamIndex) {
            decodeVideoFrame(&packet, frame, fd);
        }

        av_packet_unref(&packet);
    }

    fclose(fd);

    printf("Generate video files successfully!\nUse ffplay to play the yuv420p raw video.\n");
    printf("ffplay -f rawvideo -pixel_format yuv420p -video_size %dx%d %s.\n", tmpW, tmpH, outFile);

    return false;
}

bool VideoDecoding::decodeVideoFrame(AVPacket *pkt, AVFrame *frame, FILE *fd)
{
    avcodec_send_packet(mCodecCtx, pkt);
    int ret = avcodec_receive_frame(mCodecCtx, frame);
    if (!ret) {

        // 2種保存YUV數據的方式

        // 保存爲未壓縮的YUV視頻文件
        saveYUV(frame, fd);

        // 保存爲PGM灰度圖像文件
        //savePGM(frame);

        printf("."); // program running state
        return false;
    }

    return true;
}

保存解碼後的YUV數據

上一步進行了視頻解碼，要想驗證是否真的解碼成功就要保存YUV數據爲可以查看的格式。可以把每一幀圖像存爲一副圖像，也可以保存爲YUV420p格式視頻文件。

保存爲YUV420p視頻

YUV420視頻格式如下圖所示（引用自維基百科）：

YUV像素個數爲4:1:1，Y分量個數爲圖像尺寸h*w，UV分量個數都是h*w/4。
YUV420p中的字母p表示planar平面模式，即YUV分量按順序排列存儲，還有另外一個YUV420sp，表示UV分量是交錯排列的。
解碼後得到的frame->data結構是一個多維數組，此時data[0] data[1] data[2]分別爲YUV分量的數據。

bool VideoDecoding::saveYUV(AVFrame *frame, FILE *fd)
{
    fwrite(frame->data[0], 1, mCodecCtx->width *mCodecCtx->height, fd);
    fwrite(frame->data[1], 1, mCodecCtx->width*mCodecCtx->height / 4, fd);
    fwrite(frame->data[2], 1, mCodecCtx->width*mCodecCtx->height / 4, fd);
    return false;
}

把每一個frame的未壓縮YUV數據都寫入到一個文件中就是YUV420p格式的原生視頻數據了，可以直接使用FFmpeg中的ffplay命令進行播放，播放的參數爲：ffplay -f rawvideo -pixel_format yuv420p -video_size 1280x534 file.yuv，注意指定其圖像尺寸。

保存爲PGM灰度圖像

PGM(portable graymap format)圖像格式是一種簡單的未經壓縮的灰度圖像格式。用純文本文件打開PGM文件可以看到，文件第一行以字符‘P5’作爲標記，第二行爲寬度和高度，第三行爲灰度值最大值，接下來的內容爲像素灰度數據。
PGM爲灰度圖像，所以這裏只需把解碼後的frame->data[0]所指向的Y分量數據保存到文件即可。

// pgm: Portable Gray Map
bool VideoDecoding::savePGM(AVFrame * frame)
{
    static int frameNum = 0;

    char pgmFile[30];
    sprintf(pgmFile, "../assets/frame%d.pgm", frameNum++);
    FILE *pFile = fopen(pgmFile, "wb");

    fprintf(pFile, "P5\n%d %d\n%d\n", frame->width, frame->height, 255);

    for (int i = 0; i < frame->height; i++) {
        // Y
        fwrite(frame->data[0] + i*frame->linesize[0], 1, mCodecCtx->width, pFile);
    }

    fclose(pFile);

    return false;
}

釋放系統資源

最後不要忘記釋放CodecContext和FormatContext資源，這裏我們可以在析構函數裏面進行釋放。

VideoDecoding::~VideoDecoding()
{
    avcodec_free_context(&mCodecCtx);
    avformat_close_input(&mFormatCtx);
}

示例程序代碼

上述示例的完整代碼可以從Github下載： https://github.com/lmshao/FFmpeg-Basic 。

【FFmpeg 3.x API應用一】視頻解碼

摘要

初始化FFmepg和FormatContext

查詢媒體流序號

配置編解碼器CodecContext

讀取視頻數據進行解碼

保存解碼後的YUV數據

保存爲YUV420p視頻

保存爲PGM灰度圖像

釋放系統資源

示例程序代碼

自學編程兩個月，現在我月入 4 萬元

「實戰應用」如何用圖表控件LightningChart創建2D氣泡圖

Google Chrome驅動程序 124.0.6367.62（正式版本）去哪下載？

搭建AppRTC服務器 (AppRTC+Collider+Coturn) 2019

快速申請 Let's Encrypt 免費SSL證書 / CA證書

Git fork分支後與原倉庫保持同步

C++ 智能指針 shared_ptr 詳解與示例

C++ 智能指針 unique_ptr 介紹與示例

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結