FFmpeg開發之旅(三)---理解過濾圖並使用字幕過濾器

【寫在前面】

首先，拋開字幕本身的格式不說。

一般的字幕分三種，內封字幕、內嵌字幕和外掛字幕。

而本篇所講的是外掛字幕，主要內容有：

1、FFmpeg過濾圖基礎。

2、使用FFmpeg字幕過濾器添加字幕。

【正文開始】

前面提到，字幕有三種形式：

內封字幕：字幕封裝在容器中，成爲字幕流。

內嵌字幕：字幕嵌入視頻，即成爲視頻圖像的一部分。

外掛字幕：字幕以文件形式提供，通常是srt，ssa，ass格式。

其中外掛字幕最爲靈活，並且不會對視頻造成影響。

實際上，在FFmpeg中添加字幕相當容易，它使用視頻疊加( overlay )的技術來實現。

在其內部，會使用 libass 將字幕渲染成位圖 ( Bitmap )，然後將位圖覆蓋在視頻上。

而對於我們來說，無需關心內部細節，FFmpeg提供了非常簡單方式進行疊加字幕，當然，還有一些坑。

首先，先介紹一下 Filter、Filter Chain、Filter Graph ：

Filter 是過濾器、濾鏡、濾波器，而在FFmpeg中，它們通常代表了對應的一些算法。

簡單來說，使用 Filter 就是使用一些算法將數據進行處理的過程。

多個 Filer 進行鏈接形成過濾鏈( Filter Chain )，而多個過濾鏈組合形成過濾圖。

FFmpeg有很多 Filter，這裏我們只需使用字幕過濾器( Subtitle Filter ) ，並且只有一條過濾鏈。

使用 Filter 之前，我們先要創建它：

bool SubtitleDecoder::init_subtitle_filter(AVFilterContext * &buffersrcContext, AVFilterContext * &buffersinkContext,
                                           QString args, QString filterDesc)
{
    const AVFilter *buffersrc = avfilter_get_by_name("buffer");
    const AVFilter *buffersink = avfilter_get_by_name("buffersink");
    AVFilterInOut *output = avfilter_inout_alloc();
    AVFilterInOut *input = avfilter_inout_alloc();
    AVFilterGraph *filterGraph = avfilter_graph_alloc();

    auto release = [&output, &input] {
        avfilter_inout_free(&output);
        avfilter_inout_free(&input);
    };

    if (!output || !input || !filterGraph) {
        release();
        return false;
    }

    //創建輸入過濾器，需要arg
    if (avfilter_graph_create_filter(&buffersrcContext, buffersrc, "in",
                                     args.toStdString().c_str(), nullptr, filterGraph) < 0) {
        qDebug() << "Has Error: line =" << __LINE__;
        release();
        return false;
    }

    if (avfilter_graph_create_filter(&buffersinkContext, buffersink, "out",
                                     nullptr, nullptr, filterGraph) < 0) {
        qDebug() << "Has Error: line =" << __LINE__;
        release();
        return false;
    }

    output->name = av_strdup("in");
    output->next = nullptr;
    output->pad_idx = 0;
    output->filter_ctx = buffersrcContext;

    input->name = av_strdup("out");
    input->next = nullptr;
    input->pad_idx = 0;
    input->filter_ctx = buffersinkContext;

    if (avfilter_graph_parse_ptr(filterGraph, filterDesc.toStdString().c_str(),
                                 &input, &output, nullptr) < 0) {
        qDebug() << "Has Error: line =" << __LINE__;
        release();
        return false;
    }

    if (avfilter_graph_config(filterGraph, nullptr) < 0) {
        qDebug() << "Has Error: line =" << __LINE__;
        release();
        return false;
    }

    release();
    return true;
}

1、使用 avfilter_get_by_name() 獲取一個Filter。

buffer是特殊的視頻過濾器，稱爲緩衝源，它沒有輸入，對應的音頻過濾器 abuffer，創建緩衝源需要[ arg ]。

buffersink 是特殊的視頻過濾器，稱爲緩存槽，它沒有輸出，對應的音頻過濾器 abuffersink 。

大概這樣：[ buffer ] + |--------Filter Graph--------| + [ buffersink ]

2、使用 avfilter_inout_alloc() 分配兩個AVFilterInOut，因爲這裏是簡單Filter，所以只使用兩個( in，out )。

3、使用 avfilter_graph_alloc() 分配一個過濾圖。

4、使用 avfilter_graph_create_filter() 創建Filter實例( 實例是 AVFilterContext )，並將其添加到過濾圖中，其中，緩衝源的 args 爲：

    QString args = QString::asprintf("video_size=%dx%d:pix_fmt=%d:time_base=%d/%d:pixel_aspect=%d/%d",
                                     m_width, m_height, codecContext->pix_fmt, time_base.num, time_base.den,
                                     codecContext->sample_aspect_ratio.num, codecContext->sample_aspect_ratio.den);

5、配置過濾鏈的輸入輸出，爲了幫助理解，我花了張圖：

|--------------Filter Graph--------------|
out in
[data frame] ==> (input)|buffersrc| => |Filter| => (output)|buffersink|

6、使用 avfilter_graph_parse_ptr() 將字符串描述的過濾圖添加到過濾圖中。

提示：本篇所用的字幕過濾器對應的 ffmpeg 命令爲：

./ffmpeg -i test.mp4 -vf "subtitles=filename='D\:\\test.ass':original_size=900x600" out.mp4

其中，"subtitles=filename='D\:\\test.ass':original_size=900x600" 正是字符串描述的過濾圖，當然，這裏很簡單。

字幕相關的過濾器有兩個：subtitles 和 ass，但 ass 只支持 ANSI 和 UTF-8。

注意：路徑格式是D\:\\test.ass，: 前面有一個 \ ，因爲 : 有其他用途，所以需要轉義。

7、最後，使用 avfilter_graph_config() 檢查有效性並配置過濾圖中的所有鏈接和格式。

至此，Filter 就創建完成了。

現在我們要使用 Filter，代碼看起來略多，實際上很簡單：

    while (m_runnable && av_read_frame(formatContext, packet) >= 0) {
        if (packet->stream_index == videoIndex) {
            //發送給解碼器
            int ret = avcodec_send_packet(codecContext, packet);

            while (ret >= 0) {
                //從解碼器接收解碼後的幀
                ret = avcodec_receive_frame(codecContext, frame);

                if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break;
                else if (ret < 0) goto Run_End;

                //如果字幕成功打開，則輸出使用subtitle filter過濾後的圖像
                if (subtitleOpened) {
                    if (av_buffersrc_add_frame_flags(buffersrcContext, frame, AV_BUFFERSRC_FLAG_KEEP_REF) < 0)
                        break;

                    while (true) {
                        ret = av_buffersink_get_frame(buffersinkContext, filter_frame);

                        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break;
                        else if (ret < 0) goto Run_End;

                        int dst_linesize[4];
                        uint8_t *dst_data[4];
                        av_image_alloc(dst_data, dst_linesize, m_width, m_height, AV_PIX_FMT_RGB24, 1);
                        SwsContext *swsContext = sws_getContext(filter_frame->width, filter_frame->height,
                                                                AVPixelFormat(filter_frame->format), m_width,
                                                                m_height, AV_PIX_FMT_RGB24, SWS_BILINEAR, nullptr, nullptr, nullptr);
                        sws_scale(swsContext, filter_frame->data, filter_frame->linesize, 0, filter_frame->height, dst_data, dst_linesize);
                        sws_freeContext(swsContext);
                        QImage image = QImage(dst_data[0], m_width, m_height, QImage::Format_RGB888).copy();
                        av_freep(&dst_data[0]);

                        m_frameQueue.enqueue(image);
                        av_frame_unref(filter_frame);
                    }
                } else {
                    //未找到字幕，直接輸出圖像
                    if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break;
                    else if (ret < 0) goto Run_End;

                    int dst_linesize[4];
                    uint8_t *dst_data[4];
                    av_image_alloc(dst_data, dst_linesize, m_width, m_height, AV_PIX_FMT_RGB24, 1);
                    SwsContext *swsContext = sws_getContext(m_width, m_height, codecContext->pix_fmt, m_width, m_height, AV_PIX_FMT_RGB24,
                                                            SWS_BILINEAR, nullptr, nullptr, nullptr);
                    sws_scale(swsContext, frame->data, frame->linesize, 0, frame->height, dst_data, dst_linesize);
                    sws_freeContext(swsContext);
                    QImage image = QImage(dst_data[0], m_width, m_height, QImage::Format_RGB888).copy();
                    av_freep(&dst_data[0]);

                    m_frameQueue.enqueue(image);

                }
                av_frame_unref(frame);
            }
        }

        av_packet_unref(packet);
    }

1、我們從解碼器獲得一幀 AVFrame 後，使用 av_buffersrc_add_frame_flags() 添加到緩衝源中。

2、使用 av_buffersink_get_frame() 從緩存槽獲取一幀過濾過的數據，而我們的過濾就是添加字幕，因此，這一幀就是添加了數據的圖像。

至此，我們就完成了使用字幕過濾器給視頻添加字幕的工作。

效果圖如下：

【結語】

呼~終於寫完了。。本篇文章不僅講了 FFmpeg 的 Filter Graph 的基本使用方法，還介紹了字幕過濾器的使用方法( 命令和API )。

然後我要吐槽 FFmpeg Filter 資料是真的少，官方的例子也是被鞭屍了一遍又一遍(全是copy官方例子，還沒有多少講解)。。而且字幕相關的基本沒有(當然國外也幾乎沒有)，所以寫起來也是異常艱難。

最後，附上項目地址：

Github的：https://github.com/mengps/FFmpeg-Learn 。

CSDN的：https://download.csdn.net/download/u011283226/11819233 包含一個ass和mp4文件便於測試。

FFmpeg開發之旅(三)---理解過濾圖並使用字幕過濾器

【寫在前面】

【正文開始】

【結語】

Qml實現簡易版Qt Linguist(語言家) & QXmlStreamReader / QXmlStreamWriter 的使用方法

Qml中實現多視圖，多圖像源(QImage / QPixmap)

FFmpeg開發之旅(四)---全字幕解碼

Qt中的那些坑(二)---qDebug和QString中的轉義字符

FFmpeg開發之旅(零)---環境搭建

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結