最近在做一個視頻分析相關的產品,基本架構就是使用ffmpeg取流,cuda解碼,然後調用算法進行分析,生成圖片。但產品做完之後,發現生成的圖片存在花屏問題。起初沒有太在意,因爲rtsp視頻流底層使用的是udp協議,丟個一兩幀數據,造成花屏是件再正常不過的事情(但忽略了這是在局域網內)。況且,已經將將解碼和取流分開,做了一級緩衝,再優化的空間實在不是很大,再加上時間緊,實在抽不出時間來解決該問題。
但隨之發現的一個bug,是我不得不重視這個問題,那就是——統一路視頻流的多個實例,最後檢測出的目標對象不同,而且差距很挺大,如下:
原因很明顯,每個實例丟掉的數據不一樣,導致了其分析的數據內容不一樣,因此,最後檢測出的結果也不一樣。
該如何解決?首先想到的就是增大緩存。因爲爲了對本地歷史文件做流控,只對每個實例做了2秒的緩存。於是,我一次又一次增大緩存,從緩存2s增加到緩存50秒,仍然丟幀,出現下面的問題:
很無奈,很崩潰!於是開始百度“ffmpeg取流 丟幀”,到最後發現好解決方案都是將接收協議改爲tcp,然後增加socket接收緩衝區(忘記了,以前自己寫接收發送視頻或圖片的代碼,都會先調用setsockopt增大socket的緩衝區),代碼如下:
AVDictionary* options = NULL;
av_dict_set(&options, "rtsp_transport", "tcp", 0); //強制使用tcp,udp在1080p下會丟包導致花屏
av_dict_set(&options, " max_delay", " 5000000", 0); //強制使用tcp,udp在1080p下會丟包導致花屏
av_dict_set(&options, "buffer_size", "8388608", 0); //設置udp的接收緩衝
考慮到在網絡不好的情況下,TCP延遲可能會很大(這種情況下只能udp接收丟包),故還是採用UDP協議,即不設置rtsp_transport字段。修改代碼後,問題解決,但ffmpeg又拋出下面的日誌:
“attempted to set receive buffer to size 8388608 but it only ended up set as 425984”是什麼鬼?會有潛在問題麼?於是準備查看ffmpeg源碼,查找問題的根源,但又不知從何找起。首先查看的是av_dict_set(dict.c)源碼,發現這裏面根本沒有buffer_size的影子,後來查看options_table.h也沒有buffer_size的影子。於是有百度avformat_open_input的代碼,但最後也不了了之,都絕望了。最後通過百度“ffmpeg buffer_size 最大值”終於找出了蛛絲馬跡,Set RTSP/UDP buffer size in FFmpeg/LibAV發現這些內容有可能libavformat目錄下的udp.c中。
於是打開udp.c,相關代碼片段果然在該文件:
static const AVOption options[] = {
{ "buffer_size", "System data size (in bytes)", OFFSET(buffer_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, .flags = D|E },
{ "bitrate", "Bits to send per second", OFFSET(bitrate), AV_OPT_TYPE_INT64, { .i64 = 0 }, 0, INT64_MAX, .flags = E },
{ "burst_bits", "Max length of bursts in bits (when using bitrate)", OFFSET(burst_bits), AV_OPT_TYPE_INT64, { .i64 = 0 }, 0, INT64_MAX, .flags = E },
{ "localport", "Local port", OFFSET(local_port), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, D|E },
{ "local_port", "Local port", OFFSET(local_port), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, .flags = D|E },
{ "localaddr", "Local address", OFFSET(localaddr), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E },
{ "udplite_coverage", "choose UDPLite head size which should be validated by checksum", OFFSET(udplite_coverage), AV_OPT_TYPE_INT, {.i64 = 0}, 0, INT_MAX, D|E },
{ "pkt_size", "Maximum UDP packet size", OFFSET(pkt_size), AV_OPT_TYPE_INT, { .i64 = 1472 }, -1, INT_MAX, .flags = D|E },
{ "reuse", "explicitly allow reusing UDP sockets", OFFSET(reuse_socket), AV_OPT_TYPE_BOOL, { .i64 = -1 }, -1, 1, D|E },
{ "reuse_socket", "explicitly allow reusing UDP sockets", OFFSET(reuse_socket), AV_OPT_TYPE_BOOL, { .i64 = -1 }, -1, 1, .flags = D|E },
{ "broadcast", "explicitly allow or disallow broadcast destination", OFFSET(is_broadcast), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, E },
{ "ttl", "Time to live (multicast only)", OFFSET(ttl), AV_OPT_TYPE_INT, { .i64 = 16 }, 0, INT_MAX, E },
{ "connect", "set if connect() should be called on socket", OFFSET(is_connected), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, .flags = D|E },
{ "fifo_size", "set the UDP receiving circular buffer size, expressed as a number of packets with size of 188 bytes", OFFSET(circular_buffer_size), AV_OPT_TYPE_INT, {.i64 = 7*4096}, 0, INT_MAX, D },
{ "overrun_nonfatal", "survive in case of UDP receiving circular buffer overrun", OFFSET(overrun_nonfatal), AV_OPT_TYPE_BOOL, {.i64 = 0}, 0, 1, D },
{ "timeout", "set raise error timeout (only in read mode)", OFFSET(timeout), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, INT_MAX, D },
{ "sources", "Source list", OFFSET(sources), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E },
{ "block", "Block list", OFFSET(block), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = D|E },
{ NULL }
};
並且,打印日誌的代碼片段也在該文件:
if (is_output) {
/* limit the tx buf size to limit latency */
tmp = s->buffer_size;
if (setsockopt(udp_fd, SOL_SOCKET, SO_SNDBUF, &tmp, sizeof(tmp)) < 0) {
log_net_error(h, AV_LOG_ERROR, "setsockopt(SO_SNDBUF)");
goto fail;
}
} else {
/* set udp recv buffer size to the requested value (default 64K) */
tmp = s->buffer_size;
if (setsockopt(udp_fd, SOL_SOCKET, SO_RCVBUF, &tmp, sizeof(tmp)) < 0) {
log_net_error(h, AV_LOG_WARNING, "setsockopt(SO_RECVBUF)");
}
len = sizeof(tmp);
if (getsockopt(udp_fd, SOL_SOCKET, SO_RCVBUF, &tmp, &len) < 0) {
log_net_error(h, AV_LOG_WARNING, "getsockopt(SO_RCVBUF)");
} else {
av_log(h, AV_LOG_DEBUG, "end receive buffer size reported is %d\n", tmp);
if(tmp < s->buffer_size)
av_log(h, AV_LOG_WARNING, "attempted to set receive buffer to size %d but it only ended up set as %d", s->buffer_size, tmp);
}
/* make the socket non-blocking */
ff_socket_nonblock(udp_fd, 1);
}
很明顯,ffmpeg是先通過setsockopt設置socket接收緩衝區,然後又通過getsockopt獲取socket接收緩衝區,來確認屬性是否設置成功。但發現獲取的接收緩衝區值小於設置的值,因此拋出了警告。具體爲什麼失敗,請參考文章《socket tcp緩衝區大小的默認值、最大值 》。其實失敗的原因說白了就是,設置的值,超過了系統允許的上限(ubuntu 16.04 允許的最大值爲208KB,然後*2的416KB,即425984)。
自己始終感覺,爲每個實例設置50秒的緩存,實在太大了,於是將緩存改成5秒,但問題又出現了:
經過百度,發現“jitter buffer”就是防抖緩衝區,很明顯是自己的程序某些部分處理滿了,於是加大緩衝區,6,7,8.......20,還是如此,一氣之下又改回了50時。但是該種方式僅僅能從一定程度上緩解“jitter buffer full”的情況,如果想完全解決,只能通過av_dict_set設置reorder_queque_size(rtp包接收重排序隊列大小),該值默認爲500,可以根據具體情況調整,解決“jitter buffer full”的情況。