在之前的《EasyRTSPLive高效轉碼之EasyVideoDecoder高效解碼》系列文章中我們已經將視頻解碼成了原始圖像數據(YUV/RGB),然後根據不同的轉碼需求進行編碼。如視頻分辨率縮放,調整碼率,多碼率輸出等;爲了解決轉碼過程中編碼高分辨率高質量或者高壓縮率(如H265)耗時的問題,我們採用Nvidia硬件驅動編碼器進行編碼,以追求最高效率的轉碼和最低的推送延遲。
EasyVideoEncoder基Nvidia獨立顯卡的硬件編碼庫EasyNvEncoder
1. 接口聲明如下:
class EasyNvEncoder
{
public:
//codec: 編碼格式 0=h264, 1=h265/hevc
int InitNvEncoder(int width,int height,int fps=25, int bitrate=4096, int gop=50, int qp=28, int rcMode=/*NV_ENC_PARAMS_RC_2_PASS_QUALITY*/NV_ENC_PARAMS_RC_CONSTQP,
char* encoderPreset = "Default", int codec = 0,int nDeviceType=0, int nDeviceID=0 );
//H264獲取SPS和PPS
int GetSPSAndPPS(unsigned char*sps,long&spslen,unsigned char*pps,long&ppslen);
//H265獲取VPS,SPS和PPS
int GetH265VPSSPSAndPPS(unsigned char*vps, long&vpslen, unsigned char*sps, long&spslen, unsigned char*pps, long&ppslen);
// 編碼InputFormat我們固定爲YUV420PL(I420),可修改爲NV12, YUY2 等等在Init()時進行格式轉換, [12/18/2016 dingshuai]
unsigned char* NvEncodeSync(unsigned char* pYUV420, int inLenth, int& outLenth, bool& bKeyFrame);
//關閉編碼器,停止編碼
int CloseNvEncoder();
};
2. EasyNvEncoder編碼庫調用流程
- 第一步,初始化編碼器及其參數
//初始化編碼器參數
int InitNvEncoder(int width,int height,int fps, int bitrate, int gop,
int qp, int rcMode, char* encoderPreset , int codec, int nDeviceType, int nDeviceID)
{
//初始化設置參數 -- Start
memset(&m_encodeConfig, 0, sizeof(EncodeConfig));
m_encodeConfig.width = width;
m_encodeConfig.height = height;
m_nVArea = width*height;
m_nCheckyuvsize = m_nVArea*3/2;
//編碼器識別的碼率是bps, 但是我們輸入的是kbps, so*1024
m_encodeConfig.bitrate = bitrate*1024;
//多通道編碼優化圖像質量只有在低延遲模式下工作(LOW_LATENCY)
m_encodeConfig.rcMode = rcMode;//NV_ENC_PARAMS_RC_2_PASS_QUALITY
m_encodeConfig.encoderPreset = encoderPreset; //NV_ENC_PARAMS_RC_2_PASS_QUALITY;
//默認指定低延時模式以及圖像的壓縮格式(HQ,HP,LOSSLESS ......)
m_encodeConfig.presetGUID = NV_ENC_PRESET_LOW_LATENCY_HQ_GUID;
// I幀間隔 [12/16/2016 dingshuai]
m_encodeConfig.gopLength = gop;//NVENC_INFINITE_GOPLENGTH;
//CUDA
m_encodeConfig.deviceType = nDeviceType;
m_encodeConfig.deviceID = nDeviceID;
m_encodeConfig.codec = codec;//NV_ENC_H264;
m_encodeConfig.fps = fps;
m_encodeConfig.qp = qp;
m_encodeConfig.i_quant_factor = DEFAULT_I_QFACTOR;
m_encodeConfig.b_quant_factor = DEFAULT_B_QFACTOR;
m_encodeConfig.i_quant_offset = DEFAULT_I_QOFFSET;
m_encodeConfig.b_quant_offset = DEFAULT_B_QOFFSET;
m_encodeConfig.pictureStruct = NV_ENC_PIC_STRUCT_FRAME;
//編碼異步輸出模式, 1-異步 0-同步
m_encodeConfig.enableAsyncMode = 0;
//默認輸入給編碼器的格式爲NV12(所以需要格式轉換:YUV420->NV12)
m_encodeConfig.inputFormat = NV_ENC_BUFFER_FORMAT_NV12;
//暫不知道這些參數什麼用
m_encodeConfig.invalidateRefFramesEnableFlag = 0;
m_encodeConfig.endFrameIdx = INT_MAX;
//沒有B幀,且目前編碼器也不支持B幀,設了也沒用
m_encodeConfig.numB = 0;
if (m_encodeConfig.numB > 0)
{
//PRINTERR("B-frames are not supported\n");
return -1;
}
// 其他參數,歡迎補充...... [12/18/2016 dingshuai]
//
//
//初始化設置參數 -- END
//初始化編碼器 -- Start
NVENCSTATUS nvStatus = NV_ENC_SUCCESS;
switch (m_encodeConfig.deviceType)
{
#if defined(NV_WINDOWS)
case NV_ENC_DX9:
nvStatus = InitD3D9(m_encodeConfig.deviceID);
break;
case NV_ENC_DX10:
nvStatus = InitD3D10(m_encodeConfig.deviceID);
break;
case NV_ENC_DX11:
nvStatus = InitD3D11(m_encodeConfig.deviceID);
break;
#endif
// initialize Cuda
case NV_ENC_CUDA:
InitCuda(m_encodeConfig.deviceID,0);
break;
}
if (nvStatus != NV_ENC_SUCCESS)
return -1;
if (m_encodeConfig.deviceType != NV_ENC_CUDA)
nvStatus = m_pNvHWEncoder->Initialize(m_pDevice, NV_ENC_DEVICE_TYPE_DIRECTX);
else
nvStatus = m_pNvHWEncoder->Initialize(m_pDevice, NV_ENC_DEVICE_TYPE_CUDA);
if (nvStatus != NV_ENC_SUCCESS)
return 1;
//nvStatus = InitCuda(m_encodeConfig.deviceID, 0);
//nvStatus = m_pNvHWEncoder->Initialize((void*)m_cuContext, NV_ENC_DEVICE_TYPE_CUDA);
//if (nvStatus != NV_ENC_SUCCESS)
// return -2;
m_encodeConfig.presetGUID = m_pNvHWEncoder->GetPresetGUID(m_encodeConfig.encoderPreset, m_encodeConfig.codec);
nvStatus = m_pNvHWEncoder->CreateEncoder(&m_encodeConfig);
if (nvStatus != NV_ENC_SUCCESS)
{
Deinitialize();
return -3;
}
// 編碼緩存幀數 [12/16/2016 dingshuai]
uint32_t uEncodeBufferCount = 1;
//分配編碼緩衝區
nvStatus = AllocateIOBuffers(m_pNvHWEncoder->m_uMaxWidth, m_pNvHWEncoder->m_uMaxHeight, uEncodeBufferCount);
if (nvStatus != NV_ENC_SUCCESS)
return -4;
m_spslen = 0;
m_ppslen = 0;
memset(m_sps, 0x00, 100);
memset(m_pps, 0x00, 100);
m_bWorking = true;
return 1;
}
其中,我們需要設置編碼格式(0=H264,1=H265目前只支持這兩種格式),視頻分辨率,幀率,碼率和I幀間隔(Gop),編碼質量以及硬件編碼器相關參數,參數詳解如下:
//rcMode: Rate Control Modes(編碼碼率/質量控制模式),詳見如下枚舉:
// typedef enum _NV_ENC_PARAMS_RC_MODE
// {
// NV_ENC_PARAMS_RC_CONSTQP = 0x0, /**< Constant QP mode */
// NV_ENC_PARAMS_RC_VBR = 0x1, /**< Variable bitrate mode */
// NV_ENC_PARAMS_RC_CBR = 0x2, /**< Constant bitrate mode */
// NV_ENC_PARAMS_RC_VBR_MINQP = 0x4, /**< Variable bitrate mode with MinQP */
// NV_ENC_PARAMS_RC_2_PASS_QUALITY = 0x8, /**< Multi pass encoding optimized for image quality and works only with low latency mode */
// NV_ENC_PARAMS_RC_2_PASS_FRAMESIZE_CAP = 0x10, /**< Multi pass encoding optimized for maintaining frame size and works only with low latency mode */
// }
//encoderPreset: 編碼預設
// 預設編碼器編碼圖像的延時和清晰度
// if (encoderPreset && (stricmp(encoderPreset, "HQ") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "LowLatencyHP") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "HP") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "LowLatencyHQ") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "BD") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "LOSSLESS") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "LowLatencyDefault") == 0))
// else if (encoderPreset && (stricmp(encoderPreset, "LosslessDefault") == 0))
// 詳見nvEncoderAPI.h /* Preset GUIDS supported by the NvEncodeAPI interface. */
- 第二步,獲取編碼信息參數
如果編碼格式爲H264,我們通過GetSPSAndPPS獲取編碼信息頭SPS和PPS,如下代碼段所示:
//獲取SPS和PPS
int GetSPSAndPPS(unsigned char*sps,long&spslen,unsigned char*pps,long&ppslen)
{
if (!m_bWorking)
{
return -1;
}
if (m_spslen == 0 || m_ppslen == 0)
{
unsigned char* pEncData = NULL;
int nDataSize = 0;
bool bKeyFrame = false;
unsigned char* pTempBuffer = new unsigned char[m_nCheckyuvsize];
memset(pTempBuffer, 0x00, m_nCheckyuvsize);
pEncData = NvEncodeSync(pTempBuffer, m_nCheckyuvsize, nDataSize, bKeyFrame);
if (pEncData && nDataSize>0)
{
GetH264SPSandPPS((char*)pEncData, nDataSize, (char*)m_sps, (int*)&m_spslen, (char*)m_pps, (int*)&m_ppslen);
}
m_encPicCommand.bForceIDR = 1;
if (pTempBuffer)
{
delete[] pTempBuffer;
pTempBuffer = NULL;
}
}
if (m_spslen>0&&m_ppslen>0)
{
memcpy(sps, m_sps, m_spslen);
memcpy(pps, m_pps, m_ppslen);
spslen = m_spslen;
ppslen = m_ppslen;
}
return 1;
}
如果編碼格式爲H265,我們通過GetH265VPSSPSAndPPS獲取編碼信息頭VPS,SPS和PPS,如下代碼段所示:
int GetH265VPSSPSAndPPS(unsigned char*vps, long&vpslen, unsigned char*sps,
long&spslen, unsigned char*pps, long&ppslen)
{
if (!m_bWorking)
{
return -1;
}
if (m_spslen == 0 || m_ppslen == 0)
{
unsigned char* pEncData = NULL;
int nDataSize = 0;
bool bKeyFrame = false;
unsigned char* pTempBuffer = new unsigned char[m_nCheckyuvsize];
memset(pTempBuffer, 0x00, m_nCheckyuvsize);
pEncData = NvEncodeSync(pTempBuffer, m_nCheckyuvsize, nDataSize, bKeyFrame);
if (pEncData && nDataSize>0)
{
GetH265VPSandSPSandPPS((char*)pEncData, nDataSize, (char*)m_vps, (int*)&m_vpslen, (char*)m_sps, (int*)&m_spslen, (char*)m_pps, (int*)&m_ppslen);
}
m_encPicCommand.bForceIDR = 1;
if (pTempBuffer)
{
delete[] pTempBuffer;
pTempBuffer = NULL;
}
}
spslen = m_spslen;
ppslen = m_ppslen;
vpslen = m_vpslen;
if (m_spslen > 0)
memcpy(sps, m_sps, m_spslen);
if(m_ppslen>0)
memcpy(pps, m_pps, m_ppslen);
if(m_vpslen)
memcpy(vps, m_vps, m_vpslen);
return 1;
}
- 第三步,調用編碼函數進行視頻幀編碼
編碼輸入格式InputFormat我們固定爲YUV420PL(I420),如源圖像色彩格式爲NV12, YUY2 等,需要在傳入編碼器時進行格式轉換。
unsigned char* NvEncodeSync(unsigned char* pYUV420, int inLenth, int& outLenth, bool& bKeyFrame)
{
if( !m_bWorking || inLenth !=m_nCheckyuvsize)//初始化尚未完成,或者傳入的數據不滿足YUV數據的長度,則返回錯誤
{
outLenth = 0;
return NULL;
}
NVENCSTATUS nvStatus = NV_ENC_SUCCESS;
bool bError = false;
EncodeBuffer* pEncodeBuffer = m_EncodeBufferQueue.GetAvailable();
EncodeFrameConfig stEncodeFrame;
memset(&stEncodeFrame, 0, sizeof(stEncodeFrame));
stEncodeFrame.yuv[0] = pYUV420;//Y
stEncodeFrame.yuv[1] = pYUV420+m_nVArea;//U
stEncodeFrame.yuv[2] = pYUV420+m_nVArea+(m_nVArea>>2);//V
int nHelfWidth = m_encodeConfig.width >> 1;
stEncodeFrame.stride[0] = m_encodeConfig.width;
stEncodeFrame.stride[1] = nHelfWidth;
stEncodeFrame.stride[2] = nHelfWidth;
stEncodeFrame.width = m_encodeConfig.width;
stEncodeFrame.height = m_encodeConfig.height;
if (m_encodeConfig.deviceType == 0)//CUDA
{
//CUDA Lock
CCudaAutoLock cuLock((CUcontext)m_pDevice);//m_cuContext
nvStatus = PreProcessInput(pEncodeBuffer, stEncodeFrame.yuv, stEncodeFrame.width, stEncodeFrame.height,
m_pNvHWEncoder->m_uCurWidth, m_pNvHWEncoder->m_uCurHeight,
m_pNvHWEncoder->m_uMaxWidth, m_pNvHWEncoder->m_uMaxHeight);
if (nvStatus != NV_ENC_SUCCESS)
{
outLenth = 0;
return NULL;
}
nvStatus = m_pNvHWEncoder->NvEncMapInputResource(pEncodeBuffer->stInputBfr.nvRegisteredResource, &pEncodeBuffer->stInputBfr.hInputSurface);
if (nvStatus != NV_ENC_SUCCESS)
{
PRINTERR("Failed to Map input buffer %p\n", pEncodeBuffer->stInputBfr.hInputSurface);
bError = true;
outLenth = 0;
return NULL;
}
}
else//DirectX or any others
{
unsigned char *pInputSurface = NULL;
uint32_t lockedPitch = 0;
while (pInputSurface == NULL)
{
nvStatus = m_pNvHWEncoder->NvEncLockInputBuffer(pEncodeBuffer->stInputBfr.hInputSurface, (void**)&pInputSurface, &lockedPitch);
if (nvStatus != NV_ENC_SUCCESS)
return NULL;
if (pInputSurface == NULL)
{
nvStatus = m_pNvHWEncoder->NvEncUnlockInputBuffer(pEncodeBuffer->stInputBfr.hInputSurface);
if (nvStatus != NV_ENC_SUCCESS)
return NULL;
Sleep(1);
}
}
if (pEncodeBuffer->stInputBfr.bufferFmt == NV_ENC_BUFFER_FORMAT_NV12_PL)
{
unsigned char *pInputSurfaceCh = pInputSurface + (pEncodeBuffer->stInputBfr.dwHeight*lockedPitch);
CmnConvertYUVtoNV12(stEncodeFrame.yuv[0], stEncodeFrame.yuv[1], stEncodeFrame.yuv[2], pInputSurface,
pInputSurfaceCh, stEncodeFrame.width, stEncodeFrame.height, stEncodeFrame.width, lockedPitch);
}
}
nvStatus = m_pNvHWEncoder->NvEncEncodeFrame(pEncodeBuffer, &m_encPicCommand, m_encodeConfig.width, m_encodeConfig.height,
NV_ENC_PIC_STRUCT_FRAME, m_qpDeltaMapArray, m_qpDeltaMapArraySize);
if (nvStatus != NV_ENC_SUCCESS)
{
bError = true;
outLenth= 0;
return NULL;
}
pEncodeBuffer = m_EncodeBufferQueue.GetAvailable();
if (!pEncodeBuffer)
{
pEncodeBuffer = m_EncodeBufferQueue.GetPending();
// 獲取編碼的h264/h265數據 [12/15/2016 dingshuai]
nvStatus = m_pNvHWEncoder->ProcessOutput(pEncodeBuffer, m_pOutputBuffer, m_nOutputBufLen);
if(nvStatus != NV_ENC_SUCCESS)
{
bError = true;
outLenth= 0;
}
if (m_encodeConfig.deviceType == 0)//CUDA
{
// UnMap the input buffer after frame done
if (pEncodeBuffer->stInputBfr.hInputSurface)
{
nvStatus = m_pNvHWEncoder->NvEncUnmapInputResource(pEncodeBuffer->stInputBfr.hInputSurface);
pEncodeBuffer->stInputBfr.hInputSurface = NULL;
}
//pEncodeBuffer = m_EncodeBufferQueue.GetAvailable();
}
else
{
nvStatus = m_pNvHWEncoder->NvEncUnlockInputBuffer(pEncodeBuffer->stInputBfr.hInputSurface);
if (nvStatus != NV_ENC_SUCCESS)
return NULL;
}
}
else
{
outLenth= 0;
return NULL;
}
if (m_encPicCommand.bForceIDR)
{
m_encPicCommand.bForceIDR = 0;
}
outLenth = m_nOutputBufLen;
return m_pOutputBuffer;
}
- 第四步,關閉編碼器,釋放編碼器申請的內存和顯卡資源
int CloseNvEncoder()
{
m_bWorking = false;
NVENCSTATUS nvStatus = NV_ENC_SUCCESS;
ReleaseIOBuffers();
m_pNvHWEncoder->NvEncDestroyEncoder();
if (m_cuContext)
{
__cu(cuCtxDestroy(m_cuContext));
}
return nvStatus;
}
有任何技術問題,歡迎大家和我技術交流:
[email protected]
大家也可以加入EasyPlayer流媒體播放器 QQ羣進行討論:
544917793