VCU Linux驅動轉裸機驅動
前言
上一篇說到了上層函數調用硬件驅動,驅動文件libCommon/HardwareDriver.c中,這一片講一下C++上層控制邏輯,注意大部分數據傳輸都是調用的的PostMessage~
開始
系統配置信息
1、首先進入主函數:
ConfigFile cfg;
SetDefaults(cfg);
auto& FileInfo = cfg.FileInfo;
auto& Settings = cfg.Settings;
auto& StreamFileName = cfg.BitstreamFileName;
auto& RecFileName = cfg.RecFileName;
auto& RunInfo = cfg.RunInfo;
ParseCommandLine(argc, argv, cfg);
DisplayVersionInfo();
設置默認參數,然後我們看一下config配置了啥
/*************************************************************************//*!
\brief Whole configuration file
*****************************************************************************/
AL_INTROSPECT(category = "debug") struct ConfigFile
{
// \brief YUV input file name(s) 輸入文件名
std::string YUVFileName;
// \brief Output bitstream file name 輸出文件名字
std::string BitstreamFileName;
// \brief Reconstructed YUV output file name
std::string RecFileName;
// \brief Name of the file specifying the frame numbers where scene changes
// happen 命令行配置文件
std::string sCmdFileName;
// \brief Name of the file specifying the region of interest per frame is specified
// happen
std::string sRoiFileName;
// \brief Folder where qp tables files are located, if load qp enabled.
std::string sQPTablesFolder;
#if AL_ENABLE_TWOPASS
// \brief Name of the file that reads/writes video statistics for TwoPassMode
std::string sTwoPassFileName;
#endif
// \brief Information relative to YUV input file (from section INPUT) 文件信息
TYUVFileInfo FileInfo;
// \brief FOURCC Code of the reconstructed picture output file
TFourCC RecFourCC;//fourcc code
// \brief Sections RATE_CONTROL and SETTINGS
AL_TEncSettings Settings;
// \brief Section RUN
TCfgRunInfo RunInfo;
// \brief control the strictness when parsing the configuration file
bool strict_mode;
};
其中runinfo如下:
typedef AL_INTROSPECT (category = "debug") struct tCfgRunInfo
{
bool bUseBoard;
SCHEDULER_TYPE iSchedulerType;
bool bLoop;//循環編碼
int iMaxPict;//最大的編碼幀數
unsigned int iFirstPict;//第一張幀數
unsigned int iScnChgLookAhead;
std::string sMd5Path;
int eVQDescr;
IpCtrlMode ipCtrlMode;
std::string logsFile = "";
bool trackDma = false;
bool printPictureType = false;
AL_64U uInputSleepInMilliseconds;
}TCfgRunInfo;
全局默認設置信息,配置信息見encode_example.cfg,說的比較清楚,這裏把我自己覺得比較重要的放在裏面:
void SetDefaults(ConfigFile& cfg)
{
cfg.BitstreamFileName = "Stream.bin";//默認輸出stram.bin
cfg.RecFourCC = FOURCC(NULL);
AL_Settings_SetDefaults(&cfg.Settings);
cfg.FileInfo.FourCC = FOURCC(I420);
cfg.FileInfo.FrameRate = 0;//0fps
cfg.FileInfo.PictHeight = 0;//0 pixel height
cfg.FileInfo.PictWidth = 0;//0 width width
cfg.RunInfo.bUseBoard = true;
cfg.RunInfo.iSchedulerType = SCHEDULER_TYPE_MCU;//mcu 控制
# Loop : specifies whether the encoder should loop back to the beginning of the YUV input stream when it reaches the end of the file
cfg.RunInfo.bLoop = false;//關閉迴環編碼
cfg.RunInfo.iMaxPict = INT_MAX; // ALL
cfg.RunInfo.iFirstPict = 0;//第一張是0frame
cfg.RunInfo.iScnChgLookAhead = 3;
cfg.RunInfo.ipCtrlMode = IPCTRL_MODE_STANDARD;
cfg.RunInfo.uInputSleepInMilliseconds = 0;
cfg.strict_mode = false;
}
然後編碼器設置信息
typedef AL_INTROSPECT (category = "debug") struct t_EncSettings
{
// Stream
AL_TEncChanParam tChParam[MAX_NUM_LAYER];
bool bEnableAUD;
bool bEnableFillerData;
uint32_t uEnableSEI;
AL_EAspectRatio eAspectRatio; /*!< specifies the display aspect ratio */
AL_EColourDescription eColourDescription;
AL_EScalingList eScalingList;
bool bDependentSlice;
bool bDisIntra;
bool bForceLoad;
int32_t iPrefetchLevel2;
uint16_t uClipHrzRange;
uint16_t uClipVrtRange;
AL_EQpCtrlMode eQpCtrlMode;
int NumView;
int NumLayer;
uint8_t ScalingList[4][6][64];
uint8_t SclFlag[4][6];
uint8_t DcCoeff[8];
uint8_t DcCoeffFlag[8];
bool bEnableWatchdog;
#if AL_ENABLE_TWOPASS
int LookAhead;
int TwoPass;
#endif
}AL_TEncSettings;
設置默認函數在下面,這裏面的東西比較細,然後在一張配置文件裏面講的比較詳細,然後直接在裏面給註釋好了,注意帶#只能在腳本中用,這裏不一樣的
void AL_Settings_SetDefaults(AL_TEncSettings* pSettings)
{
assert(pSettings);
Rtos_Memset(pSettings, 0, sizeof(*pSettings));
# Width, Height: frame width/height in pixels
# width and height shall be multiple of 8 pixels
pSettings->tChParam[0].uWidth = 0;
pSettings->tChParam[0].uHeight = 0;
# Profile : specifies the standard/profile to which the bitstream conforms
# allowed values : AVC_BASELINE, AVC_MAIN, AVC_HIGH, AVC_HIGH10, AVC_HIGH_422,
# HEVC_MAIN, HEVC_MAIN10, HEVC_MAIN_422_10...
pSettings->tChParam[0].eProfile = AL_PROFILE_HEVC_MAIN;
pSettings->tChParam[0].uLevel = 51;//最高53,越高編碼性能越好
pSettings->tChParam[0].uTier = 0; // MAIN_TIER
pSettings->tChParam[0].eOptions = AL_OPT_LF | AL_OPT_LF_X_SLICE | AL_OPT_LF_X_TILE;
pSettings->tChParam[0].eOptions |= AL_OPT_RDO_COST_MODE;
# BitDepth : specifies the bit depth of the luma and chroma samples in the encoded stream
# Format : FOURCC format of input file
# typical file formats : I420, I422, I0AL, I2AL...
# hardware supported formats : NV12, NV16, P010, P210... (depends of the hw ip)
pSettings->tChParam[0].ePicFormat = AL_420_8BITS;
pSettings->tChParam[0].uSrcBitDepth = 8;
# GopCtrlMode : specifies the Group Of Pictures configuration
# allowed values : DEFAULT_GOP, LOW_DELAY_P, LOW_DELAY_B, PYRAMIDAL_GOP
# default value : DEFAULT_GOP
pSettings->tChParam[0].tGopParam.eMode = AL_GOP_MODE_DEFAULT;
pSettings->tChParam[0].tGopParam.uFreqIDR = 0x7FFFFFFF;
# Gop.Length : GOP length in frames including the I picture. 0 = Intra only,這裏比較重要,我需要說一下,h265視頻壓縮時可以只用幀內壓縮,此時跟圖像壓縮一樣了,這裏就是設置編碼多少幅圖像再次用一張關鍵幀,設置爲0就是關閉幀內壓縮,這裏最好用25,因爲25幀人眼就看不出啥來,壓縮圖像還是0
#Gop.FreqIDR : minimum number of frames between two IDR pictures (IDR insertion depends on the position of the GOP boundary)
# allowed values : positive value or -1 to disable IDR insertion
pSettings->tChParam[0].tGopParam.uGopLength = 30;
pSettings->tChParam[0].tGopParam.eGdrMode = AL_GDR_OFF;
AL_Settings_SetDefaultRCParam(&pSettings->tChParam[0].tRCParam);
pSettings->tChParam[0].iTcOffset = -1;
pSettings->tChParam[0].iBetaOffset = -1;
pSettings->tChParam[0].eColorSpace = UNKNOWN;
# NumSlices : number of row-based slices used for each frame
pSettings->tChParam[0].uNumCore = NUMCORE_AUTO;
pSettings->tChParam[0].uNumSlices = 1;
pSettings->uEnableSEI = SEI_NONE;
pSettings->bEnableAUD = true;
pSettings->bEnableFillerData = true;
pSettings->eAspectRatio = AL_ASPECT_RATIO_AUTO;
pSettings->eColourDescription = COLOUR_DESC_BT_470_PAL;
# QPCtrlMode : specifies how to generate the QP per coding unit
# allowed values : UNIFORM_QP, AUTO_QP, LOAD_QP, LOAD_QP | RELATIVE_QP
# default value : UNIFORM_QP
pSettings->eQpCtrlMode = UNIFORM_QP;// ADAPTIVE_AUTO_QP;
pSettings->tChParam[0].eLdaCtrlMode = AUTO_LDA;
pSettings->eScalingList = AL_SCL_DEFAULT;
pSettings->bForceLoad = true;
pSettings->tChParam[0].pMeRange[SLICE_P][0] = -1; // Horz
pSettings->tChParam[0].pMeRange[SLICE_P][1] = -1; // Vert
pSettings->tChParam[0].pMeRange[SLICE_B][0] = -1; // Horz
pSettings->tChParam[0].pMeRange[SLICE_B][1] = -1; // Vert
pSettings->tChParam[0].uMaxCuSize = 5; // 32x32
pSettings->tChParam[0].uMinCuSize = 3; // 8x8
pSettings->tChParam[0].uMaxTuSize = 5; // 32x32
pSettings->tChParam[0].uMinTuSize = 2; // 4x4
pSettings->tChParam[0].uMaxTransfoDepthIntra = 1;
pSettings->tChParam[0].uMaxTransfoDepthInter = 1;
pSettings->NumLayer = 1;
pSettings->NumView = 1;
pSettings->tChParam[0].eEntropyMode = AL_MODE_CABAC;
pSettings->tChParam[0].eWPMode = AL_WP_DEFAULT;
pSettings->tChParam[0].eSrcMode = AL_SRC_NVX;
#if AL_ENABLE_TWOPASS
pSettings->LookAhead = 0;
pSettings->TwoPass = 0;
#endif
pSettings->tChParam[0].eVideoMode = AL_VM_PROGRESSIVE;
}
插曲:
我們會發現會有大量篇幅在整這個FOURCC,這到底是個啥?
FourCC全稱Four-Character Codes,代表四字符代碼 (four character code), 它是一個32位的標示符,其實就是typedef unsigned int FOURCC;是一種獨立標示視頻數據流格式的四字符代碼。
ok看一下定義
typedef uint32_t TFourCC;
#define FOURCC(A) ((TFourCC)(((uint32_t)((# A)[0])) \
| ((uint32_t)((# A)[1]) << 8) \
| ((uint32_t)((# A)[2]) << 16) \
| ((uint32_t)((# A)[3]) << 24)))
實際就是把字符串轉化爲32位~
繼續
然後把幾個重要的信息給拿出來了,引用一下,然後後面修改
auto& FileInfo = cfg.FileInfo;//編碼的文件信息,主要時寬、高、然後大小,我呢見編碼yuv是nv12還是16等等
auto& Settings = cfg.Settings;//編碼配置信息,這個解析命令行的cfg文件,不然用默認的配置
auto& StreamFileName = cfg.BitstreamFileName;//輸出文件名
auto& RecFileName = cfg.RecFileName;//記錄文件名
auto& RunInfo = cfg.RunInfo;//運行信息
設置完默認信息之後開始解析命令行以及配置文件,解析cfg文件最主要在下面
if(g_Verbosity)
cerr << warning.str();
if(cfg.FileInfo.PictWidth > UINT16_MAX)
throw runtime_error("Unsupported picture width value");
if(cfg.FileInfo.PictHeight > UINT16_MAX)
throw runtime_error("Unsupported picture height value");
//設置編碼圖像寬高
AL_SetSrcWidth(&cfg.Settings.tChParam[0], cfg.FileInfo.PictWidth);
AL_SetSrcHeight(&cfg.Settings.tChParam[0], cfg.FileInfo.PictHeight);
//設置編碼的圖像位寬、格式等
if(ipbitdepth != -1)
{
AL_SET_BITDEPTH(cfg.Settings.tChParam[0].ePicFormat, ipbitdepth);
}
cfg.Settings.tChParam[0].uSrcBitDepth = AL_GET_BITDEPTH(cfg.Settings.tChParam[0].ePicFormat);
if(AL_IS_STILL_PROFILE(cfg.Settings.tChParam[0].eProfile))
cfg.RunInfo.iMaxPict = 1;
然後就是打印版本信息
然後設置默認setting參數,然後還是整cfg文件哪些東西
ok到此文件信息配置完畢,接下來用編碼器編碼了
獲取編碼器實例
首先獲取編碼器的控制信息
function<AL_TIpCtrl* (AL_TIpCtrl*)> wrapIpCtrl = GetIpCtrlWrapper(RunInfo);
auto pIpDevice = CreateIpDevice(!RunInfo.bUseBoard, RunInfo.iSchedulerType, Settings, wrapIpCtrl, RunInfo.trackDma, RunInfo.eVQDescr);
if(!pIpDevice)
throw runtime_error("Can't create IpDevice");
首先看一下function是個啥,這裏給一個例子:
#include <functional>
#include <iostream>
int f(int a, int b)
{
return a+b;
}
int main()
{
std::function<int(int, int)>func = f;
cout<<func(1, 2)<<endl; // 3
system("pause");
return 0;
}
ok,function是一個通用的多態函數包裝器。 std :: function的實例可以存儲,複製和調用任何可調用的目標 :包括函數,lambda表達式,綁定表達式或其他函數對象,以及指向成員函數和指向數據成員的指針。
也就是說上面的wrapIpCtrl是一個函數,可以直接用,然後其參數是一個指針,指向一個指針,AL_TIpCtrl* (AL_TIpCtrl*),打開後面的獲取函數來看一下:
function<AL_TIpCtrl* (AL_TIpCtrl*)> GetIpCtrlWrapper(TCfgRunInfo& RunInfo)
{
function<AL_TIpCtrl* (AL_TIpCtrl*)> wrapIpCtrl;
switch(RunInfo.ipCtrlMode)
{
default:
//這裏是一個lambda表達式
wrapIpCtrl = [](AL_TIpCtrl* ipCtrl) -> AL_TIpCtrl*
{
return ipCtrl;
};
break;
}
return wrapIpCtrl;
}
上面其實一個lamba表達式,看一下c++lambda表達式定義形式:
*[函數對象參數] (操作符重載函數參數) mutable 或 exception 聲明 -> 返回值類型 {函數體}*
也就是說返回類型是AL_TIpCtrl*,函數參數是AL_TIpCtrl* ipCtrl,那其實就是這樣的:
warpIpCtrl(param a){
return a;
}
真繞啊~
然後下面就是創建ip實例了:
shared_ptr<CIpDevice> CreateIpDevice(bool bUseRefSoftware, int iSchedulerType, AL_TEncSettings& Settings, function<AL_TIpCtrl* (AL_TIpCtrl*)> wrapIpCtrl, bool trackDma, int eVqDescr)
{
(void)bUseRefSoftware, (void)Settings, (void)wrapIpCtrl, (void)eVqDescr, (void)trackDma;
//wc你就這一種不直接默認得了???
if(iSchedulerType == SCHEDULER_TYPE_MCU)
return createMcuIpDevice();
throw runtime_error("No support for this scheduling type");
}
費勁。。。再接着走:
static unique_ptr<CIpDevice> createMcuIpDevice()
{
auto device = make_unique<CIpDevice>();
//設備創建dma區域,這裏是一個智能指針,然後在這裏reset並初始化,源碼不解釋了
device->m_pAllocator.reset(createDmaAllocator("/dev/allegroIP"), &AL_Allocator_Destroy);
if(!device->m_pAllocator)
throw runtime_error("Can't open DMA allocator");
//這裏創建mcu調度器,這裏講一下
device->m_pScheduler = AL_SchedulerMcu_Create(AL_GetHardwareDriver(), device->m_pAllocator.get());
if(!device->m_pScheduler)
throw std::runtime_error("Failed to create MCU scheduler");
return device;
}
展開創建mcu調度器函數
static const TSchedulerVtable McuSchedulerVtable =
{
&destroy,
&createChannel,
&destroyChannel,
&encodeOneFrame,
&putStreamBuffer,
&getRecPicture,
&releaseRecPicture,
};
TScheduler* AL_SchedulerMcu_Create(AL_TDriver* driver, AL_TAllocator* pDmaAllocator)
{
AL_TSchedulerMcu* scheduler = Rtos_Malloc(sizeof(*scheduler));
if(!scheduler)
return NULL;
scheduler->vtable = &McuSchedulerVtable;
scheduler->driver = driver;
scheduler->allocator = pDmaAllocator;
return (TScheduler*)scheduler;
}
這裏的rtos_malloc其實還是封裝了標準庫:
void* Rtos_Malloc(size_t zSize)
{
return malloc(zSize);
}
然後這裏的driver就是輸入我們打開的ip,就是我們底層的ip控制層了,dma暫時先不研究~
static AL_DriverVtable hardwareDriverVtable =
{
&Open,
&Close,
&PostMessage,
};
static AL_TDriver hardwareDriver =
{
&hardwareDriverVtable
};
AL_TDriver* AL_GetHardwareDriver()
{
return &hardwareDriver;
}
所以這裏需要畫個圖~
其中vtable其實就是調用driver發送消息給mcu進行編解碼,然後這裏看一個createChannel先
static AL_ERR createChannel(AL_HANDLE* hChannel, TScheduler* pScheduler, AL_TEncChanParam* pChParam, TMemDesc* pEP1, AL_TISchedulerCallBacks* pCBs)
{
AL_ERR errorCode = AL_ERROR;
AL_TSchedulerMcu* schedulerMcu = (AL_TSchedulerMcu*)pScheduler;
Channel* chan = Rtos_Malloc(sizeof(*chan));
if(!chan)
{
errorCode = AL_ERR_NO_MEMORY;
goto channel_creation_fail;
}
Rtos_Memset(chan, 0, sizeof(*chan));
chan->driver = schedulerMcu->driver;
chan->fd = AL_Driver_Open(chan->driver, deviceFile);
if(chan->fd < 0)
{
perror("Can't open driver");
goto driver_open_fail;
}
struct al5_channel_config msg = { 0 };
setChannelParam(&msg.param, pChParam, pEP1);
chan->outputRec = pChParam->eOptions & AL_OPT_FORCE_REC;
AL_EDriverError errdrv = AL_Driver_PostMessage(chan->driver, chan->fd, AL_MCU_CONFIG_CHANNEL, &msg);
if(errdrv != DRIVER_SUCCESS)
{
if(errdrv == DRIVER_ERROR_NO_MEMORY)
errorCode = AL_ERR_NO_MEMORY;
/* the ioctl might not have been called at all,
* so the error_code might no be set. leave it to AL_ERROR in this case */
if((errdrv == DRIVER_ERROR_CHANNEL) && (msg.status.error_code != 0))
errorCode = msg.status.error_code;
goto fail;
}
assert(msg.status.error_code == 0);
setChannelFeedback(pChParam, &msg.status);
setCallbacks(chan, pCBs);
chan->shouldContinue = 1;
chan->thread = Rtos_CreateThread(&WaitForStatus, chan);
if(!chan->thread)
goto fail;
SetChannelInfo(&chan->info, pChParam);
*hChannel = (AL_HANDLE)chan;
return AL_SUCCESS;
fail:
AL_Driver_Close(schedulerMcu->driver, chan->fd);
driver_open_fail:
Rtos_Free(chan);
channel_creation_fail:
*hChannel = AL_INVALID_CHANNEL;
return errorCode;
}
其中的AL_TEncChanParam是分配的物理地址以及轉化的虛擬地址
注意重點來了
設置通道信息,這裏是將信息傳入到msg結構體中
void setChannelParam(struct al5_params* msg, AL_TEncChanParam* pChParam, TMemDesc* pEP1)
{
static_assert(sizeof(*pChParam) <= sizeof(msg->opaque_params), "Driver channel_param struct is too small");
msg->size = 0;//這裏size=0說明是msg消息剛開始
write(msg, pChParam, sizeof(*pChParam));//寫入msg中,
uint32_t uEp1VirtAddr = 0;//虛擬地址
if(pEP1)
uEp1VirtAddr = pEP1->uPhysicalAddr + DCACHE_OFFSET;//虛擬地址
write(msg, &uEp1VirtAddr, sizeof(uEp1VirtAddr));
}
把消息傳入,msg結構體如下:
typedef AL_INTROSPECT (category = "debug") struct __AL_ALIGNED__ (4) AL_t_EncChanParam
{
int iLayerID;
/* Encoding resolution */
uint16_t uWidth;
uint16_t uHeight;
AL_EVideoMode eVideoMode;
/* Encoding picture format */
AL_EPicFormat ePicFormat;
AL_EColorSpace eColorSpace;
AL_ESrcMode eSrcMode;
/* Input picture bitdepth */
uint8_t uSrcBitDepth;
/* encoding profile/level */
AL_EProfile eProfile;
uint8_t uLevel;
uint8_t uTier;
uint32_t uSpsParam;
uint32_t uPpsParam;
/* Encoding tools parameters */
AL_EChEncOption eOptions;
int8_t iBetaOffset;
int8_t iTcOffset;
int8_t iCbSliceQpOffset;
int8_t iCrSliceQpOffset;
int8_t iCbPicQpOffset;
int8_t iCrPicQpOffset;
uint8_t uCuQPDeltaDepth;
uint8_t uCabacInitIdc;
uint8_t uNumCore;
uint16_t uSliceSize;
uint16_t uNumSlices;
/* L2 prefetch parameters */
uint32_t uL2PrefetchMemOffset;
uint32_t uL2PrefetchMemSize;
uint16_t uClipHrzRange;
uint16_t uClipVrtRange;
/* MV range */
int16_t pMeRange[2][2]; /*!< Allowed range for motion estimation */
/* encoding block size */
uint8_t uMaxCuSize;
uint8_t uMinCuSize;
uint8_t uMaxTuSize;
uint8_t uMinTuSize;
uint8_t uMaxTransfoDepthIntra;
uint8_t uMaxTransfoDepthInter;
// For AVC
AL_EEntropyMode eEntropyMode;
AL_EWPMode eWPMode;
/* Gop & Rate control parameters */
AL_TRCParam tRCParam;
AL_TGopParam tGopParam;
bool bSubframeLatency;
AL_ELdaCtrlMode eLdaCtrlMode;
} AL_TEncChanParam;
也就是說我們配置的時候也要把這個結構體寫進入,然後寫入虛擬地址
繼續
創建互斥鎖以及條件變量
AL_EVENT Rtos_CreateEvent(bool bInitialState)
{
evt_t* pEvt = (evt_t*)Rtos_Malloc(sizeof(evt_t));
if(pEvt)
{
pthread_mutex_init(&pEvt->Mutex, 0);
pthread_cond_init(&pEvt->Cond, 0);
pEvt->bSignaled = bInitialState;
}
return (AL_EVENT)pEvt;
}
emm比較簡單,自己看吧
然後創建銷燬~,這裏不太一樣,有用到了lambda表達式
auto scopeMutex = scopeExit([&]() {
Rtos_DeleteEvent(hFinished);
});
輸入是一個全局變量的引用,也就是hFinished,然後把lambda表達式傳輸scopeExit中,看一下這個函數
template<typename Lambda>
class ScopeExitClass
{
public:
ScopeExitClass(Lambda fn) : m_fn(fn)
{
}
~ScopeExitClass()
{
m_fn();
}
private:
Lambda m_fn;
};
創建一個類,並把傳入的函數設置爲私有函數,也就時說scopeMutex是個類,然後他的rtos_DetleEvent(hFinshed)是自己的私有函數,ok浸提你先到這裏
END
字數太多了,轉下一篇