自定義線程池來實現文檔轉碼

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我司在很久之前,一位很久之前的同事寫過一個文檔轉圖片的服務,具體業務如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"用戶在客戶端上傳文檔,可以是ppt,word,pdf 等格式,用戶上傳完成可以在客戶端預覽上傳的文檔,預覽的時候採用的是圖片形式(不要和我說用別的方式預覽,現在已經來不及了)"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"當用戶把文檔上傳到雲端之後(阿里雲),把文檔相關的信息記錄在數據庫,然後等待轉碼完成"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"服務器有一個轉碼服務(其實就是一個windows service)不停的在輪訓待轉碼的數據,如果有待轉碼的數據,則從數據庫取出來,然後根據文檔的網絡地址下載到本地進行轉碼(轉成多張圖片)"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"當文檔轉碼完畢,把轉碼出來的圖片上傳到雲端,並把雲端圖片的信息記錄到數據庫"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"客戶端有預覽需求的時候,根據數據庫來判斷有沒有轉碼成功,如果成功,則獲取數據來顯示。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文檔預覽的整體過程如以上所說,老的轉碼服務現在什麼問題呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"由於一個文檔同時只能被一個線程進行轉碼操作,所以老的服務採用了把待轉碼數據劃分管道的思想,一共有六個管道,映射到數據庫大體就是 Id=》管道ID 這個樣子。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"一個控制檯程序,根據配置文件信息,讀取某一個管道待轉碼的文檔,然後單線程進行轉碼操作"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"一共有六個管道,所以服務器上起了六個cmd的黑窗口......"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"有的時候個別文檔由於格式問題或者其他問題 轉碼過程中會卡住,具體的表現爲:停止了轉碼操作。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"如果程序卡住了,需要運維人員重新啓動轉碼cmd窗口(這種維護比較蛋疼)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"後來機緣巧合,這個程序的維護落到的菜菜頭上,維護了一週左右,大約重啓了10多次,終於忍受不了了,重新搞一個吧。仔細分析過後,刨除實際文檔轉碼的核心操作之外,整個轉碼流程其實還有很多注意點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"需要保證轉碼服務不被卡住,如果和以前一樣就沒有必要重新設計了"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"儘量避免開多個進程的方式,其實在這個業務場景下,多個進程和多個線程作用是一致的。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"每個文檔只能被轉碼一次,如果一個文檔被轉碼多次,不僅浪費了服務器資源,而且還有可能會有數據不一致的情況發生"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"轉碼失敗的文檔需要有一定次數的重試,因爲一次失敗不代表第二次失敗,所以一定要給失敗的文檔再次被操作的機會"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"因爲程序不停的把文檔轉碼成本地圖片,所以需要保證這些文件在轉碼完成在服務器上刪除,不然的話,時間長了會生成很多無用的文件"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說了這麼多,其實需要注意的點還是很多的。以整個的轉碼流程來說,本質上是一個任務池的生產和消費問題,任務池中的任務就是待轉碼的文檔,生產者不停的把待轉碼文檔丟進任務池,消費者不停的把任務池中文檔轉碼完成。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"link","attrs":{"href":"#線程池","title":null}},{"type":"text","text":"線程池"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這很顯然和線程池很類似,菜菜之前就寫過一個線程池的文章,有興趣的同學可以去翻翻歷史。今天我們就以這個線程池來解決這個轉碼問題。線程池的本質是初始化一定數目的線程,不停的執行任務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":" //線程池定義 \n public class LXThreadPool:IDisposable\n {\n bool PoolEnable = true; //線程池是否可用 \n List ThreadContainer = null; //線程的容器\n ConcurrentQueue JobContainer = null; //任務的容器\n int _maxJobNumber; //線程池最大job容量\n\n ConcurrentDictionary JobIdList = new ConcurrentDictionary(); //job的副本,用於排除某個job 是否在運行中\n\n\n public LXThreadPool(int threadNumber,int maxJobNumber=1000)\n {\n if(threadNumber<=0 || maxJobNumber <= 0)\n {\n throw new Exception(\"線程池初始化失敗\");\n }\n _maxJobNumber = maxJobNumber;\n ThreadContainer = new List(threadNumber);\n JobContainer = new ConcurrentQueue();\n for (int i = 0; i < threadNumber; i++)\n {\n var t = new Thread(RunJob);\n t.Name = $\"轉碼線程{i}\";\n ThreadContainer.Add(t);\n t.Start();\n }\n //清除超時任務的線程\n var tTimeOutJob = new Thread(CheckTimeOutJob);\n tTimeOutJob.Name = $\"清理超時任務線程\";\n tTimeOutJob.Start();\n }\n\n //往線程池添加一個線程,返回線程池的新線程數\n public int AddThread(int number=1)\n {\n if(!PoolEnable || ThreadContainer==null || !ThreadContainer.Any() || JobContainer==null|| !JobContainer.Any())\n {\n return 0;\n }\n while (number <= 0)\n {\n var t = new Thread(RunJob);\n ThreadContainer.Add(t);\n t.Start();\n number -= number;\n }\n return ThreadContainer?.Count ?? 0;\n }\n\n //向線程池添加一個任務,返回0:添加任務失敗 1:成功\n public int AddTask(Action job, object obj,string actionId, Action errorCallBack = null)\n {\n if (JobContainer != null)\n {\n if(JobContainer.Count>= _maxJobNumber)\n {\n return 0;\n }\n //首先排除10分鐘還沒轉完的\n var timeoOutJobList = JobIdList.Where(s => s.Value.AddMinutes(10) < DateTime.Now);\n if(timeoOutJobList!=null&& timeoOutJobList.Any())\n {\n foreach (var timeoutJob in timeoOutJobList)\n {\n JobIdList.TryRemove(timeoutJob.Key,out DateTime v);\n }\n }\n\n if (!JobIdList.Any(s => s.Key == actionId))\n {\n if(JobIdList.TryAdd(actionId, DateTime.Now))\n {\n JobContainer.Enqueue(new ActionData { Job = job, Data = obj, ActionId = actionId, ErrorCallBack = errorCallBack });\n return 1;\n }\n else\n {\n return 101;\n }\n }\n else\n {\n return 100;\n } \n }\n return 0;\n } \n \n private void RunJob()\n {\n while (JobContainer != null && PoolEnable)\n {\n \n //任務列表取任務\n ActionData job = null;\n JobContainer?.TryDequeue(out job);\n if (job == null)\n {\n //如果沒有任務則休眠\n Thread.Sleep(20);\n continue;\n }\n try\n {\n //執行任務\n job.Job.Invoke(job.Data);\n }\n catch (Exception error)\n {\n //異常回調\n if (job != null&& job.ErrorCallBack!=null)\n {\n job?.ErrorCallBack(error);\n }\n \n }\n finally\n {\n if (!JobIdList.TryRemove(job.ActionId,out DateTime v))\n {\n\n }\n }\n }\n }\n\n //終止線程池\n public void Dispose()\n {\n PoolEnable = false;\n JobContainer = null;\n if (ThreadContainer != null)\n {\n foreach (var t in ThreadContainer)\n {\n //強制線程退出並不好,會有異常\n t.Join();\n }\n ThreadContainer = null;\n }\n }\n\n //清理超時的任務\n private void CheckTimeOutJob()\n {\n //首先排除10分鐘還沒轉完的\n var timeoOutJobList = JobIdList.Where(s => s.Value.AddMinutes(10) < DateTime.Now);\n if (timeoOutJobList != null && timeoOutJobList.Any())\n {\n foreach (var timeoutJob in timeoOutJobList)\n {\n JobIdList.TryRemove(timeoutJob.Key, out DateTime v);\n }\n }\n System.Threading.Thread.Sleep(60000);\n }\n }\n public class ActionData\n {\n //任務的id,用於排重\n public string ActionId { get; set; }\n //執行任務的參數\n public object Data { get; set; }\n //執行的任務\n public Action Job { get; set; }\n //發生異常時候的回調方法\n public Action ErrorCallBack { get; set; }\n }\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上就是一個線程池的具體實現,和具體的業務無關,完全可以用於任何適用於線程池的場景,其中有一個注意點,我新加了任務的標示,主要用於排除重複的任務被投放多次(只排除正在運行中的任務)。當然代碼不是最優的,有需要的同學可以自己去優化"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"link","attrs":{"href":"#使用線程池","title":null}},{"type":"text","text":"使用線程池"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來,我們利用以上的線程池來完成我們的文檔轉碼任務,首先我們啓動的時候初始化一個線程池,並啓動一個獨立線程來不停的往線程池來輸送任務,順便起了一個監控線程去監視發送任務的線程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":" string lastResId = null;\n string lastErrorResId = null;\n\n Dictionary ResErrNumber = new Dictionary(); //轉碼失敗的資源重試次數\n int MaxErrNumber = 5;//最多轉碼錯誤的資源10次\n Thread tPutJoj = null;\n LXThreadPool pool = new LXThreadPool(4,100);\n public void OnStart()\n {\n //初始化一個線程發送轉碼任務\n tPutJoj = new Thread(PutJob);\n tPutJoj.IsBackground = true;\n tPutJoj.Start();\n\n //初始化 監控線程\n var tMonitor = new Thread(MonitorPutJob);\n tMonitor.IsBackground = true;\n tMonitor.Start();\n }\n //監視發放job的線程\n private void MonitorPutJob()\n {\n while (true)\n {\n if(tPutJoj == null|| !tPutJoj.IsAlive)\n {\n Log.Error($\"發送轉碼任務線程停止==========\");\n tPutJoj = new Thread(PutJob);\n tPutJoj.Start();\n Log.Error($\"發送轉碼任務線程重新初始化並啓動==========\");\n }\n System.Threading.Thread.Sleep(5000);\n }\n\n }\n\n private void PutJob()\n { \n while (true)\n {\n try\n {\n //先搜索等待轉碼的\n var fileList = DocResourceRegisterProxy.GetFileList(new int[] { (int)FileToImgStateEnum.Wait }, 30, lastResId);\n Log.Error($\"拉取待轉碼記錄===總數:lastResId:{lastResId},結果:{fileList?.Count() ?? 0}\");\n if (fileList == null || !fileList.Any())\n {\n lastResId = null;\n Log.Error($\"待轉碼數量爲0,開始拉取轉碼失敗記錄,重新轉碼==========\");\n //如果無待轉,則把出錯的 嘗試\n fileList = DocResourceRegisterProxy.GetFileList(new int[] { (int)FileToImgStateEnum.Error, (int)FileToImgStateEnum.TimeOut, (int)FileToImgStateEnum.Fail }, 1, lastErrorResId);\n if (fileList == null || !fileList.Any())\n {\n lastErrorResId = null;\n }\n else\n {\n // Log.Error($\"開始轉碼失敗記錄:{JsonConvert.SerializeObject(fileList)}\");\n List errFilter = new List();\n foreach (var errRes in fileList)\n {\n if (ResErrNumber.TryGetValue(errRes.res_id, out int number))\n {\n if (number > MaxErrNumber)\n {\n Log.Error($\"資源:{errRes.res_id} 轉了{MaxErrNumber}次不成功,放棄===========\");\n continue;\n }\n else\n {\n errFilter.Add(errRes);\n ResErrNumber[errRes.res_id] = number + 1;\n }\n }\n else\n {\n ResErrNumber.Add(errRes.res_id, 1);\n errFilter.Add(errRes);\n }\n }\n fileList = errFilter;\n if (fileList.Any())\n {\n lastErrorResId = fileList.Select(s => s.res_id).Max();\n }\n }\n }\n else\n {\n lastResId = fileList.Select(s => s.res_id).Max();\n }\n\n if (fileList != null && fileList.Any())\n {\n foreach (var file in fileList)\n {\n //如果 任務投放線程池失敗,則等待一面繼續投放\n int poolRet = 0;\n while (poolRet <= 0)\n {\n poolRet = pool.AddTask(s => {\n AliFileService.ConvertToImg(file.res_id + $\".{file.res_ext}\", FileToImgFac.Instance(file.res_ext));\n }, file, file.res_id);\n if (poolRet <= 0 || poolRet > 1)\n {\n Log.Error($\"發放轉碼任務失敗==========線程池返回結果:{poolRet}\");\n System.Threading.Thread.Sleep(1000);\n }\n }\n }\n }\n //每一秒去數據庫取一次數據\n System.Threading.Thread.Sleep(3000);\n }\n catch\n {\n continue;\n }\n \n }\n }\n\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上就是發放任務,線程池執行任務的所有代碼,由於具體的轉碼代碼涉及到隱私,這裏不在提供,如果有需要可以私下找菜菜索要,雖然我深知還有更優的方式,但是我覺得線程池這樣的思想可能會對部分人有幫助,其中任務超時的核心代碼如下(採用了polly插件):"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":" var policy= Policy.Timeout(TimeSpan.FromSeconds(this.TimeOut), onTimeout: (context, timespan, task) =>\n {\n ret.State=Enum.FileToImgStateEnum.TimeOut; \n });\n policy.Execute(s=>{\n .....\n });\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"把你的更優方案寫在留言區吧,2020年大家越來越好"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"更多精彩文章"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342955119549267969&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"分佈式大併發系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342959003139227648&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"架構設計系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342962375443529728&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"趣學算法和數據結構系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342964237798391808&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"設計模式系列"}]}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f8/f8af5984765a267892bf1a1272272625.png","alt":"image","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章