基於數據驅動的銷量預測模型建構

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"編者按"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"銷量預測是企業生產經營中的重要環節,但由於未來市場需求和銷量來源等存在諸多不確定性,爲企業銷量預測提升了難度,如何提升產品全生命週期智能決策分析越來越成爲企業關注的重點。百分點數據科學實驗室多年來在項目中積累了豐富的實踐經驗,總結了一套基於數據驅動的銷量預測模型建構方法,本文將從預測目標、評估方法、案例應用及效果等方面進行分享。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、銷量預測的價值"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 銷量預測的商業價值"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於供應鏈的滯後性,企業需要根據未來一段時間內的市場需求制定儘量準確的銷售計劃,再根據銷售計劃制定生產和採購計劃。但未來的市場需求是不確定的,如果企業高估市場需求,就會造成庫存積壓,進而承擔庫存成本(包括庫存費用和資金成本);如果企業低估市場需求,就會造成缺貨,進而承擔未實現銷售的機會成本。這時,準確、高效地預測市場需求,即進行銷量預測,就成爲企業降低決策不確定性,最小化庫存和機會成本的關鍵。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 數據驅動的銷量預測"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"企業進行銷量預測的傳統方法是基於人工經驗估計,也可以稱爲專家法。以一個消費品生產企業爲例,制定銷售計劃有如下步驟:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)各個地區銷售代表拜訪當地客戶收集需求意向,再根據經驗判斷,制定地區銷售計劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)總部將所有地區的銷售計劃彙總,得到全國銷售計劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)總部根據季度或月度業績目標調整銷售計劃,再返回到地區進行確認。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(4)確認後得到最終的銷售計劃,交給生產部門。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述流程本質上是通過收集客戶的需求信息,再經過專家經驗調整後得到未來銷量的預測。這種專家法能夠結合長時間積累的業務經驗和人的邏輯判斷能力,但完全依賴專家法有一定的侷限性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"人工經驗可能存在偏見(bias),忽略或放大某些影響銷量的因素,例如總部調整銷售計劃時可能高估營銷政策的影響。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"專家法有較高的時間成本,無法對大量商品進行預測,例如對於一些銷量很小的品規,地區銷售可能選擇忽略,不花時間採集信息。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據驅動的銷量預測可以解決上述問題。數據驅動的銷量預測是指利用算法挖掘大量歷史數據中可復現的規律,再用這些規律建立模型預測未來銷量(圖1)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ce\/fb\/ce59f4fd79def46e002dc00e4f3276fb.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖1 數據驅動的銷量預測"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"算法挖掘數據中規律的過程其實本質上和人工判斷的原理類似,都是在可能影響銷量的因素和銷量之間建立聯繫。銷量的影響因素包括:銷量的歷史趨勢、週期性、節假日、產品屬性、渠道屬性、營銷投入、競爭情況等(圖2)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ea\/0c\/ea700269d1fdc0832352fd75cc86ff0c.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖2 銷量預測的影響因素"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、銷量預測的難點"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 世上沒有水晶球"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管銷量預測十分重要,但在實踐中進行高質量的銷量預測並不容易,尤其是預測的準確率往往不盡如人意。在深度學習算法已經可以超越人類水平進行人臉識別的今天,爲何銷量預測仍然如此之難?在討論這個問題之前,我們首先要明確未來銷量不確定性的來源。不確定性可以分類三類(圖3):"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/70\/1c\/70cbafc57fa9b61d239e3c00f854431c.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖3 不確定性的來源"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)未知但可知:指數據中的隨機性,即噪音造成的不確定性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)博弈結果:指系統內參與者對其他參與者行爲的預期造成的不確定性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)複雜系統:指複雜系統中極小參數變化經過非線性轉換造成的“黑天鵝”類不確定性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這三類不確定性中,預測模型只適合解決第一類,而人臉識別就符合第一類不確定性:人臉的結構和特徵千百年來變化非常緩慢。第二和第三類不確定性從定義上來說無法在歷史數據中積累足夠多的案例,預測模型也就無法學習相關規律。未來銷量的不確定性恰恰不僅來自第一類不確定性。舉例來說,競爭對手的行爲(定價、新品)會影響企業的銷量,但這是競爭對手的行爲是基於企業本身策略的預期制定的(博弈結果),無法通過歷史數據預測。“黑天鵝”類的不確定性就更容易理解,去年發生的新冠疫情就是一個鮮明的例子。換而言之,即使我們能收集圖2中所有影響銷量的因素,也不可能百分之百準確地預測銷量。因此,在建立銷量預測模型時,我們不能以一個理想的準確率作爲目標,而是將模型與基線對比,評估模型帶來的效率和準確率提升。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 預測、目標和計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了預測方法的侷限性,銷量預測的另一個誤區是企業通常會混淆預測、目標和計劃三者的關係,造成預測的邊界模糊,在實踐中無法展現價值。根據預測專家Hyndman[1]的定義:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預測(forecasting)是基於歷史數據(歷史銷量)和未來可能發生的事件(營銷投入),儘量準確地估計某個變量未來的數值(未來銷量)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目標(goals)是企業希望未來發生或達成的事件(銷量增長30%)。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"規劃(planning)是企業對於預測和目標的應對措施,即需要做什麼(營銷投入增長15%)才能讓預測和目標一致。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實際項目中,企業一般會每個月制定下個月的銷售計劃。由於銷售計劃具有考覈效力,下個月的實際銷量和銷售計劃具有很高的相關性。因此,爲了得到準確率較高的預測模型,建模人員通常會將銷售計劃作爲特徵加入銷量預測模型。但銷量預測模型的目的就是爲了指導業務人員更加合理的制定銷售計劃,那麼到底應該先有銷量預測,還是應該先有銷售計劃?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"出現這個問題的根本原因是沒有區分預測、目標和規劃。在上面的例子中,銷售計劃實際上是目標,也就是企業希望完成的銷量。銷量預測模型不應該使用銷售計劃作爲特徵,銷售計劃應該在預測結果的基礎上制定。相應的,在評估模型表現時,也不能將模型的預測誤差率同銷售計劃和實際銷量的誤差率直接對比。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們使用銷售計劃作爲特徵是因爲銷售計劃是一些通常無法觀測到的變量的代理變量(proxy variable)。例如,爲了完成銷售計劃,基層業務人員會加大拜訪客戶的頻率,但拜訪次數沒有記錄,所以模型無法捕捉這類信息。因此解決這個問題的根本方法是更加全面的收集數據。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、銷量預測解決方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"銷量預測屬於時間序列預測問題,時序預測通常採用傳統時間序列模型,例如ETS和ARIMA,對單序列進行建模。爲了提升準確率,可以進一步進行多個時序模型的融合。但該方法在銷量預測領域有一定侷限性。我們從分析銷量預測的技術挑戰出發,決定最終模型解決方案。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 大規模多層級多時序問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"問題描述:銷量預測可以理解爲一個多層級多時序問題。具體來說,銷量可以根據產品、地理等維度劃分爲多個時間序列。以一個有兩級產品(品類和品規)和兩級地理(地區和門店)管理體系的企業爲例,最細的時序維度是地區-門店-品類-品規。一個較大規模企業可能需要預測數萬,甚至數十萬個時序。因此,模型需要對大規模時序組合進行預測。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個問題是時序之間存在附屬關係,例如品規屬於品類,門店屬於地區。建模時需要考慮時序之間的交互關係,並且保證附屬關係成立,例如品規銷量彙總等於品類銷量,門店銷量彙總等於地區銷量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解決方法:爲了捕捉時序之間的交互關係,並且允許相同層級的時序共享信息,我們選擇多時間序列聯合建模的方法,不使用傳統的單時間序列模型。具體來說,我們將最細維度時序(地區-門店-品類-品規)的全部數據輸入模型,再通過特徵工程提取時序類特徵(圖4)。在預測階段,我們對最細維度時序預測結果進行彙總,得到更高層級時序(如品類和門店銷量)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/de\/f6\/ded1c4518389d558d1c7731d76d9a5f6.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖4 時序特徵工程方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於以上建模方法針對最細維度時序,在彙總後,更高層級的預測不一定達到最佳效果。一種改進方法是對更高層級時序(品類或地區)分別單獨建模,再用Forecast Reconciliation方法統一和優化各層級預測結果。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 多步預測問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"問題描述:多步預測是指我們關注多個目標,例如預測未來1-3個月每個月的向量。傳統時序模型的應對方法是將T+1時間的預測結果作爲T+2時間的輸入值,用來進行滾動預測。這種方法的問題是可能造成預測誤差累計。例如,如果模型有預測偏大的問題,那麼每步預測時該問題都會放大。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解決方法:我們對每個預測目標時間(T+1,T+2等)分別建立模型,使多步預測更加穩定,代價是需要訓練預測目標時間倍數的模型。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. 間歇性需求問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"問題描述:對最細維度時序建模時,會出現部分時間銷售量爲零的情況,這種情況被稱爲間歇性需求,在銷量預測領域是一個常見問題。訓練數據中存在大量零值會造成模型偏見,降低準確率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解決方法:我們採取兩個步驟解決這個問題。首先,我們將有大量連續零值時序視爲已停產狀態,從訓練數據中剔除,不對其進行預測。在篩選完時序後,還會有間歇性需求存在。我們根據實際數據情況採用以下方法或方法組合應對:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用Tweedie Loss等對零值敏感的損失函數訓練模型。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用Hurdle Model,先訓練一個分類模型預測銷量是否爲零,再訓練一個迴歸模型預測在銷量非零情況下的銷量。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"四、銷量預測評估方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"銷量預測模型的評估方法多種多樣,可以分爲技術指標和業務指標兩類。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 技術指標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"技術指標用來評估模型在驗證集或實際生產中的預測準確率。最常用的技術指標是平均絕對百分比誤差(MAPE),其定義如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/1a\/ce\/1abd22c27962d4f1ac53018c3dca22ce.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MAPE的優點是作爲一個百分比誤差,非常易於業務人員理解。但MAPE有兩個顯著問題,導致在實際應用中會得到不直觀的結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)MAPE是非對稱的:當預測值大於實際值時,MAPE是沒有上限的,而當預測值小於實際值時,MAPE最大爲100%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)MAPE在實際值爲零時無法計算,這在間歇性需求常見的銷量預測領域是嚴重問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決上述問題,人們提出對稱平均絕對百分比誤差(sMAPE),但sMAPE存在自己的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/8c\/9e\/8cef1aff6cf3035f30dd0b09268bef9e.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在實踐中採取MAD Mean Ratio作爲技術評估指標。該指標適用於間歇性需求場景,並且同樣是一個百分比誤差,易於理解。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 業務指標"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務指標用來評估模型應用後對業務產生的實際影響,是比模型準確率更加直觀和有效的評估指標。業務指標需要根據具體業務設計,還是以消費品企業爲例,與銷量預測模型相關的業務指標包括庫存週轉率、訂單拖欠率等。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"五、對於業務設計的啓示"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據項目實踐中積累的經驗,我們總結兩點對於業務設計的啓示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)如果想最大程度挖掘數據中的價值,那麼設計相關業務和IT系統時需要充分考慮數據分析和建模的需求。舉例來說,一般業務系統的數據庫設計不會考慮時間切片數據的保存,這就造成分析和建模時無法獲取歷史時點的數據,進而造成時間泄露等問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)銷量預測是一種技術工具,需要和業務流程結合才能發揮作用。即使模型達到令人滿意的準確率,如果混淆了預測、目標和規劃,對模型產生不切實際的預期,或者模型結果無法被業務人員理解和接受,模型也不會對業務產生實際影響。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"六、項目應用案例"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 背景和需求"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"某醫藥企業生產數百種OTC藥品,並通過多級分銷商體系在全國進行銷售。爲了滿足企業複雜的經營業務,供應鏈管理十分重要。該企業的供應鏈可以抽象爲物料流和信息流,兩者統稱爲產銷協同鏈條,具體如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"物料流:原料倉庫-生產線-成品倉庫-物流-渠道倉庫-銷售。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"信息流:需求預測-渠道訂單-總部計劃-生產計劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該企業產銷協同鏈條面臨以下問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)產銷協同管理鏈條不同環節數據未打通。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)管理環節彼此獨立,整個供產銷協同執行過程預警信息不統一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)銷售預測不夠快速和準確,供銷協同動態調整不夠快速。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對第三點問題,實施項目的解決方案爲基於歷史銷售和庫存數據建立銷量預測模型,以大幅擴展進行銷量預測的品規範圍,並且提供更加準確和更高頻率的預測爲目標。具體而言,由於該企業的最細管理粒度爲地區-門店-品類-品規,我們需要對超過90,000個時間序列進行建模;預測頻率爲月度;預測週期爲3-16個月。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 方案和效果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述需求完全符合第三章節中描述的銷量預測技術挑戰,因此我們按照該章節提供的解決方案設計項目中的建模策略:我們對全部時間序列進行聯合建模,對每個預測目標時間分別建立模型,並使用Hurdle Model應對間歇性需求問題。特徵方面,我們使用基於銷量、庫存、營銷政策等類型數據衍生出的數百個特徵。算法方面,我們採用適合結構化數據並且高效的LightGBM。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"利用時序交叉驗證方法(Time-series crossvalidation),我們驗證模型在歷史數據上的MAD Mean Ratio表現,和採用預測模型之前的人工基準方法比較,模型在主要品規上降低了15%預測誤差,取得較好效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"參考資料"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Hyndman, R.J.,& Athanasopoulos, G. (2018) Forecasting: principles and practice, 2ndedition, OTexts: Melbourne, Australia. OTexts.com\/fpp2. Accessed on <2021-03-23>."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本文轉載自公衆號百分點科技(ID:baifendian_com)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/R5TWJSMCnwrcvPG-pt99pg","title":"","type":null},"content":[{"type":"text","text":"基於數據驅動的銷量預測模型建構"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章