2020美賽F獎論文(一):摘要、緒論和模型準備

全文:

Soccer Teamwork Evaluation Models

足球團隊合作評價模型

  • 2020MCM-ICM ProblemD
  • Finalist 方案

2020年美國大學生數學建模競賽ICM-D題 特等獎提名

GitHub倉庫

Certificate

Summary

  • This paper proposes a method, with graph theory, probability theory and calculus, to build machine learning models based on data analysis, which aims at providing strategies for soccer coach’s lineup arrangement and players’ training.

本文利用圖論,概率論和微積分的方法,利用數據分析和建立機器學習模型,爲足球教練的陣容安排和球員訓練提供策略。

  • Firstly, the Pass Network Model can be established according to the graph theory, whose edge-weights are evaluation of coordination degree of each dyadic configurations. Pass Evaluate Index is designed for evaluate a single pass, and the summation of each pass can be defined as the edge-weights of PNM. For analysis, the adjacency matrix of N participating players within a period. Several outstanding M configurations can be found by the sort of M-element combination with the key of the sum of the sub-complete graph edge weights. What’s more, investigation of the influence of time on pass density depends on the constructed and approximate function of time and pass.

Firstly,根據圖論,在球員之間建立傳球網絡,並建立單次傳球的價值評價模型,用於評價兩兩球員間傳球的配合程度,即傳球網絡的邊權。建立在一定時間範圍內所有參與比賽的N個球員的鄰接矩陣,通過以M個點的子完全圖邊權之和爲排序關鍵字找出若干組優秀的M元組合。同時建立基於時間尺度的價值模型,用於評價時間對傳球效率的影響。

  • Secondly, performance indicators that reflect successful teamwork can be divided into dynamic indicators and static indicators. Static indicators include player position arrangement and line-up with which player season heatmap models and player position models can be established while the dynamic indicators include opponents’ strength, side, coach, passes, defense, attack and fail. etc. After visualized analysis of the correlation between the dynamic indicators extracted after data cleaning, and with the setting label by the goal difference, the random forest classifier, a machine learning model, is used as a evaluation model of dynamic indicators. After the Grid Search used for tuning parameters, and cross-validation, the accuracy of the model achieving 80% approximately.

Secondly,我們將反映成功團隊合作的績效指標劃分爲靜態指標和動態指標。靜態指標包括球員位置安排和球隊陣型(line-up),我們建立球員賽季熱點模型和球員分佈模型。動態指標包括opponents,side,coach,passes,defence,attack and fail等。對經過數據清洗動態指標之間通過可視化進行相關性分析後,以淨勝球分類作爲比賽樣本標籤,以隨機森林分類器作爲機器學習的模型,用網格搜索調優參數,建立動態指標評價模型,進行交叉驗證,達到了80%的準確率。

  • Thirdly, the study focuses on the role of static indicators in the performance of the team and establishes different players’ value evaluation models in different positions which comprehensively consider the player’s positions and technical statistical data evaluation. To optimize the value of 11-person permutation, we choose simulated annealing (SA) algorithm which searches the global optimal solution in cousin points in the same minimized search tree after the local optimal solution has attained. The model finally gave the best starting lineup formation. In addition, we also consider the following three secondary factors: tacit understanding between players, home and away influence, and coaching arrangements. All analysis above can be concluded as comprehensive suggestion to the coach.

Thirdly,通過上述中建立的模型進行觀察分析,我們着重研究靜態指標對球隊的勝利起到的關鍵作用,綜合考慮球員位置和技術數據評價模型,建立不同球員在不同位置價值評價模型。通過模擬退火算法,優化11人排列組合的考慮,在局部最優解的父級搜索樹進行搜索全局最優解,最終給出價值最優的首發陣容陣型圖。此外我們還考慮以下三個次要影響因素:球員間默契度,主客場影響和教練安排。給教練提出的綜合建議。

  • Finally, we use the case of the Huskies to explain group dynamics. And use the conclusions obtained by the Huskies to build a model to explain how to design a more effective team and supplement the team performance indicators.

Finally,我們用哈士奇球隊的案例來解釋羣體動力學。並用哈士奇球隊建立模型得到的結論來說明如何設計更有效的團隊,並對團隊績效指標進行補充。

Key words: Network; Graph theory; Calculus; Machine learning; Random forest classifier; Simulated annealing; Heat map; Group dynamics

0 Content

1 Introduction 3

  • 1.1 Background 3
  • 1.2 Problem Restatement 3

2 Preparation of the Models 3

  • 2.1 Processing Tools 3
  • 2.2 Data Cleaning 4

3 Establishment of PNM and Analysis of Influence Factors 4

  • 3.1 Pass Evaluation Index (PEI) 4
  • 3.2 Pass Network Model (PNM) and Recognition of Network Pattern 6
  • 3.3 Fluctuation of Passing State at The Time 6

4 Soccer Team Indexes and Performance Prediction Based on ML 7

  • 4.1 Static Index (SI) 8
  • 4.2 Dynamic Index (DI) 9
    • 4.2.1 Data Cleaning and Feature Engineering 9
    • 4.2.2 Visualization Analysis 9
  • 4.2.3 RFC Establishment, Optimization, and Training 12

5 Design of Structural Strategies Driven by SA 13

  • 5.1 Position Evaluation Engineering (PEE) 13
  • 5.2 Optimization of Permutation and Combination Based on SA Algorithm 14
  • 5.3 Other Structural Strategy Factors 15
  • 5.4 Structural Strategy Conclusion 16

6 Model Extension Combined with Group Dynamics 16

  • 6.1 Group and Soccer Team 17
    • 6.1.1 Group Cohesiveness 17
    • 6.1.2 Group Standard and Group Pressure 17
    • 6.1.3 Individual Motivation and Group Goals 17
    • 6.1.4 Leadership and Group Performance 18
    • 6.1.5 Group Structure 18
  • 6.2 Other influence factor of successful teamwork 18

7 Evaluation 18

  • 7.1 Strength 18
  • 7.2 Weakness 19

8 Reference 19

0 目錄

1 緒論 3

  • 1.1 背景 3
  • 1.2 問題重述 3

2 模型準備 3

  • 2.1 預處理工具 3
  • 2.2 數據清洗 4

3 傳球網絡模型(PNM)的建立和影響因子分析 4

  • 3.1 傳球評價指標 (PEI) 4
  • 3.2 傳球網絡模型(PNM)構建及識別網絡模式 6
  • 3.3 時間尺度上傳球狀態波動 6

4 足球團隊指標和基於機器學習的球隊表現預測 7

  • 4.1 靜態指標 (SI) 8
  • 4.2 動態指標 (DI) 9
    • 4.2.1 數據清洗和特徵工程 9
    • 4.2.2 可視化分析 9
  • 4.2.3 隨機森立分類器模型的建立、參數調優和訓練 12

5 模擬退火算法驅動的結構策略設計 13

  • 5.1 位置評價工程(PEE) 13
  • 5.2 基於SA算法優化排列組合 14
  • 5.3 其他結構策略因素 15
  • 5.4 結構性策略總結 16

6 結合團隊動力學的模型拓展 16

  • 6.1 團體動力學和足球隊 17
    • 6.1.1 羣體內聚力 17
    • 6.1.2 羣體標準和羣體壓力 17
    • 6.1.3 個人動機和羣體目標 17
    • 6.1.4 領導與羣體性能 18
    • 6.1.5 羣體的結構性 18
  • 6.2 成功團隊合作其他影響因素 18

7 評價 18

  • 7.1 優勢 18
  • 7.2 缺陷 19

8 參考文獻 19

1 緒論 Introduction

1.1 背景 Background

Football has a long history. It has been loved all over the world since it was popularized. Football can be considered as the most popular sports in the world. Football, a seemingly simple sport, contains the secrets of individual ability and team cooperation. With the development of the times and the progress of science and technology, football players and coaches continue to improve in skills, showing the audience wonderful matches. As we all know, a wonderful football match is inseparable from the contributions of players and teams. By studying the actions of everyone in the team, coordinating the team relationship, reasonably arranging the minutes and line-up, we can score best.

1.2 問題重述 Problem Restatement

Football is a sport suitable for all ages. Since its inclusion in international tournaments, people have created a variety of methods to evaluate the team dynamics throughout the match and over the entire season to help determine specific strategies that can improve teamwork next season. We need to use the data provided by the ICM team to build a model to solve the following four problems.

足球賽是一項老少皆宜的運動,自從其納入國際賽事以來,人們就創造出各種各樣的方法來評價整個比賽和整個賽季的團隊動態,來幫助確定下個賽季可以改善團隊合作的具體策略。我們需要使用ICM團隊提供的數據建立模型來解決以下四個問題。

  1. Consider each player as a node and create a passing network to identify dyadic, triadic and multiple configurations. We need to establish a value evaluation model of a single pass and a general evaluation model of the passing of the time structure index under the passing network.
  2. To Identify performance indicators that reflect successful teamwork, we need to consider static and dynamic indicators. Establish a model of the impact of each performance indicator on successful teamwork, and use one model to encompass these four sub-models.
  3. By observing and analyzing the model established in Questions 1 and 2, tell the coach that which form of structural strategy is applicable to the Huskies. Using the results of the model analysis to make suggestions for the coach to improve the team’s success rate next season.
  4. Use the case of the Huskies to explain the theory of group dynamics, and use the conclusion of the model established by the Huskies to explain how to design a more effective team, and supplement the team performance indicators.
  1. 將每一個球員當做一個節點,創建傳球網絡來識別二元配置,三元配置和 多元配置。我們需要建立在傳球網絡下,單次傳球的價值評價模型,以及時間結構指標的傳球總數評價模型。
  2. 確定反映成功團隊合作的績效指標,我們需要考慮靜態指標和動態指標。建立每個績效指標對成功團隊合作影響的模型,並用一個模型來囊括這四個子模型。
  3. 通過對問題1,2中建立的模型的觀察分析,告訴教練什麼樣的結構策略適用於哈士奇球隊。用模型分析的結果爲教練提高球隊的下個賽季的成功率給出建議。
  4. 用哈士奇球隊的案例來解釋羣體動力學理論,用哈士奇球隊建立模型得到的結論來說明如何設計更有效的團隊,並對團隊績效指標進行補充。

2 模型準備 Preparation of the Models

2.1 預處理工具 Processing Tools

Tool Uses
Visual Studio Code 1.42 Coding, Visualization
IPython 3.6.8 Run Code
Visio Design Flowchart
Excel Arrange Dataset
GitHub Synchronization, Storing
MindMaster Plot Mind Map

2.2 數據清洗 Data Cleaning

若空白則爲上一個相同

Data Name Processing Type Feature Name
Side Map + Dummy Side_1, Side_0
Coach Dummy Coach_1, Coach_2, Coach_3
Opponent Strength Analysis Oppo
Shots Count Attack
Dribbles
Touch
Corner
Offside
Tackle Count Defence
Dispossess
Aerial Won
Interception
Clearance
Blocks
Saves
Passes Count Pass
Possession Search + Integrate
Pass Success Calculate
Foul Count Fail
Loss of Possession Search + Count

後接:2020美賽F獎論文(二):傳球網絡模型(PNM)的建立和影響因子分析
全文:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章