Python小練習:線性衰減

Python小練習:線性衰減

作者:凱魯嘎吉 - 博客園 http://www.cnblogs.com/kailugaji/

本文介紹一種最簡單的衰減曲線:線性衰減。給定schedule = [start, end, start_value, end_value],先前一直保持在start_value水平,從start時刻開始衰減,直至到達end時刻結束,其值爲end_value,之後就一直保持在end_value這一水平上不變。

1. get_scheduled_value_test.py

 1 # -*- coding: utf-8 -*-
 2 # Author:凱魯嘎吉 Coral Gajic
 3 # https://www.cnblogs.com/kailugaji/
 4 # Python小練習:線性衰減
 5 import numpy as np
 6 import matplotlib.pyplot as plt
 7 plt.rc('font',family='Times New Roman')
 8 # Scheduled Exploration Noise
 9 # linear decay
10 def get_scheduled_value(current, schedule):
11     start, end, start_value, end_value = schedule
12     ratio = (current - start) / (end - start) # 當前步數在總步數的比例
13     # 總計100步,當前current步
14     ratio = max(0, min(1, ratio))
15     value = (ratio * (end_value - start_value)) + start_value
16     return value
17 
18 start = 10 # 從這時開始衰減
19 end = 100 # the decay horizon
20 start_value = 1 # 從1衰減到0.1
21 end_value = 0.1
22 schedule = [start, end, start_value, end_value]
23 exploration_noise = []
24 for i in range(int(end - start)+1):
25     value = get_scheduled_value(start + i, schedule)
26     exploration_noise.append(value)
27 
28 # --------------------畫圖------------------------
29 # 手動設置橫縱座標範圍
30 plt.xlim([0, end*1.3])
31 plt.ylim([0, start_value + 0.1])
32 my_time = np.arange(start, end+1)
33 exploration_noise = np.array(exploration_noise)
34 plt.plot([0, start], [start_value, start_value], color = 'red', ls = '-')
35 plt.plot(my_time, exploration_noise, color = 'red', ls = '-')
36 plt.plot([end, end*1.3], [end_value, end_value], color = 'red', ls = '-')
37 # 畫3條不起眼的虛線
38 plt.plot([0, end*1.3], [exploration_noise[-1], exploration_noise[-1]], color = 'gray', ls = '--', alpha = 0.3)
39 plt.text(end - end/3, exploration_noise[-1] + 0.03, "y = %.2f" %exploration_noise[-1], fontdict={'size': '12', 'color': 'gray'})
40 plt.plot([start, start], [0, start_value + 0.1], color = 'gray', ls = '--', alpha = 0.3)
41 plt.text(start + 0.5, start_value - 0.8, "x = %d" %start, fontdict={'size': '12', 'color': 'gray'})
42 plt.plot([end, end], [0, start_value + 0.1], color = 'gray', ls = '--', alpha = 0.3)
43 plt.text(end + 0.5, start_value - 0.8, "x = %d" %end, fontdict={'size': '12', 'color': 'gray'})
44 # 橫縱座標軸
45 plt.xlabel('Timestep')
46 plt.ylabel('Linear Decay')
47 plt.tight_layout()
48 plt.savefig('Linear Decay.png', bbox_inches='tight', dpi=500)
49 plt.show()

2. 結果

3. 參考文獻

[1] Yarats D, Fergus R, Lazaric A, et al. Mastering visual continuous control: Improved data-augmented reinforcement learning[J]. arXiv preprint arXiv:2107.09645, 2021.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章