Python小練習:線性衰減
作者:凱魯嘎吉 - 博客園 http://www.cnblogs.com/kailugaji/
本文介紹一種最簡單的衰減曲線:線性衰減。給定schedule = [start, end, start_value, end_value],先前一直保持在start_value水平,從start時刻開始衰減,直至到達end時刻結束,其值爲end_value,之後就一直保持在end_value這一水平上不變。
1. get_scheduled_value_test.py
1 # -*- coding: utf-8 -*- 2 # Author:凱魯嘎吉 Coral Gajic 3 # https://www.cnblogs.com/kailugaji/ 4 # Python小練習:線性衰減 5 import numpy as np 6 import matplotlib.pyplot as plt 7 plt.rc('font',family='Times New Roman') 8 # Scheduled Exploration Noise 9 # linear decay 10 def get_scheduled_value(current, schedule): 11 start, end, start_value, end_value = schedule 12 ratio = (current - start) / (end - start) # 當前步數在總步數的比例 13 # 總計100步,當前current步 14 ratio = max(0, min(1, ratio)) 15 value = (ratio * (end_value - start_value)) + start_value 16 return value 17 18 start = 10 # 從這時開始衰減 19 end = 100 # the decay horizon 20 start_value = 1 # 從1衰減到0.1 21 end_value = 0.1 22 schedule = [start, end, start_value, end_value] 23 exploration_noise = [] 24 for i in range(int(end - start)+1): 25 value = get_scheduled_value(start + i, schedule) 26 exploration_noise.append(value) 27 28 # --------------------畫圖------------------------ 29 # 手動設置橫縱座標範圍 30 plt.xlim([0, end*1.3]) 31 plt.ylim([0, start_value + 0.1]) 32 my_time = np.arange(start, end+1) 33 exploration_noise = np.array(exploration_noise) 34 plt.plot([0, start], [start_value, start_value], color = 'red', ls = '-') 35 plt.plot(my_time, exploration_noise, color = 'red', ls = '-') 36 plt.plot([end, end*1.3], [end_value, end_value], color = 'red', ls = '-') 37 # 畫3條不起眼的虛線 38 plt.plot([0, end*1.3], [exploration_noise[-1], exploration_noise[-1]], color = 'gray', ls = '--', alpha = 0.3) 39 plt.text(end - end/3, exploration_noise[-1] + 0.03, "y = %.2f" %exploration_noise[-1], fontdict={'size': '12', 'color': 'gray'}) 40 plt.plot([start, start], [0, start_value + 0.1], color = 'gray', ls = '--', alpha = 0.3) 41 plt.text(start + 0.5, start_value - 0.8, "x = %d" %start, fontdict={'size': '12', 'color': 'gray'}) 42 plt.plot([end, end], [0, start_value + 0.1], color = 'gray', ls = '--', alpha = 0.3) 43 plt.text(end + 0.5, start_value - 0.8, "x = %d" %end, fontdict={'size': '12', 'color': 'gray'}) 44 # 橫縱座標軸 45 plt.xlabel('Timestep') 46 plt.ylabel('Linear Decay') 47 plt.tight_layout() 48 plt.savefig('Linear Decay.png', bbox_inches='tight', dpi=500) 49 plt.show()
2. 結果
3. 參考文獻
[1] Yarats D, Fergus R, Lazaric A, et al. Mastering visual continuous control: Improved data-augmented reinforcement learning[J]. arXiv preprint arXiv:2107.09645, 2021.