python for循環計算速度很慢

原創

2020-04-26 09:09

從一個比較大的數據文件中讀取數據，是一個80k乘10k的矩陣，保存在pickle文件中，循環讀取然後做一些計算處理，最後記錄並保存爲同樣矩陣大小的文檔，代碼寫好之後，開始測試，跑一個循環就要6s左右，一共80k個循環，就是133.3333334小時，要命，耗不起。
先上代碼，再細究這個問題還有解決辦法：

	...
	with open(save_path, 'rb') as fi1:
	    result = pickle.load(fi1, encoding='iso-8859-1')
	n, fx = zip(*result)

	merge_result=[]
    #starttime = datetime.now()
    for i in range(0, len(result)):
        #starttime = datetime.now()
        f_x = np.array(fx)[i]  #problem is here
        name = result[i][0]
        full_result= merge_result(f_x)
        merge_result.append((name, full_result))
        if i % 1000 == 0:
            print('processing result:{}/80000\r'.format(i))
        #endtime = datetime.now()
        #print('consuming time:', (endtime - starttime))
        #starttime = datetime.now()
        
	...

按照上面這段代碼，跑一個循環耗時是5.92s左右，用的顯卡是Titan xp，太費時間了，速度這麼慢，非常不合理，對各個部分的耗時都做了計算，最後發現問題出在f_x = np.array(fx)[i]這裏，因爲fx是一個元組，當時的理解是需要轉換成array數組的形式進行計算，後來瞭解了一下，在Python中其實是沒有數組這個概念的，只有列表（list）和元組（tuple），還是基礎的問題，對python理解不夠。
問題找到了，解決起來也有方向了，後來發現其實根本不需要轉化成numpy的數組形式進行計算，因爲通過序號檢索得到的結果就是數組的形式，不需要再對整個矩陣進行轉化成array操作。
代碼修改後如下：

	...
	with open(save_path, 'rb') as fi1:
	    result = pickle.load(fi1, encoding='iso-8859-1')
	n, fx = zip(*result)

	merge_result=[]
    #starttime = datetime.now()
    for i in range(0, len(result)):
        #starttime = datetime.now()
        f_x = fx[i]  #problem is here
        name = result[i][0]
        full_result= merge_result(f_x)
        merge_result.append((name, full_result))
        if i % 1000 == 0:
            print('processing result:{}/80000\r'.format(i))
        #endtime = datetime.now()
        #print('consuming time:', (endtime - starttime))
        #starttime = datetime.now()
        
	...

修改完之後，再測試速度，跑一個循環只需要0.127s左右，這就舒服多了，效率也高了很多。
速度大幅提升啊！

關鍵還是對基礎的知識瞭解得不夠，多學習多積累，共勉。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python for循環計算速度很慢

cv2.error: cascadedetect.cpp:1389: error: (-215:Assertion failed) scaleFactor 1 && _image.depth()

Python中的**kwargs和*args【言簡意賅系列】

pytorch中tf.nn.functional.softmax(x,dim = -1)對參數dim的理解

使用Keras 的Model.fit_generator報錯StopIteration

keras輸入數據時報錯 Expected to see 1 array(s), but instead got the following list of 128 arrays

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結