pymongo讀取大數據時報錯
Traceback (most recent call last):
File "MongodbConvertMysql.py", line 91, in
mcm.mongoConvertMysql()
File "MongodbConvertMysql.py", line 82, in mongoConvertMysql
ret_val = self.__delmysqlInsert(val)
File "MongodbConvertMysql.py", line 66, in __delmysqlInsert
for v in val:
File "build/bdist.linux-x86_64/egg/pymongo/cursor.py", line 1090, in next
File "build/bdist.linux-x86_64/egg/pymongo/cursor.py", line 1012, in _refresh
File "build/bdist.linux-x86_64/egg/pymongo/cursor.py", line 903, in __send_message
File "build/bdist.linux-x86_64/egg/pymongo/helpers.py", line 137, in _unpack_response
pymongo.errors.OperationFailure: database error: Plan executor error during find: Overflow sort stage buffered data usage of 33554492 bytes exceeds internal limit of 33554432 bytes
原因比較明確:Sort operation used more than the maximum 33554432 bytes of RAM.
,33554432 bytes
算下來正好是32Mb
,而Mongodb的sort操作是把數據拿到內存中再進行排序的,爲了節約內存,默認給sort操作限制了最大內存爲32Mb
,當數據量越來越大直到超過32Mb
的時候就自然拋出異常了!
解決方案有兩個思路,一個是既然內存不夠用那就修改默認配置多分配點內存空間;一個是像錯誤提示裏面說的那樣創建索引。
首先說如何修改默認內存配置,在Mongodb命令行窗口中執行如下命令即可:
db.adminCommand({setParameter:1, internalQueryExecMaxBlockingSortBytes:335544320})
我直接把內存擴大了10倍,變成了320Mb。從這裏可以看出,除非你服務器的內存足夠大,否則sort佔用的內存會成爲一個嚴重的資源消耗!然後是創建索引,也比較簡單:
db.yourCollection.createIndex({<field>:<1 or -1>})
db.yourCollection.getIndexes() //查看當前collection的索引
其中1
表示升序排列,-1
表示降序排列。索引創建之後即時生效,不需要重啓數據庫和服務器程序,也不需要對原來的數據庫查詢語句進行修改。創建索引的話也有不好的地方,會導致數據寫入變慢,同時Mongodb數據本身佔用的存儲空間也會變多。不過從查詢性能和服務器資源消耗這兩方面來看,通過創建索引來解決這個問題還是最佳的方案!
來源:https://blog.csdn.net/cloume/article/details/70767061
http://blog.sina.com.cn/s/blog_4b623d4e0102wztq.html