Python os.walk 函數

Python os.walk 函數

概述:
    os.walk() 方法用於通過在目錄樹種遊走輸出在目錄中的文件名,向上或者向下。在Unix,Windows中有效。
語法:
os.walk(top[, topdown=True[, onerror=None[, followlinks=False]]])
參數:
  • top – 根目錄下的每一個文件夾(包含它自己), 產生3-元組 (dirpath, dirnames, filenames)【文件夾路徑, 文件夾名字, 文件名】

  • topdown –可選,爲True或者沒有指定, 一個目錄的的3-元組將比它的任何子文件夾的3-元組先產生 (目錄自上而下)。如果topdown爲 False, 一個目錄的3-元組將比它的任何子文件夾的3-元組後產生 (目錄自下而上)。

  • onerror – 可選,是一個函數; 它調用時有一個參數, 一個OSError實例。報告這錯誤後,繼續walk,或者拋出exception終止walk。

  • followlinks – 設置爲 true,則通過軟鏈接訪問目錄

返回值:
該方法沒有返回值
實例:
import os
from os.path import join, getsize

def getdirsize(dir):
        size = 0L
        for root, dirs, files in os.walk(dir):
                print 'file = ', files  # file in current dictionary
                for name in files:
                        full_name = join(root,name)
                        print full_name
                        size += getsize(full_name) 
        return size

if __name__ == '__main__':
        filesize = getdirsize(r'/home/data/crawler_data/cs_CZK')
        print 'There are %.3f' % (filesize/1024/1024), 'Mbytes in /home/hadoop/data/crawler_data'
結果:
file =  []
file =  ['2017051104_content', '2017051010_content', '2017051008_content', '2017051113_content', '2017051012_content', '2017051118_content', '2017051117_content', '2017051103_content', '2017051015_content', '2017051007_content', '2017051014_content', '2017051022_content', '2017051019_content', '2017051102_content', '2017051023_content', '2017051021_content', '2017051020_content', '2017051009_content', '2017051106_content', '2017051202_content', '2017051013_content', '2017051112_content', '2017051119_content', '2017051111_content', '2017051108_content', '2017051017_content', '2017051121_content', '2017051101_content', '2017051201_content', '2017051018_content', '2017051006_content', '2017051110_content', '2017051105_content', '2017051116_content', '2017051123_content', '2017051011_content', '2017051200_content', '2017051120_content', '2017051100_content', '2017051016_content', '2017051107_content', '2017051115_content', '2017051109_content', '2017051122_content', '2017051114_content']
/home/data/crawler_data/cs_CZK/74IwnIzFSn0THho0KHkTKLAHsfu-5tZy/2017051104_content
There are 1837.000 Mbytes in /home/data/crawler_data
參考:

Python os.walk() 方法

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章