使用python filecmp模塊的dircmp類可以很方便的比對兩個目錄,dircmp的用法已經有很多文章介紹,不再贅述。
可以help(filecmp.dircmp)查看幫助信息,其中提到的x.report()、x.report_partial_closure(),都只能打印兩目錄一級子目錄的比較信息。而x.report_full_closure()可以遞歸打印所有子目錄的比對信息,但是輸出太多,大多數情況下我們可能只關心兩目錄的不同之處。
help(filecmp.dircmp) 摘選:
| High level usage:
| x = dircmp(dir1, dir2)
| x.report() -> prints a report on the differences between dir1 and dir2
| or
| x.report_partial_closure() -> prints report on differences between dir1
| and dir2, and reports on common immediate subdirectories.
| x.report_full_closure() -> like report_partial_closure,
| but fully recursive.
本文編寫的腳本,重點關注並實現兩個目標:1)遞歸比對兩個目錄及其所有子目錄。2)僅輸出兩目錄不同之處,包括文件名相同(common_files)但是文件不一致(diff_files),以及左、右目錄中獨有的文件或子目錄。
py腳本compare_dir.py內容如下:
# -*- coding: utf-8 -*-
"""
@desc 使用filecmp.dircmp遞歸比對兩個目錄,輸出比對結果以及統計信息。
@author longfeiwlf
@date 2020-5-20
"""
from filecmp import dircmp
import sys
# 定義全局變量:
number_different_files = 0 # 文件名相同但不一致的文件數
number_left_only = 0 # 左邊目錄獨有的文件或目錄數
number_right_only = 0 # 右邊目錄獨有的文件或目錄數
def print_diff(dcmp):
"""遞歸比對兩目錄,如果有不同之處,打印出來,同時累加統計計數。"""
global number_different_files
global number_left_only
global number_right_only
for name in dcmp.diff_files:
print("diff_file found: %s/%s" % (dcmp.left, name))
number_different_files += 1
for name_left in dcmp.left_only:
print("left_only found: %s/%s" % (dcmp.left, name_left))
number_left_only += 1
for name_right in dcmp.right_only:
print("right_only found: %s/%s" % (dcmp.right, name_right))
number_right_only += 1
for sub_dcmp in dcmp.subdirs.values():
print_diff(sub_dcmp) # 遞歸比較子目錄
if __name__ == '__main__':
try:
mydcmp = dircmp(sys.argv[1], sys.argv[2])
except IndexError as ie:
print(ie)
print("使用方法:python compare_dir_cn.py 目錄1 目錄2")
else:
print("\n比對結果詳情: ")
print_diff(mydcmp)
if (number_different_files == 0 and number_left_only == 0
and number_right_only == 0):
print("\n兩個目錄完全一致!")
else:
print("\n比對結果統計:")
print("Total Number of different files is: "
+ str(number_different_files))
print("Total Number of files or directories only in '"
+ sys.argv[1] + "' is: " + str(number_left_only))
print("Total Number of files or directories only in '"
+ sys.argv[2] + "' is: " + str(number_right_only))
compare_dir.py腳本使用舉例: