用Openpyxl做两个Excel文件的比对

最近老是要做比对Report的工作,有个想法写个Python小程序来比对两个文件。

以前只用过xlrd库,处理的是xls文件,做一些简单的数据处理

这次在写小工具的同时也学习一下新的东西


由于报表是Office 2007类型(也就是Xlsx类型的)

上网查了下,决定采用openpyxl

Python处理Excel的四个工具

到官网上按着流程装了一发

A python library: Openpyxl

由于要拖BitBucket的源码,我还先装了SourceTree(别怕麻烦

据说SourceTree还有Git的功能,下次体验一下。


Openpyxl装好以后,直接在Python里面

from openpyxl import *

就可以直接使用了



Openxyxl Tutorial

Openxyxl还提供了一个很棒的教程(不过有点太简单了

而且这个库各种文档都不是特别详细…有什么都要跑去看它的源码是怎么做的...


最终代码如下

from openpyxl import *

def compareExcel(ename1, ename2):

    print "------------------------------------"
    print "Comparing",ename1,ename2
    # A bool to verify if 2 xlsx is the same.
    fileSame = True

    # Load workbook, and get sheetname
    wb1 = load_workbook(filename = ename1)
    wb2 = load_workbook(filename = ename2)
    sn1 = wb1.get_sheet_names()
    sn2 = wb2.get_sheet_names()

    if (sn1 != sn2):
        print "Two file has different sheets."
        print ename1,"has sheet names:", sn1
        print ename2,"has sheet names:", sn2
    else:
        sn = sn1
        for wsn in sn:
            # Get worksheet
            ws1 = wb1.get_sheet_by_name(name = wsn)
            ws2 = wb2.get_sheet_by_name(name = wsn)
            c = ws1.get_highest_column()
            r = ws1.get_highest_row()
            # This can be replace by ws1.get_dimension
            if ((ws2.get_highest_column()!= c) or (ws2.get_highest_row() != r)):
                print "DIFFERDENT at SHEET-",wsn,": Rows or columns not the same!"
                fileSame = False
            else:
                #Compare every cell.
                flag = True
                for i in range(1,r+1):
                    for j in range(1,c+1):
                        c1 = ws1.cell(None,i,j)
                        c2 = ws2.cell(None,i,j)
                        if (c1):
                            if (c2):
                                if (c1.value != c2.value):
                                    if ((wsn == "Internal Info") and ((i == 4) and (j == 2)) or ((i == 5) and (j == 3))):
                                        continue
                                    print "DIFFERDENT_VALUE at SHEET-",wsn,": At (",i,",",j,")",
                                    print "diff FROM",c1.value,"TO",c2.value
                                    flag = False
                            else:
                                print "DIFFERDENT_TO_NONE at SHEET-",wsn,": At (",i,",",j,")"
                                print "diff FROM",c1.value
                                flag = False
                        else:
                            if (c2):
                                print "DIFFERDENT_TO_NONE at SHEET-",wsn,": At (",i,",",j,")"
                                print "diff FROM",c2.value
                                flag = False
                fileSame = fileSame and flag
        if fileSame:
            print "SAME_FILE:", ename1, ename2

    print "------------------------------------"
    print



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章