使用docxtpl庫實現docx報告自動化輸出

原創

2020-03-25 16:35

概述

大概是一些信息需要從excel表格中讀取，然後填寫到word固定位置後批量出局報告，操作繁瑣，重複勞動，使用docxtpl庫實現。

應用環境如下：

Windows 10
Python 3.6
docxtpl-0.6.3

安裝支持

conda install docxtpl
#或
pip install docxtpl
#再不行
conda install -c conda-forge docxtpl
#docxtpl安裝，會自動安裝依賴庫docx和jinja2

docxtpl：官方文檔　 github
jinja2：官方文檔　中文文檔

實現思路

1.jinja2使用{{…}}聲明模版中的變量，我們將docx模版中需要替換的內容使用{{…}}手動標註起來。
2.從xls中讀取需要替換的值，並與docx模版中預設的變量名對應起來。
3.使用docxtpl庫中的DocxTemplate.render完成模板替換。
4.輸出替換後的docx。

準備模版

將需要替換的位置使用雙大括號進行標準，並添加變量名。
這裏需要注意的是這裏應該對{{var}}本身的文本格式完成調整，這樣後面替換時就不需要再單獨對文本格式進行處理了。
需要單獨調整的可以通過docxtpl庫使用富文本的方式操作。

讀取委託信息

中間發生過一個問題，因爲手機號信息輸出時需要整形，使用pd.astype(‘int64’)時,發現對存在“NAN”的數據無法處理，網上查到說pandas 0.24以上的版本已經可以支持了，就去升級了pandas到1.0.1，結果spyder打不開了。

又查到好像說是升級pandas時，依賴庫把mkl升級到了2018.0.3，而這個版本有問題，(參考鏈接)，建議重新裝回mkl 2018.0.2。

conda install mkl=2018.0.2

裝完後，出現第二個問題。

問題解決：
因爲提示少了”IPython.core.inputtransformer2“模塊，所以找到對應的文件夾
”D:\anaconda3\Lib\site-packages\IPython\core“
發現在這下面的文件與可以正常運行的ipython文件夾對比少了”inputtransformer2.py“和”async_helpers.py“兩個文件，從中複製過來,正常打開即可~~~

然後是第三個問題，

conda install cloudpickle

最後，spyder4.0如果啓動出現“crashed during last session”，可能是kite的問題，卸載kite可以解決。

#設置關鍵文件的路徑
path_template =r'./templates'
path_xlsx =r'./templates/template_x.xlsx'
path_docx1 =r'./templates/template_d_entrust.docx'
path_docx2 =r'./templates/template_d_communication.docx'

#因爲存在nan值，默認會轉換成float型，手機號輸出會帶小數點，這裏指定'telephone'列爲Int64型,pd在0.24以後版本已經可以將含有nan值的數組保存爲整型。
dtype_dic= {'telephone': 'Int64','instrument_numbers': 'Int64' }
#讀取excel中的委託信息
df = pd.read_excel (path_xlsx, sheet_name=1, header=0,index_col=None, na_values = [ 'NA' ], dtype = dtype_dic)

委託信息整理

#獲取最後一行信息的索引
lastrow=df.index[-1]

#對'instrument_numbers'列因合併單元格的產生的nan值進行填充
df['instrument_numbers'].fillna(method='pad',inplace=True)

#對接受日期的年月日信息進行分列，方便後續填入到文檔中
df['YY'], df['MM'] , df['DD']= df['接收日期'].str.split('.', 2).str

委託信息替換

#讀取模板文檔
tpl_1 = DocxTemplate(path_docx1)
tpl_2 = DocxTemplate(path_docx2)

#替換word中的變量
#字典中的key爲變量名，value爲要替換的值
context = { 
    'Delegate_numbers':df.Delegate_numbers[lastrow],
    'client':df.client[lastrow],
    'addr':df.addr[lastrow],
    'name':df.name[lastrow],
    'telephone':df.telephone[lastrow],
    'instrument_name':df.instrument_name[lastrow],
    'instrument_produce':df.instrument_produce[lastrow],
    'instrument_numbers':df.instrument_numbers[lastrow],
    'instrument_model':df.instrument_model[lastrow],
    'instrument_sn':df.instrument_sn[lastrow],
    'YY':df.YY[lastrow],
    'MM':df.MM[lastrow],
    'DD':df.DD[lastrow],
    'standard_multi':R(std),#template提供了5種方式對字符進行轉義,見https://docxtpl.readthedocs.io/en/latest/index.html
    'standard_solo':std1+','+std2
    
}

tpl_1.render(context, autoescape=True)
path_save1='委託單_{}_{}.docx'.format(df.loc[lastrow,'Delegate_numbers'],df.loc[lastrow,'client'])
tpl_1.save(path_save1)

tpl_2.render(context, autoescape=True)
path_save2='與客戶溝通的記錄及評審表_{}_{}.docx'.format(df.loc[lastrow,'Delegate_numbers'],df.loc[lastrow,'client'])
tpl_2.save(path_save2)

完整代碼

在這裏插入代碼片

[3]: Python辦公自動化 | excel讀取和寫入
[4]: Python辦公自動化 | 批量word報告生成工具
[5]: python-docx template 操作word文檔
[6]: 超簡單Python將Excel的指定數據插入到docx模板並生成
[7]: Excel信息批量替換Word模板生成新文件
[8]:Pandas查找缺失值的位置，並返回缺失值行號以及列號
[9]:基於docxtpl的自動化報告生成(基於word模板)

另外這一篇作者只使用了python-docx庫也達到了批量出具報告的作用

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

使用docxtpl庫實現docx報告自動化輸出

概述

安裝支持

實現思路

準備模版

讀取委託信息

委託信息整理

委託信息替換

完整代碼

調整Jupyter notebook的啓動目錄和瀏覽器

[轉載]從源碼求證tensorflow中os.environ["TF_CPP_MIN_LOG_LEVEL"]的值的含義

前後復權

使用docxtpl庫實現docx報告自動化輸出

調整Jupyter notebook的啓動目錄

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結