使用docxtpl库实现docx报告自动化输出

原創

2020-03-25 16:35

概述

大概是一些信息需要从excel表格中读取，然后填写到word固定位置后批量出局报告，操作繁琐，重复劳动，使用docxtpl库实现。

应用环境如下：

Windows 10
Python 3.6
docxtpl-0.6.3

安装支持

conda install docxtpl
#或
pip install docxtpl
#再不行
conda install -c conda-forge docxtpl
#docxtpl安装，会自动安装依赖库docx和jinja2

docxtpl：官方文档　 github
jinja2：官方文档　中文文档

实现思路

1.jinja2使用{{…}}声明模版中的变量，我们将docx模版中需要替换的内容使用{{…}}手动标注起来。
2.从xls中读取需要替换的值，并与docx模版中预设的变量名对应起来。
3.使用docxtpl库中的DocxTemplate.render完成模板替换。
4.输出替换后的docx。

准备模版

将需要替换的位置使用双大括号进行标准，并添加变量名。
这里需要注意的是这里应该对{{var}}本身的文本格式完成调整，这样后面替换时就不需要再单独对文本格式进行处理了。
需要单独调整的可以通过docxtpl库使用富文本的方式操作。

读取委托信息

中间发生过一个问题，因为手机号信息输出时需要整形，使用pd.astype(‘int64’)时,发现对存在“NAN”的数据无法处理，网上查到说pandas 0.24以上的版本已经可以支持了，就去升级了pandas到1.0.1，结果spyder打不开了。

又查到好像说是升级pandas时，依赖库把mkl升级到了2018.0.3，而这个版本有问题，(参考链接)，建议重新装回mkl 2018.0.2。

conda install mkl=2018.0.2

装完后，出现第二个问题。

问题解决：
因为提示少了”IPython.core.inputtransformer2“模块，所以找到对应的文件夹
”D:\anaconda3\Lib\site-packages\IPython\core“
发现在这下面的文件与可以正常运行的ipython文件夹对比少了”inputtransformer2.py“和”async_helpers.py“两个文件，从中复制过来,正常打开即可~~~

然后是第三个问题，

conda install cloudpickle

最后，spyder4.0如果启动出现“crashed during last session”，可能是kite的问题，卸载kite可以解决。

#设置关键文件的路径
path_template =r'./templates'
path_xlsx =r'./templates/template_x.xlsx'
path_docx1 =r'./templates/template_d_entrust.docx'
path_docx2 =r'./templates/template_d_communication.docx'

#因为存在nan值，默认会转换成float型，手机号输出会带小数点，这里指定'telephone'列为Int64型,pd在0.24以后版本已经可以将含有nan值的数组保存为整型。
dtype_dic= {'telephone': 'Int64','instrument_numbers': 'Int64' }
#读取excel中的委托信息
df = pd.read_excel (path_xlsx, sheet_name=1, header=0,index_col=None, na_values = [ 'NA' ], dtype = dtype_dic)

委托信息整理

#获取最后一行信息的索引
lastrow=df.index[-1]

#对'instrument_numbers'列因合并单元格的产生的nan值进行填充
df['instrument_numbers'].fillna(method='pad',inplace=True)

#对接受日期的年月日信息进行分列，方便后续填入到文档中
df['YY'], df['MM'] , df['DD']= df['接收日期'].str.split('.', 2).str

委托信息替换

#读取模板文档
tpl_1 = DocxTemplate(path_docx1)
tpl_2 = DocxTemplate(path_docx2)

#替换word中的变量
#字典中的key为变量名，value为要替换的值
context = { 
    'Delegate_numbers':df.Delegate_numbers[lastrow],
    'client':df.client[lastrow],
    'addr':df.addr[lastrow],
    'name':df.name[lastrow],
    'telephone':df.telephone[lastrow],
    'instrument_name':df.instrument_name[lastrow],
    'instrument_produce':df.instrument_produce[lastrow],
    'instrument_numbers':df.instrument_numbers[lastrow],
    'instrument_model':df.instrument_model[lastrow],
    'instrument_sn':df.instrument_sn[lastrow],
    'YY':df.YY[lastrow],
    'MM':df.MM[lastrow],
    'DD':df.DD[lastrow],
    'standard_multi':R(std),#template提供了5种方式对字符进行转义,见https://docxtpl.readthedocs.io/en/latest/index.html
    'standard_solo':std1+','+std2
    
}

tpl_1.render(context, autoescape=True)
path_save1='委托单_{}_{}.docx'.format(df.loc[lastrow,'Delegate_numbers'],df.loc[lastrow,'client'])
tpl_1.save(path_save1)

tpl_2.render(context, autoescape=True)
path_save2='与客户沟通的记录及评审表_{}_{}.docx'.format(df.loc[lastrow,'Delegate_numbers'],df.loc[lastrow,'client'])
tpl_2.save(path_save2)

完整代码

在这里插入代码片

[3]: Python办公自动化 | excel读取和写入
[4]: Python办公自动化 | 批量word报告生成工具
[5]: python-docx template 操作word文档
[6]: 超简单Python将Excel的指定数据插入到docx模板并生成
[7]: Excel信息批量替换Word模板生成新文件
[8]:Pandas查找缺失值的位置，并返回缺失值行号以及列号
[9]:基于docxtpl的自动化报告生成(基于word模板)

另外这一篇作者只使用了python-docx库也达到了批量出具报告的作用

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

使用docxtpl库实现docx报告自动化输出

概述

安装支持

实现思路

准备模版

读取委托信息

委托信息整理

委托信息替换

完整代码

調整Jupyter notebook的啓動目錄和瀏覽器

[轉載]從源碼求證tensorflow中os.environ["TF_CPP_MIN_LOG_LEVEL"]的值的含義

前後復權

使用docxtpl庫實現docx報告自動化輸出

調整Jupyter notebook的啓動目錄

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結