ImagePart :word里面的图片:
在OpenXML中, 要插入一个word文档中的图片叫做一个Blip对象或一个element。
Object->OpenXmlElement->OpenXmlCompositeElement->Blip
Object->OpenXmlElement->OpenXmlCompositeElement->Paragraph
Object->OpenXmlElement->OpenXmlCompositeElement->Run
Object->OpenXmlElement->OpenXmlCompositeElement->Drawing
Object->OpenXmlElement->OpenXmlCompositeElement->OfficeMath
Object->OpenXmlElement->OpenXmlCompositeElement->SdtElement
派生自SdtElement得对象为:
SdtBlock
SdtCell
SdtRow
SdtRun
SdtRunRuby
Object->OpenXmlPartContainer->OpenXmlPart->ImagePart
Object->OpenXmlPartContainer->OpenXmlPart->MainDocumentPart
Object->OpenXmlElement->OpenXmlCompositeElement->BodyType->Body
一个ooxml文档中包含很多Parts:
ChartParts
DiagramDataParts
FooterParts
HeaderParts
.....
它们都是类似于IEnumerable<ImagePart>这样的数据类型。
DrawingML是一个定义ooxml文档中的图片,图形,图表等图形对象的语言, 是一个定义语言, 类似于sql
http://www.officeopenxml.com/drwOverview.php
Drawing对象表示图象, 嵌入在Run里面, 一个典型的xml如下:
<w:r>
<w:drawing>
<wp:inline>
… </wp:inline>
</w:drawing>
</w:r>
正如嵌入文字一样:
run->Text
run-> Drawing
获取ppt中某个图片的信息:
// from a picture
foreach (var pic in slide.Descendants<Picture>())
{
// First, get relationship id of image
string rId = pic.BlipFill.Blip.Embed.Value;
ImagePart imagePart = (ImagePart)slide.SlidePart.GetPartById(rId);
// Get the original file name.
Console.Out.WriteLine(imagePart.Uri.OriginalString);
// Get the content type (e.g. image/jpeg).
Console.Out.WriteLine("content-type: {0}", imagePart.ContentType);
// GetStream() returns the image data
System.Drawing.Image img = System.Drawing.Image.FromStream(imagePart.GetStream());
// You could save the image to disk using the System.Drawing.Image class
img.Save(@"c:\temp\temp.jpg");
}
ooxml中的图片是由图片数据和一个ID租车, ID可以在body中找到, 而且图片数据可以被替换和覆盖。
<w:p>
<w:r>
<w:drawing>
<wp:inline>
<wp:extent cx="3200400" cy="704850" /> <!-- describes the size of the image -->
<wp:docPr id="2" name="Picture 1" descr="filename.JPG" />
<a:graphic>
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic>
<pic:nvPicPr>
<pic:cNvPr id="0" name="filename.JPG" />
<pic:cNvPicPr />
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId5" /> <!-- this is the ID you need to find -->
<a:stretch>
<a:fillRect />
</a:stretch>
</pic:blipFill>
<pic:spPr>
<a:xfrm>
<a:ext cx="3200400" cy="704850" />
</a:xfrm>
<a:prstGeom prst="rect" />
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:r>
</w:p>
ID保存在Blip元素中,
从所有的Run当中提取出Inline元素的代码, 利用Descendants<Run>, 返回一个IEnumerable<Inline>:
https://stackoverflow.com/questions/2810138/replace-image-in-word-doc-using-openxml
using (WordprocessingDocument document = WordprocessingDocument.Open("docfilename.docx", true)) {
// go through the document and pull out the inline image elements
IEnumerable<Inline> imageElements = from run in Document.MainDocumentPart.Document.Descendants<Run>()
where run.Descendants<Inline>().First() != null
select run.Descendants<Inline>().First();
// select the image that has the correct filename (chooses the first if there are many)
Inline selectedImage = (from image in imageElements
where (image.DocProperties != null &&
image.DocProperties.Equals("image filename"))
select image).First();
// get the ID from the inline element
string imageId = "default value";
Blip blipElement = selectedImage.Descendants<Blip>().First();
if (blipElement != null) {
imageId = blipElement.Embed.Value;
}
}
如果把docx文档后缀名改为zip, 然后解压, 就会看到Media目录, 里面存放了很多图片文件以及和Id的映射关系。
如何把docx里面的图片存放到另外一个目录, 请看:
https://stackoverflow.com/questions/2810138/replace-image-in-word-doc-using-openxml
omml2mml生成的mathml带有namespace, 需要在html前面加上这个名字空间才行。
Section: Sections are subdivisions of a document. 一旦文档分为几个Section,可以仅仅格式化某个Section。例如,改变页朝向以及栏目数。
Pagesize: This element specifies the properties (尺寸和朝向) for all pages in the current section.
oleobjectbinarypart
一个twip是一个打印点的二十分之一, 1440分之一英寸, 567分之一厘米
当DPI设置为96时, 一个像素为(1/96)*1440=15Twip
https://baike.baidu.com/item/twip/1554184?fr=aladdin
https://en.wikipedia.org/wiki/Twip
<w:pgSz w:w="11907" w:h="16839" />
A4纸的标准尺寸为: 210*297 毫米
换算为英寸为:8.27 * 11.69英寸
1英寸等于2.54厘米
PageMargin里面有个header属性, header意味着到header顶部的空间。
<w:sectPr> <w:pgMar w:header="720" w:bottom="1440" w:top="1440" w:right="1440" w:left="1440"/> …</w:sectPr>
sectPr (Document Final Section Properties)
This element defines the section properties for the final section of the document. [Note: For any other section the properties are stored as a child element of the paragraph element corresponding to the last paragraph in the given section. end note]
[Example: Consider a document with multiple sections. For all sections except the final section, the sectPr element is stored as a child element of the last paragraph in the section. For the final section, this information is stored as the last child element of the body element, as follows:
XMLCopy
<w:body>
<w:p>
… </w:p>
… <w:sectPr>
(final section's properties) </w:sectPr>
</w:body>
如果一个文档有多个Section, sectPr是描述最后一个section的, 它是body的最后一个节点;而对于其它的section, 保存为该Section的最后一个paragraph的子元素。
OpenXml操作Word的一些操作总结.无word组件生成word.
https://blog.csdn.net/u011394397/article/details/78142860
new ContextualSpacing() { Val = false }
Which tells word to uncheck the “在相同样式的段落间不添加空格” in paragraph options.
利用AltChunk合并多个word文档