dom4j

它的主要接口都在org.dom4j這個包裏定義：

　　Attribute Attribute定義了XML的屬性

　　Branch Branch爲能夠包含子節點的節點如XML元素(Element)和文檔(Docuemnts)定義了一個公共的行爲，

　　CDATA CDATA 定義了XML CDATA 區域

　　CharacterData CharacterData是一個標識藉口，標識基於字符的節點。如CDATA，Comment, Text.

　　Comment Comment 定義了XML註釋的行爲

　　Document 定義了XML文檔

　　DocumentType DocumentType 定義XML DOCTYPE聲明

　　Element Element定義XML 元素

　　ElementHandler ElementHandler定義了 Element 對象的處理器

　　ElementPath 被 ElementHandler 使用，用於取得當前正在處理的路徑層次信息

　　Entity Entity定義 XML entity

　　Node Node爲所有的dom4j中XML節點定義了多態行爲

　　NodeFilter NodeFilter 定義了在dom4j節點中產生的一個濾鏡或謂詞的行爲（predicate）

　　ProcessingInstruction ProcessingInstruction 定義 XML 處理指令.

　　Text Text 定義XML 文本節點.

　　Visitor Visitor 用於實現Visitor模式.

　　XPath XPath 在分析一個字符串後會提供一個XPath 表達式

　　看名字大致就知道它們的涵義如何了。

　　要想弄懂這套接口，關鍵的是要明白接口的繼承關係：

　　interface java.lang.Cloneable

　　interface org.dom4j.Node

　　interface org.dom4j.Attribute

　　interface org.dom4j.Branch

　　interface org.dom4j.Document

　　interface org.dom4j.Element

　　interface org.dom4j.CharacterData

　　interface org.dom4j.CDATA

　　interface org.dom4j.Comment

　　interface org.dom4j.Text

　　interface org.dom4j.DocumentType

　　interface org.dom4j.Entity

　　interface org.dom4j.ProcessingInstruction

　　一目瞭然，很多事情都清楚了。大部分都是由Node繼承來的。知道這些關係，將來寫程序就不會出現ClassCastException了。

編輯本段使用簡介

　　下面給出一些例子（部分摘自DOM4J自帶的文檔），簡單說一下如何使用。

１讀取並解析XML文檔

　　讀寫XML文檔主要依賴於org.dom4j.io包，其中提供DOMReader和SAXReader兩類不同方式，而調用方式是一樣的。這就是依靠接口的好處。

　　// 從文件讀取XML，輸入文件名，返回XML文檔

　　public Document read(String fileName) throws MalformedURLException, DocumentException {

　　SAXReader reader = new SAXReader();

　　Document document = reader.read(new File(fileName));

　　return document;

　　}

　　其中，reader的read方法是重載的，可以從InputStream, File, Url等多種不同的源來讀取。得到的Document對象就代表了整個XML。

　　根據本人自己的經驗，讀取的字符編碼是按照XML文件頭定義的編碼來轉換。如果遇到亂碼問題，注意要把各處的編碼名稱保持一致即可。

２取得Root節點

　　讀取後的第二步，就是得到Root節點。熟悉XML的人都知道，一切XML分析都是從Root元素開始的。

　　public Element getRootElement(Document doc){

　　return doc.getRootElement();

　　}

３遍歷XML樹

　　DOM4J提供至少3種遍歷節點的方法：

　　1) 枚舉(Iterator)

　　// 枚舉所有子節點

　　for ( Iterator i = root.elementIterator(); i.hasNext(); ) {

　　Element element = (Element) i.next();

　　// do something

　　}

　　// 枚舉名稱爲foo的節點

　　for ( Iterator i = root.elementIterator(foo); i.hasNext();) {

　　Element foo = (Element) i.next();

　　// do something

　　}

　　// 枚舉屬性

　　for ( Iterator i = root.attributeIterator(); i.hasNext(); ) {

　　Attribute attribute = (Attribute) i.next();

　　// do something

　　}

　　2)遞歸

　　遞歸也可以採用Iterator作爲枚舉手段，但文檔中提供了另外的做法

　　public void treeWalk() {

　　treeWalk(getRootElement());

　　}

　　public void treeWalk(Element element) {

　　for (int i = 0, size = element.nodeCount(); i < size; i++) {

　　Node node = element.node(i);

　　if (node instanceof Element) {

　　treeWalk((Element) node);

　　} else { // do something....

　　}

　　3) Visitor模式

　　最令人興奮的是DOM4J對Visitor的支持，這樣可以大大縮減代碼量，並且清楚易懂。瞭解設計模式的人都知道，Visitor是GOF設計模式之一。其主要原理就是兩種類互相保有對方的引用，並且一種作爲Visitor去訪問許多Visitable。我們來看DOM4J中的Visitor模式(快速文檔中沒有提供)

　　只需要自定一個類實現Visitor接口即可。

　　public class MyVisitor extends VisitorSupport {

　　public void visit(Element element){

　　System.out.println(element.getName());

　　}

　　public void visit(Attribute attr){

　　System.out.println(attr.getName());

　　}

　　調用： root.accept(new MyVisitor())

　　Visitor接口提供多種Visit()的重載，根據XML不同的對象，將採用不同的方式來訪問。上面是給出的Element和Attribute的簡單實現，一般比較常用的就是這兩個。VisitorSupport是DOM4J提供的默認適配器，Visitor接口的Default Adapter模式，這個模式給出了各種visit(*)的空實現，以便簡化代碼。

　　注意，這個Visitor是自動遍歷所有子節點的。如果是root.accept(MyVisitor)，將遍歷子節點。我第一次用的時候，認爲是需要自己遍歷，便在遞歸中調用Visitor，結果可想而知。

4 XPath支持

　　DOM4J對XPath有良好的支持，如訪問一個節點，可直接用XPath選擇。

　　public void bar(Document document) {

　　List list = document.selectNodes( //foo/bar );

　　Node node = document.selectSingleNode(//foo/bar/author);

　　String name = node.valueOf( @name );

　　}

　　例如，如果你想查找XHTML文檔中所有的超鏈接，下面的代碼可以實現：

　　public void findLinks(Document document) throws DocumentException {

　　List list = document.selectNodes( //a/@href );

　　for (Iterator iter = list.iterator(); iter.hasNext(); ) {

　　Attribute attribute = (Attribute) iter.next();

　　String url = attribute.getValue();

　　}

5 字符串與XML的轉換

　　有時候經常要用到字符串轉換爲XML或反之，

　　// XML轉字符串

　　Document document = ...;

　　String text = document.asXML();

　　// 字符串轉XML

　　String text = <name>James</name> </person>;

　　Document document = DocumentHelper.parseText(text);

6 用XSLT轉換XML

　　public Document styleDocument(

　　Document document,

　　String stylesheet

　　) throws Exception {

　　// load the transformer using JAXP

　　TransformerFactory factory = TransformerFactory.newInstance();

　　Transformer transformer = factory.newTransformer(

　　new StreamSource( stylesheet )

　　);

　　// now lets style the given document

　　DocumentSource source = new DocumentSource( document );

　　DocumentResult result = new DocumentResult();

　　transformer.transform( source, result );

　　// return the transformed document

　　Document transformedDoc = result.getDocument();

　　return transformedDoc;

　　}

7 創建XML

　　一般創建XML是寫文件前的工作，這就像StringBuffer一樣容易。

　　public Document createDocument() {

　　Document document = DocumentHelper.createDocument();

　　Element root = document.addElement(root);

　　Element author1 =

　　root

　　.addElement(author)

　　.addAttribute(name, James)

　　.addAttribute(location, UK)

　　.addText(James Strachan);

　　Element author2 =

　　root

　　.addElement(author)

　　.addAttribute(name, Bob)

　　.addAttribute(location, US)

　　.addText(Bob McWhirter);

　　return document;

　　}

8 文件輸出

　　一個簡單的輸出方法是將一個Document或任何的Node通過write方法輸出

　　FileWriter out = new FileWriter( foo.xml );

　　document.write(out);

　　如果你想改變輸出的格式，比如美化輸出或縮減格式，可以用XMLWriter類

　　public void write(Document document) throws IOException {

　　// 指定文件

　　XMLWriter writer = new XMLWriter(

　　new FileWriter( output.xml )

　　);

　　writer.write( document );

　　writer.close();

　　// 美化格式

　　OutputFormat format = OutputFormat.createPrettyPrint();

　　writer = new XMLWriter( System.out, format );

　　writer.write( document );

　　// 縮減格式

　　format = OutputFormat.createCompactFormat();

　　writer = new XMLWriter( System.out, format );

　　writer.write( document );

　　}

　　如何，DOM4J夠簡單吧，當然，還有一些複雜的應用沒有提到，如ElementHandler等。如果你動心了，那就一起來用DOM4J.

編輯本段使用介紹2

　　本文主要討論了用dom4j解析XML的基礎問題，包括建立XML文檔，添加、修改、刪除節點，以及格式化（美化）輸出和中文問題。可作爲dom4j的入門資料。

1．下載與安裝

　　dom4j是sourceforge.net上的一個開源項目，主要用於對XML的解析。從2001年7月發佈第一版以來，已陸續推出多個版本，目前最高版本爲1.5。

　　dom4j專門針對Java開發，使用起來非常簡單、直觀，在Java界，dom4j正迅速普及。

　　可以到http://sourceforge.net/projects/dom4j下載其最新版。

　　dom4j1.5的完整版大約13M，是一個名爲dom4j-1.5.zip的壓縮包，解壓後有一個dom4j-1.5.jar文件，這就是應用時需要引入的類包，另外還有一個jaxen-1.1-beta-4.jar文件，一般也需要引入，否則執行時可能拋java.lang.NoClassDefFoundError: org/jaxen/JaxenException異常，其他的包可以選擇用之。

2．示例XML文檔（holen.xml）

　　爲了述說方便，先看一個XML文檔，之後的操作均以此文檔爲基礎。

　　holen.xml

　　<?xml version="1.0" encoding="UTF-8"?>

　　<books>

　　<title>Dom4j Tutorials</title>

　　</book>

　　<title>Lucene Studing</title>

　　</book>

　　<title>Lucene in Action</title>

　　</book>

　　<owner>O'Reilly</owner>

　　</books>

　　這是一個很簡單的XML文檔，場景是一個網上書店，有很多書，每本書有兩個屬性，一個是書名，一個爲是否展示[show]，最後還有一項是這些書的擁有者[owner]信息。

3．建立一個XML文檔

　　/**

　　* 建立一個XML文檔,文檔名由輸入屬性決定

　　* @param filename 需建立的文件名

　　* @return 返回操作結果, 0表失敗, 1表成功

　　public int createXMLFile(String filename){

　　/** 返回操作結果, 0表失敗, 1表成功 */

　　int returnValue = 0;

　　/** 建立document對象 */

　　Document document = DocumentHelper.createDocument();

　　/** 建立XML文檔的根books */

　　Element booksElement = document.addElement("books");

　　/** 加入一行註釋 */

　　booksElement.addComment("This is a test for dom4j, holen, 2004.9.11");

　　/** 加入第一個book節點 */

　　Element bookElement = booksElement.addElement("book");

　　/** 加入show屬性內容 */

　　bookElement.addAttribute("show","yes");

　　/** 加入title節點 */

　　Element titleElement = bookElement.addElement("title");

　　/** 爲title設置內容 */

　　titleElement.setText("Dom4j Tutorials");

　　/** 類似的完成後兩個book */

　　bookElement = booksElement.addElement("book");

　　bookElement.addAttribute("show","yes");

　　titleElement = bookElement.addElement("title");

　　titleElement.setText("Lucene Studing");

　　bookElement = booksElement.addElement("book");

　　bookElement.addAttribute("show","no");

　　titleElement = bookElement.addElement("title");

　　titleElement.setText("Lucene in Action");

　　/** 加入owner節點 */

　　Element ownerElement = booksElement.addElement("owner");

　　ownerElement.setText("O'Reilly");

　　try{

　　/** 將document中的內容寫入文件中 */

　　XMLWriter writer = new XMLWriter(new FileWriter(new File(filename)));

　　writer.write(document);

　　writer.close();

　　/** 執行成功,需返回1 */

　　returnValue = 1;

　　}catch(Exception ex){

　　ex.printStackTrace();

　　}

　　return returnValue;

　　}

　　說明：

　　Document document = DocumentHelper.createDocument();

　　通過這句定義一個XML文檔對象。

　　Element booksElement = document.addElement("books");

　　通過這句定義一個XML元素，這裏添加的是根節點。

　　Element有幾個重要的方法：

　　l addComment：添加註釋

　　l addAttribute：添加屬性

　　l addElement：添加子元素

　　最後通過XMLWriter生成物理文件，默認生成的XML文件排版格式比較亂，可以通過OutputFormat類的createCompactFormat()方法或createPrettyPrint()方法格式化輸出，默認採用createCompactFormat()方法，顯示比較緊湊，這點將在後面詳細談到。

　　生成後的holen.xml文件內容如下：

　　<?xml version="1.0" encoding="UTF-8"?>

<books><book show="yes"><title>Dom4j Tutorials</title></book><book show="yes"><title>Lucene Studing</title></book><book show="no"><title>Lucene in Action</title></book><owner>O'Reilly</owner></books>

4．修改XML文檔

　　有三項修改任務，依次爲：

　　l 如果book節點中show屬性的內容爲yes,則修改成no

　　l 把owner項內容改爲Tshinghua，並添加date節點

　　l 若title內容爲Dom4j Tutorials,則刪除該節點

　　/**

　　* 修改XML文件中內容,並另存爲一個新文件

　　* 重點掌握dom4j中如何添加節點,修改節點,刪除節點

　　* @param filename 修改對象文件

　　* @param newfilename 修改後另存爲該文件

　　* @return 返回操作結果, 0表失敗, 1表成功

　　public int ModiXMLFile(String filename,String newfilename){

　　int returnValue = 0;

　　try{

　　SAXReader saxReader = new SAXReader();

　　Document document = saxReader.read(new File(filename));

　　/** 修改內容之一: 如果book節點中show屬性的內容爲yes,則修改成no */

　　/** 先用xpath查找對象 */

　　List list = document.selectNodes("/books/book/@show" );

　　Iterator iter = list.iterator();

　　while(iter.hasNext()){

　　Attribute attribute = (Attribute)iter.next();

　　if(attribute.getValue().equals("yes")){

　　attribute.setValue("no");

　　}

　　/**

　　* 修改內容之二: 把owner項內容改爲Tshinghua

　　* 並在owner節點中加入date節點,date節點的內容爲2004-09-11,還爲date節點添加一個屬性type

　　list = document.selectNodes("/books/owner" );

　　iter = list.iterator();

　　if(iter.hasNext()){

　　Element ownerElement = (Element)iter.next();

　　ownerElement.setText("Tshinghua");

　　Element dateElement = ownerElement.addElement("date");

　　dateElement.setText("2004-09-11");

　　dateElement.addAttribute("type","Gregorian calendar");

　　}

　　/** 修改內容之三: 若title內容爲Dom4j Tutorials,則刪除該節點 */

　　list = document.selectNodes("/books/book");

　　iter = list.iterator();

　　while(iter.hasNext()){

　　Element bookElement = (Element)iter.next();

　　Iterator iterator = bookElement.elementIterator("title");

　　while(iterator.hasNext()){

　　Element titleElement=(Element)iterator.next();

　　if(titleElement.getText().equals("Dom4j Tutorials")){

　　bookElement.remove(titleElement);

　　}

　　try{

　　/** 將document中的內容寫入文件中 */

　　XMLWriter writer = new XMLWriter(new FileWriter(new File(newfilename)));

　　writer.write(document);

　　writer.close();

　　/** 執行成功,需返回1 */

　　returnValue = 1;

　　}catch(Exception ex){

　　ex.printStackTrace();

　　}

　　}catch(Exception ex){

　　ex.printStackTrace();

　　}

　　return returnValue;

　　}

　　說明：

　　List list = document.selectNodes("/books/book/@show" );

　　list = document.selectNodes("/books/book");

　　上述代碼通過xpath查找到相應內容。

　　通過setValue()、setText()修改節點內容。

　　通過remove()刪除節點或屬性。

5．格式化輸出和指定編碼

　　默認的輸出方式爲緊湊方式，默認編碼爲UTF-8，但對於我們的應用而言，一般都要用到中文，並且希望顯示時按自動縮進的方式的顯示，這就需用到OutputFormat類。

　　/**

　　* 格式化XML文檔,並解決中文問題

　　* @param filename

　　* @return

　　public int formatXMLFile(String filename){

　　int returnValue = 0;

　　try{

　　SAXReader saxReader = new SAXReader();

　　Document document = saxReader.read(new File(filename));

　　XMLWriter writer = null;

　　/** 格式化輸出,類型IE瀏覽一樣 */

　　OutputFormat format = OutputFormat.createPrettyPrint();

　　/** 指定XML編碼 */

　　format.setEncoding("GBK");

　　writer= new XMLWriter(new OutputStreamWriter(new FileOutputStream("filename"),format.getEncoding()),format);

　　writer.write(document);

　　writer.close();

　　/** 執行成功,需返回1 */

　　returnValue = 1;

　　}catch(Exception ex){

　　ex.printStackTrace();

　　}

　　return returnValue;

　　}

　　說明：

　　OutputFormat format = OutputFormat.createPrettyPrint();

　　這句指定了格式化的方式爲縮進式，則非緊湊式。

　　format.setEncoding("GBK");

　　指定編碼爲GBK。

　　XMLWriter writer = new XMLWriter(new FileWriter(new File(filename)),format);

　　這與前面兩個方法相比，多加了一個OutputFormat對象，用於指定顯示和編碼方式。

6．完整的類代碼

　　前面提出的方法都是零散的，下面給出完整類代碼。

　　Dom4jDemo.java

　　package com.holen.dom4j;

　　import java.io.File;

　　import java.io.FileWriter;

　　import java.util.Iterator;

　　import java.util.List;

　　import org.dom4j.Attribute;

　　import org.dom4j.Document;

　　import org.dom4j.DocumentHelper;

　　import org.dom4j.Element;

　　import org.dom4j.io.OutputFormat;

　　import org.dom4j.io.SAXReader;

　　import org.dom4j.io.XMLWriter;

草泥馬

發佈了17 篇原創文章 · 獲贊 1 · 訪問量 3萬+

私信關注

編輯本段使用簡介

１讀取並解析XML文檔

２取得Root節點

３遍歷XML樹

4 XPath支持

5 字符串與XML的轉換

6 用XSLT轉換XML

7 創建XML

8 文件輸出

編輯本段使用介紹2

1．下載與安裝

2．示例XML文檔（holen.xml）

3．建立一個XML文檔

4．修改XML文檔

5．格式化輸出和指定編碼

6．完整的類代碼

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

基本數值類型和final關鍵字

抽象類和接口的一些特徵

hashCode和equals方法的關係

C語言中的零值比較

dom4j

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

dom4j

編輯本段使用簡介

１ 讀取並解析XML文檔

２ 取得Root節點

３ 遍歷XML樹

4 XPath支持

5 字符串與XML的轉換

6 用XSLT轉換XML

7 創建XML

8 文件輸出

編輯本段使用介紹2

1． 下載與安裝

2． 示例XML文檔（holen.xml）

3． 建立一個XML文檔

4． 修改XML文檔

5． 格式化輸出和指定編碼

6． 完整的類代碼

１讀取並解析XML文檔

２取得Root節點

３遍歷XML樹

1．下載與安裝

2．示例XML文檔（holen.xml）

3．建立一個XML文檔

4．修改XML文檔

5．格式化輸出和指定編碼

6．完整的類代碼