Java XML 解析器

在工作中我們也許會用到xml,比如java中的配置文件,或者是一些基於硬件方面的接口通訊,一把都不是json,而是xml格式的,那爲了好操作,我們需要把xml文件格式轉換爲我們需要的實體對象,那麼:如何高效的將xml對象解析爲我們的實體類對象?

目前在java中比較流行的,xml解析器有四種:

1.DOM解析器

2.SAX 解析器

3.StAX解析器

4.JAXB解析器  (這裏暫不試驗,用起來相對複雜一些)

當然除了上面這四種,github或其他開源平臺上也有許多開源的xml解析插件。這裏主要來結合代碼來說明這四種解析器的使用。

DOM 解析器

DOM 解析器是最容易學習的java xml解析器。DOM解析器將XML文件加載到內存中,我們可以逐節點遍歷它來解析XML。DOM Parser適用於小文件,但是當文件大小增加時,它執行速度慢並消耗更多內存。

測試代碼如下:

創建一個employee.xml的測試文件:

<?xml version="1.0"?>
<Employees>
    <Employee>
        <name>Pankaj</name>
        <age>544</age>
        <role>Java Developer</role>
        <gender>Male</gender>
    </Employee>
    <Employee>
        <name>Lisa</name>
        <age>35</age>
        <role>CSS Developer</role>
        <gender>Female</gender>
    </Employee>
</Employees>

DOMParse類如下:

public class DOMParse {
    //DOM Parser適用於小型XML文檔,但由於它將完整的XML文件加載到內存中,因此對大型XML文件不利。對於大型XML文件,您應該使用SAX Parser。
    public static void main(String[] args) throws Exception {
        String filePath = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        File xmlFile = new File(filePath);
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder;
        try {
            dBuilder = dbFactory.newDocumentBuilder();
            Document doc = dBuilder.parse(xmlFile);
            doc.getDocumentElement().normalize();
            System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
            NodeList nodeList = doc.getElementsByTagName("Employee");
            //now XML is loaded as Document in memory, lets convert it to Object List
            List<Employee> empList = new ArrayList<Employee>();
            for (int i = 0; i < nodeList.getLength(); i++) {
                empList.add(getEmployee(nodeList.item(i)));
            }
            //lets print Employee list information
            for (Employee emp : empList) {
                System.out.println(emp.toString());
            }
        } catch (SAXException | ParserConfigurationException | IOException e1) {
            e1.printStackTrace();
        }

    }


    private static Employee getEmployee(Node node) {
        //XMLReaderDOM domReader = new XMLReaderDOM();
        Employee emp = new Employee();
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element element = (Element) node;
            emp.setName(getTagValue("name", element));
            emp.setAge(Integer.parseInt(getTagValue("age", element)));
            emp.setGender(getTagValue("gender", element));
            emp.setRole(getTagValue("role", element));
        }

        return emp;
    }


    private static String getTagValue(String tag, Element element) {
        NodeList nodeList = element.getElementsByTagName(tag).item(0).getChildNodes();
        Node node = (Node) nodeList.item(0);
        return node.getNodeValue();
    }

}

輸出結果:

Root element :Employees
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

SAX 解析器

Java SAX 解析器提供瞭解析XML文檔的API。SAX解析器與DOM解析器不同,因爲它不會將完整的XML加載到內存中並按順序讀取xml文檔。它是一個基於事件的解析器,我們需要實現我們的Handler類來解析XML文件。對於大型XML文件而言,它在時間和內存使用方面比DOM Parser更優秀。

javax.xml.parsers.SAXParser提供了使用事件處理程序解析XML文檔的方法。此類實現XMLReader接口並提供重載版本的parse()方法,以從File,InputStream,SAX InputSource和String URI讀取XML文檔。

實際的解析由Handler類完成。我們需要創建自己的處理程序類來解析XML文檔。我們需要實現org.xml.sax.ContentHandler接口來創建自己的處理程序類。此接口包含回調方法,這些方法在發生任何事件時接收通知。例如StartDocument,EndDocument,StartElement,EndElement,CharacterData等。

org.xml.sax.helpers.DefaultHandler提供了ContentHandler接口的默認實現,我們可以擴展這個類來創建自己的處理程序。建議擴展此類,因爲我們可能只需要很少的方法來實現。擴展此類將使我們的代碼更清晰,更易於維護。

我們依然沿用相同的employee.xml文件

創建我們自己的Handler對象EmployeeXMLHandler

public class EmployeeXMLHandler extends DefaultHandler {

    //List to hold Employees object
    private List<Employee> empList = null;
    private Employee emp = null;


    //getter method for employee list
    public List<Employee> getEmpList() {
        return empList;
    }

    boolean bAge = false;
    boolean bName = false;
    boolean bGender = false;
    boolean bRole = false;

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)
            throws SAXException {

        if (qName.equalsIgnoreCase("Employee")) {
            //create a new Employee and put it in Map
            //initialize Employee object and set id attribute
            emp = new Employee();
            //initialize list
            if (empList == null)
                empList = new ArrayList<>();
        } else if (qName.equalsIgnoreCase("name")) {
            //set boolean values for fields, will be used in setting Employee variables
            bName = true;
        } else if (qName.equalsIgnoreCase("age")) {
            bAge = true;
        } else if (qName.equalsIgnoreCase("gender")) {
            bGender = true;
        } else if (qName.equalsIgnoreCase("role")) {
            bRole = true;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (qName.equalsIgnoreCase("Employee")) {
            //add Employee object to list
            empList.add(emp);
        }
    }

    @Override
    public void characters(char ch[], int start, int length) throws SAXException {

        if (bAge) {
            //age element, set Employee age
            emp.setAge(Integer.parseInt(new String(ch, start, length)));
            bAge = false;
        } else if (bName) {
            emp.setName(new String(ch, start, length));
            bName = false;
        } else if (bRole) {
            emp.setRole(new String(ch, start, length));
            bRole = false;
        } else if (bGender) {
            emp.setGender(new String(ch, start, length));
            bGender = false;
        }
    }
}

 

測試類XMLParserSAX:

public class XMLParserSAX {

    public static void main(String[] args) {
        SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = saxParserFactory.newSAXParser();
            EmployeeXMLHandler handler = new EmployeeXMLHandler();
            saxParser.parse(new File("D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml"), handler);
            //Get Employees list
            List<Employee> empList = handler.getEmpList();
            //print employee information
            for(Employee emp : empList)
                System.out.println(emp);
        } catch (ParserConfigurationException  | IOException | org.xml.sax.SAXException e) {
            e.printStackTrace();
        }
    }
}

輸出結果:

Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

要覆蓋的SAX解析器方法

重寫的重要方法是startElement()endElement()characters()

SAXParser開始解析文檔,當找到任何start元素時,startElement()調用方法。我們重寫此方法以設置將用於標識元素的布爾變量。

每次找到Employee start元素時,我們也使用此方法創建新的Employee對象。檢查如何讀取id屬性以設置Employee Object id字段。

characters()SAXParser在元素中找到字符數據時調用方法。我們使用布爾字段將值設置爲在Employee對象中更正字段。

endElement()是我們Employee對象添加到每當我們發現員工結束元素標籤列表中的位置。

SAXParserFactory提供工廠方法來獲取SAXParser實例。我們將File對象與MyHandler實例一起傳遞給parse方法來處理回調事件。

SAXParser在開始時有點混亂,但如果您正在處理大型XML文檔,它提供了比DOM Parser更有效的XML讀取方法。這就是Java中的SAX Parser。

 

StAX Java XML 解析器

用於XML的Java Streaming API(Java StAX)提供了在java中處理XML的實現。StAX包含兩組API - 基於遊標的API基於迭代器的API

 基於迭代的API

我們依然沿用上面的employee.xml文件來做測試。

 

創建StaxXMLReader類:

public class StaxXMLReader {

    public static void main(String[] args) {
        String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        List<Employee> empList = parseXML(fileName);
        for(Employee emp : empList){
            System.out.println(emp.toString());
        }
    }

    private static List<Employee> parseXML(String fileName) {
        List<Employee> empList = new ArrayList<>();
        Employee emp = null;
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        try {
            XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
            while(xmlEventReader.hasNext()){
                XMLEvent xmlEvent = xmlEventReader.nextEvent();
                if (xmlEvent.isStartElement()){
                    StartElement startElement = xmlEvent.asStartElement();
                    if(startElement.getName().getLocalPart().equals("Employee")){
                        emp = new Employee();
                        //Get the 'id' attribute from Employee element
                        Attribute idAttr = startElement.getAttributeByName(new QName("id"));
                        /*if(idAttr != null){
                            emp.setId(Integer.parseInt(idAttr.getValue()));
                        }*/
                    }
                    //set the other varibles from xml elements
                    else if(startElement.getName().getLocalPart().equals("age")){
                        xmlEvent = xmlEventReader.nextEvent();
                    // 這裏得注意一下,如果age可能爲空則需要這樣來判斷一下
                        if(xmlEvent.isEndElement()) {
                            emp.setAge(Integer.parseInt("1000"));
                        }
                        else
                        {
                            emp.setAge(Integer.parseInt(xmlEvent.asCharacters().getData()));
                        }

                    }else if(startElement.getName().getLocalPart().equals("name")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setName(xmlEvent.asCharacters().getData());
                    }else if(startElement.getName().getLocalPart().equals("gender")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setGender(xmlEvent.asCharacters().getData());
                    }else if(startElement.getName().getLocalPart().equals("role")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setRole(xmlEvent.asCharacters().getData());
                    }
                }
                //if Employee end element is reached, add employee object to list
                if(xmlEvent.isEndElement()){
                    EndElement endElement = xmlEvent.asEndElement();
                    System.out.println("取到的結束標籤"+endElement.getName().getLocalPart());
                    if(endElement.getName().getLocalPart().equals("Employee")){
                        empList.add(emp);
                    }
                }
            }

        } catch (FileNotFoundException | XMLStreamException e) {
            e.printStackTrace();
        }
        return empList;
    }
}

 

 

基於遊標的API

當我們使用StAX XML Parser時,我們需要創建XMLInputFactory讀取XML文件。然後我們可以通過創建XMLStreamReader對象來讀取文件來選擇基於遊標的API 。XMLStreamReader next()方法用於獲取下一個解析事件,並根據事件類型返回int值。常見事件類型包括Start Document,Start Element,Characters,End Element和End Document。XMLStreamConstants包含可用於根據事件類型處理事件的int常量。

測試類StaxXMLReader2

public class StaxXMLReader2
{
    private static boolean bName;
    private static boolean bAge;
    private static boolean bGender;
    private static boolean bRole;

    public static void main(String[] args) {
        String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        List<Employee> empList = parseXML(fileName);
        for(Employee emp : empList){
            System.out.println(emp.toString());
        }
    }

    private static List<Employee> parseXML(String fileName) {
        List<Employee> empList = new ArrayList<>();
        Employee emp = null;
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        try {
            XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
            int event = xmlStreamReader.getEventType();
            while(true){
                switch(event) {
                    case XMLStreamConstants.START_ELEMENT:
                        if(xmlStreamReader.getLocalName().equals("Employee")){
                            emp = new Employee();
                           // emp.setId(Integer.parseInt(xmlStreamReader.getAttributeValue(0)));
                        }else if(xmlStreamReader.getLocalName().equals("name")){
                            bName=true;
                        }else if(xmlStreamReader.getLocalName().equals("age")){
                            bAge=true;
                        }else if(xmlStreamReader.getLocalName().equals("role")){
                            bRole=true;
                        }else if(xmlStreamReader.getLocalName().equals("gender")){
                            bGender=true;
                        }
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        if(bName){
                            emp.setName(xmlStreamReader.getText());
                            bName=false;
                        }else if(bAge){
                            emp.setAge(Integer.parseInt(xmlStreamReader.getText()));
                            bAge=false;
                        }else if(bGender){
                            emp.setGender(xmlStreamReader.getText());
                            bGender=false;
                        }else if(bRole){
                            emp.setRole(xmlStreamReader.getText());
                            bRole=false;
                        }
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        if(xmlStreamReader.getLocalName().equals("Employee")){
                            empList.add(emp);
                        }
                        break;
                }
                if (!xmlStreamReader.hasNext())
                    break;

                event = xmlStreamReader.next();
            }

        } catch (FileNotFoundException | XMLStreamException e) {
            e.printStackTrace();
        }
        return empList;
    }
}

運行結果:

Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

 

 

 

 

Java XML Parser - JDOM

JDOM提供了一個出色的Java XML解析器API,可以輕鬆地讀取,編輯和編寫XML文檔。JDOM提供了包裝類,用於從SAX Parser,DOM Parser,STAX Event Parser和STAX Stream Parser中選擇底層實現。

添加maven依賴:

    <!-- https://mvnrepository.com/artifact/org.jdom/jdom2 -->
        <dependency>
            <groupId>org.jdom</groupId>
            <artifactId>jdom2</artifactId>
            <version>2.0.6</version>
        </dependency>

 

測試類JDOMXMLReader

public class JDOMXMLReader {
    //使用JDOM的好處是可以輕鬆地從SAX切換到DOM到STAX Parser,您可以提供工廠方法讓客戶端應用程序選擇實現。
    public static void main(String[] args) {
        final String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        org.jdom2.Document jdomDoc;
        try {
            //we can create JDOM Document from DOM, SAX and STAX Parser Builder classes
            jdomDoc = useDOMParser(fileName);
           // jdomDoc = useSAXParser(fileName);
          //  jdomDoc = useSTAXParser(fileName,"stream");
            Element root = jdomDoc.getRootElement();
            List<Element> empListElements = root.getChildren("Employee");
            List<Employee> empList = new ArrayList<>();
            for (Element empElement : empListElements) {
                Employee emp = new Employee();
               // emp.setId(Integer.parseInt(empElement.getAttributeValue("id")));
                emp.setAge(Integer.parseInt(empElement.getChildText("age")));
                emp.setName(empElement.getChildText("name"));
                emp.setRole(empElement.getChildText("role"));
                emp.setGender(empElement.getChildText("gender"));
                empList.add(emp);
            }
            //lets print Employees list information
            for (Employee emp : empList)
                System.out.println(emp);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }


    //Get JDOM document from DOM Parser
    private static org.jdom2.Document useDOMParser(String fileName)
            throws ParserConfigurationException, SAXException, IOException {
        //creating DOM Document
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder;
        dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(new File(fileName));
        DOMBuilder domBuilder = new DOMBuilder();
        return domBuilder.build(doc);

    }

    //Get JDOM document from SAX Parser
    private static org.jdom2.Document useSAXParser(String fileName) throws JDOMException,
            IOException {
        SAXBuilder saxBuilder = new SAXBuilder();
        return saxBuilder.build(new File(fileName));
    }

    //Get JDOM Document from STAX Stream Parser or STAX Event Parser
    private static org.jdom2.Document useSTAXParser(String fileName, String type) throws FileNotFoundException, XMLStreamException, JDOMException{
        if(type.equalsIgnoreCase("stream")){
            StAXStreamBuilder staxBuilder = new StAXStreamBuilder();
            XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
            XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
            return staxBuilder.build(xmlStreamReader);
        }
        StAXEventBuilder staxBuilder = new StAXEventBuilder();
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
        return staxBuilder.build(xmlEventReader);

    }
}

使用JDOM的好處是可以輕鬆地從SAX切換到DOMSTAX Parser,我們可以提供相關實現接口讓客戶端應用程序選擇實現。

 

 

完整的測試代碼地址:https://github.com/bo-zhang-1/Xml-Parser

 

 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章