在工作中我們也許會用到xml,比如java中的配置文件,或者是一些基於硬件方面的接口通訊,一把都不是json,而是xml格式的,那爲了好操作,我們需要把xml文件格式轉換爲我們需要的實體對象,那麼:如何高效的將xml對象解析爲我們的實體類對象?
目前在java中比較流行的,xml解析器有四種:
1.DOM解析器
2.SAX 解析器
3.StAX解析器
4.JAXB解析器 (這裏暫不試驗,用起來相對複雜一些)
當然除了上面這四種,github或其他開源平臺上也有許多開源的xml解析插件。這裏主要來結合代碼來說明這四種解析器的使用。
DOM 解析器
DOM 解析器是最容易學習的java xml解析器。DOM解析器將XML文件加載到內存中,我們可以逐節點遍歷它來解析XML。DOM Parser適用於小文件,但是當文件大小增加時,它執行速度慢並消耗更多內存。
測試代碼如下:
創建一個employee.xml的測試文件:
<?xml version="1.0"?>
<Employees>
<Employee>
<name>Pankaj</name>
<age>544</age>
<role>Java Developer</role>
<gender>Male</gender>
</Employee>
<Employee>
<name>Lisa</name>
<age>35</age>
<role>CSS Developer</role>
<gender>Female</gender>
</Employee>
</Employees>
DOMParse類如下:
public class DOMParse {
//DOM Parser適用於小型XML文檔,但由於它將完整的XML文件加載到內存中,因此對大型XML文件不利。對於大型XML文件,您應該使用SAX Parser。
public static void main(String[] args) throws Exception {
String filePath = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
File xmlFile = new File(filePath);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nodeList = doc.getElementsByTagName("Employee");
//now XML is loaded as Document in memory, lets convert it to Object List
List<Employee> empList = new ArrayList<Employee>();
for (int i = 0; i < nodeList.getLength(); i++) {
empList.add(getEmployee(nodeList.item(i)));
}
//lets print Employee list information
for (Employee emp : empList) {
System.out.println(emp.toString());
}
} catch (SAXException | ParserConfigurationException | IOException e1) {
e1.printStackTrace();
}
}
private static Employee getEmployee(Node node) {
//XMLReaderDOM domReader = new XMLReaderDOM();
Employee emp = new Employee();
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
emp.setName(getTagValue("name", element));
emp.setAge(Integer.parseInt(getTagValue("age", element)));
emp.setGender(getTagValue("gender", element));
emp.setRole(getTagValue("role", element));
}
return emp;
}
private static String getTagValue(String tag, Element element) {
NodeList nodeList = element.getElementsByTagName(tag).item(0).getChildNodes();
Node node = (Node) nodeList.item(0);
return node.getNodeValue();
}
}
輸出結果:
Root element :Employees
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
SAX 解析器
Java SAX 解析器提供瞭解析XML文檔的API。SAX解析器與DOM解析器不同,因爲它不會將完整的XML加載到內存中並按順序讀取xml文檔。它是一個基於事件的解析器,我們需要實現我們的Handler類來解析XML文件。對於大型XML文件而言,它在時間和內存使用方面比DOM Parser更優秀。
javax.xml.parsers.SAXParser
提供了使用事件處理程序解析XML文檔的方法。此類實現XMLReader
接口並提供重載版本的parse()
方法,以從File,InputStream,SAX InputSource和String URI讀取XML文檔。
實際的解析由Handler類完成。我們需要創建自己的處理程序類來解析XML文檔。我們需要實現org.xml.sax.ContentHandler
接口來創建自己的處理程序類。此接口包含回調方法,這些方法在發生任何事件時接收通知。例如StartDocument,EndDocument,StartElement,EndElement,CharacterData等。
org.xml.sax.helpers.DefaultHandler
提供了ContentHandler接口的默認實現,我們可以擴展這個類來創建自己的處理程序。建議擴展此類,因爲我們可能只需要很少的方法來實現。擴展此類將使我們的代碼更清晰,更易於維護。
我們依然沿用相同的employee.xml文件
創建我們自己的Handler對象EmployeeXMLHandler:
public class EmployeeXMLHandler extends DefaultHandler {
//List to hold Employees object
private List<Employee> empList = null;
private Employee emp = null;
//getter method for employee list
public List<Employee> getEmpList() {
return empList;
}
boolean bAge = false;
boolean bName = false;
boolean bGender = false;
boolean bRole = false;
@Override
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase("Employee")) {
//create a new Employee and put it in Map
//initialize Employee object and set id attribute
emp = new Employee();
//initialize list
if (empList == null)
empList = new ArrayList<>();
} else if (qName.equalsIgnoreCase("name")) {
//set boolean values for fields, will be used in setting Employee variables
bName = true;
} else if (qName.equalsIgnoreCase("age")) {
bAge = true;
} else if (qName.equalsIgnoreCase("gender")) {
bGender = true;
} else if (qName.equalsIgnoreCase("role")) {
bRole = true;
}
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if (qName.equalsIgnoreCase("Employee")) {
//add Employee object to list
empList.add(emp);
}
}
@Override
public void characters(char ch[], int start, int length) throws SAXException {
if (bAge) {
//age element, set Employee age
emp.setAge(Integer.parseInt(new String(ch, start, length)));
bAge = false;
} else if (bName) {
emp.setName(new String(ch, start, length));
bName = false;
} else if (bRole) {
emp.setRole(new String(ch, start, length));
bRole = false;
} else if (bGender) {
emp.setGender(new String(ch, start, length));
bGender = false;
}
}
}
測試類XMLParserSAX:
public class XMLParserSAX {
public static void main(String[] args) {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
try {
SAXParser saxParser = saxParserFactory.newSAXParser();
EmployeeXMLHandler handler = new EmployeeXMLHandler();
saxParser.parse(new File("D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml"), handler);
//Get Employees list
List<Employee> empList = handler.getEmpList();
//print employee information
for(Employee emp : empList)
System.out.println(emp);
} catch (ParserConfigurationException | IOException | org.xml.sax.SAXException e) {
e.printStackTrace();
}
}
}
輸出結果:
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
要覆蓋的SAX解析器方法
重寫的重要方法是startElement()
,endElement()
和characters()
。
SAXParser
開始解析文檔,當找到任何start元素時,startElement()
調用方法。我們重寫此方法以設置將用於標識元素的布爾變量。
每次找到Employee start元素時,我們也使用此方法創建新的Employee對象。檢查如何讀取id屬性以設置Employee Object id
字段。
characters()
SAXParser在元素中找到字符數據時調用方法。我們使用布爾字段將值設置爲在Employee對象中更正字段。
該endElement()
是我們Employee對象添加到每當我們發現員工結束元素標籤列表中的位置。
SAXParserFactory
提供工廠方法來獲取SAXParser
實例。我們將File對象與MyHandler實例一起傳遞給parse方法來處理回調事件。
SAXParser在開始時有點混亂,但如果您正在處理大型XML文檔,它提供了比DOM Parser更有效的XML讀取方法。這就是Java中的SAX Parser。
StAX Java XML 解析器
用於XML的Java Streaming API(Java StAX)提供了在java中處理XML的實現。StAX包含兩組API - 基於遊標的API和基於迭代器的API。
基於迭代的API
我們依然沿用上面的employee.xml文件來做測試。
創建StaxXMLReader類:
public class StaxXMLReader {
public static void main(String[] args) {
String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
List<Employee> empList = parseXML(fileName);
for(Employee emp : empList){
System.out.println(emp.toString());
}
}
private static List<Employee> parseXML(String fileName) {
List<Employee> empList = new ArrayList<>();
Employee emp = null;
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
while(xmlEventReader.hasNext()){
XMLEvent xmlEvent = xmlEventReader.nextEvent();
if (xmlEvent.isStartElement()){
StartElement startElement = xmlEvent.asStartElement();
if(startElement.getName().getLocalPart().equals("Employee")){
emp = new Employee();
//Get the 'id' attribute from Employee element
Attribute idAttr = startElement.getAttributeByName(new QName("id"));
/*if(idAttr != null){
emp.setId(Integer.parseInt(idAttr.getValue()));
}*/
}
//set the other varibles from xml elements
else if(startElement.getName().getLocalPart().equals("age")){
xmlEvent = xmlEventReader.nextEvent();
// 這裏得注意一下,如果age可能爲空則需要這樣來判斷一下
if(xmlEvent.isEndElement()) {
emp.setAge(Integer.parseInt("1000"));
}
else
{
emp.setAge(Integer.parseInt(xmlEvent.asCharacters().getData()));
}
}else if(startElement.getName().getLocalPart().equals("name")){
xmlEvent = xmlEventReader.nextEvent();
emp.setName(xmlEvent.asCharacters().getData());
}else if(startElement.getName().getLocalPart().equals("gender")){
xmlEvent = xmlEventReader.nextEvent();
emp.setGender(xmlEvent.asCharacters().getData());
}else if(startElement.getName().getLocalPart().equals("role")){
xmlEvent = xmlEventReader.nextEvent();
emp.setRole(xmlEvent.asCharacters().getData());
}
}
//if Employee end element is reached, add employee object to list
if(xmlEvent.isEndElement()){
EndElement endElement = xmlEvent.asEndElement();
System.out.println("取到的結束標籤"+endElement.getName().getLocalPart());
if(endElement.getName().getLocalPart().equals("Employee")){
empList.add(emp);
}
}
}
} catch (FileNotFoundException | XMLStreamException e) {
e.printStackTrace();
}
return empList;
}
}
基於遊標的API
當我們使用StAX XML Parser時,我們需要創建XMLInputFactory
讀取XML文件。然後我們可以通過創建XMLStreamReader
對象來讀取文件來選擇基於遊標的API 。XMLStreamReader next()方法用於獲取下一個解析事件,並根據事件類型返回int值。常見事件類型包括Start Document,Start Element,Characters,End Element和End Document。XMLStreamConstants
包含可用於根據事件類型處理事件的int常量。
測試類StaxXMLReader2
public class StaxXMLReader2
{
private static boolean bName;
private static boolean bAge;
private static boolean bGender;
private static boolean bRole;
public static void main(String[] args) {
String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
List<Employee> empList = parseXML(fileName);
for(Employee emp : empList){
System.out.println(emp.toString());
}
}
private static List<Employee> parseXML(String fileName) {
List<Employee> empList = new ArrayList<>();
Employee emp = null;
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
int event = xmlStreamReader.getEventType();
while(true){
switch(event) {
case XMLStreamConstants.START_ELEMENT:
if(xmlStreamReader.getLocalName().equals("Employee")){
emp = new Employee();
// emp.setId(Integer.parseInt(xmlStreamReader.getAttributeValue(0)));
}else if(xmlStreamReader.getLocalName().equals("name")){
bName=true;
}else if(xmlStreamReader.getLocalName().equals("age")){
bAge=true;
}else if(xmlStreamReader.getLocalName().equals("role")){
bRole=true;
}else if(xmlStreamReader.getLocalName().equals("gender")){
bGender=true;
}
break;
case XMLStreamConstants.CHARACTERS:
if(bName){
emp.setName(xmlStreamReader.getText());
bName=false;
}else if(bAge){
emp.setAge(Integer.parseInt(xmlStreamReader.getText()));
bAge=false;
}else if(bGender){
emp.setGender(xmlStreamReader.getText());
bGender=false;
}else if(bRole){
emp.setRole(xmlStreamReader.getText());
bRole=false;
}
break;
case XMLStreamConstants.END_ELEMENT:
if(xmlStreamReader.getLocalName().equals("Employee")){
empList.add(emp);
}
break;
}
if (!xmlStreamReader.hasNext())
break;
event = xmlStreamReader.next();
}
} catch (FileNotFoundException | XMLStreamException e) {
e.printStackTrace();
}
return empList;
}
}
運行結果:
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer
Java XML Parser - JDOM
JDOM提供了一個出色的Java XML解析器API,可以輕鬆地讀取,編輯和編寫XML文檔。JDOM提供了包裝類,用於從SAX Parser,DOM Parser,STAX Event Parser和STAX Stream Parser中選擇底層實現。
添加maven依賴:
<!-- https://mvnrepository.com/artifact/org.jdom/jdom2 -->
<dependency>
<groupId>org.jdom</groupId>
<artifactId>jdom2</artifactId>
<version>2.0.6</version>
</dependency>
測試類JDOMXMLReader:
public class JDOMXMLReader {
//使用JDOM的好處是可以輕鬆地從SAX切換到DOM到STAX Parser,您可以提供工廠方法讓客戶端應用程序選擇實現。
public static void main(String[] args) {
final String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
org.jdom2.Document jdomDoc;
try {
//we can create JDOM Document from DOM, SAX and STAX Parser Builder classes
jdomDoc = useDOMParser(fileName);
// jdomDoc = useSAXParser(fileName);
// jdomDoc = useSTAXParser(fileName,"stream");
Element root = jdomDoc.getRootElement();
List<Element> empListElements = root.getChildren("Employee");
List<Employee> empList = new ArrayList<>();
for (Element empElement : empListElements) {
Employee emp = new Employee();
// emp.setId(Integer.parseInt(empElement.getAttributeValue("id")));
emp.setAge(Integer.parseInt(empElement.getChildText("age")));
emp.setName(empElement.getChildText("name"));
emp.setRole(empElement.getChildText("role"));
emp.setGender(empElement.getChildText("gender"));
empList.add(emp);
}
//lets print Employees list information
for (Employee emp : empList)
System.out.println(emp);
} catch (Exception e) {
e.printStackTrace();
}
}
//Get JDOM document from DOM Parser
private static org.jdom2.Document useDOMParser(String fileName)
throws ParserConfigurationException, SAXException, IOException {
//creating DOM Document
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File(fileName));
DOMBuilder domBuilder = new DOMBuilder();
return domBuilder.build(doc);
}
//Get JDOM document from SAX Parser
private static org.jdom2.Document useSAXParser(String fileName) throws JDOMException,
IOException {
SAXBuilder saxBuilder = new SAXBuilder();
return saxBuilder.build(new File(fileName));
}
//Get JDOM Document from STAX Stream Parser or STAX Event Parser
private static org.jdom2.Document useSTAXParser(String fileName, String type) throws FileNotFoundException, XMLStreamException, JDOMException{
if(type.equalsIgnoreCase("stream")){
StAXStreamBuilder staxBuilder = new StAXStreamBuilder();
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
return staxBuilder.build(xmlStreamReader);
}
StAXEventBuilder staxBuilder = new StAXEventBuilder();
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
return staxBuilder.build(xmlEventReader);
}
}
使用JDOM的好處是可以輕鬆地從SAX切換到DOM到STAX Parser,我們可以提供相關實現接口讓客戶端應用程序選擇實現。
完整的測試代碼地址:https://github.com/bo-zhang-1/Xml-Parser