xml小結

1,XML 大小寫敏感

<p></p> 開始結束標籤都要有或者<img src=””/>

屬性必須用雙引號擴起attribute values must be enclosed in quotation marks

Attribute必須有值,<input type="radio" name="language" value="Java" checked>.

這樣錯誤,必須checked=“true”

 

2,XML穩定已header開頭<?xml version="1.0"?>或<?xml version="1.0" encoding="UTF-8"?>當然此header是optional的
Header後跟着dtd如下
<!DOCTYPE web-app PUBLIC
   "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
   "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">
當然dtd也是optional的
 
3,XML body,以一個根元素tag包含其他的tag,根tag自由定義
如下的mixed content
<font>
   Helvetica
   <size>36</size>
</font>
 

4,XML elements can contain attributes, such as

<size unit="pt">36</size>

 

5 ,一般儘量使用elements不要用attribute
Elements:
<font>
   <name>Helvetica</name>
   <size>36</size>
</font>

Attribute:

<font name="Helvetica" size="36"/>

問題是假如要給size添加單位時

<font name="Helvetica" size="36 pt"/>這樣parser要解析"36 pt"
而Elements的方式更爲清晰
<font>
   <name>Helvetica</name>
   <size unit="pt">36</size>
</font>
6特殊標記
·                &#233; é
·                &#x2122; ™.
·         &lt;小於
·         &gt;大於
·         &amp;&號
·         &quot;引號
·         &apos;省略號
 

7,CDATA 表示爲 <![CDATA[ and ]]>.其中的特殊符號不會被轉義爲標記符號

其中不能有]]>符號 ,其中的內容原樣輸出,

 

8,Processing instructions處理指令<? and ?>,

·                <?xml-stylesheet href="mystyle.css" type="text/css"?>

9,註釋爲<!-- and -->,

·                <!-- This is a comment. -->

 

 

 

解析XML文檔

 

1,  The Java library supplies two kinds of XML parsers

·         The Document Object Model (DOM) parser reads an XML document into a tree structure. 文件大時生產tree佔用大量內存

·         The Simple API for XML (SAX) parser generates events as it reads an XML document

SAX用於解析大型文檔,或者是隻關心xml中的部分元素時使用,佔內存少

Dom Parser

The DOM parser interface已經是W3C的標準,該標準定義了Document and Element,Node等interface,各廠商的解析器均實現該接口

2The Sun Java API for XML Processing (JAXP) library actually makes it possible to plug in any of these parsers. But Sun also includes its own DOM parser in the Java SDK

使用sun的SDK中的parser如下:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

 

File f = . . .
Document doc = builder.parse(f);

URL u = . . .
Document doc = builder.parse(u);
InputStream in = . . .
Document doc = builder.parse(in);
Element root = doc.getDocumentElement();
NodeList children = root.getChildNodes();
for (int i = 0; i < children.getLength(); i++)
{
   Node child = children.item(i);
   if (child instanceof Element)
   {
      Element childElement = (Element) child;
      Text textNode = (Text) childElement.getFirstChild();
      String text = textNode.getData().trim();
      if (childElement.getTagName().equals("name"))
         name = text;
      else if (childElement.getTagName().equals("size"))
        size = Integer.parseInt(text);
   }
}

SAX parser

SAXParserFactory factory = SAXParserFactory.newInstance(); 
SAXParser parser = factory.newSAXParser();
parser.parse(source, handler); 
Example 12-8. SAXTest.java

import java.io.*;
 import java.net.*;
 import javax.xml.parsers.*;
 import org.xml.sax.*;
 import org.xml.sax.helpers.*;

 
 public class SAXTest
{
public static void main(String[] args) throws Exception
{
String url;
if (args.length == 0)
{
 url = "http://www.w3c.org";
      System.out.println("Using " + url);
  }
 else
 url = args[0];

 DefaultHandler handler = new
   DefaultHandler()
 {
   public void startElement(String namespaceURI,
    String lname, String qname, Attributes attrs)
  {
 if (lname.equalsIgnoreCase("a") && attrs != null)
   {
    for (int i = 0; i < attrs.getLength(); i++)
   {
    String aname = attrs.getLocalName(i);
    if (aname.equalsIgnoreCase("href"))
        System.out.println(attrs.getValue(i));
  }
    }
 }
  };

  SAXParserFactory factory = SAXParserFactory.newInstance();
   factory.setNamespaceAware(true);
    SAXParser saxParser = factory.newSAXParser();
    InputStream in = new URL(url).openStream();
    saxParser.parse(in, handler);
    }
 }

待續....

發佈了23 篇原創文章 · 獲贊 0 · 訪問量 1萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章