xml小结

1,XML 大小写敏感

<p></p> 开始结束标签都要有或者<img src=””/>

属性必须用双引号扩起attribute values must be enclosed in quotation marks

Attribute必须有值,<input type="radio" name="language" value="Java" checked>.

这样错误,必须checked=“true”

 

2,XML稳定已header开头<?xml version="1.0"?>或<?xml version="1.0" encoding="UTF-8"?>当然此header是optional的
Header后跟着dtd如下
<!DOCTYPE web-app PUBLIC
   "-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN"
   "http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">
当然dtd也是optional的
 
3,XML body,以一个根元素tag包含其他的tag,根tag自由定义
如下的mixed content
<font>
   Helvetica
   <size>36</size>
</font>
 

4,XML elements can contain attributes, such as

<size unit="pt">36</size>

 

5 ,一般尽量使用elements不要用attribute
Elements:
<font>
   <name>Helvetica</name>
   <size>36</size>
</font>

Attribute:

<font name="Helvetica" size="36"/>

问题是假如要给size添加单位时

<font name="Helvetica" size="36 pt"/>这样parser要解析"36 pt"
而Elements的方式更为清晰
<font>
   <name>Helvetica</name>
   <size unit="pt">36</size>
</font>
6特殊标记
·                &#233; é
·                &#x2122; ™.
·         &lt;小于
·         &gt;大于
·         &amp;&号
·         &quot;引号
·         &apos;省略号
 

7,CDATA 表示为 <![CDATA[ and ]]>.其中的特殊符号不会被转义为标记符号

其中不能有]]>符号 ,其中的内容原样输出,

 

8,Processing instructions处理指令<? and ?>,

·                <?xml-stylesheet href="mystyle.css" type="text/css"?>

9,注释为<!-- and -->,

·                <!-- This is a comment. -->

 

 

 

解析XML文档

 

1,  The Java library supplies two kinds of XML parsers

·         The Document Object Model (DOM) parser reads an XML document into a tree structure. 文件大时生产tree占用大量内存

·         The Simple API for XML (SAX) parser generates events as it reads an XML document

SAX用于解析大型文档,或者是只关心xml中的部分元素时使用,占内存少

Dom Parser

The DOM parser interface已经是W3C的标准,该标准定义了Document and Element,Node等interface,各厂商的解析器均实现该接口

2The Sun Java API for XML Processing (JAXP) library actually makes it possible to plug in any of these parsers. But Sun also includes its own DOM parser in the Java SDK

使用sun的SDK中的parser如下:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();

 

File f = . . .
Document doc = builder.parse(f);

URL u = . . .
Document doc = builder.parse(u);
InputStream in = . . .
Document doc = builder.parse(in);
Element root = doc.getDocumentElement();
NodeList children = root.getChildNodes();
for (int i = 0; i < children.getLength(); i++)
{
   Node child = children.item(i);
   if (child instanceof Element)
   {
      Element childElement = (Element) child;
      Text textNode = (Text) childElement.getFirstChild();
      String text = textNode.getData().trim();
      if (childElement.getTagName().equals("name"))
         name = text;
      else if (childElement.getTagName().equals("size"))
        size = Integer.parseInt(text);
   }
}

SAX parser

SAXParserFactory factory = SAXParserFactory.newInstance(); 
SAXParser parser = factory.newSAXParser();
parser.parse(source, handler); 
Example 12-8. SAXTest.java

import java.io.*;
 import java.net.*;
 import javax.xml.parsers.*;
 import org.xml.sax.*;
 import org.xml.sax.helpers.*;

 
 public class SAXTest
{
public static void main(String[] args) throws Exception
{
String url;
if (args.length == 0)
{
 url = "http://www.w3c.org";
      System.out.println("Using " + url);
  }
 else
 url = args[0];

 DefaultHandler handler = new
   DefaultHandler()
 {
   public void startElement(String namespaceURI,
    String lname, String qname, Attributes attrs)
  {
 if (lname.equalsIgnoreCase("a") && attrs != null)
   {
    for (int i = 0; i < attrs.getLength(); i++)
   {
    String aname = attrs.getLocalName(i);
    if (aname.equalsIgnoreCase("href"))
        System.out.println(attrs.getValue(i));
  }
    }
 }
  };

  SAXParserFactory factory = SAXParserFactory.newInstance();
   factory.setNamespaceAware(true);
    SAXParser saxParser = factory.newSAXParser();
    InputStream in = new URL(url).openStream();
    saxParser.parse(in, handler);
    }
 }

待续....

发布了23 篇原创文章 · 获赞 0 · 访问量 1万+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章