dom4j如何獲取節點的行數,列數

在使用dom4j解析xml文件時,可能會對一些節點做檢測,判斷是否符合schema,對一些不符合的節點要作出提示。爲了使作出的提示更友好,還需要指出錯誤在哪裏。但是dom4j並沒有提供相關的功能,或者說這個功能隱藏的很深。搜索了一下,發現這個問題的答案很少。我在一個mail(https://www.mail-archive.com/[email protected]/msg02769.html)中得到了提示,摸索了出來,巨麻煩。該mail提到了主要的過程:

org.xml.sax.Locator locator = new …;
DocumentFactory documentFactory = new DocumentFactoryWithLocator(locator);
SAXContentHandler contentHandler = new SAXContentHandler(documentFactory);
contentHandler.setDocumentLocator(locator);
org.xml.sax.XMLReader reader = …;
reader.setContentHandler(contentHandler );
reader.parse(…);
Document document = contentHandler.getDocument();


在DocumentFactory中:

public Element createElement(QName qname) {
	ElementWithLocation element =  new ElementWithLocation (qname);
    element.setLocation(locator.getLineNumber(),
	locator.getColumnNumber());
    return element;
}


方向是對的,但是按照這個方法做出來,卻得不到正確的結果。原因是,SAXReader中的XMLReader在調用parse方法時,會重新賦值locator,把我傳進去的locator覆蓋掉了,這樣在DocumentFactory就不是XMLReader中的locator。dom4j使用的XMLReader是org.apache.xerces中的SAXParser,該類繼承了AbstractSAXParser,在AbstractSAXParser中有一個方法:

public void startDocument(XMLLocator locator, String encoding, 
                              NamespaceContext namespaceContext, Augmentations augs)
        throws XNIException {
        
        fNamespaceContext = namespaceContext;

        try {
            // SAX1
            if (fDocumentHandler != null) {
                if (locator != null) {
                    fDocumentHandler.setDocumentLocator(new LocatorProxy(locator));
                }
                fDocumentHandler.startDocument();
            }

            // SAX2
            if (fContentHandler != null) {
                if (locator != null) {
                    fContentHandler.setDocumentLocator(new LocatorProxy(locator));
                }
                fContentHandler.startDocument();
            }
        }
        catch (SAXException e) {
            throw new XNIException(e);
        }

    }

就是這個方法覆蓋了contentHandler的locator。我的做法就是在AbstractSAXParser重新賦值locator的時候獲取這個值,傳遞給DocumentFactory。

上代碼,首先需要擴展原來的Element類,使之可以記錄節點的位置信息(需要記錄Attribute等的同理)。

public class GokuElement extends DefaultElement {
	private int lineNum = 0, colNum = 0;

	public GokuElement(QName qname) {
		super(qname);
		// TODO Auto-generated constructor stub
	}

	public GokuElement(QName qname, int attrCount) {
		super(qname, attrCount);
	}

	public GokuElement(String name) {
		super(name);
	}

	public GokuElement(String name, Namespace namespace) {
		super(name, namespace);
	}

	public int getColumnNumber() {
		return this.colNum;
	}

	public int getLineNumber() {
		return this.lineNum;
	}

	public void setLocation(int lineNum, int colNum) {
		this.lineNum = lineNum;
		this.colNum = colNum;
	}
}

然後,擴展DocumentFactory,讓factory生成我們定義的Element:

public class DocumentFactoryWithLocator extends DocumentFactory {
	private Locator locator;

	public DocumentFactoryWithLocator(Locator locator) {
		super();
		this.locator = locator;
	}

	@Override
	public Element createElement(QName qname) {
		GokuElement element = new GokuElement(qname);
		element.setLocation(this.locator.getLineNumber(), this.locator.getColumnNumber());
		return element;
	}

	@Override
	public Element createElement(String name) {
		GokuElement element = new GokuElement(name);
		element.setLocation(this.locator.getLineNumber(), this.locator.getColumnNumber());
		return element;
	}

	public void setLocator(Locator locator) {
		this.locator = locator;
	}
}

然後擴展SAXContentHandler:

public class GokuSAXContentHandler extends SAXContentHandler {
	private DocumentFactoryWithLocator documentFactory = null;

	public GokuSAXContentHandler(DocumentFactory documentFactory2, ElementHandler dispatchHandler) {
		// TODO Auto-generated constructor stub
		super(documentFactory2, dispatchHandler);
	}

	public void setDocFactory(DocumentFactoryWithLocator fac) {
		this.documentFactory = fac;
	}

	@Override
	public void setDocumentLocator(Locator documentLocator) {
		super.setDocumentLocator(documentLocator);
		if (this.documentFactory != null)
			this.documentFactory.setLocator(documentLocator);
	}
}

最後擴展SAXReader

public class GokuSAXReader extends SAXReader {
	DocumentFactory docFactory;
	Locator locator;

	public GokuSAXReader(DocumentFactory docFactory) {
		// TODO Auto-generated constructor stub
		super(docFactory);
		this.docFactory = docFactory;
	}

	public GokuSAXReader(DocumentFactory docFactory, Locator locator) {
		// TODO Auto-generated constructor stub
		super(docFactory);
		this.locator = locator;
		this.docFactory = docFactory;
	}

	@Override
	protected SAXContentHandler createContentHandler(XMLReader reader) {
		return new GokuSAXContentHandler(this.getDocumentFactory(), super.getDispatchHandler());
	}

	@Override
	public Document read(InputSource in) throws DocumentException {
		try {
			XMLReader reader = this.getXMLReader();

			reader = this.installXMLFilter(reader);

			EntityResolver thatEntityResolver = super.getEntityResolver();

			if (thatEntityResolver == null) {
				thatEntityResolver = this.createDefaultEntityResolver(in.getSystemId());
				super.setEntityResolver(thatEntityResolver);
			}

			reader.setEntityResolver(thatEntityResolver);

			SAXContentHandler contentHandler = this.createContentHandler(reader);
			contentHandler.setEntityResolver(thatEntityResolver);
			contentHandler.setInputSource(in);

			boolean internal = this.isIncludeInternalDTDDeclarations();
			boolean external = this.isIncludeExternalDTDDeclarations();

			contentHandler.setIncludeInternalDTDDeclarations(internal);
			contentHandler.setIncludeExternalDTDDeclarations(external);
			contentHandler.setMergeAdjacentText(this.isMergeAdjacentText());
			contentHandler.setStripWhitespaceText(this.isStripWhitespaceText());
			contentHandler.setIgnoreComments(this.isIgnoreComments());
			reader.setContentHandler(contentHandler);

			this.configureReader(reader, contentHandler);
			((GokuSAXContentHandler) contentHandler).setDocFactory((DocumentFactoryWithLocator) this.docFactory);
			contentHandler.setDocumentLocator(this.locator);
			reader.parse(in);

			return contentHandler.getDocument();
		} catch (Exception e) {
			if (e instanceof SAXParseException) {
				// e.printStackTrace();
				SAXParseException parseException = (SAXParseException) e;
				String systemId = parseException.getSystemId();

				if (systemId == null) {
					systemId = "";
				}

				String message = "Error on line " + parseException.getLineNumber() + " of document " + systemId + " : "
						+ parseException.getMessage();

				throw new DocumentException(message, e);
			} else {
				throw new DocumentException(e.getMessage(), e);
			}
		}
	}
}
重寫了read(InputSource)方法,使用我們寫的SAXContentHandler。其他簽名的read方法最後都調用了這個read方法。


解析文件的代碼:

Locator locator = new LocatorImpl();
DocumentFactory docFactory = new DocumentFactoryWithLocator(locator);
SAXReader reader = new GokuSAXReader(docFactory, locator);
Document doc = reader.read(new File("goku.xml"));

需要獲取Element信息時:

System.out.println(((GokuElement) element).getLineNumber());


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章