invalid byte 1 of 1-byte UTF-8 sequence

原創

2020-02-24 00:12

在用SAX解析XML文檔的時候，在XML文件中如果有中文的話就會拋出“invalid byte 1 of 1-byte UTF-8 sequence”異常，調試是總是找不到問題所在，於是求救於網絡，終於找到問題所在，成功解決了問題，在此謝謝強大的網絡資源。

   XML內容實際是以UTF-8編碼的，因此造成了包括中文字符的XML文件無法正常閱讀，將編碼格式改成“GB2312”後就可以正常解析了。<?xml   version="1.0"   encoding="GB2312"?>

自己的總結：
1、“org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence.”異常分析和解決：
分析：
該異常由下面的reader.read(file);語句拋出：
SAXReader reader = new SAXReader();
Document doc = reader.read(file);

產生這個異常的原因是：
所讀的xml文件實際是GBK或者其他編碼的，而xml內容中卻用<?xml version="1.0" encoding="utf-8"?>指定編碼爲utf-8，所以就報異常了！

註釋：參考網上的《Java/J2EE中文問題終極解決之道》一文，編碼問題原因應該是：操作系統編碼爲GBK，而xml指定爲utf-8，SAXReader使用系統的默認編碼GBK，所以存在需要轉換編碼的問題，也就自然會出現亂碼了！解決：讓文件編碼和java 操作該文件的接口的編碼一致；

解決：
情況一：該xml文件由dom4j生成；

解決方法：用 org.dom4j.io.XMLWriter xmlWriter = new org.dom4j.io.XMLWriter(
                    new FileOutputStream(fileName));
代替
xmlWriter = new XMLWriter(new FileWriter(fileName));
，指定編碼爲utf-8生成xml文件；

詳細參考資料1：
Dom4j 編碼問題徹底解決作者：lonsen
http://www.5inet.net/Develop/Java/036579,Dom4j_BianMaWenDiCheDeJieJue.aspx

情況二:解析從jsp頁面中讀取到的用戶輸入的xml描述內容時，reader.read()拋出異常；

解決方法：
調用read前先把xml內容轉爲utf-8編碼：（使用支持編碼格式的函數）

public void validate(FacesContext context, UIComponent component, Object obj)
     throws ValidatorException {

            String xmldescription = (String) obj;
     byte[] bytes =xmldescription.getBytes();
            RelationXmlParser.isXmlOK("E://jiangcm//templateXMLSchema.xsd",bytes);
     ……
    }

public static boolean isXmlOK(String xsdFile, byte[] tagetXml) throws SAXException,                  IOException, DocumentException
{
   SAXReader reader = new SAXReader();
                ……
   InputStream in = new ByteArrayInputStream(tagetXml);
   InputStreamReader utf8In=new InputStreamReader(in,"utf-8");
                ……
        }

自己的解決：String.getBytes("utf-8")

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

invalid byte 1 of 1-byte UTF-8 sequence

c# textBox滾動條一直在最下

c#使用SharpZipLib壓縮和解壓縮文件

invalid byte 1 of 1-byte UTF-8 sequence

OSGi中獲取Service的幾種方式

獲取客戶端IP x-forwarded-for

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結