java數據傳輸常用編碼方式總結

原創

2020-06-17 10:43

在開發中常常遇到一種場景，一個系統向另一個系統提交數據的時候，可能是通過json方式提交，也可能先寫到html頁面的form表單裏面提交，這樣就會帶來幾個問題，比如，json格式裏面含有疑似XSS攻擊的惡意字符串，或者含有與HTML語義相同的標籤字符。

在這種情況下，就需要對所提交的字符串進行編碼了。然後接收方進行解碼。

編碼的常見方式總結如下：

一、unicode編碼

        //import org.apache.commons.lang.StringEscapeUtils;
 
        String test="我是中國人()（），,<br>\",.<script><tablke>";
       
        String encoderJava=StringEscapeUtils.escapeJava(test);
        String unencoderJava=StringEscapeUtils.unescapeJava(encoderJava);
        System.out.println(encoderJava);
        System.out.println(unencoderJava);
        System.out.println("---------------------------------------------------------");



結果：
\u6211\u662F\u4E2D\u56FD\u4EBA()\uFF08\uFF09\uFF0C,<br>\",.<script><tablke>
我是中國人()（），,<br>",.<script><tablke>

二、Base64與Base32

       
        //import org.apache.commons.codec.binary.Base64;

        byte[] base64=Base64.encodeBase64(test.getBytes());
        String base64Str=new String(base64);
        byte[] decodeBase64= Base64.decodeBase64(base64Str);
        String decodeBase64Str=new String(decodeBase64);
        System.out.println(base64Str);
        System.out.println(decodeBase64Str);


結果：
5oiR5piv5Lit5Zu95Lq6KCnvvIjvvInvvIwsPGJyPiIsLjxzY3JpcHQ+PHRhYmxrZT4=
我是中國人()（），,<br>",.<script><tablke>

Base32類似

三、只針對html字符轉義的編碼

這裏有兩個包提供了html的轉義編碼功能，但是差別很大

分別是org.apache.commons.lang.StringEscapeUtils; 與org.springframework.web.util.HtmlUtils;

先看StringEscapeUtils，如下：

 String encoderHtml=StringEscapeUtils.escapeHtml(test);
        String unencoderHtml=StringEscapeUtils.unescapeHtml(encoderHtml);
        System.out.println(encoderHtml);
        System.out.println(unencoderHtml);
        System.out.println("---------------------------------------------------------");


結果：
&#25105;&#26159;&#20013;&#22269;&#20154;()&#65288;&#65289;&#65292;,&lt;br&gt;&quot;,.&lt;script&gt;&lt;tablke&gt;
我是中國人()（），,<br>",.<script><tablke>

再看HtmlUtils的效果：

String springEncodeHtml=HtmlUtils.htmlEscape(test);
        String unspringEncodeHtml=HtmlUtils.htmlUnescape(springEncodeHtml);
        System.out.println(springEncodeHtml);
        System.out.println(unspringEncodeHtml);


結果：

我是中國人()（），,&lt;br&gt;&quot;,.&lt;script&gt;&lt;tablke&gt;
我是中國人()（），,<br>",.<script><tablke>

四、url編碼

這裏使用commons-codec包提供的功能

 byte[] url=URLCodec.encodeUrl(new BitSet(),test.getBytes());
        byte[] decodeUrl=URLCodec.decodeUrl(url);
        String urlStr=new String(url);
        String decodeUrlStr=new String(decodeUrl);
        System.out.println(urlStr);
        System.out.println(decodeUrlStr);



結果：
5oiR5piv5Lit5Zu95Lq6KCnvvIjvvInvvIwsPGJyPiIsLjxzY3JpcHQ+PHRhYmxrZT4=
我是中國人()（），,<br>",.<script><tablke>

五、其他編碼方式(http://commons.apache.org/proper/commons-codec)

commons-codec 還提供了很多其他的編碼方式，這裏列出一種。

 byte[] qByte=QuotedPrintableCodec.encodeQuotedPrintable(new BitSet(),test.getBytes());
        byte[] dByte=QuotedPrintableCodec.decodeQuotedPrintable(qByte);
        String qByteStr=new String(qByte);
        String dByteStr=new String(dByte);
        System.out.println(qByteStr);
        System.out.println(dByteStr);


結果：
%E6%88%91%E6%98%AF%E4%B8%AD%E5%9B%BD%E4%BA%BA%28%29%EF%BC%88%EF%BC%89%EF%BC%8C%2C%3C%62%72%3E%22%2C%2E%3C%73%63%72%69%70%74%3E%3C%74%61%62%6C%6B%65%3E
我是中國人()（），,<br>",.<script><tablke>

通過以上五種編碼方式，發現不滿足我需求的除了第一種和第三種裏面的HtmlUtils之外，其餘都滿足我的需求。編碼後字符長度最小的是第二種和第三種裏面的StringEscapeUtils

另：我希望找到一種這樣的編碼方式，提交請求對方接受後，不需要轉碼就能使用。有嗎？

org.springframework.web.util.HtmlUtils; 提供的編碼方式僅僅將HTML字符進行了轉義，HTML轉義字符push到頁面後，提交到後臺，後臺通過request.getParameter獲取，並不需要對獲取到的數據進行解碼。瀏覽器能自動解碼。這個例子與瀏覽器URL字符串自動對中文進行編碼解碼原理類似

舉例如下：

jdjg<input id =""> 轉碼後爲 jdjg<input id ="">此時HTML源碼如下：

 value="jdjg&lt;input id =&quot;&quot;&gt;" >

頁面展示效果爲：

實際後臺獲取到的值爲：

jdjg<input id ="">

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

java數據傳輸常用編碼方式總結

shell處理故障

解決線上故障-python分析日誌腳本

java數據傳輸常用編碼方式總結

做業務項目的checkList

(二)自定義velocity 標籤

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結