瞭解編碼的轉換

原創

2018-08-30 12:10

關於編碼的簡單介紹

import java.io.IOException;
import java.io.UnsupportedEncodingException;

public class CharCode {

    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String strChina = "中國";
        /**
        * 打印出unico碼
        * Unicode當然是一個很大的集合，現在的規模可以容納100多萬個符號。
        * 每個符號的編碼都不一樣，比如，U+0639表示阿拉伯字母Ain，U+0041表示英語的大寫字母A，
        * “漢”這個字的Unicode編碼是U+6C49。
        */
        for (int i = 0; i < strChina.length(); i++) {
            System.out.print(strChina.charAt(i)+"（的unico編碼的十六進制表現形式爲）：");
            System.out.println(Integer.toHexString((int) strChina.charAt(i)));
        }
        /**
        * 打印出每個字的國標碼
        * 於是中國的標準化組織就出臺了GB2312簡體中文編碼。
        * GB2312編碼用兩個字節(8位2進制)表示一個漢字，
        * 所以理論上最多可以表示256×256=65536個漢字。
        */
        try {
            byte[] buf = strChina.getBytes("gb2312");
            for (int i = 0; i < buf.length; i++) {
                System.out.println("國標gb2312碼的十六進制爲："+Integer.toHexString(buf[i]));
            }
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        /**
        * utf-8
        * 爲了提高Unicode的編碼效率，於是就出現了UTF-8編碼。
        * UTF-8可以根據不同的符號自動選擇編碼的長短。
        * 比如英文字母可以只用1個字節就夠了。
　　        * UTF-8的編碼是這樣得出來的，以”漢”這個字爲例：
　　        * “漢”字的Unicode編碼是U+00006C49，然後把U+00006C49通過UTF-8編碼器進行編碼，
        * 最後輸出的UTF-8編碼是E6B189。
        */
        try {
            byte[] buf = strChina.getBytes("utf-8");
            for (int i = 0; i < buf.length; i++) {
                System.out.println("utf-8碼的十六進制爲："+Integer.toHexString(buf[i]));
            }
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        /**
        * ASCII
        * 美國(國家)信息交換標準(代)碼，一種使用7個或8個二進制位進行編碼的方案，最多可以給256個字符
        * (包括字母、數字、標點符號、控制字符及其他符號)分配(或指定)數值。
        * 字母和數字的 ASCII 碼的記憶是非常簡單的。
        * 我們只要記住了一個字母或數字的 ASCII 碼（例如記住 A 爲 65 ， 0 的 ASCII 碼爲 48 ），
        * 知道相應的大小寫字母之間差 32 ，就可以推算出其餘字母、數字的 ASCII 碼。
        *
        */
        try {
            byte[] buf = strChina.getBytes("ASCII");
            for (int i = 0; i < buf.length; i++) {
                System.out.println("ASCII碼的十六進制爲："+Integer.toHexString(buf[i]));
            }
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }

        //System.setProperty(key, value)//設置環境配置的鍵值
        System.getProperties().list(System.out);//運行環境參數

        System.out.println("請輸入一箇中文字符竄按enter結束輸入並且得到unico編碼");
        byte[] b = new byte[1024];
        String strInfo = null;
        int post = 0;
        int ch = 0;
        while(true){
            try {
                ch = System.in.read();
            } catch (IOException e) {
                e.printStackTrace();
            }
            //System.out.println(Integer.toHexString(ch));
            switch(ch){
            case '/r':
                break;
            case '/n':
                try {
                    strInfo = new String(b,0,post,"gb2312");
                } catch (UnsupportedEncodingException e) {
                    e.printStackTrace();
                }
                for(int i = 0; i<strInfo.length();i++){
                    System.out.println(Integer.toHexString(strInfo.charAt(i)));
                }
                break;
            default:
                b[post++] = (byte)ch;
            }
        }

    }

}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

瞭解編碼的轉換

工作中用到的腳本合集

24-5-18 X

火狐瀏覽器插件大全

Android應用程序安裝

關於Iphone開發得一些案例及常用知識

HttpClient與HtmlParse完美融合簡單實例

關於Java反射你瞭解多少

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結