Oracle字符集與字符類型存儲空間佔用

今天看到了樂大師新篇後,自己實驗了一把

Oracle字符集與字符類型存儲空間佔用

 http://blog.csdn.net/leshami/article/details/51416387 


使用XMANGER  XSHELL 連接到LINUX客戶端工具 設置格式爲UTF-8

 

設置LINUX客戶端語言環境 LANG是系統環境,NLS_LANG是數據庫客戶端環境

 

開另外個回話窗口

 

 

[root@oraclebak ~]# su - oracle

[oracle@oraclebak ~]sqlplus shark/shark

 

SQL*Plus: Release 11.2.0.1.0 Production on 星期二 5月 17 20:29:202016

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise EditionRelease 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

 

查看數據庫字符集

 

SQL> col value format a40

SQL> select * from nls_database_parameters where parameter like '%CHARACT%';

PARAMETER                     VALUE

----------------------------------------------------------------------

NLS_NUMERIC_CHARACTERS               .,

NLS_CHARACTERSET                  AL32UTF8

NLS_NCHAR_CHARACTERSET               AL16UTF16

 

一個漢字佔三個字節 3 BYTES

SQL> select dump('鯊') from  dual;

DUMP('鯊')

-------------------------

Typ=96 Len=3: 233,178,168

 

SQL> exit

Disconnected from Oracle Database 11gEnterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options


[oracle@oraclebak ~]env | grep LANG

NLS_LANG=SIMPLIFIED CHINESE_CHINA.AL32UTF8

LANG=zh_CN.UTF-8

[oracle@oraclebak ~]unset NLS_LANG

[oracle@oraclebak ~]env | grep LANG

LANG=zh_CN.UTF-8

[oracle@oraclebak ~]

 

SQL> select dump('鯊') from   dual;

 

DUMP('???')

-------------------------------------------------

Typ=96 Len=9:239,191,189,239,191,189,239,191,189

 

怎麼變成了9個字節了呢?

 

 

這個原因可以確定涉及到NLS_LANG 因爲這個並沒有在數據庫存儲進去而是直接顯示出來.

 

OK 我們建個表存點東西進去看看

SQL> create table tb_length(id int,col1varchar2(20), col2 nvarchar2(20));

 

Table created.

 

SQL> insert into tb_length values(1,'海鯊','海鯊');

 

1 row created.

 

SQL> commit;

 

Commit complete.

 

SQL> select * from tb_length;

 

         IDCOL1           COL2

---------- ----------------------------------------

          1 ??????                  ??????

 

SQL> select dump(col1),dump(col2) from   tb_length;

 

DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

 

SQL> exit

退出來後我們把語言還原回去

Disconnected from Oracle Database 11gEnterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

[oracle@oraclebak ~]exportNLS_LANG="SIMPLIFIED CHINESE_CHINA.AL32UTF8"

[oracle@oraclebak ~]env | grep LANG

NLS_LANG=SIMPLIFIED CHINESE_CHINA.AL32UTF8

LANG=zh_CN.UTF-8

 

再進去看看

Connected to:

Oracle Database 11g Enterprise EditionRelease 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

 

SQL> select dump(col1),dump(col2) from   tb_length;

 

DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

 

SQL> select * from tb_length;

 

         IDCOL1

---------- --------------------

COL2

--------------------------------------------------------------------------------

        1 ������

������

 

 天啦 依舊是亂碼啊 What Fuck Ghost?


SQL> insert into tb_length values(1,'海鯊','海鯊');

 

1 row created.

 

SQL> commit;

 

Commit complete.

 

SQL> select * from tb_length;

 

 ID              COL1   COL2

---------- -  ---------------------------------------------------------------------------------------------------

        1 ������  ������

        1 海鯊   海鯊

SQL> select dump(col1),dump(col2) from   tb_length;

 DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

Typ=1 Len=6: 230,181,183,233,178,168

Typ=1 Len=4: 109,119,156,168

 

OK 結論可以得到是NLS_LANG 是非常關鍵的語言參數 主要在客戶端環境設置.

如果是空值將是亂碼方式存入數據庫,雖然我們採用XMANGE XSHELL工具設置的是UTF-8編碼. 這個東東只是我們在WINDOWS下顯示的結果.既是輸入正確的漢字,也是錯誤的.

 

SQL> select dump('鯊') from  dual;

DUMP('???')

-------------------------------------------------

Typ=96 Len=9:239,191,189,239,191,189,239,191,189

 

 

這樣就有三層  XSHELL->LINUX->DATABASE  三個字符集要設置正確

輸入的話轉換關係XSHELL->LINUX->DATABASE

輸出的話轉換關係DATABASE->LINUX-XSHELL

如果是用工具直接連數據庫的話  中間就少了LINUX.

 

談談字符集存儲問題

SQL> select lengthb(col1),lengthb(col2) from tb_length;

 

LENGTHB(COL1) LENGTHB(COL2)

------------- -------------

            18                12

             6                  4

 

SQL> select length(col1),length(col2) from tb_length;

 

LENGTH(COL1) LENGTH(COL2)

------------ ------------

            6                   6

            2                   2

 

存儲方面varchar2AL32UTF8一箇中文 3個字節 nvarchar2AL16UTF16 一箇中文佔2個字節

 

再看看通常中國字符集

PARAMETER                                VALUE

--------------------------------------------------------------------------------

NLS_NUMERIC_CHARACTERS                            .,

NLS_CHARACTERSET                              ZHS16GBK

NLS_NCHAR_CHARACTERSET                            AL16UTF16

 

insert into tb_length values(1,'海鯊','海鯊');

select dump(col1),dump(col2) from  tb_length;

A                                                          B                                                      COL1                         COL2

---------------------------------------   ----------------------------------------           -------------------------           --------------

Typ=1 Len=4: 186,163,246,232  Typ=1 Len=4: 109,119,156,168             海鯊                    海鯊

 

ZHS16GBK 字符集一個漢字佔2Bytes

 

 
oracle 字符集是個比較麻煩的事情,好討厭哦!! 歐巴. 清理空下的話 我認爲 三組三層兩個參數 
OS層
NLS_LANG="SIMPLIFIED CHINESE_CHINA.AL32UTF8"  --這個是顯示數據庫結果集
LANG=zh_CN.UTF-8                                                      --操作系統的語言
數據庫
NLS_CHARACTERSET                      AL32UTF8    ---數據庫字符集 對應VARCHAR2
NLS_NCHAR_CHARACTERSET         AL16UTF16   ---國家字符集  對應NVARCHAR2
數據庫字段
VARCHAR2()
NVARCHR2()
一般情況下我們大陸人會設置ZHS16GBK數據庫字符集,國家字符集 AL16UTF16
NLS_CHARACTERSET               ZHS16GBK
LINUX 操作系統帶中文的 呵呵 反正中文包含英文的.
export NLS_LANG="SIMPLIFIED CHINESE_CHINA.ZHS16GBK"
EXPORT LANG= zh_CN.UTF-8
這樣的話 1個漢字佔2個字節,VARCHAR2(4000) 默認是BYTES 就可以存2K漢字. NVCHAR2()可以存???


關於字符集參考: 樂大師字符集全球化

http://blog.csdn.net/leshami/article/details/6030398

以及本鯊的Oracle 字符集

http://blog.csdn.net/zengmuansha/article/details/5661691

zh_CN.UTF-8 環境下 Putty 的配置

http://blog.csdn.net/zengmuansha/article/details/7814523

VARCHAR2 佔幾個字節?NLS_LENGTH_SEMANTICS,nls_language

http://blog.csdn.net/zengmuansha/article/details/46373443

NVARCHAR(MAXSIZE)

http://blog.csdn.net/zengmuansha/article/details/12949599

 

關於DUMP

Oracle dump函數的用法

http://blog.csdn.net/liuyuehui110/article/details/44617153

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章