Incorrect string value: '\xF0\x9F\x92\x90

1.一般來說MySQL(小於5.5.3)字符集設置爲utf8,指定連接的字符集也爲utf8,django中save unicode string是木有問題的。但是,當字符串中有特殊字符(如emoji表情符號,以及其他凡是轉成utf8要佔用4字節的字符),就會有問題,會報錯Incorrect string value: '\xF0\x9F\x92\x90</...' for column 'xxx' at row 1

大家都知道Unicode是一個標準,utf8是unicode一個實現方式, 某些Unicode字符轉成utf8可能4字節,而在MySQl5.5.3之前,utf8最長只有3字節。

mysql> show character set;
+------------+----------------------------+------------------------+----------+
| Charset  | Description                 | Default collation   | Maxlen |
+------------+----------------------------+------------------------+----------+
| utf8          | UTF-8 Unicode          | utf8_general_ci    |      3       |
+------------+----------------------------+------------------------+----------+

所以呢,這個需要4字節才能表示的Unicode字符會被截斷,存不進去。


2. 低版本Mysql<5.5.3貌似沒啥好辦法,把字段類型改爲 MEDIUMBLOB ,   其他啥都不用改(繼續保持數據庫字符集和連接字符集都是utf8),問題解決。見下圖,

<code><span class="pln">mysql</span><span class="pun">></span><span class="pln"> show variables like </span><span class="str">'char%'</span><span class="pun">;</span><span class="pln">
</span><span class="pun">+--------------------------+----------------------------+</span><span class="pln">
</span><span class="pun">|</span><span class="pln"> </span><span class="typ">Variable_name</span><span class="pln">            </span><span class="pun">|</span><span class="pln"> </span><span class="typ">Value</span><span class="pln">                      </span><span class="pun">|</span><span class="pln">
</span><span class="pun">+--------------------------+----------------------------+</span><span class="pln">
</span><span class="pun">|</span><span class="pln"> character_set_client     </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_connection </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_database   </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_filesystem </span><span class="pun">|</span><span class="pln"> binary                     </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_results    </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_server     </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_set_system     </span><span class="pun">|</span><span class="pln"> utf8                       </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">|</span><span class="pln"> character_sets_dir       </span><span class="pun">|</span><span class="pln"> </span><span class="str">/usr/</span><span class="pln">share</span><span class="pun">/</span><span class="pln">mysql</span><span class="pun">/</span><span class="pln">charsets</span><span class="pun">/</span><span class="pln"> </span><span class="pun">|</span><span class="pln"> 
</span><span class="pun">+--------------------------+----------------------------+</span></code>
這個狀態下 MEDIUMBLOB   就能搞定。


3.MySQl>=5.5.3,則可以不用像上面那麼做。

3.1 修改mysql配置文件,設置默認字符集utf8mb4, 包括collation

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init_connect='SET NAMES utf8mb4'

3.2 重啓,確認上述配置生效

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+-------------------------------------+------------------------------+
| Variable_name                       | Value                              |
+-------------------------------------+------------------------------+
| character_set_client              | utf8mb4                         |
| character_set_connection    | utf8mb4                         |
| character_set_database      | utf8mb4                          |
| character_set_filesystem     | binary                              |
| character_set_results           | utf8mb4                          |
| character_set_server            | utf8mb4                           |
| character_set_system          | utf8                                   |
| collation_connection             | utf8mb4_unicode_ci    |
| collation_database                | utf8mb4_unicode_ci    |
| collation_server                      | utf8mb4_unicode_ci    |
+-------------------------------------+-------------------------------+

其他不用改,都用 utf8mb4 , django中任意Unicode字符都能存入MySQL。


思路:判斷你的MySQL utf8最大長度是不是4, 

            如果不是,支不支持utf8mb4,

           如果不支持,升級  or  MEDIUMBLOB

其實這個問題,網上已經太多了,沒啥好寫的,記一筆,僅爲個人成長記錄。


喫水不忘挖井人,修bug時參考了這兩篇文章。

http://vivisidea.iteye.com/blog/1395571

http://www.linuxidc.com/Linux/2013-05/84360.htm


轉載:http://blog.csdn.net/secretx/article/details/21253559

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章