And I'm getting the following error when tyring to add a row to a table:Incorrect string value: '\xF0\x90\x8D\x83\xF0\x90...' for column 'content' at row 1
MySQL's utf8
permits only the unicode characters that can be represented with 3 bytes in UTF-8. Here you have a character that needs 4 bytes: \xF0\x90\x8D\x83 (U+10343 GOTHIC LETTER SAUIL).
If you have MySQL 5.5 or later you can change the column encoding from utf8
to utf8mb4
. This encoding allows storage of characters that occupy 4 bytes in UTF-8
Answers:
Answers (1)
UTF-8 encodes everything in the basic multilingual plane (i.e. U+0000 to U+FFFF inclusive) in 1-3 bytes. Therefore, you just need to check whether everything in your string is in the BMP.
In Java, that means checking whether any char
(which is a UTF-16 code unit) is a high or low surrogate character, as Java will use surrogate pairs to encode non-BMP characters:
publicstaticboolean isEntirelyInBasicMultilingualPlane(String text){for(int i =0; i < text.length(); i++){if(Character.isSurrogate(text.charAt(i))){returnfalse;}}returntrue;}
Answers (2)
If you do not want to support beyond BMP, you can just strip those characters before handing it to MySQL:
publicstaticString withNonBmpStripped(String input ){if( input ==null)thrownewIllegalArgumentException("input");return input.replaceAll("[\\ud800-\\udfff]","");}
If you want to support beyond BMP, you need MySQL 5.5+ and you need to change everything that's utf8
to utf8mb4
(collations, charsets ...). But you also need the support for this in the driver that I am not familiar with. Handling these characters in Java is also a pain because they are spread over 2 chars
and thus need special handling in many operations.
原因是當前mysql的字符集爲utf-8,最多3個字節,但遇到像emoji表情這樣的特殊字符時,需要4個字節來表示。所以導致insert報錯。
mysql> show character set; +----------+-----------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +----------+-----------------------------+---------------------+--------+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | +----------+-----------------------------+---------------------+--------+
解決辦法:
1.備份數據庫
2.升級Mysql Server到v5.5.3+
Upgrade the MySQL server to v5.5.3+
3.修改database,table,column字符集
# For each database: ALTER DATABASE database_name CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci; # For each table: ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # For each column: ALTER TABLE table_name CHANGE column_name column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # (Don’t blindly copy-paste this! The exact statement depends on the column type, maximum length, and other properties. The above line is just an example for a `VARCHAR` column.)
4.修改my.ini(linux下爲my.cnf)
[client] default-character-set = utf8mb4 [mysql] default-character-set = utf8mb4 [mysqld] character-set-client-handshake = FALSE character-set-server = utf8mb4 collation-server = utf8mb4_unicode_ci init_connect='SET NAMES utf8mb4'
重新啓動Mysql,檢查字符集:
mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; +--------------------------+--------------------+ | Variable_name | Value | +--------------------------+--------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | utf8mb4 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8mb4 | | character_set_system | utf8 | | collation_connection | utf8mb4_unicode_ci | | collation_database | utf8mb4_unicode_ci | | collation_server | utf8mb4_unicode_ci | +--------------------------+--------------------+ 10 rows in set (0.00 sec)
5.如果是用java連接的mysql,需要升級mysql-connector-java.jar至少到5.1.14
參考資料:
http://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4
http://technovergence-en.blogspot.jp/2012/03/mysql-from-utf8-to-utf8mb4.html
http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html
http://stackoverflow.com/questions/7814293/how-to-insert-utf-8-mb4-characteremoji-in-ios5-in-mysql
http://stackoverflow.com/questions/8709892/mysql-throws-incorrect-string-value-error