環境

>>> import sys
>>> print(sys.version)
'3.6.0 |Anaconda 4.3.1 (64-bit)| (default, Dec 23 2016, 12:22:00) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]'

問題描述

今天在使用python3的時候，報錯信息

Traceback (most recent call last):
  File "tmp.py", line 3, in <module>
    print(a)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

報錯代碼可簡化爲

a = b'\xe5\x94\xb1\xe6\xad\x8c'
a = a.decode("utf-8")
print(a)

問題分析

本節介紹問題的分析過程，如果想看解決辦法，可以直接看一下節。

網上解釋

網上給出的解釋：錯誤的使用decode和encode方法會出現這種異常。例如使用decode方法將Unicode字符串轉化的時候：

s = u'中文'
s.decode('utf-8')
print s

但是將這個例子放到python3環境中，會報錯

Traceback (most recent call last):
  File "tmp_2.py", line 4, in <module>
    s.decode('utf-8')
AttributeError: 'str' object has no attribute 'decode'

熟悉python歷史的朋友會知道，爲了解決編碼問題，在python3中，所有的字符串都是使用Unicode編碼，統一使用str類型來保存，而str類型沒有decode方法，所以網上給出的方向並不適合我的問題。

字符編碼

爲了確定是否是字符編碼的問題，我換了一臺python3機器，測試了一下

>>>a = b'\xe5\x94\xb1\xe6\xad\x8c'
>>>a = a.decode("utf-8")
>>>print(a)
唱歌

完全沒有問題，正常輸出，排除字符編碼和代碼失誤。

輸出

既然字符編碼、代碼都沒有錯，那麼問題肯定出在print上面。這時我開始關注錯誤信息中的ascii。因爲在一般python3環境中，輸出時會將Unicode轉化爲utf-8。爲了解開這個疑惑，查看了輸出編碼

>>>import sys
>>>sys.stdout.encoding
'ANSI_X3.4-1968'

竟然是ANSI_X3.4-1968，所以任何中文都會報錯。哈哈，終於定位問題啦。

解決方案

定位問題後，解決辦法就很簡單啦，有兩種方法

使用PYTHONIOENCODING

運行python的時候加上PYTHONIOENCODING=utf-8，即

PYTHONIOENCODING=utf-8 python your_script.py

重新定義標準輸出

標準輸出的定義如下

sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

打印日誌的方法

sys.stdout.write("Your content....")

總結

通過分析這個問題，進一步加深了對python3的瞭解。另外，希望各位看官批評指正！！

python(三)：Python3—UnicodeEncodeError 'ascii' codec can't encode characters in position 0-1

環境

問題描述

問題分析

網上解釋

字符編碼

輸出

解決方案

總結

Wireshark 安裝+使用（一）

AUC（二）：AUC線上線下不一致

AUC(一)：AUC與Mann–Whitney U test

代碼(二) 手鍊有m個珠子共n種顏色

代碼(一) 進制轉換

Lookalike(一)：Lookalike技術調研

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結