东是漢字“東”的Unicode編碼的十進制表示;
char t = (char)19996;
就將該編碼值轉換成了相應的字符“東”;
import re
company = '东莞市陈珊服饰源头厂家'
if '&#' in company :
new_a_list = re.findall(r'&#(\d+?);', company )
company = ''
for m in new_a_list:
company += unichr(int(m))
print(company )
輸出:東莞市陳珊服飾源頭廠家