獲取到html實體編碼字符後,通過正則獲取其中的html實體編碼,再統一強制轉換到正常字符;
代碼如下:
string strformat = item.value7; //將html實體編碼轉換到正常字符 string regx = "(?<=(& #)).+?(?=;)"; MatchCollection matchCol = Regex.Matches(strformat, regx); if (matchCol.Count > 0) { for (int i = 0; i < matchCol.Count; i++) { int asciinum = int.Parse(matchCol[i].Value); char c = (char) asciinum; strformat = strformat.Replace(string.Format("& #{0};", asciinum), c.ToString()); } }
附對換表格