最近需要解析一個JSONArray類型的字符串
[{"key":"姓名","value":"XX"},{"key":"資質","value":"從事貴金屬投資行業10年
國家期貨二級分析師
上金所榮譽長老"},{"key":"其他","value":""}]
在key資質對應的value中包含三條分行顯示的信息,那麼坑就來了,當JSON解析遇到\n(換行)就會拋出異常,那怎麼辦?
還好,想到了一個對策,就是使用java原生的String.replaceAll方法先把換行(\n)轉換成能明文顯示的\n(\\n)。
System.out.println(array.replaceAll("\n","\\n"));
結果發現,貌似不對勁,輸出結果是這樣的????
[{"key":"姓名","value":"XX"},{"key":"資質","value":"從事貴金屬投資行業10年n國家期貨二級分析師n上金所榮譽長老"},{"key":"其他","value":""}]
哇,有毒!怎麼只剩下一個n了??
爲了搞明白什麼問題,百度、google?no,我們看源碼。
先看一下replaceAll方法的源碼
public String replaceAll(String replacement) {
reset();
boolean result = find();
if (result) {
StringBuffer sb = new StringBuffer();
do {
appendReplacement(sb, replacement);
result = find();
} while (result);
appendTail(sb);
return sb.toString();
}
return text.toString();
}
從該方法中,我們可以看到,該方法中是一直循環直至find()返回false,每一次find匹配到換行(我們調用String.replaceAll時傳入的匹配字符串是”\n”)都會執行appendReplacement方法,那麼這個傢伙到底做了什麼呢?
public Matcher appendReplacement(StringBuffer sb, String replacement) {
// If no match, return error
if (first < 0)
throw new IllegalStateException("No match available");
// Process substitution string to replace group references with groups
int cursor = 0;
StringBuilder result = new StringBuilder();
while (cursor < replacement.length()) {
char nextChar = replacement.charAt(cursor);
if (nextChar == '\\') {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
} else if (nextChar == '$') {
// Skip past $
cursor++;
// A StringIndexOutOfBoundsException is thrown if
// this "$" is the last character in replacement
// string in current implementation, a IAE might be
// more appropriate.
nextChar = replacement.charAt(cursor);
int refNum = -1;
if (nextChar == '{') {
cursor++;
StringBuilder gsb = new StringBuilder();
while (cursor < replacement.length()) {
nextChar = replacement.charAt(cursor);
if (ASCII.isLower(nextChar) ||
ASCII.isUpper(nextChar) ||
ASCII.isDigit(nextChar)) {
gsb.append(nextChar);
cursor++;
} else {
break;
}
}
if (gsb.length() == 0)
throw new IllegalArgumentException(
"named capturing group has 0 length name");
if (nextChar != '}')
throw new IllegalArgumentException(
"named capturing group is missing trailing '}'");
String gname = gsb.toString();
if (ASCII.isDigit(gname.charAt(0)))
throw new IllegalArgumentException(
"capturing group name {" + gname +
"} starts with digit character");
if (!parentPattern.namedGroups().containsKey(gname))
throw new IllegalArgumentException(
"No group with name {" + gname + "}");
refNum = parentPattern.namedGroups().get(gname);
cursor++;
} else {
// The first number is always a group
refNum = (int)nextChar - '0';
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
// Capture the largest legal group string
boolean done = false;
while (!done) {
if (cursor >= replacement.length()) {
break;
}
int nextDigit = replacement.charAt(cursor) - '0';
if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
break;
}
int newRefNum = (refNum * 10) + nextDigit;
if (groupCount() < newRefNum) {
done = true;
} else {
refNum = newRefNum;
cursor++;
}
}
}
// Append group
if (start(refNum) != -1 && end(refNum) != -1)
result.append(text, start(refNum), end(refNum));
} else {
result.append(nextChar);
cursor++;
}
}
// Append the intervening text
sb.append(text, lastAppendPosition, first);
// Append the match substitution
sb.append(result);
lastAppendPosition = last;
return this;
}
分析該方法的實現,我們可以發現在while循環的第一行執行了
char nextChar = replacement.charAt(cursor);
獲取替換目標字符串的第一個字符,我們這裏是”\\n”,那麼第一個字符就是’\’,然後看第一個if語句
if (nextChar == '\\') {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
}
當該字符爲’\’時,cursor會++自增1,然後獲取第二個字符’\’,把該字符append到result中,關鍵之處就在這裏了,它把連續的兩個反斜槓(‘\\’)變成了一個反斜槓(‘\’),到這裏,問題貌似搞明白了。
那麼,我們最終展示\n的寫法應該是
System.out.println(array.replaceAll("\n","\\\\n"));
展示\\n的寫法應該是
System.out.println(array.replaceAll("\n","\\\\\\\\n"));