java中String.replaceAll方法把換行符(\n)替換爲明文顯示(\n)或(\\n)

原創

2020-07-03 08:07

最近需要解析一個JSONArray類型的字符串

[{"key":"姓名","value":"XX"},{"key":"資質","value":"從事貴金屬投資行業10年
國家期貨二級分析師
上金所榮譽長老"},{"key":"其他","value":""}]

在key資質對應的value中包含三條分行顯示的信息，那麼坑就來了，當JSON解析遇到\n（換行）就會拋出異常，那怎麼辦？
還好，想到了一個對策，就是使用java原生的String.replaceAll方法先把換行（\n）轉換成能明文顯示的\n(\\n)。

System.out.println(array.replaceAll("\n","\\n"));

結果發現，貌似不對勁，輸出結果是這樣的？？？？

[{"key":"姓名","value":"XX"},{"key":"資質","value":"從事貴金屬投資行業10年n國家期貨二級分析師n上金所榮譽長老"},{"key":"其他","value":""}]

哇，有毒！怎麼只剩下一個n了？？
爲了搞明白什麼問題，百度、google？no，我們看源碼。
先看一下replaceAll方法的源碼

public String replaceAll(String replacement) {
    reset();
    boolean result = find();
    if (result) {
        StringBuffer sb = new StringBuffer();
        do {
            appendReplacement(sb, replacement);
            result = find();
        } while (result);
        appendTail(sb);
        return sb.toString();
    }
    return text.toString();
}

從該方法中，我們可以看到，該方法中是一直循環直至find()返回false，每一次find匹配到換行（我們調用String.replaceAll時傳入的匹配字符串是”\n”）都會執行appendReplacement方法，那麼這個傢伙到底做了什麼呢？

public Matcher appendReplacement(StringBuffer sb, String replacement) {

        // If no match, return error
        if (first < 0)
            throw new IllegalStateException("No match available");

        // Process substitution string to replace group references with groups
        int cursor = 0;
        StringBuilder result = new StringBuilder();

        while (cursor < replacement.length()) {
            char nextChar = replacement.charAt(cursor);
            if (nextChar == '\\') {
                cursor++;
                nextChar = replacement.charAt(cursor);
                result.append(nextChar);
                cursor++;
            } else if (nextChar == '$') {
                // Skip past $
                cursor++;
                // A StringIndexOutOfBoundsException is thrown if
                // this "$" is the last character in replacement
                // string in current implementation, a IAE might be
                // more appropriate.
                nextChar = replacement.charAt(cursor);
                int refNum = -1;
                if (nextChar == '{') {
                    cursor++;
                    StringBuilder gsb = new StringBuilder();
                    while (cursor < replacement.length()) {
                        nextChar = replacement.charAt(cursor);
                        if (ASCII.isLower(nextChar) ||
                            ASCII.isUpper(nextChar) ||
                            ASCII.isDigit(nextChar)) {
                            gsb.append(nextChar);
                            cursor++;
                        } else {
                            break;
                        }
                    }
                    if (gsb.length() == 0)
                        throw new IllegalArgumentException(
                            "named capturing group has 0 length name");
                    if (nextChar != '}')
                        throw new IllegalArgumentException(
                            "named capturing group is missing trailing '}'");
                    String gname = gsb.toString();
                    if (ASCII.isDigit(gname.charAt(0)))
                        throw new IllegalArgumentException(
                            "capturing group name {" + gname +
                            "} starts with digit character");
                    if (!parentPattern.namedGroups().containsKey(gname))
                        throw new IllegalArgumentException(
                            "No group with name {" + gname + "}");
                    refNum = parentPattern.namedGroups().get(gname);
                    cursor++;
                } else {
                    // The first number is always a group
                    refNum = (int)nextChar - '0';
                    if ((refNum < 0)||(refNum > 9))
                        throw new IllegalArgumentException(
                            "Illegal group reference");
                    cursor++;
                    // Capture the largest legal group string
                    boolean done = false;
                    while (!done) {
                        if (cursor >= replacement.length()) {
                            break;
                        }
                        int nextDigit = replacement.charAt(cursor) - '0';
                        if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
                            break;
                        }
                        int newRefNum = (refNum * 10) + nextDigit;
                        if (groupCount() < newRefNum) {
                            done = true;
                        } else {
                            refNum = newRefNum;
                            cursor++;
                        }
                    }
                }
                // Append group
                if (start(refNum) != -1 && end(refNum) != -1)
                    result.append(text, start(refNum), end(refNum));
            } else {
                result.append(nextChar);
                cursor++;
            }
        }
        // Append the intervening text
        sb.append(text, lastAppendPosition, first);
        // Append the match substitution
        sb.append(result);

        lastAppendPosition = last;
        return this;
    }

分析該方法的實現，我們可以發現在while循環的第一行執行了

char nextChar = replacement.charAt(cursor);

獲取替換目標字符串的第一個字符，我們這裏是”\\n”，那麼第一個字符就是’\’，然後看第一個if語句

if (nextChar == '\\') {
    cursor++;
    nextChar = replacement.charAt(cursor);
    result.append(nextChar);
    cursor++;
}

當該字符爲’\’時，cursor會++自增1，然後獲取第二個字符’\’，把該字符append到result中，關鍵之處就在這裏了，它把連續的兩個反斜槓(‘\\’)變成了一個反斜槓(‘\’)，到這裏，問題貌似搞明白了。

那麼，我們最終展示\n的寫法應該是

System.out.println(array.replaceAll("\n","\\\\n"));

展示\\n的寫法應該是

System.out.println(array.replaceAll("\n","\\\\\\\\n"));

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

java中String.replaceAll方法把換行符(\n)替換爲明文顯示(\n)或(\\n)

Python的數據類型和變量

常用正則表達式！

IDEA熱部署不生效解決方案(二)

IDEA熱部署不生效解決方案(一)

二進制與十進制之間的轉換！

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結