String 源碼淺析(一)

前言

相信作爲 JAVAER，平時編碼時使用最多的必然是 String 字符串，而相信應該存在不少人對於 String 的 api 很熟悉了，但沒有看過其源碼實現，其實我個人覺得對於 api 的使用，最開始的階段是看其官方文檔，而隨着開發經驗的積累，應當嘗試去看源碼實現，這對自身能力的提升是至關重要的。當你理解了源碼之後，後面對於 api 的使用也會更加得心應手！

備註：以下記錄基於 jdk8 環境

String 只是一個類

String 其實只是一個類，我們大致可以從以下幾個角度依次剖析它：

類繼承關係
類成員變量
類構造方法
類成員方法
相關靜態方法

繼承關係

從 IDEA 自帶插件導出 String 的 UML 類圖如下:

從圖中馬上可以看出，String 實現了接口 Serializable，Comparable，CharSequence，簡單介紹一下這三個接口的作用：

Serializable :實現該接口的類將具備序列化的能力，該接口沒有任何實現，僅僅是一直標識作用。
Comparable：實現此接口的類具備比較大小的能力，比如實現此接口的對象的列表（和數組）可以由 Collections 類的靜態方法 sort 進行自動排序。
CharSequence：字符序列統一的我接口。提供字符序列通用的操作方法，通常是一些只讀方法，許多字符相關的類都實現此接口，以達到對字符序列的操作，比如：String，StringBuffer 等。

String 類定義如下：

1public final class String
2    implements java.io.Serializable, Comparable<String>, CharSequence{
3        ...
4    }

由 final 修飾符可知， String 類是無法被繼承，不可變類。

類成員變量

這裏主要介紹最關鍵的一個成員變量 value[]，其定義如下：

1 /** The value is used for character storage. */
2    private final char value[];

String 是一個字符串，由字符 char 所組成，因此實際上 String 內部其實就是一個字符數組，用 value[] 表示，注意這裏的 value[] 是用 final 修飾的，表示該值是不允許修改的。

類構造方法

String 有很多重載的構造方法，介紹如下：

空參數構造方法，初始化字符串實例，默認爲空字符,理論上不需要用到這個構造方法，實際上定義一個空字符 String = "" 就會初始化一個空字符串的 String 對象，而此構造方法，也是把空字符的 value[] 拷貝一遍而已，源碼實現如下：
```
1 public String() {
2    this.value = "".value;
3}
```

通過一個字符串參數構造 String 對象，實際上將形參的 value 和 hash 賦值給實例對象作爲初始化，相當於深拷貝了一個形參String對象，源碼如下：

1  public String(String original) {
2        this.value = original.value;
3        this.hash = original.hash;
4    }

通過字符數組去構建一個新的String對象，這裏使用 Arrays.copyOf 方法拷貝字符數組

1 public String(char value[]) {
2        this.value = Arrays.copyOf(value, value.length);
3    }

在源字符數組基礎上，通過偏移量(起始位置)和字符數量，截取構建一個新的String對象。

 1public String(char value[], int offset, int count) {
 2        //如果偏移量小於0，則拋出越界異常
 3        if (offset < 0) {
 4            throw new StringIndexOutOfBoundsException(offset);
 5        }
 6        if (count <= 0) {
 7            //如果字符數量小於0，則拋出越界異常
 8            if (count < 0) {
 9                throw new StringIndexOutOfBoundsException(count);
10            }
11            //在截取的字符數量爲0的情況下，偏移量在字符串長度範圍內，則返回空字符
12            if (offset <= value.length) {
13                this.value = "".value;
14                return;
15            }
16        }
17        // Note: offset or count might be near -1>>>1.
18        //如果偏移量大於字符總長度-截取的字符長度，則拋出越界異常
19        if (offset > value.length - count) {
20            throw new StringIndexOutOfBoundsException(offset + count);
21        }
22        //使用Arrays.copyOfRange靜態方法，截取一定範圍的字符數組，從offset開始，長度爲offset+count，賦值給當前實例的字符數組
23        this.value = Arrays.copyOfRange(value, offset, offset+count);
24    }

在源整數數組的基礎上，通過偏移量(起始位置)和字符數量，截取構建一個新的String對象。這裏的整數數組表示字符對應的ASCII整數值

 1    public String(int[] codePoints, int offset, int count) {
 2    //如果偏移量小於0，則拋出越界異常
 3    if (offset < 0) {
 4        throw new StringIndexOutOfBoundsException(offset);
 5    }
 6    if (count <= 0) {
 7        //如果字符數量小於0，則拋出越界異常
 8        if (count < 0) {
 9            throw new StringIndexOutOfBoundsException(count);
10        }
11        //在截取的字符數量爲0的情況下，偏移量在字符串長度範圍內，則返回空字符
12        if (offset <= codePoints.length) {
13            this.value = "".value;
14            return;
15        }
16    }
17    // Note: offset or count might be near -1>>>1.
18    如果偏移量大於字符總長度-截取的字符長度，則拋出越界異常
19    //if (offset > codePoints.length - count) {
20        throw new StringIndexOutOfBoundsException(offset + count);
21    }
22    final int end = offset + count;
23    // 這段邏輯是計算出字符數組的精確大小n,過濾掉一些不合法的int數據
24    int n = count;
25    for (int i = offset; i < end; i++) {
26        int c = codePoints[i];
27        if (Character.isBmpCodePoint(c))
28            continue;
29        else if (Character.isValidCodePoint(c))
30            n++;
31        else throw new IllegalArgumentException(Integer.toString(c));
32    }
33    // 按照上一步驟計算出來的數組大小初始化數組
34    final char[] v = new char[n];
35    //遍歷填充字符數組
36    for (int i = offset, j = 0; i < end; i++, j++) {
37        int c = codePoints[i];
38        if (Character.isBmpCodePoint(c))
39            v[j] = (char)c;
40        else
41            Character.toSurrogates(c, v, j++);
42    }
43    //賦值給當前實例的字符數組
44    this.value = v;
45}

通過源字節數組，按照一定範圍，從offset開始截取length個長度，初始化 String 實例，同時可以指定字符編碼。

 1public String(byte bytes[], int offset, int length, String charsetName)
 2        throws UnsupportedEncodingException {
 3    //字符編碼參數爲空，拋出空指針異常
 4    if (charsetName == null)
 5        throw new NullPointerException("charsetName");
 6    //靜態方法 檢查字節數組的索引是否越界
 7    checkBounds(bytes, offset, length);
 8    //使用 StringCoding.decode 將字節數組按照一定範圍解碼爲字符串，從offset開始截取length個長度
 9    this.value = StringCoding.decode(charsetName, bytes, offset, length);
10}

與第6個構造相似，只是編碼參數重載爲 Charset 類型

1  public String(byte bytes[], int offset, int length, Charset charset) {
2    if (charset == null)
3        throw new NullPointerException("charset");
4    checkBounds(bytes, offset, length);
5    this.value =  StringCoding.decode(charset, bytes, offset, length);
6}

通過源字節數組，構造一個字符串實例，同時指定字符編碼,具體實現其實是調用第6個構造器，起始位置爲0，截取長度爲字節數組長度

1 public String(byte bytes[], String charsetName)
2        throws UnsupportedEncodingException {
3    this(bytes, 0, bytes.length, charsetName);
4}

通過源字節數組，構造一個字符串實例，同時指定字符編碼,具體實現其實是調用第7個構造器，起始位置爲0，截取長度爲字節數組長度
```
1 public String(byte bytes[], Charset charset) {
2    this(bytes, 0, bytes.length, charset);
3}
```

通過源字節數組，按照一定範圍，從offset開始截取length個長度，初始化 String 實例，與第六個構造器不同的是，使用系統默認字符編碼

1public String(byte bytes[], int offset, int length) {
2   //檢查索引是否越界
3   checkBounds(bytes, offset, length);
4   //使用系統默認字符編碼解碼字節數組爲字符數組
5   this.value = StringCoding.decode(bytes, offset, length);
6}

通過源字節數組，構造一個字符串實例，使用系統默認編碼,具體實現其實是調用第10個構造器，起始位置爲0，截取長度爲字節數組長度
```
1public String(byte bytes[]) {
2    this(bytes, 0, bytes.length);
3}
```

將 StringBuffer 構建成一個新的String,比較特別的就是這個方法有synchronized鎖同一時間只允許一個線程對這個 buffer 構建成String對象,是線程安全的

1 public String(StringBuffer buffer) {
2    //對當前 StringBuffer 對象加同步鎖
3    synchronized(buffer) {
4        //拷貝 StringBuffer 字符數組給當前實例的字符數組
5        this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
6    }
7}

將 StringBuilder 構建成一個新的String,與第12個構造器不同的是，此構造器不是線程安全的

1 public String(StringBuilder builder) {
2    this.value = Arrays.copyOf(builder.getValue(), builder.length());
3}

類成員方法

獲取字符串長度，實際上是獲取字符數組長度

1  public int length() {
2    return value.length;
3}

判斷字符串是否爲空，實際上是盼復字符數組長度是否爲0

1public boolean isEmpty() {
2    return value.length == 0;
3}

根據索引參數獲取字符

1 public char charAt(int index) {
2    //索引小於0或者索引大於字符數組長度，則拋出越界異常
3    if ((index < 0) || (index >= value.length)) {
4        throw new StringIndexOutOfBoundsException(index);
5    }
6    //返回字符數組指定位置字符
7    return value[index];
8}

根據索引參數獲取指定字符ASSIC碼(int類型)

1  public int codePointAt(int index) {
2    //索引小於0或者索引大於字符數組長度，則拋出越界異
3    if ((index < 0) || (index >= value.length)) {
4        throw new StringIndexOutOfBoundsException(index);
5    }
6    //返回索引位置指定字符ASSIC碼(int類型)
7    return Character.codePointAtImpl(value, index, value.length);
8}

返回index位置元素的前一個元素的ASSIC碼(int型)

1public int codePointBefore(int index) {
2    //獲得index前一個元素的索引位置
3    int i = index - 1;
4    //檢查索引是否越界
5    if ((i < 0) || (i >= value.length)) {
6        throw new StringIndexOutOfBoundsException(index);
7    }
8    return Character.codePointBeforeImpl(value, index, 0);
9}

方法返回的是代碼點個數，是實際上的字符個數,功能類似於length()，對於正常的String來說，length方法和codePointCount沒有區別，都是返回字符個數。但當String是Unicode類型時則有區別了。例如：String str = “/uD835/uDD6B” (即使 'Z' ), length() = 2 ,codePointCount() = 1

1 public int codePointCount(int beginIndex, int endIndex) {
2    if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
3        throw new IndexOutOfBoundsException();
4    }
5    return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
6}

也是相對Unicode字符集而言的，從index索引位置算起，偏移codePointOffset個位置，返回偏移後的位置是多少，例如，index = 2 ,codePointOffset = 3 ，maybe返回 5

1public int offsetByCodePoints(int index, int codePointOffset) {
2    if (index < 0 || index > value.length) {
3        throw new IndexOutOfBoundsException();
4    }
5    return Character.offsetByCodePointsImpl(value, 0, value.length,
6            index, codePointOffset);
7}

這是一個不對外的方法，是給String內部調用的，因爲它是沒有訪問修飾符的，只允許同一包下的類訪問參數：dst[]是目標數組，dstBegin是目標數組的偏移量，既要複製過去的起始位置(從目標數組的什麼位置覆蓋) 作用就是將String的字符數組value整個複製到dst字符數組中，在dst數組的dstBegin位置開始拷貝
```
1void getChars(char dst[], int dstBegin) {
2    System.arraycopy(value, 0, dst, dstBegin, value.length);
3}
```

得到char字符數組，原理是getChars() 方法將一個字符串的字符複製到目標字符數組中。參數：srcBegin是原始字符串的起始位置，srcEnd是原始字符串要複製的字符末尾的後一個位置(既複製區域不包括srcEnd) dst[]是目標字符數組，dstBegin是目標字符的複製偏移量，複製的字符從目標字符數組的dstBegin位置開始覆蓋。

 1public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
 2    if (srcBegin < 0) {
 3        throw new StringIndexOutOfBoundsException(srcBegin);
 4    }
 5    if (srcEnd > value.length) {
 6        throw new StringIndexOutOfBoundsException(srcEnd);
 7    }
 8    if (srcBegin > srcEnd) {
 9        throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
10    }
11    System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
12}

獲取字符串的字節數組，按照指定字符編碼將字符串解碼爲字節數組

1public byte[] getBytes(String charsetName)
2        throws UnsupportedEncodingException {
3    if (charsetName == null) throw new NullPointerException();
4    return StringCoding.encode(charsetName, value, 0, value.length);
5}

獲取字符串的字節數組，按照指定字符編碼將字符串解碼爲字節數組

1public byte[] getBytes(Charset charset) {
2    if (charset == null) throw new NullPointerException();
3    return StringCoding.encode(charset, value, 0, value.length);
4}

獲取字符串的字節數組，按照系統默認字符編碼將字符串解碼爲字節數組

1 public byte[] getBytes() {
2    return StringCoding.encode(value, 0, value.length);
3}

簡單的總結

String 被修飾符 final 修飾，是無法被繼承的，不可變類
String 實現 Serializable 接口，可以被序列化
String 實現 Comparable 接口，可以用於比較大小
String 實現 CharSequence 接口，表示一直有序字符序列，實現了通用的字符序列方法
String 是一個字符序列，內部數據結構其實是一個字符數組，所有的操作方法都是圍繞這個字符數組的操作。
String 中頻繁使用到了 System 類的 arraycopy 方法，目的是拷貝字符數組

鄭州婦科醫院×××:https://yyk.familydoctor.com.cn/21206/鄭州×××醫院×××:http://yyk.familydoctor.com.cn/12248/鄭州婦科醫院×××:http://yyk.39.net/zz3/zonghe/1d426.html

String 源碼淺析(一)

前言

String 只是一個類

繼承關係

類成員變量

類構造方法

類成員方法

簡單的總結

容器中nginx無法使用同一個網絡下的容器域名

Python: SunMoonTimeCalculator

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

NETCore中實現一個輕量無負擔的極簡任務調度ScheduleTask

docker使用特定的網絡

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

nodejs學習07——API

避免DbContext同時在多個線程調用

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

線程安全對於我們的設計工作異常重要

5G大規模商用來臨之前，你必須知道的幾個知識點

對於網站設計的10個建議

2019年，區塊鏈4大趨勢你關注了嗎？

擁有痛覺機器人，我們的生活會有變化嗎？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結