Java筆記目錄可以點這裏：Java 強化筆記

字符串的合法驗證

在開發中，經常會對一些字符串進行合法驗證；

例如，對輸入的郵件格式進行驗證；

自己編寫驗證邏輯

我們可以寫一段代碼來對郵件字符串格式進行驗證：

// 6~18個字符，可使用字母、數字、下劃線，需以字母開頭
public static boolean validate(String email) {
	if (email == null) {
		System.out.println("不能爲空");
		return false;
	}
	char[] chars = email.toCharArray();
	if (chars.length < 6 || chars.length > 18) {
		System.out.println("必須是6~18個字符");
		return false;
	}
	if(!isLetter(chars[0])) {
		System.out.println("必須以字母開頭");
		return false;
	}
	for (int i = 1; i < chars.length; i++) {
		char c = chars[i];
		if (isLetter(c) || isDigit(c) || c == '_') continue;
		System.out.println("必須由字母、數字、下劃線組成");
		return false;
	}
		return true;
}
// 判斷是否是字母
public static boolean isLetter(char c) {
	return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
}
// 判斷是否是數字
public static boolean isDigit(char c) {
	return c >= '0' && c <= '9';
}

public static void main(String[] args) {
	// 必須是6~18個字符
	validate("12345");
	// 必須以字母開頭
	validate("123456");
	// true
	validate("vv123_456");
	// 必須由字母、數字、下劃線組成
	validate("vv123+/?456");
}

使用正則表達式

上面驗證邏輯的代碼比較長，如果用正則表達式則可以輕鬆搞定：（正則是什麼請繼續看下去）

// String regex = "[a-zA-Z][a-zA-Z0-9_]{5,17}"; // 兩種寫法等價
String regex = "[a-zA-Z]\\w{5,17}";
"12345".matches(regex); // false
"123456".matches(regex); // false
"vv123_456".matches(regex); // true
"vv123+/?456".matches(regex); // false

[a-zA-Z]\\w{5,17} 是一個正則表達式

用非常精簡的語法取代了複雜的驗證邏輯
極大地提高了開發效率

正則表達式是一種通用的技術，適用於絕大多數流行編程語言

// JavaScript 中的正則表達式
const regex = /[a-zA-Z]\w{5, 17}/;
regex.test('12345'); // false
regex.test('123456'); // false
regex.test('vv123_456') // true
regex.test('vv123+/?456') // false

單字符匹配

// 只能以b、c、r開頭, 後面必須跟着at
// 等價於 [b|c|r]at、(b|c|r)at, 圓括號不可能省略 "|"
String regex = "[bcr]at";
"bat".matches(regex); // true
"cat".matches(regex); // true
"rat".matches(regex); // true
"hat".matches(regex); // false

// 不能以b、c、r開頭, 但後面必須跟at
String regex = "[^bcr]at"; 
"bat".matches(regex); // false
"cat".matches(regex); // false
"rat".matches(regex); // false
"hat".matches(regex); // true

// foo後面只能跟1~5
String regex = "foo[1-5]";
"foo3".matches(regex); // true
"foo6".matches(regex); // false

// foo後面只能跟1-5以外的事情
String regex = "foo[^1-5]";
"foo3".matches(regex); // false
"foo6".matches(regex); // true

// "foo1-5"必須全字匹配
String regex = "foo1-5"; 
"foo1-5".matches(regex); // true
"fool".matches(regex); // false
"foo5".matches(regex); // flase

// 等價於 "[0-46-8]"
String regex = "[0-4[6-8]]";
"5".matches(regex); // false
"7".matches(regex); // true
"9".matches(regex); // false

// [0-9] 與 [^345] 的交集, 即 [0-2[6-9]]
String regex = "[0-9&&[^345]]";
"2".matches(regex); // true
"3".matches(regex); // false
"4".matches(regex); // false
"5".matches(regex); // false
"6".matches(regex); // true

// [0-9] 與 [345] 的交集, 等價於 [3-5]
String regex = "[0-9&&[345]]";
"2".matches(regex); // false
"3".matches(regex); // true
"4".matches(regex); // true
"5".matches(regex); // trye
"6".matches(regex); // false

預定義字符

Java 中，以 1個反斜槓\ 開頭的字符會被當做轉義字符處理

因此，爲了在正則表達式中完整地表示預定義字符，需要以 2個反斜槓\\ 開頭，比如 "\\d"

// 匹配任意單個字符
String regex = ".";
"@".matches(regex);
"c".matches(regex);
"6".matches(regex);
".".matches(regex);

// 只匹配 "."
String regex = "\\.";
"@".matches(regex); // false
"c".matches(regex); // false
"6".matches(regex); // fasle
".".matches(regex); // true

// 全字匹配 "[123]"
String regex = "\\[123\\]";
"1".matches(regex); // false
"2".matches(regex); // false
"3".matches(regex); // false
"[123]".matches(regex); // true

// 匹配數字, 等價於[0-9]
String regex = "\\d";
"c".matches(regex); // false
"6".matches(regex); // true

// 匹配非數字, 等價於[^0-9]
String regex = "\\D";
"c".matches(regex); // true
"6".matches(regex); // false

量詞（Quantifier）

// "666"完全匹配
String regex = "6{3}";
"66".matches(regex);   // false
"666".matches(regex);  // true 
"6666".matches(regex); // false

// "6"出現2到4次, "66"、"666"、"6666"
String regex = "6{2,4}";
"6".matches(regex); // false
"66".matches(regex); // true
"666".matches(regex); // true
"6666".matches(regex); // true
"66666".matches(regex); // false

// "6"出現2次以上
String regex = "6{2,}";
"6".matches(regex); // false
"66".matches(regex); // true
"666".matches(regex); // true
"6666".matches(regex); // true
"66666".matches(regex); // true

// "6"出現0次或者1次
String regex = "6?";
"".matches(regex); // true
"6".matches(regex); // true
"66".matches(regex); // false

// "6"出現任意次數
String regex = "6*";
"".matches(regex); // true
"6".matches(regex); // true
"66".matches(regex); // true
"67".matches(regex); // false, 必須完全匹配

// "6"出現至少1次
String regex = "6+";
"".matches(regex); // false
"6".matches(regex); // true
"66".matches(regex); // true

Pattern、Matcher

String 的 matches 方法底層用到了 Pattern、Matcher 兩個類；

// java.lang.String 源碼：
public boolean matches(String regex) {
    return Pattern.matches(regex, this);
}

// java.util.regex.Pattern 源碼：
public static boolean matches(String regex, CharSequence input) {
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(input);
    return m.matches();
}

Matcher 常用方法

//如果整個input與regex匹配，就返回 true 
public boolean matches();

//如果從input中找到了與regex匹配的子序列，就返回 true
//如果匹配成功，可以通過start、end、group方法獲取更多信息
//每次的查找範圍會先剔除此前已經查找過的範圍
public boolean find();

//返回上一次匹配成功的開始索引
public int start();

//返回上一次匹配成功的結束索引
public int end();

//返回上一次匹配成功的input子序列
public String group();

【Matcher 工具】：找出所有匹配的子序列

public static void findAll(String regex, String input) {
	findAll(regex, input, 0);
}

public static void findAll(String regx, String input, int flags) {
	if (regx == null || input == null) return;
	Pattern p = Pattern.compile(regx, flags); // 編譯正則, 看是否合法, flags代表模式
	Matcher	m = p.matcher(input); // 匹配, 返回一個匹配器
	boolean found = false;
	while (m.find()) {
		found = true;
		System.out.format("\"%s\", [%d, %d)%n", m.group(), m.start(), m.end());
	}
	if (!found) {
		System.out.println("No match.");
	}
}

Matcher - 示例：

findAll("\\d{3}", "111_222_333_444_555");
/*
"111", [0, 3)
"222", [4, 7)
"333", [8, 11)
"444", [12, 15)
"555", [16, 19)
*/

String regex = "123";

findAll(regex, "123");
// "123", [0, 3)

findAll(regex, "6_123_123_123_7");
/*
"123", [2, 5)
"123", [6, 9)
"123", [10, 13)
*/

String regex = "[abc]{3}";

findAll(regex, "abccabaaaccbbbc");
/*
"abc", [0, 3)
"cab", [3, 6)
"aaa", [6, 9)
"ccb", [9, 12)
"bbc", [12, 15)
*/

String regex = "\\d{2}";

findAll(regex, "0_12_345_67_8");
/*
"12", [2, 4)
"34", [5, 7)
"67", [9, 11)
*/

String input = "";

findAll("a?", input);
// "", [0, 0)

String input = "";

findAll("a?", input);
// "", [0, 0)

findAll("a*", input);
// "", [0, 0)

findAll("a+", input);
// No match.

String input = "a";

findAll("a?", input);
// "a", [0, 1)
// "", [1, 1)

findAll("a*", input);
// "a", [0, 1)
// "", [1, 1)

findAll("a+", input);
// "a", [0, 1)

String input = "abbaaa";

findAll("a?", input);
/*
"a", [0, 1)
"", [1, 1)
"", [2, 2)
"a", [3, 4)
"a", [4, 5)
"a", [5, 6)
"", [6, 6)
*/

findAll("a*", input);
/*
"a", [0, 1)
"", [1, 1)
"", [2, 2)
"aaa", [3, 6)
"", [6, 6)
*/

findAll("a+", input);
// "a", [0, 1)
// "aaa", [3, 6)

Matcher – 貪婪、勉強、獨佔的區別

這裏再次放出這張表。

String input = "afooaaaaaafooa";
findAll(".*foo", input); // 貪婪
// "afooaaaaaafoo", [0, 13)

findAll(".*?foo", input); // 勉強
// "afoo", [0, 4)
// "aaaaaafoo", [4, 13)

findAll(".*+foo", input); // 獨佔
// No match.

捕獲組（Capturing Group）

簡單的說，一對小括號裏的內容就是一個捕獲組；

String regex1 = "dog{3}";
"doggg".matches(regex1); // true

String regex2 = "[dog]{3}";
"ddd".matches(regex2); // true
"ooo".matches(regex2); // true
"ggg".matches(regex2); // true
"dog".matches(regex2); // true
"gog".matches(regex2); // true
"gdo".matches(regex2); // true
// ... 共 3 * 3 * 3 = 27 種可能

// (dog)就是一個捕獲組
String regex3 = "(dog){3}";
"dogdogdog".matches(regex3); // true

捕獲組 – 反向引用（Backreference）

反向引用：可以使用反斜槓\+ 組編號（從 1 開始）來引用組的內容

// (\\d\\d)是一個捕獲組, \\1表示引用第一個捕獲組(內容要相同)
String regex = "(\\d\\d)\\1";
"1212".matches(regex); // true
"1234".matches(regex); // false

// 總共有2個組
// 編號1: ([a-z]{2})
// 編號2: ([A-Z]{2})
// \\2\\1 表示先引用第2組再引用第1組
String regex = "([a-z]{2})([A-Z]{2})\\2\\1";
"mjPKPKmj".matches(regex); // true
"mjPKmjPK".matches(regex); // false

// 總共有4個組
// 編號1: ((I)( Love( You)))
// 編號2: (I)
// 編號3: ( Love( You))
// 編號4: ( You)
// \\3{2} 表示引用2次第3組, 即後面跟 "Love You Love You"
String regex = "((I)( Love( You)))\\3{2}";
"I Love You Love You Love You".matches(regex); // true

String input = "aaabbbb";
// 下面的正則等價於: "([a-z])\\1{3}"
String regex = "([a-z])\\1\\1\\1";
findAll(regex, input);

邊界匹配符（ Boundary Matcher）

基本概念（終止符、輸入、一行、單詞邊界）

終止符（Final Terminator、Line Terminator）

\r（回車符）、\n（換行符）、\r\n（回車換行符）

輸入：整個字符串

一行：以終止符（或整個輸入的結尾）結束的字符串片段

如果輸入是 dog\ndog\rdog
那麼 3 個 dog 都是一行
（匹配模式要設置爲多行模式，終止符纔會生效，否則還是看作單行）

單詞邊界：

// 哪些東西是單詞邊界？
// 除開英文字母大小寫、阿拉伯數字、下劃線、其他國家的正常文字以外的字符
String input = "dog_dog6dog+dog-dog哈";
findAll("\\bdog\\b", input);
// "dog", [12, 15)

\b 代表單詞邊界：

// \\b是單詞邊界, 要求dog左邊和右邊都是單詞邊界
String regex = "\\bdog\\b";

// " " 和 "." 是單詞邊界
findAll(regex, "This is a dog.");
// "dog", [10, 13)

findAll(regex, "This is a doggie.");
// No match.

// 開頭視作單詞邊界
findAll(regex, "dog is cute");
// "dog", [0, 3)

// ","是單詞邊界
findAll(regex, "I love cat,dog,pig.");
// "dog", [11, 14)

\B 代表非單詞邊界：

// dog左邊是單詞邊界, dog右邊不是單詞邊界
String regex = "\\bdog\\B";

findAll(regex, "This is a dog.");
// No match.

findAll(regex, "This is a doggie.");
// "dog", [10, 13)

findAll(regex, "dog is cute");
// No match.

findAll(regex, "I love cat,dog,pig.");
// No match.

^ 代表一行的開頭，$ 代表一行的結尾：

// ^是一行的開頭, $是一行的結尾
// 要求dog, 且d是行開頭, g是行結尾
String regex = "^dog$";
findAll(regex, "dog");
// "dog", [0, 3)

findAll(regex, "     dog");
// No match.

// -------------------------------------

findAll("\\s*dog$", "    dog");
// "    dog", [0, 7)

findAll("^dog\\w*", "dogblahblah");
// "dogblahblah", [0, 11)

\A 代表 輸入的開頭、\z 代表輸入的結尾、\Z 代表輸入的結尾（結尾可以有終結符）：

// "\A" 代表輸入的開頭
// "\z" 代表輸入的結尾
// "\Z" 代表輸入的結尾(結尾可以有終結符)
String regex1 = "\\Adog\\z";
String regex2 = "\\Adog\\Z";

findAll(regex1, "dog");
// "dog", [0, 3)
findAll(regex2, "dog");

findAll(regex1, "dog\n");
// No match.
findAll(regex2, "dog\n");
// "dog", [0, 3)

findAll(regex1, "dog\ndog\rdog");
// No match.
findAll(regex2, "dog\ndog\rdog");
// No match.

findAll(regex1, "dog\ndog\rdog", Pattern.MULTILINE);
// No match.
findAll(regex2, "dog\ndog\rdog", Pattern.MULTILINE);
// No match.

\G 代表上一次匹配的結尾（很少用到）

// 開頭看做一次匹配的結尾
String regex = "\\Gdog";
findAll(regex, "dog");
// "dog", [0, 3)

findAll(regex, "dog dog");
// "dog", [0, 3)

findAll(regex, "dogdog");
// "dog", [0, 3)
// "dog", [3, 6)

常用模式（CASE_INSENSITIVE、DOTALL、MULTILINE）

CASE_INSENSITIVE 忽略大小寫模式：

String regex = "dog";
String input = "Dog_dog_DOG";

// 默認是
findAll(regex, input);
// "dog", [4, 7)

// 設置忽略大小寫模式
findAll(regex, input, Pattern.CASE_INSENSITIVE);
// "Dog", [0, 3)
// "dog", [4, 7)
// "DOG", [8, 11)

// 忽略大小寫模式, 正則寫法
findAll("(?i)dog", input);
// "Dog", [0, 3)
// "dog", [4, 7)
// "DOG", [8, 11)

DOTALL 單行模式：

// "."代表匹配任意字符
String regex = ".";
String input = "\r\n";

findAll(regex, input); // 默認無法匹配到終結符
// No match.

// 單行模式（可以匹配任意字符, 包括終止符）
findAll(regex, input, Pattern.DOTALL);
// "\r", [0, 1)
// "\n", [1, 2)

// 多行模式（^、$ 能真正匹配一行的開頭和結尾, 無法匹配到終結符）
findAll(regex, input, Pattern.MULTILINE);
// No match.

findAll(regex, input, Pattern.MULTILINE | Pattern.DOTALL);
// "\r", [0, 1)
// "\n", [1, 2)

MULTILINE 多行模式：

// 以d爲一行開頭, g爲一行結尾, 中間爲o
String regex = "^dog$";
String input = "dog\ndog\rdog";

findAll(regex, input);
// No match.

findAll(regex, input, Pattern.DOTALL); // 單行模式, 可以匹配到終結符, 無法匹配到 ^ 與 $
// No match.

findAll(regex, input, Pattern.MULTILINE); // 多行模式（^、$ 才能真正匹配一行的開頭和結尾）
// "dog", [0, 3)
// "dog", [4, 7)
// "dog", [8, 11)

// 單行模式 與 多行模式 的內容都能匹配到
findAll(regex, input, Pattern.DOTALL | Pattern.MULTILINE);
// "dog", [0, 3)
// "dog", [4, 7)
// "dog", [8, 11)

常用的正則表達式

正則表達式在線測試：https://c.runoob.com/front-end/854

例如：
18 位身份證號碼：\d{17}[\dXx]
中文字符：[\u4e00-\u9fa5]

String 類與正則表達式（replaceAll、repaceFirst、split）

String 類中接收正則表達式作爲參數的常用方法有

public String replaceAll(String regex, string replacement)

public replaceFirst(String regex, string replacement)

public String[] split(String regex)

【練習】替換字符串中的單詞

將單詞 row 換成單詞 line

replace 是單純的字符串替換（無法傳入正則表達式），功能不如正則表達式強大。
比如這段代碼，兩者可以達到相同的功能。

// 將單詞 row 換成單詞 line
String s1 = "The row we are	 looking for is row 8.";

String s2 = s1.replace("row", "line"); // 成功替換
// The line we are looking for is line 8. 
String s3 = s1.replaceAll("\\brow\\b", "line"); // 成功替換
// The line we are looking for is line 8.

replaceAll 可以傳入正則表達式，功能更加強大。
這段代碼，單純的字符串替換無法達到效果，需要使用正則表達式。

// 將單詞 row 換成單詞 line
String s1 = "Tomorrow I will wear in brown standing in row 10.";

String s2 = s1.replace("row", "line"); // 替換錯誤, 沒有達到要求
// Tomorline I will wear in blinen standing in line 10.
String s3 = s1.replaceAll("\\brow\\b", "line"); // 成功替換
// Tomorrow I will wear in brown standing in line 10.

【練習】替換字符串的數字

將所有連續的數字替換爲**

// 將所有連續的數字替換爲 "**"
String s1 = "ab12c3d456efg7h89i1011jk12lmn";

String s2 = s1.replace("\\d+", "**");
// ab**c**d**efg**h**i**jk**lmn

【練習】利用數字分隔字符串

String s1 = "ab12c3d456efg7h89i1011jk12lmn";
String[] strs = s1.split("\\d+");
// [ab, c, d, efg, h, i, jk, lmn]

【練習】提取重疊的字母、數字

提取出"小寫字母1小寫字母2數字1數字2"格式的字母、數字

比如 "aa33", 提取出 a、3；
比如 "aa33", 提取出 a、3
比如 "aa12"、"ab44"、"aabb"、"5566", 不符合條件

// 提取出"小寫字母1小寫字母2數字1數字2"格式的字母、數字
// 比如"aa33", 提取出a、3
// 比如"aa12"、"ab44"、"aabb"、"5566", 不符合條件
String input = "aa11+bb23-mj33*dd44/5566%ff77";
String regex = "([a-z])\\1(\\d)\\2";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
while (m.find()) {
	System.out.println(m.group() + "_" + m.group(1) + "、" + m.group(2));
}

aa11_a、1
dd44_d、4
ff77_f、7

// 提取出"小寫字母1小寫字母2數字1數字2"格式的最後一個數字
// 比如"ab12", 提取出2
String input = "aa12+bb34-m56j*dd78/9900";
String regex = "[a-z]{2}\\d(\\d)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
while (m.find()) {
	System.out.println(m.group() + "_" + m.group(1));
}

aa12_2
bb34_4
dd78_8

【Java 正則表達式】單字符匹配、預定字符、量詞、Matcher（貪婪、勉強、獨佔模式）、捕獲組、邊界匹配符、String類與正則表達式

正則表達式（Regex Expression）

字符串的合法驗證

自己編寫驗證邏輯

使用正則表達式

單字符匹配

預定義字符

量詞（Quantifier）

Pattern、Matcher

Matcher 常用方法

【Matcher 工具】：找出所有匹配的子序列

Matcher – 貪婪、勉強、獨佔的區別

捕獲組（Capturing Group）

捕獲組 – 反向引用（Backreference）

邊界匹配符（ Boundary Matcher）

基本概念（終止符、輸入、一行、單詞邊界）

常用模式（CASE_INSENSITIVE、DOTALL、MULTILINE）

常用的正則表達式

String 類與正則表達式（replaceAll、repaceFirst、split）

【練習】替換字符串中的單詞

【練習】替換字符串的數字

【練習】利用數字分隔字符串

【練習】提取重疊的字母、數字

10分鐘搞定Mysql主從部署配置

如何使用 JS 判斷用戶是否處於活躍狀態

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

lightdb數據庫超時相關控制參數

lightdb秒級增加列和刪除列（not null帶默認值）

Java ThreadPoolShutdown

【Java 強化】單元測試（JUnit3、JUnit4）、XML（語法、約束、文檔結構）、DOM、DOM4J

【計算機網絡】第5章 Internet原理與技術2（因特網的路由協議RIP、OSPF、BGP，網絡地址轉換NAT，網絡協議IPv6）

南郵《網絡技術與應用》課後作業解析

【Java 強化】代碼規範、JavaBean、lombok、內省（Introspector）、commons-beanutils組件

【戀上數據結構】串匹配算法（蠻力匹配、KMP【重點】、Boyer-Moore、Karp-Rabin、Sunday）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

【Java 正則表達式】單字符匹配、預定字符、量詞、Matcher（貪婪、勉強、獨佔模式）、捕獲組、邊界匹配符、String類與正則表達式

正則表達式（Regex Expression）

字符串的合法驗證

自己編寫驗證邏輯

使用正則表達式

單字符匹配

預定義字符

量詞（Quantifier）

Pattern、Matcher

Matcher 常用方法

【Matcher 工具】：找出所有匹配的子序列

Matcher – 貪婪、勉強、獨佔的區別

捕獲組（Capturing Group）

捕獲組 – 反向引用（Backreference）

邊界匹配符（ Boundary Matcher）

基本概念（終止符、輸入、一行、單詞邊界）

常用模式（CASE_INSENSITIVE、DOTALL、MULTILINE）

常用的正則表達式

String 類與正則表達式（replaceAll、repaceFirst、split）

【練習】 替換字符串中的單詞

【練習】替換字符串的數字

【練習】利用數字分隔字符串

【練習】提取重疊的字母、數字

【練習】替換字符串中的單詞