MySQL 筆記5 全文本搜索

參考:《MySQL必知必會》Ben Forta著,第18章 全文本搜索

方腳本下載傳送門 -> https://forta.com/books/0672327120/ 
如果上述無法訪問 -> https://download.csdn.net/download/wy_hhxx/12277619
===========================
首先導入sql腳本,create.sql和populate.sql
create.sql創建建數據表  -> 我在腳本開始添加了兩行命令,爲的是將這些數據表創建在一個名爲supply的數據庫下
CREATE DATABASE IF NOT EXISTS `supply`;
USE `supply`;

populate.sql爲數據表填充數據 -> 同理,我在腳本開始添加了一行命令 USE `supply`;

然後使用以下命令將數據導入數據庫,

[root@xxx bin]# ./mysql -uroot -p < create.sql
Enter password:
[root@xxx bin]# ./mysql -uroot -p < populate.sql
Enter password:
[root@xxx bin]#

導入後,會新建supply庫,包含以下6張表。

mysql> use supply;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>
mysql> show tables;
+------------------+
| Tables_in_supply |
+------------------+
| customers        |
| orderitems       |
| orders           |
| productnotes     |
| products         |
| vendors          |
+------------------+
6 rows in set (0.00 sec)

mysql>

 

1. MySQL支持的兩種最常使用的引擎爲MyISAM和InnoDB,前者支持全文本搜索,而後者不支持。
一般在創建表時啓用全文本搜索。CREATE TABLE語句接受FULLTEXT子句,它給出被索引列的一個逗號分隔的列表。

【例】創建表productnotes時,使用FULLTEXT(note_text),之後可對列note_text進行全文本搜索
說明:在定義之後,MySQL自動維護該索引。在增加、更新或刪除行時,索引隨之自動更新。

mysql> SHOW CREATE TABLE productnotes\G;
*************************** 1. row ***************************
       Table: productnotes
Create Table: CREATE TABLE `productnotes` (
  `note_id` int(11) NOT NULL AUTO_INCREMENT,
  `prod_id` char(10) NOT NULL,
  `note_date` datetime NOT NULL,
  `note_text` text,
  PRIMARY KEY (`note_id`),
  FULLTEXT KEY `note_text` (`note_text`)
) ENGINE=MyISAM AUTO_INCREMENT=115 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

ERROR:
No query specified

mysql> SELECT * FROM productnotes LIMIT 1\G;
*************************** 1. row ***************************
  note_id: 101
  prod_id: TNT2
note_date: 2005-08-17 00:00:00
note_text: Customer complaint:
Sticks not individually wrapped, too easy to mistakenly detonate all at once.
Recommend individual wrapping.
1 row in set (0.00 sec)

ERROR:
No query specified

mysql>

 

2.使用兩個函數Match()和Against()執行全文本搜索,其中Match()指定被搜索的列,Against()指定要使用的搜索表達式

【例】Match(note_text)指示MySQL針對指定的列進行搜索,Against('rabbit')指定詞rabbit作爲搜索文本。
說明:有兩行包含詞rabbit,這兩個行被返回。搜索不區分大小寫(除非使用BINARY方式)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('rabbit');
+----------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                         |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+----------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('Rabbit');
+----------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                         |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+----------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

 

3.布爾文本搜索

mysql> SELECT COUNT(*) FROM productnotes WHERE Match(note_text) Against('heavy' IN BOOLEAN MODE);
+----------+
| COUNT(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to' IN BOOLEAN MODE);
Empty set (0.00 sec)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy' IN BOOLEAN MODE);
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Item is extremely heavy. Designed for dropping, not recommended for use with slings, ropes, pulleys, or tightropes.                                     |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead. |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql>

說明:書中指出 “許多詞出現的頻率很高,搜索它們沒有用處(返回太多的結果)。因此,MySQL規定了一條50%規則,如果一個詞出現在50%以上的行中,則將它作爲一個非用詞忽略[1]。50%規則不用於IN BOOLEAN MODE。[2]”

[1] 高頻詞確實會被忽略,如下搜索 “to” (其實note_text列內容包含to的行數爲12)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to');
Empty set (0.00 sec)

mysql> SELECT COUNT(*) FROM productnotes WHERE note_text LIKE '%to%';
+----------+
| COUNT(*) |
+----------+
|       12 |
+----------+
1 row in set (0.00 sec)

mysql> 

[2] 但是50%規則似乎也適用於IN BOOLEAN MODE,不知是否和數據庫設置有關(?)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to' IN BOOLEAN MODE);
Empty set (0.00 sec)

【例】-rope*指示MySQL排除包含rope*(任何以rope開始的詞,包括ropes)的行

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy -rope*' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead.
1 row in set (0.00 sec)

ERROR:
No query specified

【例】 匹配包含詞rabbit和bait的行 Against('rabbit bait' IN BOOLEAN MODE)
匹配rabbit和bait中至少一個 Against('+rabbit +bait' IN BOOLEAN MODE)
匹配短語 rabbit bait 而非兩個詞 Against(' "rabbit bait" ' IN BOOLEAN MODE)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('rabbit bait' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait.
*************************** 2. row ***************************
note_text: Customer complaint: rabbit has been able to detect trap, food apparently less effective now.
2 rows in set (0.00 sec)

ERROR:
No query specified

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('+rabbit +bait' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait.
1 row in set (0.00 sec)

ERROR:
No query specified

mysql>

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章