MySQL 笔记5 全文本搜索

参考:《MySQL必知必会》Ben Forta著,第18章 全文本搜索

方脚本下载传送门 -> https://forta.com/books/0672327120/ 
如果上述无法访问 -> https://download.csdn.net/download/wy_hhxx/12277619
===========================
首先导入sql脚本,create.sql和populate.sql
create.sql创建建数据表  -> 我在脚本开始添加了两行命令,为的是将这些数据表创建在一个名为supply的数据库下
CREATE DATABASE IF NOT EXISTS `supply`;
USE `supply`;

populate.sql为数据表填充数据 -> 同理,我在脚本开始添加了一行命令 USE `supply`;

然后使用以下命令将数据导入数据库,

[root@xxx bin]# ./mysql -uroot -p < create.sql
Enter password:
[root@xxx bin]# ./mysql -uroot -p < populate.sql
Enter password:
[root@xxx bin]#

导入后,会新建supply库,包含以下6张表。

mysql> use supply;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>
mysql> show tables;
+------------------+
| Tables_in_supply |
+------------------+
| customers        |
| orderitems       |
| orders           |
| productnotes     |
| products         |
| vendors          |
+------------------+
6 rows in set (0.00 sec)

mysql>

 

1. MySQL支持的两种最常使用的引擎为MyISAM和InnoDB,前者支持全文本搜索,而后者不支持。
一般在创建表时启用全文本搜索。CREATE TABLE语句接受FULLTEXT子句,它给出被索引列的一个逗号分隔的列表。

【例】创建表productnotes时,使用FULLTEXT(note_text),之后可对列note_text进行全文本搜索
说明:在定义之后,MySQL自动维护该索引。在增加、更新或删除行时,索引随之自动更新。

mysql> SHOW CREATE TABLE productnotes\G;
*************************** 1. row ***************************
       Table: productnotes
Create Table: CREATE TABLE `productnotes` (
  `note_id` int(11) NOT NULL AUTO_INCREMENT,
  `prod_id` char(10) NOT NULL,
  `note_date` datetime NOT NULL,
  `note_text` text,
  PRIMARY KEY (`note_id`),
  FULLTEXT KEY `note_text` (`note_text`)
) ENGINE=MyISAM AUTO_INCREMENT=115 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

ERROR:
No query specified

mysql> SELECT * FROM productnotes LIMIT 1\G;
*************************** 1. row ***************************
  note_id: 101
  prod_id: TNT2
note_date: 2005-08-17 00:00:00
note_text: Customer complaint:
Sticks not individually wrapped, too easy to mistakenly detonate all at once.
Recommend individual wrapping.
1 row in set (0.00 sec)

ERROR:
No query specified

mysql>

 

2.使用两个函数Match()和Against()执行全文本搜索,其中Match()指定被搜索的列,Against()指定要使用的搜索表达式

【例】Match(note_text)指示MySQL针对指定的列进行搜索,Against('rabbit')指定词rabbit作为搜索文本。
说明:有两行包含词rabbit,这两个行被返回。搜索不区分大小写(除非使用BINARY方式)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('rabbit');
+----------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                         |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+----------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('Rabbit');
+----------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                            |
+----------------------------------------------------------------------------------------------------------------------+
| Customer complaint: rabbit has been able to detect trap, food apparently less effective now.                         |
| Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait. |
+----------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

 

3.布尔文本搜索

mysql> SELECT COUNT(*) FROM productnotes WHERE Match(note_text) Against('heavy' IN BOOLEAN MODE);
+----------+
| COUNT(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to' IN BOOLEAN MODE);
Empty set (0.00 sec)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy' IN BOOLEAN MODE);
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| note_text                                                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| Item is extremely heavy. Designed for dropping, not recommended for use with slings, ropes, pulleys, or tightropes.                                     |
| Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead. |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql>

说明:书中指出 “许多词出现的频率很高,搜索它们没有用处(返回太多的结果)。因此,MySQL规定了一条50%规则,如果一个词出现在50%以上的行中,则将它作为一个非用词忽略[1]。50%规则不用于IN BOOLEAN MODE。[2]”

[1] 高频词确实会被忽略,如下搜索 “to” (其实note_text列内容包含to的行数为12)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to');
Empty set (0.00 sec)

mysql> SELECT COUNT(*) FROM productnotes WHERE note_text LIKE '%to%';
+----------+
| COUNT(*) |
+----------+
|       12 |
+----------+
1 row in set (0.00 sec)

mysql> 

[2] 但是50%规则似乎也适用于IN BOOLEAN MODE,不知是否和数据库设置有关(?)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('to' IN BOOLEAN MODE);
Empty set (0.00 sec)

【例】-rope*指示MySQL排除包含rope*(任何以rope开始的词,包括ropes)的行

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('heavy -rope*' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Customer complaint:
Not heavy enough to generate flying stars around head of victim. If being purchased for dropping, recommend ANV02 or ANV03 instead.
1 row in set (0.00 sec)

ERROR:
No query specified

【例】 匹配包含词rabbit和bait的行 Against('rabbit bait' IN BOOLEAN MODE)
匹配rabbit和bait中至少一个 Against('+rabbit +bait' IN BOOLEAN MODE)
匹配短语 rabbit bait 而非两个词 Against(' "rabbit bait" ' IN BOOLEAN MODE)

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('rabbit bait' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait.
*************************** 2. row ***************************
note_text: Customer complaint: rabbit has been able to detect trap, food apparently less effective now.
2 rows in set (0.00 sec)

ERROR:
No query specified

mysql> SELECT note_text FROM productnotes WHERE Match(note_text) Against('+rabbit +bait' IN BOOLEAN MODE)\G;
*************************** 1. row ***************************
note_text: Quantity varies, sold by the sack load.
All guaranteed to be bright and orange, and suitable for use as rabbit bait.
1 row in set (0.00 sec)

ERROR:
No query specified

mysql>

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章