爲某XXX手機公司刪除重複數據優化一例

原創

2018-09-12 05:39

這是他們一個開發寫的SQL，目的是刪除重複數據，且id是最小值的行不刪除：

delete from jd_chapter a where a.`id` in
(select `id` from jd_chapter group by book_id,chapter_id  having count(*)>1)   
and a.`id` not in 
(select min(`id`) from jd_chapter group by book_id,chapter_id  having count(*)>1);

因爲表大（千萬級別），且使用了兩個子查詢，執行了很久沒有執行完。

--------------------------思路----------------------------

採用臨時表做關聯，以下是步驟：

一、先到Slave庫上把重複數據導出來，避免造成主庫壓力過大。

select id from jd_chapter group by book_id,chapter_id having count(*)>1 order by id asc 
into outfile '/tmp/jd_chapter.sql' FIELDS TERMINATED BY ',';

二、拷貝導出的SQL到Master主庫的/tmp/目錄下

三、在Master主庫上，建立一張臨時表，並創建主鍵：

mysql> create TEMPORARY table tmp(id int,primary key(id)); 
Query OK, 0 rows affected (0.07 sec)

四、在Master主庫上，LOAD方式導入至臨時表裏

load data infile '/tmp/jd_chapter.sql' into table tmp FIELDS TERMINATED BY ',';

五、在Master主庫上，刪除臨時表最小的id

delete from tmp limit 1;

六、用臨時表做關聯，刪除jd_chapter表重複數據

delete a from jd_chapter join tmp b on a.id=b.id;

本文出自 “賀春暘的技術專欄” 博客，請務必保留此出處http://hcymysql.blog.51cto.com/5223301/1129629

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

爲某XXX手機公司刪除重複數據優化一例

工作中用到的腳本合集

通過f-string編寫簡潔高效的Python格式化輸出代碼

24-5-18 X

mysql主從日誌的定期清理

Linux下簡單的安全日誌統計腳本

某大型網站遷移總結（完結）

LAMP或LNMP一鍵安裝包

CENTOS6 X64 LAMP+GD SHELL腳本

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結