爲某XXX手機公司刪除重複數據優化一例

這是他們一個開發寫的SQL,目的是刪除重複數據,且id是最小值的行不刪除:

  1. delete from jd_chapter a where a.`id` in
  2. (select `id` from jd_chapter group by book_id,chapter_id having count(*)>1)
  3. and a.`id` not in
  4. (select min(`id`) from jd_chapter group by book_id,chapter_id having count(*)>1);

因爲表大(千萬級別),且使用了兩個子查詢,執行了很久沒有執行完。
 

--------------------------思路----------------------------

採用臨時表做關聯,以下是步驟:

一、先到Slave庫上把重複數據導出來,避免造成主庫壓力過大。

  1. select id from jd_chapter group by book_id,chapter_id having count(*)>1 order by id asc
  2. into outfile '/tmp/jd_chapter.sql' FIELDS TERMINATED BY ',';

二、拷貝導出的SQL到Master主庫的/tmp/目錄下

三、在Master主庫上,建立一張臨時表,並創建主鍵:

  1. mysql> create TEMPORARY table tmp(id int,primary key(id));
  2. Query OK, 0 rows affected (0.07 sec)

四、在Master主庫上,LOAD方式導入至臨時表裏

  1. load data infile '/tmp/jd_chapter.sql' into table tmp FIELDS TERMINATED BY ',';

五、在Master主庫上,刪除臨時表最小的id

  1. delete from tmp limit 1;

六、用臨時表做關聯,刪除jd_chapter表重複數據

  1. delete a from jd_chapter join tmp b on a.id=b.id;

 

 

本文出自 “賀春暘的技術專欄” 博客,請務必保留此出處http://hcymysql.blog.51cto.com/5223301/1129629

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章