大表數據加索引,加字段

原創

名明鸣冥

2020-06-17 10:22

這段時間發現一個800W的用戶表查詢非常慢,用了orderby ,排序的字段是string的，雖然也加了索引，但效果不理想。

由於之前的經驗，一張5000W的表，orderby 一個timestamp 字段，只要加了 tree 索引，分頁10條的速度也是非常快的，於是決定對這張800W表的 timestamp加索引。

但這個表正在運行，有大量的更新，在這個過程停掉服務去處理非常冒險，而且服務是分佈式的，並且有多個服務調用，停應用去更新也不現實。

問了一個大牛後，給了一個方案，建一個一樣的 tpm表，給 tmp表加索引，然後兩個表rename，給主表加索引，再RENAME回來，把TMP表的新增數據在主表中沒有的給INSERT回主表。

廢話不多說，SQL如下：


-- 創建tmp表
create table tx_xxxxx_user_tmp_data like tx_xxxxx_user;
-- 
ALTER  TABLE  tx_xxxxx_user_tmp_data  ADD  INDEX idx_device_guid (device_guid);
 ALTER TABLE tx_xxxxx_user_tmp_data ADD INDEX idx_update_time (update_time) ;
-- 
-- insert into tx_xxxxx_user_tmp  select * from tx_xxxxx_user ;
-- select count(id) from tx_xxxxx_user_tmp ;
 -- -- 換表
  RENAME TABLE tx_xxxxx_user TO tx_xxxxx_user2, tx_xxxxx_user_tmp_data to tx_xxxxx_user;
-- -- 
-- -- -- 增加大表索引
  ALTER  TABLE  tx_xxxxx_user2  ADD  INDEX idx_device_guid (device_guid);
  ALTER TABLE tx_xxxxx_user2 ADD INDEX idx_update_time (update_time) ;
-- -- 
-- -- 
-- -- -- 換回來
 RENAME TABLE tx_xxxxx_user TO tx_xxxxx_user_tmp_data, tx_xxxxx_user2 to tx_xxxxx_user;
-- 查看總數
SELECT count(1) from tx_xxxxx_user t ;
SELECT count(1) from tx_xxxxx_user2 t ;
SELECT * from tx_xxxxx_user_tmp_data t  limit 10;
SELECT * from tx_xxxxx_user t  limit 10;

-- 查看新增的用戶數據
-- EXPLAIN
SELECT * from tx_xxxxx_user_tmp_data  o where o.id not in (
SELECT t.id from tx_xxxxx_user_tmp_data t,tx_xxxxx_user t2 where 1=1
and t.login_name = t2.login_name
and t.source = t2.source
);

在實際操作中，800W的數據也沒啥壓力，兩個索引，一個用了140S，另一個用了70S，挑在了凌晨處理，很快就搞完了。

但細想，我這個表在切換主要是沒有考慮系統的使用，如果在這個過程中，有操作要查詢這800W的數據，這個方案還是不完善的。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

大表數據加索引,加字段

大表數據加索引,加字段

flume+kafka收集業務日誌

elastic search安裝配置與使用

使用mycat分表(一致性hash)

word的宏_vba統一設置表格寬度

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結