[SQL]取每個用戶最近一條記錄（分組取Top n）

一直沒有時間寫分享，終於等到雙十一，任何需求都不準上，這才抽出時間整理一下蒐集了好幾天的SQL。

需求：查出用戶最近一條登錄記錄。(110w條）

前提：默認時間和id都是遞增。（求時間最大->求id最大）

第一種：select * from user_login_log where id in(select max(id) from user_login_log group by user_id); 耗時6.35s

第二種：select * from user_login_log where exists (select max(id) from user_login_log group by user_id);耗時3.47s

第三種：select * from user_login_log a  join (select max(id) as id from user_login_log group by user_id) b on a.id=b.id;耗時3.65s

第四種：select * from user_login_log a ,(select max(id) as id from user_login_log group by user_id) b where a.id=b.id;耗時3.65s

第五種：select * from user_login_log a where 1>=(select count(1) from user_login_log b where a.user_id=b.user_id and a.id<=b.id) order by user_id ,id desc; 耗時過長（1w條耗時62.0s）

第六種：select a.* from user_login_log  a left join user_login_log  b on a.user_id=b.user_id and a.id<=b.id group by a.id,a.user_id having count(1)=1 order by a.user_id,a.id desc;耗時過長（1w條耗時15.8s）

第七種：select *,GROUP_CONCAT(login_ip_str order by id desc) from user_login_log group by user_id;耗時200.32s

第八種：select * from(select  (@row_number:=CASE WHEN @customer_no = user_id THEN @row_number + 1 ELSE 1 END) AS num, @customer_no:=user_id AS user_id, id, login_ip_str,created_date FROM user_login_log ORDER BY user_id,id desc)x where x.num=1;耗時25.5s

方案一：in+max()

解釋：選出每個用戶的最大id，然後用in查詢。適用於全部數據庫，但是它默認把id最大當成了時間最近，因此它使用前提是時間和id都是自增。

方案二：exists+max()

解釋：選出每個用戶的最大id。然後用用這個id和表的id做where條件再加一個select 1，作爲exists的內容。

方案三：join+max()

解釋：選出每個用戶的最大id。然後根據id取join全表，取出id對應的全部信息。

方案四：where+max()

解釋：where聯查會被優化成join查詢，所有效果和方案三等效。

方案五：自關聯+count()

解釋：思路就是判斷每一條記錄，是否有比本地大的記錄的條數是否爲1，如果只有一條比自己大，則符合。篩選出全部記錄，最後按組名和值排序。但是他進行了n次count(*)計算，因此性能極差。

方案六: 外鏈接+having count()

解釋：兩個表做鏈接查詢，笛卡爾積，然後用haing count(1)做篩選出比當前條數大的條數。

方案七：GROUP_CONCAT+截取

解釋：利用GROUP_CONCAT的排序功能，將每個用戶的所有記錄排序並拼接成一個字段，再截取第一項。適用於mysql，但是此方法效率極低，方法只能返回了最近一條的某個字段，如果需要返回最近一條的全部信息需要在函數中使用*，所以一般只做思路。

方案八：臨時變量

解釋：其實就是判斷當前行user_id值是否與上一行user_id值相同，當不相同時重新編號（輸出1），從而實現了分組順序編號的功能。（case中判斷條件在customer_no賦值之前），這種計算最近幾條，可以規定limit來縮短時間，縮短到1s。

耗時比較：方案二<方案三=方案四<方案一<方案八<方案七<方案六<方案五

1.其中五、六、七、八不僅可以實現取最近一條，還可以實現Top n的需求，都可以用來代替ROW_NUMBER() over ()，但方案七性能更好。

2.Hive、Oracle、SqlServer一般用ROW_NUMBER() over (PARTITION BY xx ORDER BY ** DESC)實現分組取Top n的問題。

3.mysql8以前沒有開窗函數，因此只能通過其方案五和方案七實現。

歡迎補充~！

[SQL]取每個用戶最近一條記錄（分組取Top n）

SpringBoot 通過註解封裝API

一個案例弄懂ElasticSearch分詞匹配原理和同義詞

[mysql]分組取Top n、最近一條

Word2Vec計算相似文章

TF-IDF計算相似文章

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結