想要统计stackoverflow上关于mongodb的每月的post数量还有总的viewcount
已有stackoverflow的posts镜像表(2014镜像),我们按照标签(tags)来统计。
(1)先把标签分离成sof_tags_database表,
(2)然后新建一个临时表用于存放有关mongodb的post,其结构和stackoverflow的posts表相同:
CREATE TABLE sof_mongodb LIKE Posts_stackoverflow
(3)建立索引后,取出标签中含有“mongodb”的post:
INSERT INTO sof_mongodb SELECT * FROM Posts_stackoverflow p LEFT JOIN sof_tags_database s on p.Id = s.post_id
WHERE s.tags = 'mongodb'
(4)建立索引后,按照年月统计每个月的post数量和总的viewcount。
这里主要用到了MySQL获取日期的指定值的函数EXTRACT(Type FROM Datetime),按照年月分组后统计:
SELECT EXTRACT(YEAR_MONTH from CreationDate),COUNT(post_id),SUM(ViewCount)
FROM sof_mongodb
WHERE CreationDate < '2015-01-01'
GROUP BY EXTRACT(YEAR_MONTH from CreationDate)
(5) 结果如下:
年月,PostNum,ViewCount200809 | 1 | 8875 |
200905 | 2 | 127902 |
200907 | 2 | 8814 |
200908 | 2 | 21682 |
200909 | 4 | 109726 |
200910 | 2 | 8520 |
200911 | 11 | 53092 |
200912 | 13 | 92333 |
201001 | 34 | 181973 |
201002 | 40 | 190208 |
201003 | 40 | 147643 |
201004 | 41 | 136315 |
201005 | 65 | 435551 |
201006 | 77 | 212016 |
201007 | 72 | 373212 |
201008 | 110 | 268109 |
201009 | 112 | 269758 |
201010 | 135 | 344647 |
201011 | 136 | 412603 |
201012 | 173 | 502789 |
201101 | 213 | 684384 |
201102 | 226 | 657619 |
201103 | 245 | 562895 |
201104 | 255 | 674638 |
201105 | 273 | 499982 |
201106 | 285 | 535738 |
201107 | 333 | 637016 |
201108 | 333 | 499327 |
201109 | 324 | 492788 |
201110 | 397 | 594212 |
201111 | 437 | 626991 |
201112 | 414 | 460230 |
201201 | 500 | 867172 |
201202 | 486 | 583519 |
201203 | 568 | 724225 |
201204 | 632 | 705364 |
201205 | 582 | 630233 |
201206 | 547 | 518286 |
201207 | 596 | 512650 |
201208 | 669 | 600473 |
201209 | 594 | 459296 |
201210 | 635 | 502791 |
201211 | 637 | 421906 |
201212 | 568 | 432391 |
201301 | 726 | 457281 |
201302 | 683 | 369917 |
201303 | 743 | 432477 |
201304 | 707 | 413095 |
201305 | 790 | 333086 |
201306 | 698 | 293294 |
201307 | 849 | 288062 |
201308 | 895 | 312353 |
201309 | 811 | 268926 |
201310 | 853 | 221314 |
201311 | 787 | 154976 |
201312 | 809 | 123046 |
201401 | 928 | 88421 |
201402 | 1003 | 67433 |
201403 | 984 | 50066 |
201404 | 948 | 36567 |
201405 | 73 | 1581 |