之前博文中寫過一篇
《SQL實現佔比、同比、環比指標分析》
其中列舉了mysql和oracle實現佔比的兩種方式,分別使用on1=1和cross join 實現笛卡爾積。
基本語法如下
SELECT
`status`,
number,
concat(round(number / total * 100.00, 2), '%') percent
FROM
(
SELECT
*
FROM
(
SELECT
`status`,
COUNT(1) number
FROM
`user_tasks`
GROUP BY
`status`
) t1
INNER JOIN(
SELECT
COUNT(1) total
FROM
`user_tasks`
) t2 ON 1 = 1
基本這種操作可以應對大部分的佔比求值, 但是當使用時間進行分組求佔比的時候就需要注意了
例如下面的例子
創建數據庫並插入數據
CREATE TABLE `order` (
`order_id` int(11) NOT NULL,
`order_time` datetime DEFAULT NULL,
`order_num` int(11) DEFAULT NULL,
PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
插入數據
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (1, '2019-01-02 15:02:42', 100);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (2, '2019-01-24 15:03:18', 200);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (3, '2018-01-04 15:03:37', 50);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (4, '2018-01-26 15:12:12', 120);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (5, '2019-02-01 15:12:48', 300);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (6, '2018-02-20 15:12:58', 180);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (7, '2019-03-12 15:13:08', 260);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (8, '2018-03-22 15:13:14', 220);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (9, '2019-04-17 15:13:27', 350);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (10, '2018-04-19 15:13:59', 280);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (11, '2019-04-17 15:21:45', 260);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (12, '2019-05-21 15:21:54', 200);
INSERT INTO `order`(`order_id`, `order_time`, `order_num`) VALUES (13, '2018-05-10 15:22:03', 220);
如下圖,該表記錄了2018年1-5月和209年1-5月的訂單量
2.誤區:沒有進行對月和年分別分組
這時候我們發現,在做彙總的時候月份需要按照月份,年份需要按照年份分組,這時候我們進行笛卡爾積進行查看
3.誤區:笛卡爾積錯位
SELECT
*
FROM
( SELECT DATE_FORMAT( order_time, '%Y-%m' ) AS MONTH,
sum( order_num ) AS number FROM `order`
GROUP BY MONTH ) t1
JOIN
( SELECT DATE_FORMAT( order_time, '%Y' ) AS YEAR,
sum( order_num ) AS total FROM `order`
GROUP BY YEAR ) t2
ON 1 = 1
這時候我們查看一下
total和number的彙總值是沒問題對的,一個是按照月份一個是按照年份,但是在進行笛卡爾積關聯的時候錯位了
這是因爲我們在拼接的時候少了拼接條件
4.誤區:添加條件失敗
SELECT
*
FROM
( SELECT DATE_FORMAT( order_time, '%Y-%m' ) AS MONTH, sum( order_num ) AS number FROM `order`
GROUP BY MONTH ) t1
JOIN
( SELECT DATE_FORMAT( order_time, '%Y' ) AS YEAR,sum( order_num ) AS total FROM `order`
GROUP BY YEAR ) t2
ON 1 = 1 AND date_format( t1.MONTH, '%Y' ) = t2.YEAR
這時候我們加上條件,讓年份等於年份,結果卻爲空值,這是爲什麼呢?
不如我們看一下 date_format( t1.MONTH, ‘%Y’ ),這幾個值哪裏有問題,爲什麼不能連接
SELECT
t1.month,date_format( t1.MONTH, '%Y' ),t2.YEAR
FROM
( SELECT DATE_FORMAT( order_time, '%Y-%m' ) AS MONTH,
sum( order_num ) AS number FROM `order`
GROUP BY MONTH ) t1
JOIN
( SELECT DATE_FORMAT( order_time, '%Y' ) AS YEAR,
sum( order_num ) AS total FROM `order`
GROUP BY YEAR ) t2
ON 1 = 1
所以關鍵是 date_format( t1.MONTH, ‘%Y’ )有問題,
這是因爲在時間格式化成月份之後,再次進行格式化的時候不能識別出這是個時間格式,所以可以進行字符串拼接,根據第一次格式化出來的格式,在月份後面添加上相同格式的日期
例如 date_format( concat( t1.MONTH, ‘-01’ ), ‘%Y’ )就可以了,這樣就可以識別出時間格式了,
正確的語法應爲
SELECT
MONTH,
YEAR,
number,
concat( round( number / total * 100.00, 2 ), '%' ) percent
FROM
(
SELECT
*
FROM
( SELECT DATE_FORMAT( order_time, '%Y-%m' ) AS MONTH,
sum( order_num ) AS number FROM `order`
GROUP BY MONTH ) t1
JOIN
( SELECT DATE_FORMAT( order_time, '%Y' ) AS YEAR,
sum( order_num ) AS total FROM `order`
GROUP BY YEAR ) t2
ON 1 = 1
AND date_format( concat( t1.MONTH, '-01' ), '%Y' ) = t2.YEAR
) t3;
執行結果如下
此時年份和月份是一一對應的。