最近需要統計系統調用之間的成功率,使用的ORM是sqlalchemy,我的數據庫表如下所示:
+-----------+---------------+----------+---------------------+
| caller | callee | success | time |
+-----------+---------------+----------+---------------------+
| Bar::baz | Dolor::sit | 0 | 2016-01-07 00:00:00 |
| Bar::baz | Dolor::sit | 0 | 2016-01-05 00:00:00 |
| Bar::baz | Lorem::ipsum | 0 | 2016-01-01 00:00:00 |
| Bar::baz | Lorem::ipsum | 1 | 2016-01-04 00:00:00 |
| Bar::baz | Lorem::ipsum | 1 | 2016-01-09 00:00:00 |
| Bar::baz | Lorem::ipsum | 1 | 2016-01-08 00:00:00 |
| Bar::baz | Lorem::ipsum | 1 | 2016-01-04 00:00:00 |
| Bar::baz | Qux::foo | 0 | 2016-01-05 00:00:00 |
| Bar::baz | Qux::foo | 0 | 2016-01-01 00:00:00 |
| Bar::baz | Qux::foo | 1 | 2016-01-05 00:00:00 |
| Foo::bar | Dolor::sit | 0 | 2016-01-06 00:00:00 |
| Foo::bar | Lorem::ipsum | 0 | 2016-01-08 00:00:00 |
| Foo::bar | Lorem::ipsum | 1 | 2016-01-03 00:00:00 |
| Foo::bar | Lorem::ipsum | 1 | 2016-01-05 00:00:00 |
| Foo::bar | Lorem::ipsum | 1 | 2016-01-07 00:00:00 |
| Foo::bar | Qux::foo | 0 | 2016-01-07 00:00:00 |
| Foo::bar | Qux::foo | 0 | 2016-01-04 00:00:00 |
+-----------+---------------+----------+---------------------+
我需要統計每一對調用的總次數,成功次數,成功率,並且按照成功率進行排序,返回的期望數據應該是:
result(caller,callee, success_count, total_count, success_ratio)
我一時沒有想到比較好的辦法,在stackoverflow上面問了之後有大神給出詳細的答案
result = session.query(
cs.c.caller,
cs.c.callee,
func.sum(cast(cs.c.success, Integer).label('success_count'),
func.count().label('total_count'),
(func.sum(cast(cs.c.success, Integer)) / func.count()).label('success_ratio')
).group_by(cs.c.caller, cs.c.callee).order_by(desc('success_ratio'))
還有對應的sql語句
SELECT caller,
callee,
sum(success) AS 'success_count',
count(*) AS 'total_count',
sum(success) / count(*) AS 'success_ratio'
FROM callstate
GROUP BY caller, callee
ORDER BY success_ratio DESC
讓我解了燃眉之急
我用的python flask-sqlalchemy
下面是我的表結構
class State(db.Model): __tablename__ = 'states' _id = db.Column(db.Integer, primary_key=True) caller = db.Column(db.String(64), nullable=False) caller_fn = db.Column(db.String(64), nullable=False) callee = db.Column(db.String(64), nullable=False) callee_fn = db.Column(db.String(64), nullable=False) success = db.Column(db.Boolean, nullable=False) time = db.Column(db.DateTime, nullable=False, default=datetime.utcnow) msg = db.Column(db.String(256))
然後我的查詢使用到三個label解決了這個問題
all_count = label('all_count', func.count(State._id)) success_count = label('success_count', func.sum(case( [(State.success, 1)], else_=0))) success_ratio = label('success_ratio', success_count / all_count)這三個label在幾乎任何條件的查詢中,只要是需要得到調用成功數,成功率,調用總數,就可以服用,與查詢條件無關
比如這樣的
query = db.session.query(State.caller, State.callee, all_count, success_count, success_ratio)這樣的
query = db.session.query(State.caller_fn, all_count, success_count, success_ratio).group_by(State.caller_fn). \ filter_by(caller=caller)還有這樣的
query = db.session.query(State.callee, State.callee_fn, all_count, success_count, success_ratio).group_by(State.callee, State.callee_fn). \ filter_by(caller=caller, caller_fn=caller_fn)
等等