OpenMLDB:一文了解帶參數查詢語句(paramterized query statement) 的細節

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"In database management systems (DBMS), a prepared statement or parameterized statement is a feature used to execute the same or similar database statements repeatedly with high efficiency. Typically used with SQL statements such as queries or updates, the prepared statement takes the form of a template into which certain constant values are substituted during each execution. (","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//en.wikipedia.org/wiki/Prepared_statement","title":null,"type":null},"content":[{"type":"text","text":"https://en.wikipedia.org/wiki/Prepared_statement","attrs":{}}]},{"type":"text","text":")","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在數據庫系統中,帶參數的語句(parameterized statement),一方面,能夠提供預編譯的能力,以達到高效執行語句、提高性能的目的。另一方面,能夠預防SQL注入攻擊,安全性更好。以上兩點是傳統的數據庫系統使用支持帶參數語句的主要原因。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從數據庫系統角度看,","attrs":{}},{"type":"link","attrs":{"href":"https://sourl.cn/guDiLJ","title":"","type":null},"content":[{"type":"text","text":"OpenMLDB","attrs":{}}]},{"type":"text","text":" 支持Parameterized query statement能進一步完善數據庫查詢能力。從業務角度上看,它使得OpenMLDB能夠在規則引擎場景下,支持規則特徵計算。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"場景示例:規則引擎特徵計算","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"SELECT \nSUM(trans_amount) as F_TRANS_AMOUNT_SUM, \nCOUNT(user) as F_TRANS_COUNT,\nMAX(trans_amount) as F_TRANS_AMOUNT_MAX,\nMIN(trans_amount) as F_TRANS_AMOUNT_MIN,\nFROM t1 where user = 'ABC123456789' and trans_time between 1590115420000 and 1592707420000;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在示例中,我們計算了用戶`ABC123456789` 從`2020-05-22 02:43:40` 到 `2020-06-20 07:43:40`這段期間的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"交易總額","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"交易次數","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"最大交易金額","attrs":{}},{"type":"text","text":",","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"最小交易金額","attrs":{}},{"type":"text","text":"。這些特徵將傳遞可給下游的組件(規則引擎)使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實際場景中,不可能針對每個用戶寫一段SQL查詢代碼。因此,需要一個規則特徵計算的模版,而用戶,時間區間則是動態變化的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最簡單的方式,就是寫一段類似下面程序,把用戶名,時間區間作爲變量拼接到一段SQL語句中。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"String query = \"SELECT \"+\n \"SUM(trans_amount) as F_TRANS_AMOUNT_SUM, \"+\n \"COUNT(user) as F_TRANS_COUNT,\"+\n \"MAX(trans_amount) as F_TRANS_AMOUNT_MAX,\"+\n \"MIN(trans_amount) as F_TRANS_AMOUNT_MIN,\"+\n \"FROM t1 where user = '\"+ user +\"' and trans_time between \" \n + System.currentTimestamp()-30*86400000+ \" and \" + System.currentTimestamp();\n\nexecutor.execute(query);","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種實現方法比較直接,但查詢性能將很差,並且可能有SQL注入的風險。更爲推薦的方式,是使用帶參數查詢(Parameterized query)","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"PreparedStatement stmt = conn.prepareStatement(\"SELECT \"+\n \"SUM(trans_amount) as F_TRANS_AMOUNT_SUM, \"+\n \"COUNT(user) as F_TRANS_COUNT,\"+\n \"MAX(trans_amount) as F_TRANS_AMOUNT_MAX,\"+\n \"MIN(trans_amount) as F_TRANS_AMOUNT_MIN,\"+\n \"FROM t1 where user = ? and trans_time between ? and ? \");\n\nstmt.setString(1, user);\nstmt.setTimestamp(2, System.currentTimestamp()-30*86400000);\nstmt.setTimestamp(3, System.currentTimestamp())\nResultSet rs = stmt.executeQuery();\nrs.next();\n","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"實現細節","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//sourl.cn/qFpzK7","title":null,"type":null},"content":[{"type":"text","text":"OpenMLDB","attrs":{}}]},{"type":"text","text":"中,支持一個新的語法功能,通常需要依次完成","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"語法解析、計劃生成和優化、表達式Codegen、執行查詢等步驟。","attrs":{}},{"type":"text","text":"必要時,還需要考慮在客戶端新增或者重構相關接口。`Paramteried Query`的支持基本就涵蓋的對上述幾個模塊的修改和開發,因此,瞭解相關實現細節有助於大家快速瞭解OpenMLDB的開發,特別是OpenMLDB Engine的開發。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖是執行帶參數查詢流程示意圖。","attrs":{}}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"用戶在應用程序`JavaApplication`中s使用JDBC(PrepraredStatement)來執行帶參數查詢。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"客戶端(TabletClient)提供接口`ExecuteSQLParameterized`來處理帶參數的查詢,並通過RPC調用服務端(Tablet)的`Query`服務。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"服務端(Tablet)的依賴Engine模塊進行查詢編譯和執行。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"查詢語句的編譯需要經過SQL語法分析,計劃生成優化,表達式Codegen三個主要階段。編譯成功後,編譯結果會存放在當前執行會話(jizSeesion)的SQL上下文中(SqlContext)。如果當前查詢語句已經預編譯過,則不需要重複編譯。可直接從編譯緩存中獲取相對應的編譯產物存放到RunSession的SqlContext中。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"zerowidth","attrs":{}},{"type":"text","text":"查詢語句的執行需要調用RunSeesion的`Run`接口。執行結果`run output`會存放到response的附件中,回傳給TabletClient。最終存放到`ResultSet`返回給`JavaApplication`","attrs":{}}]}]}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8d/8d2567ac2f52b281e54897394c15c3cc.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. JDBC PreparedStatement","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1.1 JDBC Prepared Statements 概覽","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sometimes it is more convenient to use a `PreparedStatement` object for sending SQL statements to the database. This special type of statement is derived from the more general class, `Statement`, that you already know.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"If you want to execute a `Statement` object many times, it usually reduces execution time to use a `PreparedStatement` object instead.[[2]](","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html%23overview_ps","title":null,"type":null},"content":[{"type":"text","text":"Using Prepared Statements","attrs":{}}]},{"type":"text","text":")","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"JDBC提供`PreparedStatement`給用戶執行參數的SQL語句。用戶可以使用PrepareStatment執行帶參數的查詢、插入、更新等操作。這個小節,我們講詳細OpenMLDB的PrepareStatement執行帶參數查詢語句的細節。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1.2 OpenMLDB PreapredStatement的用法介紹","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public void parameterizedQueryDemo() {\n SdkOption option = new SdkOption();\n option.setZkPath(TestConfig.ZK_PATH);\n option.setZkCluster(TestConfig.ZK_CLUSTER);\n option.setSessionTimeout(200000);\n try {\n SqlExecutor executor = new SqlClusterExecutor(option);\n String dbname = \"demo_db\";\n boolean ok = executor.createDB(dbname);\n // create table\n ok = executor.executeDDL(dbname, \"create table t1(user string, trans_amount double, trans_time bigint, index(key=user, ts=trans_time));\");\n // insert normal (1000, 'hello')\n ok = executor.executeInsert(dbname, \"insert into t1 values('user1', 1.0, 1592707420000);\");\n ok = executor.executeInsert(dbname, \"insert into t1 values('user1', 2.0, 1592707410000);\");\n ok = executor.executeInsert(dbname, \"insert into t1 values('user1', 3.0, 1592707400000);\");\n ok = executor.executeInsert(dbname, \"insert into t1 values('user2', 4.0, 1592707420000);\");\n ok = executor.executeInsert(dbname, \"insert into t1 values('user2', 5.0, 1592707410000);\");\n ok = executor.executeInsert(dbname, \"insert into t1 values('user2', 6.0, 1592707400000);\");\n ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"PreparedStatement query_statement \n = executor.getPreparedStatement(dbname, \"select SUM(trans_amout), COUNT(trans_amout), MAX(trans_amout) from t1 where user=? and trans_time between ? and ?\");\n ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"query_statement.setString(1, \"user1\");\nquery_statement.setLong(2, 1592707410000);\nquery_statement.setLong(3, 1592707420000);\ncom._4paradigm.openmldb.jdbc.SQLResultSet rs1\n = (com._4paradigm.openmldb.jdbc.SQLResultSet) query_statement.executeQuery();\n\nquery_statement.setString(1, \"user2\");\nquery_statement.setLong(2, 1592707410000);\nquery_statement.setLong(3, 1592707420000);\ncom._4paradigm.openmldb.jdbc.SQLResultSet rs2\n = (com._4paradigm.openmldb.jdbc.SQLResultSet) query_statement.executeQuery();\n\nquery_statement.setString(1, \"user3\");\nquery_statement.setLong(2, 1592707410000);\nquery_statement.setLong(3, 1592707420000);\ncom._4paradigm.openmldb.jdbc.SQLResultSet rs3\n = (com._4paradigm.openmldb.jdbc.SQLResultSet) query_statement.executeQuery();\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":" } catch (Exception e) {\n e.printStackTrace();\n }\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Step 1: 構造executor。準備數據庫,表,表數據(如果需要的話)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Step 2: 使用帶參數的查詢語句新建一個PreparedStatement實例","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"PreparedStatement query_statement \n = executor.getPreparedStatement(dbname, \"select SUM(trans_amout), COUNT(trans_amout), MAX(trans_amout) from t1 where user=? and trans_time between ? and ?\");\n","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Step 3: 設置每一個位置上的參數值。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"query_statement.setString(1, \"user1\");\nquery_statement.setLong(2, 1592707410000);\nquery_statement.setLong(3, 1592707420000);","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Step 4: 執行查詢。獲取查詢結果。請注意,執行完一次查詢後,PrepareStatement裏的參數數據會自動清空。可以直接配置新參數值,進行新一輪查詢","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"com._4paradigm.openmldb.jdbc.SQLResultSet rs2\n = (com._4paradigm.openmldb.jdbc.SQLResultSet) query_statement.executeQuery();\nquery_statement.setString(1, \"user2\");\nquery_statement.setLong(2, 1592707410000);\nquery_statement.setLong(23, 1592707420000);\ncom._4paradigm.openmldb.jdbc.SQLResultSet rs2\n = (com._4paradigm.openmldb.jdbc.SQLResultSet) query_statement.executeQuery();\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1.3 PreparedStatement的實現細節","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"public class PreparedStatement implements java.sql.PreparedStatement {\n \t//...\n \t// 參數行\n protected String currentSql;\n \t// 參數數據\n \tprotected TreeMap currentDatas;\n \t// 參數類型\n protected TreeMap types;\n \t// 上次查詢的參數類型\n protected TreeMap orgTypes;\n\t\t//...\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PrepareaStatement`繼承了JDBC標準接口`java.sql.PreparedStatement`。它維護了查詢編譯和執行需要的一些基本要素:查詢語句(currentSql), 參數數據(currentDatas) 參數類型(types)等。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"構建`PrepareaStatement`後,我們初始化了`PreparedStatement`,並設置`currentSql`","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"設置參數值後, `currentDatas`, `types`會被更新。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"執行查詢時,`query_statement.executeQuery()","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"@Override\npublic SQLResultSet executeQuery() throws SQLException {\n checkClosed();\n dataBuild();\n Status status = new Status();\n com._4paradigm.openmldb.ResultSet resultSet = router.ExecuteSQLParameterized(db, currentSql, currentRow, status);\n // ... 此處省略\n return rs;\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先,執行``dataBuild``: 按參數類型和位置將參數數據集編碼編碼到currentRow中。值得注意的是,如果參數類型不發生變化,我們可以複用原來的currentRow實例。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"protected void dataBuild() throws SQLException {\n // types has been updated, create new row container for currentRow\n if (null == this.currentRow || orgTypes != types) {\n // ... 此處省略\n this.currentRow = SQLRequestRow.CreateSQLRequestRowFromColumnTypes(columnTypes);\n this.currentSchema = this.currentRow.GetSchema();\n this.orgTypes = this.types;\n }\n\n // ... 此處currentRow初始化相關的代碼\n \n for (int i = 0; i < this.currentSchema.GetColumnCnt(); i++) {\n DataType dataType = this.currentSchema.GetColumnType(i);\n Object data = this.currentDatas.get(i+1);\n if (data == null) {\n ok = this.currentRow.AppendNULL();\n } else {\n // 省略編碼細節\n // if (DataType.kTypeInt64.equals(dataType)) {\n // ok = this.currentRow.AppendInt64((long) data);\n // } \n // ...\n }\n if (!ok) {\n throw new SQLException(\"append data failed, idx is \" + i);\n }\n }\n if (!this.currentRow.Build()) {\n throw new SQLException(\"build request row failed\");\n }\n clearParameters();\n }\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接着,調用客戶端提供的帶參數查詢接口`ExecuteSQLParameterized`。 ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. TabletClient 和 Tablet","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2.1 客戶端tablet_client","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"客戶端提供接口`ExecuteSQLParameterized`來支持帶參數查詢。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"/// Execute batch SQL with parameter rowstd::shared_ptr","attrs":{}},{"type":"link","attrs":{"href":"hybridse::sdk::ResultSet","title":"","type":null},"content":[{"type":"text","text":"hybridse::sdk::ResultSet","attrs":{}}]},{"type":"text","text":"ExecuteSQLParameterized(const std::string& db, const std::string& sql,std::shared_ptr parameter,::hybridse::sdk::Status* status) override;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ExecuteSQLParameterized`將從參數行`parameter`中提取參數類型、參數行大小等信息,裝入`QueryRequest`,並把參數數據行裝入roc附件中。客戶端調用rpc,在服務端完成查詢的編譯和運行。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將參數行大小、分片數、參數類型列表裝入`QueryRequest`","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"request.set_parameter_row_size(parameter_row.size());\nrequest.set_parameter_row_slices(1);\nfor (auto& type : parameter_types) {\n request.add_parameter_types(type);\n}","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參數數據行存放在rpc的附件中`cntl->request_attachment()`","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"auto& io_buf = cntl->request_attachment();\nif (!codec::EncodeRpcRow(reinterpret_cast(parameter_row.data()), parameter_row.size(), &io_buf)) {\n LOG(WARNING) << \"Encode parameter buffer failed\";\n return false;\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調用RPC","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"bool ok = client_.SendRequest(&::openmldb::api::TabletServer_Stub::Query, cntl, &request, response);","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"2.2 服務端Tablet","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務端tablet的`Query`函數負責從`QueryRequest`中獲取參數行信息,然後調用接口`engine_->Get()`","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=http%3A//xn--%2560session-i30nta9680a88mb6mci9ble5a3t6bkia4g4p.run/","title":null,"type":null},"content":[{"type":"text","text":"編譯查詢語句並調用接口`session.Run","attrs":{}}]},{"type":"text","text":"()`執行查詢語句。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"::hybridse::vm::BatchRunSession session;if (request->is_debug()) {session.EnableDebug();}session.SetParameterSchema(parameter_schema);{bool ok = engine_->Get(request->sql(), request->db(), session, status);// ...}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"::hybridse::codec::Row parameter_row;auto& request_buf = static_cast","attrs":{}},{"type":"link","attrs":{"href":"brpc::Controller*","title":"","type":null},"content":[{"type":"text","text":"brpc::Controller*","attrs":{}}]},{"type":"text","text":"(ctrl)->request_attachment();if (request->parameter_row_size() > 0 &&!codec::DecodeRpcRow(request_buf, 0, request->parameter_row_size(), request->parameter_row_slices(),&parameter_row)) {response->set_code(::openmldb::base::kSQLRunError);response->set_msg(\"fail to decode parameter row\");return;}std::vector<::hybridse::codec::row> output_rows;int32_t run_ret = session.run(parameter_row, output_rows);","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"想了解更多細節,可以閱讀 ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/4paradigm/OpenMLDB/blob/main/src/client/tablet_client.cc","title":null,"type":null},"content":[{"type":"text","text":"tablet客戶端","attrs":{}}]},{"type":"text","text":" 和 ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/4paradigm/OpenMLDB/blob/main/src/tablet/tablet_impl.cc","title":null,"type":null},"content":[{"type":"text","text":"tablet","attrs":{}}]},{"type":"text","text":" 的源碼實現。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. Compile: 查詢語句的編譯","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"3.1 查詢語句的編譯","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"step 1: 對帶參數查詢語句來說,編譯時,相比普通查詢,需要額外配置參數類型信息。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"session.SetParameterSchema(parameter_schema);\n","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"step 2: 配置參數列表後,調用`engine.Get(...)`接口編譯SQL語句","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢語句的編譯需要經過","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"SQL語法分析","attrs":{}},{"type":"text","text":"(3.2. Parser: 語法解析),","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"計劃生成(","attrs":{}},{"type":"text","text":"3.3 Planner: 計劃生成),","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"表達式Codegen(","attrs":{}},{"type":"text","text":"3.4 Codegen: 表達式的代碼生成)三個主要階段。編譯成功後,編譯結果會存放在當前執行會話(RunSeesion)的SQL上下文中(SqlContext)。後面幾個小節將依次介紹帶參數查詢語句的編譯過程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果當前查詢語句已經預編譯過,則不需要重複編譯。可直接從編譯緩存中獲取相對應的編譯產物存放到RunSession的SqlContext中。我們需要需要特別關注一下編譯緩存的設計變動。**對於帶參數的查詢來說,命中緩存需要同時**匹配**SQL和參數類型列表**。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"// check if paramter types matches with target compile result or not.\nfor (int i = 0; i < batch_sess->GetParameterSchema().size(); i++) {\n if (cache_ctx.parameter_types.Get(i).type() != batch_sess->GetParameterSchema().Get(i).type()) {\n status = Status(common::kEngineCacheError, \"Inconsistent cache parameter type, expect \" +\n batch_sess->GetParameterSchema().Get(i).DebugString(),\" but get \", cache_ctx.parameter_types.Get(i).DebugString());\n return false;\n }\n}\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"3.2. Parser: 語法解析","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"OpenMLDB的語法解釋器是基於`ZetaSQL`的SQL的解釋器開發的:除了覆蓋Zetasql原有的語法能力,還額外支持了OpenMLDb特有語法特性。例如,爲AI場景引入的特殊拼表類型`LastJoin`和窗口類型`ROWS_RANGE`等。關於OpenMLDB的語法解析以及新語法特性會陸續在未來的技術文章中闡述。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SQL的Parameterized 語句使用`?`作爲參數的佔位符,這種佔位符被ZetaSQL解釋器解析爲`zetasql::ASTParameterExpr`。由於`ZetaSQL`中已經支持了Parameterized Query Statement的解析,所以我們並不需要對語法解析模塊作太多額外修改,僅需要將原來的限制打開,識別這種參數表達式,將其轉化爲OpenMLDB的`ParameterExpr`類型的表達式節點並存放在語法樹中。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"/// Convert zetasql::ASTExpression into ExprNodebase::Status ConvertExprNode(const zetasql::ASTExpression* ast_expression, node::NodeManager* node_manager,node::ExprNode** output) {//...base::Status status;switch (ast_expression->node_kind()) {//...case zetasql::AST_PARAMETER_EXPR: {const zetasql::ASTParameterExpr* parameter_expr =ast_expression->GetAsOrNull","attrs":{}},{"type":"link","attrs":{"href":"zetasql::ASTParameterExpr","title":"","type":null},"content":[{"type":"text","text":"zetasql::ASTParameterExpr","attrs":{}}]},{"type":"text","text":"();CHECK_TRUE(nullptr != parameter_expr, common::kSqlAstError, \"not an ASTParameterExpr\")// Only support anonymous parameter (e.g, ?) so far.CHECK_TRUE(nullptr == parameter_expr->name(), common::kSqlAstError,\"Un-support Named Parameter Expression \", parameter_expr->name()->GetAsString());*output = node_manager->MakeParameterExpr(parameter_expr->position());return base::Status::OK();}//...}}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,下面這條參數查詢語句:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"SELECT col0 FROM t1 where col1 <= ?;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在語法解析後,將生成如下查詢語法樹:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"+-list[list]:\n +-0:\n +-node[kQuery]: kQuerySelect\n +-distinct_opt: false\n +-where_expr:\n | +-expr[binary]\n | +-<=[list]:\n | +-0:\n | | +-expr[column ref]\n | | +-relation_name: \n | | +-column_name: col1\n | +-1:\n | +-expr[parameter]\n | +-position: 1\n +-group_expr_list: null\n +-having_expr: null\n +-order_expr_list: null\n +-limit: null\n +-select_list[list]:\n | +-0:\n | +-node[kResTarget]\n | +-val:\n | | +-expr[column ref]\n | | +-relation_name: \n | | +-column_name: col0\n | +-name: \n +-tableref_list[list]:\n | +-0:\n | +-node[kTableRef]: kTable\n | +-table: t1\n | +-alias: \n +-window_list: []","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏可以重點關注一下過濾條件,` where col1 <= ?`被解析爲:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"+-where_expr:\n | +-expr[binary]\n | +-<=[list]:\n | +-0:\n | | +-expr[column ref]\n | | +-relation_name: \n | | +-column_name: col1\n | +-1:\n | +-expr[parameter]\n | +-position: 1","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"3.3 Planner: 計劃生成","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"邏輯計劃","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"邏輯計劃階段,帶參數查詢和普通參數並沒有什麼區別。因此,本文並不打算展開邏輯計劃的細節。下面這條參數查詢語句:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"SELECT col0 FROM t1 where col1 <= ?;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"邏輯計劃如下:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":": +-[kQueryPlan]\n +-[kProjectPlan]\n +-table: t1\n +-project_list_vec[list]:\n +-[kProjectList]\n +-projects on table [list]:\n | +-[kProjectNode]\n | +-[0]col0: col0\n +-[kFilterPlan]\n +-condition: col1 <= ?1\n +-[kTablePlan]","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對邏輯計劃以及物理計劃細節感興趣的讀者可以關注我們專欄。後續會陸續推出介紹引擎技術細節的系列文章。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"物理計劃","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​在物理計劃生成階段,爲了支持帶參數查詢,要完成兩件事:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先,在物理計劃上下文,表達式分析上下文以及CodeGen上下文中","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"維護參數類型列表","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在帶參數查詢語句中,最終執行使用的參數是用戶動態指定的,所以參數類型也是外部動態指定。爲此,我們提供了相關接口,使用戶在編譯SQL時,可以配置","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"參數類型列表","attrs":{}},{"type":"text","text":"(如果有參數的話)。這個列表最終會存放進物理計劃上下文,表達式分析上下文以及CodeGen上下文中。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"// 物理計劃上下文\nclass PhysicalPlanContext {\n // ...\n private:\n const codec::Schema* parameter_types_;\n}\n// 表達式分析上下文\nclass ExprAnalysisContext {\n\t// ...\n private:\n const codec::Schema* parameter_types_;\n}\n// Codegen上下文\nclass CodeGenContext {\n // ...\n private:\n \tconst codec::Schema* parameter_types_;\n}\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其次,根據參數類型列表完成","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"參數表達式的類型推斷","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Parameterized query語句完成語法解釋後,幾乎就是一棵普通的查詢語句生成的語法樹。唯一的區別是,parameterized query的語法樹裏有參數表達式節點(`ParamterExpr`)。因爲參數的類型既與查詢上游表的schema無關,也不是常量。所以,我們無法直接對這個","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"參數表達式進行類型推斷","attrs":{}},{"type":"text","text":"。這使得我們在計劃生成階段,特別是表達式的類型推斷過程中,需要對`ParamterExpr`進行特別處理。具體的做法是:在推斷`ParamterExpr`輸出類型時,需要根據參數所在位置從**參數類型列表**中找到相應的類型。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Status ParameterExpr::InferAttr(ExprAnalysisContext *ctx) {\n // ignore code including boundary check and nullptr check\n \t// ...\n type::Type parameter_type = ctx->parameter_types()->Get(position()-1).type();\n node::DataType dtype;\n CHECK_TRUE(vm::SchemaType2DataType(parameter_type, &dtype), kTypeError,\n \"Fail to convert type: \", parameter_type);\n SetOutputType(ctx->node_manager()->MakeTypeNode(dtype));\n return Status::OK();\n}\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還是之前那個SQL語句,物理計劃生成結果如下:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"SIMPLE_PROJECT(sources=(col0))\n FILTER_BY(condition=col1 <= ?1, left_keys=, right_keys=, index_keys=)\n DATA_PROVIDER(table=auto_t0)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中,`FILTER_BY`節點中的過濾條件就包含了參數表達式`condition=(col1 <= ?1)`","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"3.4 Codegen: 表達式的代碼生成","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Codegen模塊負責分析每個計劃節點的表達式列表,然後進行一系列表達式和函數的代碼生成處理。codegen後,每一個需要計算表達式的計劃節點都將生成至少一個codegen函數。這些函數負責計算表達式的計算。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Codegen函數增加一個參數","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"OpenMLDB的通過LLVM將每一個涉及表達式計算的節點生成中間代碼(IR)。具體地實現方式是爲每一個節點的表達式列表生成類似`@__internal_sql_codegen_6`的函數(這些函數將在執行語句的過程中,被","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"調用","attrs":{}},{"type":"text","text":"(4 Run: 查詢語句的執行):","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"; ModuleID = 'sql'\nsource_filename = \"sql\"\ndefine i32 @__internal_sql_codegen_6(i64 /*row key id*/, \n i8* /*row ptr*/, \n i8* /*rows ptr*/, \n i8** /*output row ptr ptr*/) {\n__fn_entry__:\n// 此處省略\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個函數的參數主要包含一些`int_8`指針,這些指針指向數據行(`row ptr`)或者數據集(`rows ptr`)(聚合計算依賴數據集)。函數體負責每一個表達式的計算,並將結果按順序編碼成行,並將編碼地址到最後一個`i8**`輸出參數上。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當表達式列表中包含參數表達式的時候,我們還額外需要獲得參數數據,因此,需要做的就是在原來的函數結構上,新增一個指向參數行的指針(`parameter_row ptr`)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Status RowFnLetIRBuilder::Build(...) { // 此處省略std::vector","attrs":{}},{"type":"link","attrs":{"href":"std::string","title":"","type":null},"content":[{"type":"text","text":"std::string","attrs":{}}]},{"type":"text","text":" args;std::vector<::llvm::type> args_llvm_type;args_llvm_type.push_back(::llvm::Type::getInt64Ty(module->getContext()));args_llvm_type.push_back(::llvm::Type::getInt8PtrTy(module->getContext()));args_llvm_type.push_back(::llvm::Type::getInt8PtrTy(module->getContext()));args_llvm_type.push_back(::llvm::Type::getInt8PtrTy(module->getContext())); // 新增一個int8ptr類型的參數args_llvm_type.push_back(::llvm::Type::getInt8PtrTy(module->getContext())->getPointerTo());// ...}於是,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"支持參數表達式後","attrs":{}},{"type":"text","text":",codegen函數的結構就變成如下樣子:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"; ModuleID = 'sql'\nsource_filename = \"sql\"\ndefine i32 @__internal_sql_codegen_6(i64 /*row key id*/, \n i8* /*row ptr*/, \n i8* /*rows ptr*/, \n i8* /*parameter row ptr*/, \n i8** /*output row ptr ptr*/) {\n__fn_entry__:\n// 此處省略\n}","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參數表達式的codegen","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參數行和普通的數據行一樣,遵循OpenMLDB的編碼格式,參數行的第0個元素就是參數查詢語句中的第1個參數,第1個元素就是第2個參數,依次類推。因此,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"計算參數表達式","attrs":{}},{"type":"text","text":"實際上就是","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"從參數行中讀取相應位置的參數","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"// Get paramter item from parameter row// param parameter// param output// returnStatus ExprIRBuilder::BuildParameterExpr(const ::hybridse::node::ParameterExpr* parameter, NativeValue* output) {// ...VariableIRBuilder variable_ir_builder(ctx_->GetCurrentBlock(), ctx_->GetCurrentScope()->sv());NativeValue parameter_row;// 從全局scope中獲取參數行parameter_rowCHECK_TRUE(variable_ir_builder.LoadParameter(&parameter_row, status), kCodegenError, status.msg);","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"// ...\n// 從參數行中讀取相應位置的參數\nCHECK_TRUE(\n buf_builder.BuildGetField(parameter->position()-1, slice_ptr, slice_size, output),\n kCodegenError, \"Fail to get \", parameter->position(), \"th parameter value\")\nreturn base::Status::OK();\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"於是,前面例子中的查詢語句的`Filter`節點的條件`col1 < ?` 會生成如下代碼:","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"; ModuleID = 'sql'\nsource_filename = \"sql\"\ndefine i32 @__internal_sql_codegen_6(i64, i8*, i8*, i8*, i8**) {\n__fn_entry__:\n %is_null_addr1 = alloca i8, align 1\n %is_null_addr = alloca i8, align 1\n // 獲取行指針row = {col0, col1, col2, col3, col4, col5}\n %5 = call i8* @hybridse_storage_get_row_slice(i8* %1, i64 0)\n %6 = call i64 @hybridse_storage_get_row_slice_size(i8* %1, i64 0)\n // Get field row[1] 獲取數據col1\n %7 = call i32 @hybridse_storage_get_int32_field(i8* %5, i32 1, i32 7, i8* nonnull %is_null_addr)\n %8 = load i8, i8* %is_null_addr, align 1\n // 獲取參數行指針paramter_row = {?1}\n %9 = call i8* @hybridse_storage_get_row_slice(i8* %3, i64 0)\n %10 = call i64 @hybridse_storage_get_row_slice_size(i8* %3, i64 0)\n // Get field of paramter_row[0] 獲取第一個參數\n %11 = call i32 @hybridse_storage_get_int32_field(i8* %9, i32 0, i32 7, i8* nonnull %is_null_addr1)\n %12 = load i8, i8* %is_null_addr1, align 1\n %13 = or i8 %12, %8\n // 比較 col1 <= ?1\n %14 = icmp sle i32 %7, %11\n // ... 此處省略多行\n // 將比較結果%14編碼輸出\n store i1 %14, i1* %20, align 1\n ret i32 0\n\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在此,我們並不打算展開codegen的具體細節。後續會陸續更新Codegen設計和優化相關的技術文章。如果大家感興趣,可以持續關注OpenMLDB技術專欄。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. Run: 查詢語句的執行","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查詢語句編譯後,會將編譯產物存放在當前運行會話(RunSession)中。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RunSession提供`Run`接口支持查詢語句的執行。對帶參數查詢語句來說,執行查詢時,相比普通的查詢,需要額外傳入參數行的信息。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"session.run(parameter_row, outputs)","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參數行`paramter_row`會存放在**運行上下文**`RunContext`中: ","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"RunnerContext ctx(&sql_ctx.cluster_job, parameter_row, is_debug_);","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"帶參數查詢過程中,表達式的計算可能依賴動態傳入的參數。所以,我們需要在執行計劃的時候,從運行上下文中獲取參數行,並帶入到表達式函數中計算。以TableProject節點爲例,","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於普通查詢來說,實現TableProject就是遍歷表中每一行,然後爲每一個行作`RowProject`操作。在帶參數的查詢場景中,因爲表達式的計算除了依賴數據行還可能依賴參數。所以,我們需要從運行行下文中獲取參數行,然後`project_gen_.Gen(iter->GetValue(), parameter)`。","attrs":{}}]}]}],"attrs":{}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"std::shared_ptr TableProjectRunner::Run(\n RunnerContext& ctx,\n const std::vector<:shared_ptr>>& inputs) {\n // ... 此處省略部分代碼\n // 從運行上下文中獲取參數行(如果沒有則獲得一個空的行指針\n auto& parameter = ctx.GetParameterRow();\n iter->SeekToFirst();\n int32_t cnt = 0;\n while (iter->Valid()) {\n if (limit_cnt_ > 0 && cnt++ >= limit_cnt_) {\n break;\n }\n // 遍歷表中每一行,計算每一個行的表達式列表\n output_table->AddRow(project_gen_.Gen(iter->GetValue(), parameter));\n iter->Next();\n }\n return output_table;\n }\n\nconst Row ProjectGenerator::Gen(const Row& row, const Row& parameter) {\n return CoreAPI::RowProject(fn_, row, parameter, false);\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoreAPI::RowProject`函數數據行和參數行來計算表達式列表。它最重要的工作就是調用fn函數。fn函數是查詢語句的編譯期根據表達式列表Codegen而成的函數。在小節","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"表達式的代碼生成","attrs":{}},{"type":"text","text":"(3.4 Codegen: 表達式的代碼生成)中我們已經介紹過了,我們在codegen函數的的參數列表中增加了一個參數行指針。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"// 基於輸入數據行和參數行計算表達式列表並輸出\nhybridse::codec::Row CoreAPI::RowProject(const RawPtrHandle fn,\n const hybridse::codec::Row row,\n const hybridse::codec::Row parameter,\n const bool need_free) {\n// 此處省略部分代碼\nauto udf = reinterpret_cast(\n const_cast(fn));\n \n auto row_ptr = reinterpret_cast(&row);\n auto parameter_ptr = reinterpret_cast(&parameter);\n int8_t* buf = nullptr;\n uint32_t ret = udf(0, row_ptr, nullptr, parameter_ptr, &buf);\n // 此處省略部分代碼\n return Row(base::RefCountedSlice::CreateManaged(\n buf, hybridse::codec::RowView::GetSize(buf)));\n}","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"未來的工作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PreparedStatement的預編譯在服務端tablet上完成,預編譯產生的編譯結果會緩存在tablet上。下次查詢時,只要SQL語句和參數類型匹配成功,即可複用編譯結果。但這就意味着,每次客戶端執行一次查詢,都需要將SQL語句和參數類型傳輸到服務端tablet上。當查詢語句很長時,這部分開銷就很可存放觀。因此,我們的設計仍有優化的空間。可以考慮在服務端產生一個唯一的預編譯查詢QID,這個QID會傳回給客戶端,保存在PrepareStatemetn的上下文中。只要查詢參數的類型不發生改變,客戶端就可以通過QID和參數執行查詢。這樣,可以減少查詢語句的傳輸開銷。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"std::shared_ptr","attrs":{}},{"type":"link","attrs":{"href":"hybridse::sdk::ResultSet","title":"","type":null},"content":[{"type":"text","text":"hybridse::sdk::ResultSet","attrs":{}}]},{"type":"text","text":"ExecuteSQLParameterized(const std::string& db, const std::string& qid,std::shared_ptr parameter,::hybridse::sdk::Status* status) override;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"歡迎更多開發者關注和參與","attrs":{}},{"type":"link","attrs":{"href":"https://sourl.cn/guDiLJ","title":"","type":null},"content":[{"type":"text","text":"OpenMLDB","attrs":{}}]},{"type":"text","text":"開源項目。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章