sql執行語句流程解析
整個處理流程在exec_simple_query函數中完成,代碼架構如下:
/*
* exec_simple_query
*
* Execute a "simple Query" protocol message.
*/
static void
exec_simple_query(const char *query_string)
{
...
//原始語法樹獲取
/*
* Do basic parsing of the query or queries (this should be safe even if
* we are in aborted transaction state!)
*/
parsetree_list = pg_parse_query(query_string);
...
//循環處理sql語句
/*
* Run through the raw parsetree(s) and process each one.
*/
foreach(parsetree_item, parsetree_list)
{
...
//對原始語法樹進行分析和重寫,生成查詢語法樹
querytree_list = pg_analyze_and_rewrite(parsetree, query_string,
NULL, 0, NULL);
//對查詢語法樹進行優化,生成執行計劃
plantree_list = pg_plan_queries(querytree_list,
CURSOR_OPT_PARALLEL_OK, NULL);
...
//執行語句
/*
* Run the portal to completion, and then drop it (and the receiver).
*/
(void) PortalRun(portal,
FETCH_ALL,
true, /* always top level */
true,
receiver,
receiver,
completionTag);
...
}
...
}
查詢分析和優化重寫
詞法、語法解析
使用FLEX和BISON做語法解析,詳見https://my.oschina.net/Greedxuji/blog/4290160
查詢分析和優化重寫
sql語句經過詞法、語法解析後,將得到一個原始的語法樹。查詢分析的作用就是對原始語法樹進行分析重寫,將原始樹轉換成一顆或者多顆查詢語法樹。
該部分功能主要在pg_analyze_and_rewrite函數中完成,主要操作步驟爲語法分析和優化重寫。
代碼框架如下:
/*
* Given a raw parsetree (gram.y output), and optionally information about
* types of parameter symbols ($n), perform parse analysis and rule rewriting.
*
* A list of Query nodes is returned, since either the analyzer or the
* rewriter might expand one query to several.
*
* NOTE: for reasons mentioned above, this must be separate from raw parsing.
*/
List *
pg_analyze_and_rewrite(RawStmt *parsetree, const char *query_string,
Oid *paramTypes, int numParams,
QueryEnvironment *queryEnv)
{
Query *query;
List *querytree_list;
TRACE_POSTGRESQL_QUERY_REWRITE_START(query_string);
/*
* (1) Perform parse analysis.
*/
if (log_parser_stats)
ResetUsage();
//原始語法樹分析
query = parse_analyze(parsetree, query_string, paramTypes, numParams,
queryEnv);
if (log_parser_stats)
ShowUsage("PARSE ANALYSIS STATISTICS");
//原始語法樹優化重寫
/*
* (2) Rewrite the queries, as necessary
*/
querytree_list = pg_rewrite_query(query);
TRACE_POSTGRESQL_QUERY_REWRITE_DONE(query_string);
return querytree_list;
}
查詢分析 parse_analyze
查詢分析是將原始語法樹轉換爲查詢語法樹。因爲元素語法樹爲樹結構,所以遍歷樹的節點執行相應的處理。
基本調用棧如下,由此可見,對select的 相關處理都已經包含完全了;相應的sql語句按照相應的執行節點執行就可以了。
parse_analyze
->transformTopLevelStmt
->transformOptionalSelectInto
->transformStmt
->transformInsertStmt
->transformDeleteStmt
->transformUpdateStmt
->transformSelectStmt
->transformDeclareCursorStmt
->transformExplainStmt
->transformCreateTableAsStmt
->transformCallStmt
主要函數解析
這裏以“ SELECT * FROM A_TBL,B_TBL WHERE xx == xx
”爲例。
命令執行時首先調用transformSelectStmt
函數。
SELECT命令包含WITH . FROM . TARGET . WHERE . HAVING . ORDER BY . GROUP BY . DISTINCT
7種信息處理。每個信息處理對應了一個處理函數。具體代碼如下:
static Query *
transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
{
Query *qry = makeNode(Query);
Node *qual;
ListCell *l;
qry->commandType = CMD_SELECT;
/* process the WITH clause independently of all else */
if (stmt->withClause)
{
qry->hasRecursive = stmt->withClause->recursive;
qry->cteList = transformWithClause(pstate, stmt->withClause);
qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
}
/* Complain if we get called from someplace where INTO is not allowed */
if (stmt->intoClause)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("SELECT ... INTO is not allowed here"),
parser_errposition(pstate,
exprLocation((Node *) stmt->intoClause))));
/* make FOR UPDATE/FOR SHARE info available to addRangeTableEntry */
pstate->p_locking_clause = stmt->lockingClause;
/* make WINDOW info available for window functions, too */
pstate->p_windowdefs = stmt->windowClause;
/* process the FROM clause */
transformFromClause(pstate, stmt->fromClause);
/* transform targetlist */
qry->targetList = transformTargetList(pstate, stmt->targetList,
EXPR_KIND_SELECT_TARGET);
/* mark column origins */
markTargetListOrigins(pstate, qry->targetList);
/* transform WHERE */
qual = transformWhereClause(pstate, stmt->whereClause,
EXPR_KIND_WHERE, "WHERE");
/* initial processing of HAVING clause is much like WHERE clause */
qry->havingQual = transformWhereClause(pstate, stmt->havingClause,
EXPR_KIND_HAVING, "HAVING");
/*
* Transform sorting/grouping stuff. Do ORDER BY first because both
* transformGroupClause and transformDistinctClause need the results. Note
* that these functions can also change the targetList, so it's passed to
* them by reference.
*/
qry->sortClause = transformSortClause(pstate,
stmt->sortClause,
&qry->targetList,
EXPR_KIND_ORDER_BY,
false /* allow SQL92 rules */ );
qry->groupClause = transformGroupClause(pstate,
stmt->groupClause,
&qry->groupingSets,
&qry->targetList,
qry->sortClause,
EXPR_KIND_GROUP_BY,
false /* allow SQL92 rules */ );
if (stmt->distinctClause == NIL)
{
qry->distinctClause = NIL;
qry->hasDistinctOn = false;
}
else if (linitial(stmt->distinctClause) == NULL)
{
/* We had SELECT DISTINCT */
qry->distinctClause = transformDistinctClause(pstate,
&qry->targetList,
qry->sortClause,
false);
qry->hasDistinctOn = false;
}
else
{
/* We had SELECT DISTINCT ON */
qry->distinctClause = transformDistinctOnClause(pstate,
stmt->distinctClause,
&qry->targetList,
qry->sortClause);
qry->hasDistinctOn = true;
}
/* transform LIMIT */
qry->limitOffset = transformLimitClause(pstate, stmt->limitOffset,
EXPR_KIND_OFFSET, "OFFSET");
qry->limitCount = transformLimitClause(pstate, stmt->limitCount,
EXPR_KIND_LIMIT, "LIMIT");
/* transform window clauses after we have seen all window functions */
qry->windowClause = transformWindowDefinitions(pstate,
pstate->p_windowdefs,
&qry->targetList);
/* resolve any still-unresolved output columns as being type text */
if (pstate->p_resolve_unknowns)
resolveTargetListUnknowns(pstate, qry->targetList);
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
qry->hasTargetSRFs = pstate->p_hasTargetSRFs;
qry->hasAggs = pstate->p_hasAggs;
foreach(l, stmt->lockingClause)
{
transformLockingClause(pstate, qry,
(LockingClause *) lfirst(l), false);
}
assign_query_collations(pstate, qry);
/* this must be done after collations, for reliable comparison of exprs */
if (pstate->p_hasAggs || qry->groupClause || qry->groupingSets || qry->havingQual)
parseCheckAggregates(pstate, qry);
return qry;
}
FROM處理:
transformFromClause
FROM處理時,遍歷fromlist將每一個“基表”傳送給transformFromClauseItem進行處理,transformFromClauseItem處理的基表可能是直接處理基表或者查詢表,例如:select * from aa,(select * from bb) as BB;
所以在處理時分爲一下幾個類型進行處理:
- RangeVar 調用 transformTableEntry:普通類型的基表,基表信息直接存儲在pstate->p_rtable鏈表中,後續結果顯示按照該鏈表順序進行顯示
- RangeSubselect 調用 transformRangeSubselect:子查詢類型的基表,因爲是完整select語句,所以最終再調用transformStmt函數進行分析;解析的結果存儲在pstate->p_rtable鏈表中,作爲區別會將rtekind域設置爲RTE_SUBQUERY。
- RangeFunction 調用 transformRangeFunction:查詢到函數並最終調用addRangeTableEntryForFunction函數,將結果存儲在pstate->p_rtable鏈表中,作爲區別會將rtekind域設置爲RTE_FUNCTION。
- RangeTableFunc 調用 transformRangeTableFunc:調用XMLTABLE相關函數,將結果存儲在pstate->p_rtable鏈表中,作爲區別會將rtekind域設置爲RTE_TABLEFUNC。
- RangeTableSample 調用 transformFromClauseItem:
- JoinExpr 調用 transformFromClauseItem:join連接語句,對左右節點進行解析, 獲取到基表信息,並創建一個新的RTE結果存儲在pstate->p_rtable鏈表中。作爲區別會將rtekind域設置爲RTE_JOIN。
在處理完成後,將所有表明添加到pstate->p_namespace中,該值用於後續對select *
中列名的解析,查詢出所有的列名;或者判斷查詢的列名是否存在。
相關代碼如下:
/*
* transformFromClause -
* Process the FROM clause and add items to the query's range table,
* joinlist, and namespace.
*
* Note: we assume that the pstate's p_rtable, p_joinlist, and p_namespace
* lists were initialized to NIL when the pstate was created.
* We will add onto any entries already present --- this is needed for rule
* processing, as well as for UPDATE and DELETE.
*/
void
transformFromClause(ParseState *pstate, List *frmList)
{
ListCell *fl;
/*
* The grammar will have produced a list of RangeVars, RangeSubselects,
* RangeFunctions, and/or JoinExprs. Transform each one (possibly adding
* entries to the rtable), check for duplicate refnames, and then add it
* to the joinlist and namespace.
*
* Note we must process the items left-to-right for proper handling of
* LATERAL references.
*/
foreach(fl, frmList)
{
Node *n = lfirst(fl);
RangeTblEntry *rte;
int rtindex;
List *namespace;
n = transformFromClauseItem(pstate, n,
&rte,
&rtindex,
&namespace);
checkNameSpaceConflicts(pstate, pstate->p_namespace, namespace);
/* Mark the new namespace items as visible only to LATERAL */
setNamespaceLateralState(namespace, true, true);
pstate->p_joinlist = lappend(pstate->p_joinlist, n);
pstate->p_namespace = list_concat(pstate->p_namespace, namespace);
}
/*
* We're done parsing the FROM list, so make all namespace items
* unconditionally visible. Note that this will also reset lateral_only
* for any namespace items that were already present when we were called;
* but those should have been that way already.
*/
setNamespaceLateralState(pstate->p_namespace, false, true);
}
static Node *
transformFromClauseItem(ParseState *pstate, Node *n,
RangeTblEntry **top_rte, int *top_rti,
List **namespace)
{
if (IsA(n, RangeVar))
{
...
}
else if (IsA(n, RangeSubselect))
{
...
}
else if (IsA(n, RangeFunction))
{
...
}
else if (IsA(n, RangeTableFunc))
{
...
}
else if (IsA(n, RangeTableSample))
{
...
}
else if (IsA(n, JoinExpr))
{
...
}
}
查詢目標列獲取:
transformTargetList
當查詢全部列名時,需要將*
轉換爲全部列名,例如“ SELECT * FROM A_TBL,B_TBL WHERE xx == xx
”。在做列名解析時,在pstate->p_namespace中驗證傳入表中是否存在該列名,不存在則報錯。
在獲取時,分爲字符串型的列名和句號.
類型的列名。
- 全部爲列名:直接存儲在qry->targetList中。
- 字符中存在
*
星號,調用ExpandColumnRefStar處理:存在*
則擴展爲全部列名(SELECT *, dname FROM emp, dept
)。帶表名的*
列名,則需要校驗表明列名不超過4個(SELECT emp.*, dname FROM emp, dept
)。將結果存儲在qry->targetList中。 - 句號
.
類型調用ExpandIndirectionStar處理:解析表達式,驗證是否存在列名,存在則存儲在qry->targetList中。
對應代碼如下:
List *
transformTargetList(ParseState *pstate, List *targetlist,
ParseExprKind exprKind)
{
List *p_target = NIL;
bool expand_star;
ListCell *o_target;
/* Shouldn't have any leftover multiassign items at start */
Assert(pstate->p_multiassign_exprs == NIL);
/* Expand "something.*" in SELECT and RETURNING, but not UPDATE */
expand_star = (exprKind != EXPR_KIND_UPDATE_SOURCE);
foreach(o_target, targetlist)
{
ResTarget *res = (ResTarget *) lfirst(o_target);
/*
* Check for "something.*". Depending on the complexity of the
* "something", the star could appear as the last field in ColumnRef,
* or as the last indirection item in A_Indirection.
*/
if (expand_star)
{
if (IsA(res->val, ColumnRef))
{
ColumnRef *cref = (ColumnRef *) res->val;
if (IsA(llast(cref->fields), A_Star))
{
/* It is something.*, expand into multiple items */
p_target = list_concat(p_target,
ExpandColumnRefStar(pstate,
cref,
true));
continue;
}
}
else if (IsA(res->val, A_Indirection))
{
A_Indirection *ind = (A_Indirection *) res->val;
if (IsA(llast(ind->indirection), A_Star))
{
/* It is something.*, expand into multiple items */
p_target = list_concat(p_target,
ExpandIndirectionStar(pstate,
ind,
true,
exprKind));
continue;
}
}
}
/*
* Not "something.*", or we want to treat that as a plain whole-row
* variable, so transform as a single expression
*/
p_target = lappend(p_target,
transformTargetEntry(pstate,
res->val,
NULL,
exprKind,
res->name,
false));
}
...
}
WHERE處理:
transformWhereClause
在該函數中處理where語句,該語句處理時沒有特定的函數進行處理,仍然使用transformExpr函數進行處理,當where中只有一個表達式時,transformExpr函數處理T_ColumnRef分支;當where中爲多個表達式時,transformExpr函數處理T_BoolExpr分支,在transformBoolExpr函數中再拆分爲T_ColumnRef分支處理。
WHERE的最終結果會存儲在jointree中qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
。所以後續進行計劃樹優化時,會對jointree進行優化處理。
代碼如下:
Node *
transformWhereClause(ParseState *pstate, Node *clause,
ParseExprKind exprKind, const char *constructName)
{
Node *qual;
if (clause == NULL)
return NULL;
qual = transformExpr(pstate, clause, exprKind);
qual = coerce_to_boolean(pstate, qual, constructName);
return qual;
}
HAVING處理:
transformWhereClause
按照where語句進行處理
/* initial processing of HAVING clause is much like WHERE clause */
qry->havingQual = transformWhereClause(pstate, stmt->havingClause,
EXPR_KIND_HAVING, "HAVING");
GROUP BY處理:
transformGroupClause
在group by語句進行處理時,需要與order by語句一起處理。處理時需要先進行order by排序,再進行group by分組。
ORDER BY處理:
DISTINCT處理:
以上兩種未作介紹
優化重寫 pg_rewrite_query
按照pg_rewrite中定義的規則進行重寫。