lightdb/postgresql中plpgsql、函數與操作符、表達式及其內部實現

PG_PROC

PG_OPERATOR

pg_opclass用於定義索引上的相關操作符，一般來說是同一類數據類型。pg_opfamiliy定義了相互兼容的數據類型的操作符，關係見https://www.postgresql.org/docs/9.1/catalog-pg-opclass.html。pg 8.3引入pg_opfamilies，原因：Create "operator families" to improve planning of queries involving cross-data-type comparisons (Tom)

https://www.postgresql.org/docs/current/btree-behavior.html

https://www.postgresql.org/docs/current/indexes-opclass.html

PG_LANGUAGE

　　對於操作符表達式，在PostgreSQL 數據庫中操作符實際都轉成了對應的函數。

　　到執行期，也就是ExecMakeTableFunctionResult/ExecMakeFunctionResultSet階段，函數信息fcinfo/flinfo及函數指針都已經確定。

表達式實現

　　https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL

plpgsql實現　　

初始化

編譯

　　默認情況下，除非存儲過程（函數無此特例）是SQL語言編寫的，否則編譯發生在第一次調用（函數總是在第一次執行時編譯）時，pl_comp()函數。

　　就理論而言，在語法解析爲數據結構這個過程，語句表達式可以使用深度優先二叉樹遍歷實現（每個節點保存節點類型、操作數和值、也可能還包括操作數，也就是left/right），二叉樹一般使用遞歸實現，遞歸性能較低，可以將其轉換爲數組來平面化（關鍵在於如何表示，PG做了實現）。對於SQL語句來說，不管硬編碼、綁定變量還是字段、函數、表達式、聚合函數、分析函數這一步都是一樣的。因爲在targetlist已經全部數組、表達式化。包括case when xxx=ssss then; case xxx when sss then; between and; interval '';語句和表達式都可用二叉樹來實現計算。

　　因爲GLR（bison默認,flex and bison第九章，可以向前查看無限個記號）或LALR(1)（Look-Ahead Left Reversed，bison也支持）或兩路並行正常會向前找一個符號，所以可以爲操作符指定優先級，這樣就可以轉換爲深度優先樹（逆波蘭可以解決括號問題，不用括號就解決優先級問題，但是不適合人工閱讀，適合機器表示）。

調用

　　在analyze語義分析階段，會確定函數信息並設置fcinfo/flinfo的固定部分，如函數名、函數指針。如下：

>    FuncnameGetCandidates    C++ (gdb)
     lt_func_get_detail    C++ (gdb)
     ParseFuncOrColumn    C++ (gdb)
     transformFuncCall    C++ (gdb)
     transformExprRecurse    C++ (gdb)
     transformExpr    C++ (gdb)
     transformRangeFunction    C++ (gdb)
     transformFromClauseItem    C++ (gdb)
     transformFromClause    C++ (gdb)
     transformSelectStmt    C++ (gdb)
     transformStmt    C++ (gdb)
     transformTopLevelStmt    C++ (gdb)
     parse_analyze    C++ (gdb)
     pg_analyze_and_rewrite    C++ (gdb)
     exec_simple_query    C++ (gdb)
     PostgresMain    C++ (gdb)
     BackendRun    C++ (gdb)
     BackendStartup    C++ (gdb)
     ServerLoop    C++ (gdb)
     PostmasterMain    C++ (gdb)
     main    C++ (gdb)

　　其中函數地址在fn_addr屬性中。在lookup_C_func函數中設置，如下：

　　那函數地址第一次是如何加載到哈希中的呢？

完成

異常清理

/*-------------------------------------------------------------------------
 *        Support struct to ease writing Set Returning Functions (SRFs)
 *-------------------------------------------------------------------------
 *
 * This struct holds function context for Set Returning Functions.
 * Use fn_extra to hold a pointer to it across calls
 */
typedef struct FuncCallContext
{
    /*
     * Number of times we've been called before
     *
     * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
     * incremented for you every time SRF_RETURN_NEXT() is called.
     */
    uint64        call_cntr;

    /*
     * OPTIONAL maximum number of calls
     *
     * max_calls is here for convenience only and setting it is optional. If
     * not set, you must provide alternative means to know when the function
     * is done.
     */
    uint64        max_calls;

    /*
     * OPTIONAL pointer to miscellaneous user-provided context information
     *
     * user_fctx is for use as a pointer to your own struct to retain
     * arbitrary context information between calls of your function.
     */
    void       *user_fctx;

    /*
     * OPTIONAL pointer to struct containing attribute type input metadata
     *
     * attinmeta is for use when returning tuples (i.e. composite data types)
     * and is not used when returning base data types. It is only needed if
     * you intend to use BuildTupleFromCStrings() to create the return tuple.
     */
    AttInMetadata *attinmeta;

    /*
     * memory context used for structures that must live for multiple calls
     *
     * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
     * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
     * context for any memory that is to be reused across multiple calls of
     * the SRF.
     */
    MemoryContext multi_call_memory_ctx;

    /*
     * OPTIONAL pointer to struct containing tuple description
     *
     * tuple_desc is for use when returning tuples (i.e. composite data types)
     * and is only needed if you are going to build the tuples with
     * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
     * the TupleDesc pointer stored here should usually have been run through
     * BlessTupleDesc() first.
     */
    TupleDesc    tuple_desc;

} FuncCallContext;

對於非跨調用（上下文無關，通常是標量函數）函數，其實例如下：

SRF

FmgrInfo：函數信息

TupleDesc：記錄定義

HeapTuple：記錄

從C字符串構建記錄元祖extern HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values);，具體實現在heap_form_tuple，如下：

/*
 * heap_form_tuple
 *        construct a tuple from the given values[] and isnull[] arrays,
 *        which are of the length indicated by tupleDescriptor->natts
 *
 * The result is allocated in the current memory context.
 */
HeapTuple
heap_form_tuple(TupleDesc tupleDescriptor,
                Datum *values,
                bool *isnull)

/*
 * heap_fill_tuple
 *        Load data portion of a tuple from values/isnull arrays
 *
 * We also fill the null bitmap (if any) and set the infomask bits
 * that reflect the tuple's data contents.
 *
 * NOTE: it is now REQUIRED that the caller have pre-zeroed the data area.
 */
void
heap_fill_tuple(TupleDesc tupleDesc,
                Datum *values, bool *isnull,
                char *data, Size data_size,
                uint16 *infomask, bits8 *bit)

generate_series的實現及返回集合類型

CREATE OR REPLACE FUNCTION fibonacci_seq(num integer)
RETURNS SETOF integer AS $$
DECLARE
a int := 0;
b int := 1;
BEGIN
IF (num <= 0)
THEN RETURN;
END IF;
RETURN NEXT a;
LOOP
EXIT WHEN num <= 1;
RETURN NEXT b;
num = num - 1;
SELECT b, a + b INTO a, b;
END LOOP;
END;
$$ language plpgsql;

zjh@postgres-# (SELECT fibonacci_seq(3));
 fibonacci_seq 
---------------
             0
             1
             1
(3 rows)

-- 雖然這種模式有點怪，但是因爲設計問題，PL/PGSQL無法支持類似RETURN a,b,c語法。
CREATE FUNCTION permutations(INOUT a int,
INOUT b int,
INOUT c int)
RETURNS SETOF RECORD
AS $$
BEGIN
RETURN NEXT;
SELECT b,c INTO c,b; RETURN NEXT;
SELECT a,b INTO b,a; RETURN NEXT;
SELECT b,c INTO c,b; RETURN NEXT;
SELECT a,b INTO b,a; RETURN NEXT;
SELECT b,c INTO c,b; RETURN NEXT;
END;
$$ LANGUAGE plpgsql;

zjh@postgres=# SELECT * FROM permutations(1, 2, 3);
 a | b | c 
---+---+---
 1 | 2 | 3
 1 | 3 | 2
 3 | 1 | 2
 3 | 2 | 1
 2 | 3 | 1
 2 | 1 | 3
(6 rows)

zjh@postgres=# CREATE OR REPLACE FUNCTION permutations2(a int, b int, c int)
RETURNS SETOF abc
AS $$
BEGIN
RETURN NEXT a,b,c;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION
zjh@postgres=# 
zjh@postgres=# 
zjh@postgres=# select * from permutations2(1,2,3);
ERROR:  query "SELECT a,b,c" returned 3 columns
CONTEXT:  PL/pgSQL function permutations2(integer,integer,integer) line 3 at RETURN NEXT

因爲generate_series是使用c語言實現的，其結構和plpgsql實現類似。

調試pl/pgsql代碼

　　目前來說，主要幾個plpgsql debugger插件實現，https://github.com/OmniDB/plpgsql_debugger，plugin_debugger（EDB寫）。主流的pg ide包括dbeaver，pgadmin 4，navicat都支持，lightdb 22.4正式版內置了plugin_debugger，二次發行版dbeaver也開箱即用的支持plpgsql和pgorasql的調試。

　　另外，和oracle裏面一樣，pg也支持打印調用堆棧，可參見https://www.cybertec-postgresql.com/en/debugging-pl-pgsql-get-stacked-diagnostics/。

PL/pgSQL的實現

　　PL/pgSQL存儲過程示例https://blog.csdn.net/kmblack1/article/details/92786900。

　　由於plpgsql支持事務（存儲過程支持，函數不支持）、表達式和語句採用表達式引擎實現、執行SQL語句基於SPI實現，因此要了解或pl/pgsql的實現，需要先熟悉事務快照，表達式以及SPI的實現機制，不然會有大量的盲區。

　　編譯、校驗、執行、語言。

　　SQL執行引擎、PL/pgSQL引擎（無單獨的表達式解析器）。

　　在postgresql中，PL/pgSQL過程、函數的執行有點類似Javascript和python，會話第一次加載（這事挺麻煩，爲了確保沒問題，還得先調用一遍，如果是很複雜的存儲過程，會導致需要大量的準備過程。注：SQL過程會在創建的時候會進行編譯）的時候會進行語法解析、語義解析等，然後生成內存中的結構，下次執行的時候就直接執行緩存的內存指令結構（由於最終會通過plXXXsql過程性解析器編譯爲類表達式引擎中執行plpgsql中的指令，其實現通常性能變動較大，因此性能通常不如c編寫的函數）。具體可參見https://www.postgresql.org/docs/13/plpgsql-implementation.html、https://www.percona.com/live/19/sites/default/files/slides/Introduction%20to%20PL_pgSQL%20Development%20-%20FileId%20-%20187790.pdf。

zjh@postgres=# CREATE OR REPLACE FUNCTION ambiguous(parameter varchar) RETURNS
zjh@postgres-# integer AS $$
zjh@postgres$# DECLARE retval integer;
zjh@postgres$# BEGIN
zjh@postgres$# INSERT INTO parameter (parameter) VALUES (parameter) RETURNING id
zjh@postgres$# INTO retval;
zjh@postgres$# RETURN retval;
zjh@postgres$# END
zjh@postgres$# $$
zjh@postgres-# language plpgsql;
CREATE FUNCTION
zjh@postgres=# 
zjh@postgres=# SELECT ambiguous ('parameter');
ERROR:  relation "parameter" does not exist
LINE 1: INSERT INTO parameter (parameter) VALUES (parameter) RETURNI...
                    ^
QUERY:  INSERT INTO parameter (parameter) VALUES (parameter) RETURNING id
CONTEXT:  PL/pgSQL function ambiguous(character varying) line 4 at SQL statement

表達式的核心設計架構

　　表達式引擎相比函數，實現起來並不是那麼直接，核心設計模式在於：爲了提升運行時的性能，因爲表達式通常對每行記錄執行一次，而遞歸層次深的函數無論資源消耗還是性能都比普通迭代的要弱，所以，在PG中，表達式被設計爲：解析的時候，向下遞歸、嵌套函數列表；表達式初始化的時候，一樣由外向內遞歸、深度優先二叉樹轉換爲array，具體在

ExprEvalPushStep(ExprState *es, const ExprEvalStep *s)中完成平鋪化。

參考：https://www.postgresql.eu/events/pgconfeu2018/sessions/session/2130/slides/113/Stored%20Procedures.pdf

遊標、跨事務遊標 https://www.cybertec-postgresql.com/en/declare-cursor-in-postgresql-or-how-to-reduce-memory-consumption/、https://www.postgresql.org/docs/10/sql-declare.html，遊標選項對應的宏定義

#define CURSOR_OPT_BINARY  0x0001 /* BINARY */
#define CURSOR_OPT_SCROLL   0x0002 /* SCROLL explicitly given */
#define CURSOR_OPT_NO_SCROLL 0x0004 /* NO SCROLL explicitly given */
#define CURSOR_OPT_INSENSITIVE 0x0008 /* INSENSITIVE */
#define CURSOR_OPT_HOLD    0x0010 /* WITH HOLD */
/* these planner-control flags do not correspond to any SQL grammar: */
#define CURSOR_OPT_FAST_PLAN 0x0020 /* prefer fast-start plan */
#define CURSOR_OPT_GENERIC_PLAN 0x0040 /* force use of generic plan */
#define CURSOR_OPT_CUSTOM_PLAN 0x0080 /* force use of custom plan */
#define CURSOR_OPT_PARALLEL_OK 0x0100 /* parallel mode OK */
https://wiki.postgresql.org/wiki/Debugging_the_PostgreSQL_grammar_(Bison)

lightdb/postgresql中plpgsql、函數與操作符、表達式及其內部實現

plpgsql實現

generate_series的實現及返回集合類型

調試pl/pgsql代碼

PL/pgSQL的實現

表達式的核心設計架構

Window 安裝 Python 失敗 0x80070643，發生嚴重錯誤

lightdb WARNING: could not establish connection after 30000 ms

oracle 19c普通用戶查詢字典表all_views時等待row cache mutex事件

centos 7.5下oracle 19.3 rac安裝（最新親測）

rhel/centos 8.5下基於asm存儲的oracle 19c安裝

Oracle RAC SCAN ip的原理、配置及優缺點

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結