PostgreSQL PipelineDB 理解與統計性能升實踐

理解

1、基於Postgre數據庫：可以使用數據庫庫的函數，表達式，存儲過程等功能，自身功能就已經足夠強大了。而且還支持proxy等分表分庫插件。

2、數據輸入和查詢：所有的流必須以Stream開始，先創建stream然後纔可以使用view或者transform來查詢。stream中的數據必須通過insert插入。

3、持久化特性：view和transform的區別在於，view的計算結果會保存在pg數據庫中，transform不會保存，只能定義觸發器來將結果輸出到其他地方。

4、特別說明：可以在存儲過程中調用view，但是 transform由於結果已經被重定向了，所以無法調用。

背景

pipelinedb是基於PostgreSQL的一個流式計算數據庫，純C代碼，效率極高(32c機器,單機日處理流水達到了250.56億條)。同時它具備了PostgreSQL強大的功能基礎，正在掀起一場流計算數據庫制霸的腥風血雨。

在物聯網(IoT)有非常廣泛的應用場景，越來越多的用戶開始從其他的流計算平臺遷移到pipelineDB。

pipelinedb的用法非常簡單，首先定義stream(流)，然後基於stream定義對應的transform(事件觸發模塊)，以及Continuous Views(實時統計模塊)

數據往流裏面插入，transform和Continuous Views就在後面實時的對流裏的數據進行處理，對開發人員來說很友好，很高效。

值得慶祝的還有，所有的接口都是SQL操作，非常的方便，大大降低了開發難度。

pipelinedb基本概念

1. 什麼是流
流是基礎，Continuous Views和transform則是基於流中的數據進行處理的手段。

對於同一份數據，只需要定義一個流，寫入一份即可。

如果對同一份數據有多個維度的統計，可以寫在一條SQL完成的（如同一維度的運算或者可以支持窗口的多維度運算），只需定義一個Continuous Views或transform。如果不能在同一條SQL中完成計算，則定義多個Continuous Views或transform即可。

如果有多份數據來源（例如設計時就已經區分了不同的表）時，定義不同的流即可；

2. 什麼是流視圖？

流視圖，其實就是定義統計分析的QUERY，例如select id, count(*), avg(x), ... from stream_1 group by ...; 就屬於一個流視圖。

定義好之後，數據插入流(stream_1)，這個流視圖就會不斷增量的進行統計，你只要查詢這個流視圖，就可以查看到實時的統計結果。

數據庫中存儲的是實時統計的結果（實際上是在內存中進行增量合併的，增量的方式持久化）。

3. 什麼是Transforms

與流視圖不同的是，transform是用來觸發事件的，所以它可以不保留數據，但是可以設定條件，當記錄滿足條件時，就觸發事件。

例如監視傳感器的值，當值的範圍超出時，觸發報警（如通過REST接口發給指定的server），或者將報警記錄下來（通過觸發器函數）。

4. pipelinedb繼承了PostgreSQL很好的擴展性，例如支持了概率統計相關的功能，例如HLL等。用起來也非常的爽，例如統計網站的UV，或者紅綠燈通過的汽車編號唯一值車流，通過手機信號統計基站輻射方圓多少公里的按時UV等。

Bloom Filter    
Count-Min Sketch    
Filtered-Space Saving Top-K    
HyperLogLog    
T-Digest

5. Sliding Windows

因爲很多場景的數據有時效，或者有時間窗口的概念，所以pipelinedb提供了窗口分片的接口，允許用戶對數據的時效進行定義。

例如僅僅統計最近一分鐘的時間窗口內的統計數據。

比如熱力圖，展示最近一分鐘的熱度，對於舊的數據不關心，就可以適應SW進行定義，從而保留的數據少，對機器的要求低，效率還高。

6. 流視圖支持JOIN，支持JOIN，支持JOIN，重要的事情說三遍。

流 JOIN 流(未來版本支持,目前可以通過transform間接實現)

流 JOIN TABLE(已支持)

欲瞭解pipelineDB詳情請參考

http://docs.pipelinedb.com/

如果你還想了解一下PostgreSQL請參考

《PostgreSQL 前世今生》

pipelinedb在github上面可以下載。

https://github.com/pipelinedb/pipelinedb/releases

pipelinedb適用場景

凡是需要流式處理的場景，pipelinedb都是適用的，例如：

1. 交通

流式處理交通傳感器（如路感、紅綠燈）上報的數據，實時的反應交通情況如車流（流視圖中完成）。動態的觸發事件響應（transform中完成）如交通事故。

2. 水文監測

流式監測傳感器的數據，水質的變化，動態的觸發事件響應（transform中完成）如水質受到污染。

3. 車聯網

結合PostGIS，實現對汽車的位置實時跟蹤和軌跡合併，動態的繪製大盤數據（分時，車輛區域分佈）。

4. 物流動態

動態的跟蹤包裹在每個環節的數據，聚合結果，在查詢時不需要再從大量的數據中篩選多條（降低離散掃描）。

5. 金融數據實時處理

例如用戶設定了某個股票達到多少時，進行買入或賣出的操作，使用transform的事件處理機制，可以快速的進行買賣。

又比如，實時的對股票的指標數據進行一些數學模型的運算，實時輸出運算結果，繪製大盤數據。

6. 公安刑偵

例如在已知可疑車輛的車牌時，在流式處理天眼拍攝並實時上傳的車牌信息時，通過transform設置的規則，遇到可疑車牌時，觸發事件，快速的知道可疑車輛的實時行蹤。

7. app埋點(feed)數據實時分析

很多APP都會設置埋點，方便對用戶的行爲，或者業務處理邏輯進行跟蹤，如果訪問量大，數據量可能非常龐大，在沒有流式處理前，我們可能需要將數據收集到一個大型的數據倉庫，進行離線分析。

但是有些時候，離線分析可能是不夠用的，比如要根據用戶的實時行爲，或者大盤的實時行爲，對用戶做出一些動態的推薦，或者營銷，那麼就要用到流式實時處理了。

8. 網絡協議層流量分析

比如對辦公網絡、運營商網關、某些服務端的流量分析。

還有諸多場景等你來發掘。

pipelinedb文檔中提到的一些例子

實時監測每個URL的日訪問UV
CREATE CONTINUOUS VIEW uniques AS
SELECT date_trunc('day', arrival_timestamp) AS day,
  referrer, COUNT(DISTINCT user_id)
FROM users_stream GROUP BY day, referrer;

實時監測兩個列的線性相關性，比如溼度與溫度，大盤與貴州茅臺，路口A與路口B的車流，某商場的人流量與銷售額
CREATE CONTINUOUS VIEW lreg AS
SELECT date_trunc('minute', arrival_timestamp) AS minute,
  regr_slope(y, x) AS mx,
  regr_intercept(y, x) AS b
FROM datapoints_stream GROUP BY minute;

最近5分鐘的計數
CREATE CONTINUOUS VIEW imps AS
SELECT COUNT(*) FROM imps_stream
WHERE (arrival_timestamp > clock_timestamp() - interval '5 minutes');

網站的訪問品質，99th的用戶訪問延時, 95th的用戶訪問延時,.....
CREATE CONTINUOUS VIEW latency AS
SELECT percentile_cont(array[90, 95, 99]) WITHIN GROUP (ORDER BY latency)
FROM latency_stream;

西斯科方圓1000公里有多少車子
-- PipelineDB ships natively with geospatial support
CREATE CONTINUOUS VIEW sf_proximity_count AS
SELECT COUNT(DISTINCT sensor_id)
FROM geo_stream WHERE ST_DWithin(

  -- Approximate SF coordinates
  ST_GeographyFromText('SRID=4326;POINT(37 -122)'), sensor_coords, 1000);

pipeline的優勢

這是個拼爹的年代，pipelinedb有個很牛逼的爸爸PostgreSQL，出身伯克利大學，有紮實的理論基礎，歷經了43年的進化，在功能、性能、擴展能力、理論基礎等方面無需質疑一直處於領先的位置。

搞流式計算，計算是靈魂，算法和支持的功能排在很重要的位置。

PostgreSQL的強大之處在於統計維度極其豐富，數據類型也極其豐富。

build-in 數據類型參考

https://www.postgresql.org/docs/9.6/static/datatype.html

build-in 聚合，窗口，數學函數請參考

https://www.postgresql.org/docs/9.6/static/functions.html

同時還支持擴展，常見的例如

還有好多好多(爲什麼這麼多？原因是PostgreSQL的BSD-Like許可，致使了PG的生態圈真的很大很大，深入各行各業)。

你能想到的和想不到的幾乎都可以在pipelinedb 中進行流式處理，大大提高開發效率。

快速部署pipelinedb

OS最佳部署

《PostgreSQL on Linux 最佳部署手冊》

部署依賴

安裝 zeromq

http://zeromq.org/intro:get-the-software

wget https://github.com/zeromq/libzmq/releases/download/v4.2.0/zeromq-4.2.0.tar.gz    

tar -zxvf zeromq-4.2.0.tar.gz    

cd zeromq-4.2.0    

./configure    
make    
make install    


vi /etc/ld.so.conf    
/usr/local/lib    

ldconfig

rhel6需要更新libcheck

刪除老版本的check

yum remove check

安裝 check

http://check.sourceforge.net/

https://libcheck.github.io/check/web/install.html#linuxsource

https://github.com/libcheck/check/releases

wget http://downloads.sourceforge.net/project/check/check/0.10.0/check-0.10.0.tar.gz?r=&ts=1482216800&use_mirror=ncu    

tar -zxvf check-0.10.0.tar.gz    

cd check-0.10.0    

./configure    
make     
make install

下載pipelinedb

wget https://github.com/pipelinedb/pipelinedb/archive/0.9.6.tar.gz    

tar -zxvf 0.9.6.tar.gz    

cd pipelinedb-0.9.6

pipelinedb for rhel 6 or CentOS 6有幾個BUG需要修復一下

rhel6需要調整check.h

vi src/test/unit/test_hll.c     
vi src/test/unit/test_tdigest.c     
vi src/test/unit/test_bloom.c     
vi src/test/unit/test_cmsketch.c     
vi src/test/unit/test_fss.c     

添加    
#include "check.h"

rhel6需要修復libzmq.a路徑錯誤

libzmq.a的路徑修正

vi src/Makefile.global.in    

LIBS := -lpthread /usr/local/lib/libzmq.a -lstdc++ $(LIBS)

修復test_decoding錯誤

cd contrib/test_decoding    

mv specs test    

cd ../../

編譯pipelinedb

export C_INCLUDE_PATH=/usr/local/include:C_INCLUDE_PATH    
export LIBRARY_PATH=/usr/local/lib:$LIBRARY_PATH    

export USE_NAMED_POSIX_SEMAPHORES=1    

LIBS=-lpthread CC="/home/digoal/gcc6.2.0/bin/gcc" CFLAGS="-O3 -flto" ./configure --prefix=/home/digoal/pgsql_pipe    

make world -j 32    

make install-world

初始化集羣

配置環境變量

vi env_pipe.sh     

export PS1="$USER@`/bin/hostname -s`-> "    
export PGPORT=$1    
export PGDATA=/$2/digoal/pg_root$PGPORT    
export LANG=en_US.utf8    
export PGHOME=/home/digoal/pgsql_pipe    
export LD_LIBRARY_PATH=/home/digoal/gcc6.2.0/lib:/home/digoal/gcc6.2.0/lib64:/home/digoal/python2.7.12/lib:$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH    
export PATH=/home/digoal/cmake3.6.3/bin:/home/digoal/gcc6.2.0/bin:/home/digoal/python2.7.12/bin:/home/digoal/cmake3.6.3/bin:$PGHOME/bin:$PATH:.    
export DATE=`date +"%Y%m%d%H%M"`    
export MANPATH=$PGHOME/share/man:$MANPATH    
export PGHOST=$PGDATA    
export PGUSER=postgres    
export PGDATABASE=pipeline    
alias rm='rm -i'    
alias ll='ls -lh'    
unalias vi

假設端口爲1922，目錄放在/u01中

. ./env_pipe.sh 1922 u01

初始化集羣

pipeline-init -D $PGDATA -U postgres -E SQL_ASCII --locale=C

修改配置

cd $PGDATA    

vi pipelinedb.conf    

listen_addresses = '0.0.0.0'    
port = 1922      
max_connections = 2000    
superuser_reserved_connections = 13    
unix_socket_directories = '.'    
shared_buffers = 64GB    
maintenance_work_mem = 1GB    
dynamic_shared_memory_type = posix    
vacuum_cost_delay = 0    
bgwriter_delay = 10ms    
bgwriter_lru_maxpages = 1000    
bgwriter_lru_multiplier = 5.0    
synchronous_commit = off    
full_page_writes = off    
checkpoint_timeout = 35min    
checkpoint_completion_target = 0.1    
random_page_cost = 1.0    
effective_cache_size = 400GB    
log_destination = 'csvlog'    
logging_collector = on    
log_truncate_on_rotation = on    
log_checkpoints = on    
log_connections = on    
log_disconnections = on    
log_error_verbosity = verbose       
log_timezone = 'PRC'    
autovacuum = on    
log_autovacuum_min_duration = 0    
datestyle = 'iso, mdy'    
timezone = 'PRC'    
lc_messages = 'C'    
lc_monetary = 'C'    
lc_numeric = 'C'    
lc_time = 'C'    
default_text_search_config = 'pg_catalog.english'    
continuous_query_combiner_synchronous_commit = off    
continuous_query_combiner_work_mem = 2GB    
continuous_view_fillfactor = 50    
continuous_query_max_wait = 10    
continuous_query_commit_interval = 500    
continuous_query_batch_size = 500000    
continuous_query_num_combiners = 12    
continuous_query_num_workers = 8

pipelinedb新增的配置

#------------------------------------------------------------------------------    
# PIPELINEDB OPTIONS    
#------------------------------------------------------------------------------    

# synchronization level for combiner commits; off, local, remote_write, or on    
continuous_query_combiner_synchronous_commit = off    

# maximum amount of memory to use for combiner query executions    
continuous_query_combiner_work_mem = 512MB    

# the default fillfactor to use for continuous views    
continuous_view_fillfactor = 50    

# the time in milliseconds a continuous query process will wait for a batch    
# to accumulate    
continuous_query_max_wait = 10    

# time in milliseconds after which a combiner process will commit state to    
# disk    
continuous_query_commit_interval = 50    

# the maximum number of events to accumulate before executing a continuous query    
# plan on them    
continuous_query_batch_size = 50000    

# the number of parallel continuous query combiner processes to use for    
# each database    
continuous_query_num_combiners = 2    

# the number of parallel continuous query worker processes to use for    
# each database    
continuous_query_num_workers = 2    

# allow direct changes to be made to materialization tables?    
#continuous_query_materialization_table_updatable = off    

# synchronization level for stream inserts    
#stream_insert_level = sync_read    

# continuous views that should be affected when writing to streams.    
# it is string with comma separated values for continuous view names.    
#stream_targets = ''    

# the default step factor for sliding window continuous queries (as a percentage    
# of the total window size)    
#sliding_window_step_factor = 5    

# allow continuous queries?    
#continuous_queries_enabled = on    

# allow anonymous statistics collection and version checks?    
#anonymous_update_checks = on

啓動pipelinedb

pipeline-ctl start

連接方法

如何連接PostgreSQL，就如何連接pipelinedb，它們是全兼容的。

psql    
psql (9.5.3)    
Type "help" for help.    

pipeline=# \dt    
No relations found.    
pipeline=# \l    
                             List of databases    
   Name    |  Owner   | Encoding  | Collate | Ctype |   Access privileges       
-----------+----------+-----------+---------+-------+-----------------------    
 pipeline  | postgres | SQL_ASCII | C       | C     |     
 template0 | postgres | SQL_ASCII | C       | C     | =c/postgres          +    
           |          |           |         |       | postgres=CTc/postgres    
 template1 | postgres | SQL_ASCII | C       | C     | =c/postgres          +    
           |          |           |         |       | postgres=CTc/postgres    
(3 rows)    

pipeline=#

測試

創建流結構

id爲KEY， val存儲值，統計時按ID聚合

CREATE STREAM s1 (id int, val int);

創建流式視圖

流視圖統計count, avg, min, max, sum幾個常見維度

CREATE CONTINUOUS VIEW cv1 AS    
SELECT id,count(*),avg(val),min(val),max(val),sum(val)    
FROM s1 GROUP BY id;

PostgreSQL的強大之處在於統計維度極其豐富，數據類型也極其豐富。

build-in 數據類型參考

https://www.postgresql.org/docs/9.6/static/datatype.html

build-in 聚合，窗口，數學函數請參考

https://www.postgresql.org/docs/9.6/static/functions.html

同時還支持擴展，常見的例如 PostGIS, wavelet, 基因，化學，圖類型，等等。

你能想到的和想不到的都可以在pipelinedb 中進行流式處理，大大提高開發效率。

激活流計算

activate ;

插入壓測

100萬個隨機group，插入的值爲500萬內的隨機值

vi test.sql    

\setrandom id 1 1000000    
\setrandom val 1 5000000    
insert into s1(id,val) values (:id, :val);

使用1000個連接，開始壓測，每秒約處理24萬流水

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 1000 -j 1000 -T 100    

...    
progress: 2.0 s, 243282.2 tps, lat 4.116 ms stddev 5.182    
progress: 3.0 s, 237077.6 tps, lat 4.211 ms stddev 5.794    
progress: 4.0 s, 252376.8 tps, lat 3.967 ms stddev 4.998    
...

如果主機有很多塊硬盤，並且CPU很強時，可以在一臺主機中部署2個或多個pipelinedb實例，進行分流。

比如我在32Core的機器上，部署2個pipelinedb實例，可以達到29萬/s的流處理能力，一天能處理 250.56億流水。

小夥伴們都驚呆了。

250.56億，使用jstrom框架的話，估計要幾十倍甚至上百倍於pipelinedb的硬件投入才能達到同樣效果。

pipelinedb集羣化部署

雖然pipelinedb的性能很強(前面測的32C機器約250.56億/天的流水處理能力)，但是單機總會有瓶頸，所以我們還是需要考慮集羣化的部署。

寫入操作，如果不需要特定的分片規則，使用haproxy分發就可以了。如果需要加入分片規則，可以使用plproxy。

查詢聚合，需要使用plproxy，非常簡單，寫個動態函數即可。

plproxy 相關文檔介紹

《使用Plproxy設計PostgreSQL分佈式數據庫》

《A Smart PostgreSQL extension plproxy 2.2 practices》

《PostgreSQL 最佳實踐 - 水平分庫(基於plproxy)》

pipelinedb 文檔結構

http://docs.pipelinedb.com/

從文檔目錄，可以快速瞭解pipelinedb可以幹什麼，可以和什麼結合，處理那些場景的問題?

1. 介紹

What PipelineDB is    
What PipelineDB is not

2. Continuous Views

定義流視圖，其實就是定義統計分析的QUERY，例如select id, count(*), avg(x), ... from table group by ...;

定義好之後，數據插入table，這個流視圖就會不斷增量的進行統計，你只要查詢這個流視圖，就可以查看到實時的統計結果。

數據庫中存儲的是實時統計的結果（實際上是在內存中進行增量合併的，增量的方式持久化）。

CREATE CONTINUOUS VIEW    
DROP CONTINUOUS VIEW    
TRUNCATE CONTINUOUS VIEW    
Viewing Continuous Views    
Data Retrieval    
Time-to-Live (TTL) Expiration    
Activation and Deactivation    
Examples

3. Continuous Transforms

與流視圖不同的是，transform是用來觸發事件的，所以它可以不保留數據，但是可以設定條件，當記錄滿足條件時，就觸發事件。

例如監視傳感器的值，當值的範圍超出時，觸發報警（如通過REST接口發給指定的server），或者將報警記錄下來（通過觸發器函數）。

CREATE CONTINUOUS TRANSFORM    
DROP CONTINUOUS TRANSFORM    
Viewing Continuous Transforms    
Built-in Transform Triggers    
Creating Your Own Trigger

4. Streams

流視圖和transform都是基於流的，所以流是基礎。

我們首先需要定義流，往流裏面寫數據，然後在流動的數據中使用流視圖或者transform對數據進行實時處理。

Writing To Streams    
Output Streams    
stream_targets    
Arrival Ordering    
Event Expiration

5. Built-in Functionality

內置的函數

General    
Aggregates    
PipelineDB-specific Types    
PipelineDB-specific Functions    
Miscellaneous Functions

6. Continuous Aggregates

聚合的介紹，通常流處理分兩類，即前面講的

流視圖（通常是實時聚合的結果），比如按分鐘實時的對紅綠燈的車流統計數據繪圖，或者按分鐘對股票的實時數據進行繪圖。

transform（事件處理機制），比如監控水質，傳感器的值超出某個範圍時，記錄日誌，並同時觸發告警（發送給server）。

PipelineDB-specific Aggregates    
Combine    
CREATE AGGREGATE    
General Aggregates    
Statistical Aggregates    
Ordered-set Aggregates    
Hypothetical-set Aggregates    
Unsupported Aggregates

7. Clients

幾種常見的客戶端用法，實際上支持PostgreSQL的都支持pipelinedb，他們的連接協議是一致的。

Python    
Ruby    
Java

8. Probabilistic Data Structures & Algorithms

概率統計相關的功能，例如HLL等。用起來也非常的爽，例如統計網站的UV，或者紅綠燈通過的汽車編號唯一值車流，通過手機信號統計基站輻射方圓多少公里的按時UV等。

Bloom Filter    
Count-Min Sketch    
Filtered-Space Saving Top-K    
HyperLogLog    
T-Digest

9. Sliding Windows

因爲很多場景的數據有時效，或者有時間窗口的概念，所以pipelinedb提供了窗口分片的接口，允許用戶對數據的時效進行定義。

例如僅僅統計最近一分鐘的時間窗口內的統計數據。

比如熱力圖，展示最近一分鐘的熱度，對於舊的數據不關心，就可以適應SW進行定義，從而保留的數據少，對機器的要求低，效率還高。

Examples    
Sliding Aggregates    
Temporal Invalidation    
Multiple Windows    
step_factor

10. Continuous JOINs

流視圖支持JOIN，支持JOIN，支持JOIN，重要的事情說三遍。

流 JOIN 流(未來版本支持,目前可以通過transform間接實現)

流 JOIN TABLE(已支持)

Stream-table JOINs    
Supported Join Types    
Examples    
Stream-stream JOINs

11. Integrations

pipelinedb繼承了PostgreSQL的高擴展性，所以支持kafka, aws kinesis也是易如反掌的，可以適應更多的場景。

https://aws.amazon.com/cn/kinesis/streams/

Apache Kafka    
Amazon Kinesis

12. Statistics

統計信息，對於DBA有很大的幫助

pipeline_proc_stats    
pipeline_query_stats    
pipeline_stream_stats    
pipeline_stats

13. Configuration

程序連接pipelinedb

	方式與連接Postgresql一樣
	Java需要相應的驅動程序jar
	Python使用psycopg2較方便

Python

conn = psycopg2.connect("dbname='pipeline' user='pipe' password='pipeline' host='ip' port=5432")
pipeline = conn.cursor()
# stream(page_views)需要提前建立，或者加句建立stream的sql程序在執行建立視圖前執行下即可
query = """
CREATE CONTINUOUS VIEW view AS
SELECT
	url::text,
	count(*) AS total_count,
	count(DISTINCT cookie::text) AS uniques,
	percentile_cont(0.99) WITHIN GROUP (ORDER BY latency::integer) AS p99_latency
	FROM page_views GROUP BY url
	"""
pipeline.execute(query)
conn.commit()
# Now let's simulate some page views
for n in range(10000):
# 10 unique urls
url = '/some/url/%d' % (n % 10)
# 1000 unique cookies
cookie = '%032d' % (n % 1000)
# latency uniformly distributed between 1 and 100
latency = random.randint(1, 100)
pipeline.execute("""
	INSERT INTO page_views (url, cookie, latency) VALUES ('%s', '%s', %d)
""" % (url, cookie, latency))
# The output of a continuous view can be queried like any other table or view
pipeline.execute('SELECT * FROM view ORDER BY url')
rows = pipeline.fetchall()
for row in rows:
	print(row)
pipeline.execute('DROP CONTINUOUS VIEW v')
conn.commit()

import psycopg2
namedict = {}
aa = 'aa'
bb = 'bb'
cc = 'cc'
# if條件句可以忽略，只爲賦值所寫
for n in range(100000):
	rows1 = {}
	x = n % 10
	rows1[aa]=100000 - n
	rows1[bb]=x
	if 0<n< 500:
   	 	rows1[cc]=n*n-200
	else:
   		 if n<10000:
        	rows1[cc]=n*20-500
    	else:
        	rows1[cc]=n*3-200
	namedict[n] = rows1

for i in range(10):
	print(namedict[i])
	print(namedict[i]['aa'],namedict[i]['bb'],namedict[i]['cc'])

tup = tuple(namedict.values())

pipeline.executemany("""INSERT INTO test_stream (key,value,x) 
	VALUES (%(aa)s,%(bb)s,%(cc)s)""", tup)
pipeline.execute('SELECT * FROM view ORDER BY url')
rows = pipeline.fetchall()
for row in rows:
	print(row)
pipeline.execute('DROP CONTINUOUS VIEW view')
conn.commit()

executemany() 使用

  示例目的是練習executemany()的使用，stream以及視圖的建立就不多表述
  第一個for循環目的：構造內層爲以aa、bb、cc爲鍵的字典，namedict型如：
  {1:{'aa':10,'bb':50,'cc':100},2:{'aa':10,'bb':50,'cc':100},3:{'aa':10,'bb':50,'cc':100}
  tup型如:({'aa':10,'bb':50,'cc':100},{'aa':10,'bb':50,'cc':100},{'aa':10,'bb':50,'cc':100})

Java

  需要先將驅動程序jar文件引入

import java.util.Properties;
import java.sql.*;
public class example {
static final String HOST = "ip**";
static final String DATABASE = "pipeline";
static final String USER = "pipe";
static final String PASSWORD = "pipeline";
public static void main(String[] args) throws SQLException {
	String url = "jdbc:postgresql://" + HOST + ":5432/" + DATABASE;
	ResultSet rs;
	// Properties props = new Properties();
	// props.setProperty("user", USER);
	Connection conn = DriverManager.getConnection(url, USER, PASSWORD);
	Statement stmt = conn.createStatement();
	stmt.executeUpdate("CREATE CONTINUOUS VIEW view AS SELECT x::integer, 
		COUNT(*) FROM stream GROUP BY x");
	for (int i = 0; i < 100000; i++) {
		int x = i % 10;
		stmt.addBatch("INSERT INTO stream (x) VALUES (" + Integer.toString(x) + ")");
	}
	stmt.executeBatch();
	rs = stmt.executeQuery("SELECT * FROM view");
	while (rs.next()) {
		int id = rs.getInt("x");
		int count = rs.getInt("count");
		System.out.println(id + " = " + count);
	}
	// stmt.executeUpdate("DROP CONTINUOUS VIEW v");
	conn.close();
	}
}

參考

https://yq.aliyun.com/articles/166

https://blog.csdn.net/ercengsha/article/details/82775680