基於Kafka技術棧構建和部署實時搜索引擎的實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 Koverhoop,我們正在保險、醫療、房地產和離線分析領域建立一些大型項目。在我們其中一個"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"多租戶團體保險經紀平臺"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "},{"type":"link","attrs":{"href":"https:\/\/klient.ca\/","title":null,"type":null},"content":[{"type":"text","text":"klient.ca"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",我們計劃構建一個"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"強大的搜索功能"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",希望能在用戶輸入內容的同時同步呈現搜索結果。下面是我們能夠實現的效果,我將在這篇文章討論這一功能的核心基礎設施,包括如何完全自動化部署及如何快速完成構建工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fb\/41\/fb25631ee6e143bf5593239650230241.gif","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來自作者的動圖: 搜索能力"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個系列文章分爲"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"兩部分"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",我將分別討論以下內容:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"第1部分"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":瞭解用於支持此搜索能力的技術棧,並使用 Docker 和 Docker-compose 進行部署(本文)"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"第2部分"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":使用 Kubernetes 對這些服務進行可伸縮的生產部署(待發布)"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"問題定義和決策"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了構建一個快速、實時的搜索引擎,我們必須做出某些設計決策。我們使用 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Postgres"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 作爲主數據庫,因此有以下選項可以使用:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"直接在 Postgres 數據庫中查詢我們在搜索欄中鍵入的每個字符。😐"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"使用一個高效的搜索數據庫,如 Elasticsearch。🤔"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"考慮到我們已經是一個"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"多租戶應用程序"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",同時被搜索的實體可能需要大量的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"關聯操作"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(如果我們使用 Postgres)且預計規模也相當大,因此我們決定不使用以前直接查詢數據庫的方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因此,我們必須決定一種可靠、高效的方式,將數據從 Postgres "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"實時"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"遷移到 Elasticsearch。接下來需要作出以下決定:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"使用 "},{"type":"link","attrs":{"href":"https:\/\/www.elastic.co\/logstash","title":null,"type":null},"content":[{"type":"text","text":"Logstash"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":" 定期查詢 Postgres 數據庫並將數據發送到 Elasticsearch。😶"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"在我們的應用程序中使用 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}},{"type":"strong"}],"text":"Elasticsearch 客戶端"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":",在 Postgres 和 Elasticsearch 中同時對數據進行 CRUD 操作。🧐 "}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"使用"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}},{"type":"strong"}],"text":"基於事件的流引擎"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":",從 Postgres 的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}},{"type":"strong"}],"text":"預寫日誌"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#292929","name":"user"}}],"text":"中提取事件,將它們導入到流處理服務器,並將其接收到 Elasticsearch。🤯"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"選項1因爲不是實時的,所以很快就被排除了,而且即使我們以較短的間隔進行查詢,也會給 Postgres 服務器帶來"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"明顯的壓力"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。對於其他兩種選擇,不同的公司做出的決定可能不一樣。在我們的場景裏如果選擇選項2,我們可以預見到一些問題:如果 Elasticsearch 在確認更新時"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"速度很慢"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",這可能會減慢我們應用程序的速度,或者在"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"不一致"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的情況下,我們要如何對單個或一組事件的插入進行重試?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因此,我們決定構建一個基於事件隊列的基礎設施。還因爲我們已經計劃了一些適合基於事件的未來場景和服務,比如"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"通知服務、數據倉庫、微服務架構"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"等。事不宜遲,讓我們直接開始解決方案及所使用服務的基本介紹吧。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"服務簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了實現基於事件的流基礎設施,我們決定使用 Confluent Kafka 技術棧。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"以下是我們整合的服務:"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/33\/3c\/3371212b89a287e8903b080554f4f93c.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來源:"},{"type":"link","attrs":{"href":"https:\/\/confluent.io\/","title":null,"type":null},"content":[{"type":"text","text":"Confluent"}],"marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 公司"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Kafka:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka 是 Confluent 平臺的核心。它是一個基於開源的分佈式事件流平臺。它將是數據庫事件(插入、更新和刪除)的主存儲區域。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Kafka Connect:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們使用 Kafka-Connect 從 "},{"type":"link","attrs":{"href":"https:\/\/debezium.io\/documentation\/reference\/connectors\/postgresql.html","title":null,"type":null},"content":[{"type":"text","text":"Debezium"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的 Postgres 連接器獲取 Kafka 的數據,該連接器從 Postgres "},{"type":"link","attrs":{"href":"https:\/\/www.postgresql.org\/docs\/9.0\/wal-intro.html","title":null,"type":null},"content":[{"type":"text","text":"WAL"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 文件中獲取事件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在接收端,我們使用 ElasticSearch 連接器處理數據並將其加載到 ElasticSearch 中。Connect 既可以作爲一個獨立軟件運行,也可以作爲一個生產環境容錯且可伸縮的服務運行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"ksqlDB:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ksqlDB 允許在 Kafka 之上構建一個流處理應用程序。它在內部使用 Kafka-streams 並在事件進來時進行轉換,我們使用它來豐富特定流的事件,其中包括已經在 Kafka 持久存在的其他表的事件,這些事件可能與搜索功能相關,例如 root表中的"},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/ea\/b3\/ea82aff73ff4f774672b55f90f58c7b3.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"自作者的圖片:基於 Apache Kafka 的 ksqlDB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用 ksqlDB,只需編寫"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"SQL"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"查詢來"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"過濾、聚合、關聯和填充"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據即可。例如,假設我們正在接收一個關於兩個主題的事件流,其中包括與"},{"type":"codeinline","content":[{"type":"text","text":"brands"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"相關的信息。考慮到這是一個多租戶數據源,我們需要使用 "},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 來填充 "},{"type":"codeinline","content":[{"type":"text","text":"brand_product"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",而 "},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"目前只與"},{"type":"codeinline","content":[{"type":"text","text":"brands"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"相關聯。然後,我們可以使用這些填充後的記錄,並將它們以非標準化的形式保存在 Elasticsearch 中(以便進行搜索)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們可以使用一個主題來設置 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KStream"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":"}]},{"type":"codeblock","attrs":{"lang":"sql"},"content":[{"type":"text","text":"CREATE STREAM \"brands\"\nWITH (\n kafka_topic = 'store.public.brands', \n value_format = 'avro'\n);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了只使用其中幾列並按 "},{"type":"codeinline","content":[{"type":"text","text":"id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 對數據流分區,我們可以創建一個名爲 "},{"type":"codeinline","content":[{"type":"text","text":"enriched_brands"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的新數據流:"}]},{"type":"codeblock","attrs":{"lang":"sql"},"content":[{"type":"text","text":"CREATE STREAM \"enriched_brands\"\nWITH (\n kafka_topic = 'enriched_brands'\n) \nAS \n SELECT \n CAST(brand.id AS VARCHAR) as \"id\", \n brand.tenant_id as \"tenant_id\",\n brand.name as \"name\" \n FROM \n \"brands\" brand \n PARTITION BY \n CAST(brand.id AS VARCHAR) \n EMIT CHANGES;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後可以通過 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KTable"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 中的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"最新偏移量"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來實現事件集合。我們使用這個功能是爲了將"},{"type":"codeinline","content":[{"type":"text","text":"brand"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"事件的當前狀態與其他流關聯起來。"}]},{"type":"codeblock","attrs":{"lang":"sql"},"content":[{"type":"text","text":"CREATE TABLE \"brands_table\"\nAS \n SELECT \n id as \"id\", \n latest_by_offset(tenant_id) as \"tenant_id\"\n FROM \n \"brands\" group by id \n EMIT CHANGES; "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"現在我們添加了一個含有"},{"type":"codeinline","content":[{"type":"text","text":"brand_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 字段的 "},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的新流,但沒有"},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 字段。"}]},{"type":"codeblock","attrs":{"lang":"sql"},"content":[{"type":"text","text":"CREATE STREAM \"brand_products\" \nWITH (\n kafka_topic = 'store.public.brand_products', \n value_format = 'avro' \n);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們可以使用以下關聯查詢向 "},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"填充 "},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"codeblock","attrs":{"lang":"sql"},"content":[{"type":"text","text":"CREATE STREAM \"enriched_brand_products\" \nWITH (\n kafka_topic = 'enriched_brand_products’ \n) AS \n SELECT \n \"brand\".\"id\" as \"brand_id\", \n \"brand\".\"tenant_id\" as \"tenant_id\", \n CAST(brand_product.id AS VARCHAR) as \"id\",\n brand_product.name AS \"name\"\n FROM \n \"brand_products\" AS brand_product \n INNER JOIN \"brands_table\" \"brand\"\n ON \n brand_product.brand_id = \"brand\".\"id\"\n PARTITION BY \n CAST(brand_product.id AS VARCHAR) \n EMIT CHANGES;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Schema 註冊表:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"它在 Kafka 的上層,用於存儲你在 Kafka 中提取的事件的元數據。它基於 AVRO 模式,並提供 REST 接口來存儲和查詢它們。它有助於確保一些 Schema 兼容性檢查及其隨時間發生的演變。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"配置技術棧"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們使用 Docker 和 Docker-compose 來配置和部署服務。下面是準備用於構建服務所寫的 docker-compose 文件,將運行 Postgres,Elasticsearch,和 Kafka 相關的服務。下面我還將解釋提到的每一種服務。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Postgres 和 Elasticsearch"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"postgres:\n build: services\/postgres\n container_name: oeso_postgres\n volumes:\n - database:\/var\/lib\/postgresql\/data\n env_file:\n - .env\n ports:\n - 5432:5432\n networks:\n - project_network\n "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用於 Postgres 的 Docker-compose 服務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"elasticsearch:\n image: docker.elastic.co\/elasticsearch\/elasticsearch:7.10.0\n container_name: elasticsearch\n volumes:\n - .\/services\/elasticsearch\/config\/elasticsearch.yml:\/usr\/share\/elasticsearch\/config\/elasticsearch.yml:ro\n - elasticsearch-database:\/usr\/share\/elasticsearch\/data\n env_file:\n - .env\n ports:\n - \"9200:9200\"\n - \"9300:9300\"\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用於 Elasticsearch 的 Docker-compose 服務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了從源數據庫中流式的導出事件,我們需要啓用邏輯解碼以便從其日誌中進行復制。在 Postgres 的例子中,這些日誌被稱爲 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Write-Ahead Logs (WAL) "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",它們被寫入一個文件中。我們需要一個邏輯解碼插件,在我們的例子中,wal2json 用來提取關於持久數據庫更改的易於閱讀的信息,以便它可以被作爲事件發送到 Kafka。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了配置所需的擴展,你可以參考這個 Postgres "},{"type":"link","attrs":{"href":"https:\/\/github.com\/behindthescenes-group\/oesophagus\/blob\/master\/services\/postgres\/Dockerfile","title":null,"type":null},"content":[{"type":"text","text":"Dockerfile"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於 Elasticsearch 和 Postgres,我們需要在環境文件中指定一些必要的變量來設置它們,如用戶名、密碼等。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Zookeeper"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"zookeeper:\n image: confluentinc\/cp-zookeeper:6.0.0\n hostname: zookeeper\n container_name: zookeeper\n ports:\n - \"2181:2181\"\n environment:\n ZOOKEEPER_CLIENT_PORT: 2181\n ZOOKEEPER_TICK_TIME: 2000\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"總的來說,Zookeeper 扮演 Kafka 這樣的分佈式平臺的中心服務,它存儲所有元數據,如 Kafka 節點狀態,並持續跟蹤主題或分區。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"即便已經有了"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum","title":null,"type":null},"content":[{"type":"text","text":"在無 zookeeper 的情況下運行 Kafka"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的替代計劃,但是目前它還是管理集羣所必須的。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka Broker"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"broker:\n image: confluentinc\/cp-enterprise-kafka:6.0.0\n hostname: broker\n container_name: broker\n depends_on:\n - zookeeper\n ports:\n - \"29092:29092\"\n environment:\n KAFKA_BROKER_ID: 1\n KAFKA_ZOOKEEPER_CONNECT: \"zookeeper:2181\"\n KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT\n KAFKA_ADVERTISED_LISTENERS: PLAINTEXT:\/\/broker:9092,PLAINTEXT_HOST:\/\/localhost:29092\n KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1\n KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0\n KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1\n KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了簡單起見,我們將配置一個單節點 Kafka 集羣。我將在本系列的第2部分中討論關於多階段集羣的更多內容。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"瞭解我們爲 Kafka Broker所做的一些配置尤其重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"監聽器(Listeners)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因爲 Kafka 被設計成一個分佈式平臺,我們需要提供一些明確的方式來允許 Kafka Broker彼此在內部通信,並基於您的網絡結構與其他客戶端進行外部通信。因此我們使用監聽器來完成這個任務,監聽器是主機、端口和協議的組合。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KAFKA_LISTENERS"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這是一個可以由 KAFKA 綁定的網絡端口列表,由主機、端口和協議組合成。默認情況下,它被設置爲 "},{"type":"codeinline","content":[{"type":"text","text":"0.0.0.0"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",即監聽所有端口。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KAFKA_ADVERTISED_LISTENERS"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個值同樣是主機和端口的組合,客戶端將使用它來連接 KAFKA Broker。因此,如果客戶端在 docker 中,它可以使用 "},{"type":"codeinline","content":[{"type":"text","text":"broker:9092"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"連接到 broker,如果在 docker 外,則返回 "},{"type":"codeinline","content":[{"type":"text","text":"localhost:9092"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來建立和 broker 的連接。我們還需要提到監聽器名稱,其才能被映射到恰當的協議以建立連接。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KAFKA_LISTENER_SECURITY_PROTOCOL_MAP"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這裏我們將用戶定義的監聽器名稱映射到希望用於通信的協議;它可以是"},{"type":"codeinline","content":[{"type":"text","text":"PLAINTEXT"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"(未加密)或 "},{"type":"codeinline","content":[{"type":"text","text":"SSL"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" (加密的)。這些名字在 "},{"type":"codeinline","content":[{"type":"text","text":"KAFKA_LISTENERS"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"KAFKA_ADVERTISED_LISTENERS"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 中被進一步與host\/ip 一起使用,以便使用恰當的協議。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於我們只配置了單節點的 Kafka 集羣,因此返回的或者說發送給任何客戶端的推薦地址都將是自身這"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"同一 broker"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Schema 註冊(Schema-Registry)"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"schema-registry:\n image: confluentinc\/cp-schema-registry:6.0.0\n hostname: schema-registry\n container_name: schema-registry\n depends_on:\n - zookeeper\n - broker\n ports:\n - \"8081:8081\"\n environment:\n SCHEMA_REGISTRY_HOST_NAME: schema-registry\n SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: \"zookeeper:2181\"\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於單節點 schema 註冊,我們指定用來連接 zookeeper 的字符串,Kafka 用它存儲與 schema 相關的數據。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Kafka-Connect"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"connect:\n image: confluentinc\/cp-kafka-connect:6.0.0\n hostname: connect\n container_name: connect\n volumes:\n - \".\/producers\/debezium-debezium-connector-postgresql\/:\/usr\/share\/confluent-hub-components\/debezium-debezium-connector-postgresql\/\"\n - \".\/consumers\/confluentinc-kafka-connect-elasticsearch\/:\/usr\/share\/confluent-hub-components\/confluentinc-kafka-connect-elasticsearch\/\"\n depends_on:\n - zookeeper\n - broker\n - schema-registry\n ports:\n - \"8083:8083\"\n environment:\n CONNECT_BOOTSTRAP_SERVERS: \"broker:9092\"\n KAFKA_HEAP_OPTS: \"-Xms256M -Xmx512M\"\n CONNECT_REST_ADVERTISED_HOST_NAME: connect\n CONNECT_REST_PORT: 8083\n CONNECT_GROUP_ID: compose-connect-group\n CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs\n CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1\n CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000\n CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets\n CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1\n CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status\n CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1\n CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter\n CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter\n CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http:\/\/schema-registry:8081\n CONNECT_INTERNAL_KEY_CONVERTER: \"org.apache.kafka.connect.json.JsonConverter\"\n CONNECT_INTERNAL_VALUE_CONVERTER: \"org.apache.kafka.connect.json.JsonConverter\"\n CONNECT_ZOOKEEPER_CONNECT: \"zookeeper:2181\"\n CLASSPATH: \/usr\/share\/java\/monitoring-interceptors\/monitoring-interceptors-5.5.1.jar\n CONNECT_PRODUCER_INTERCEPTOR_CLASSES: \"io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor\"\n CONNECT_CONSUMER_INTERCEPTOR_CLASSES: \"io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor\"\n CONNECT_PLUGIN_PATH: \"\/usr\/share\/java,\/usr\/share\/confluent-hub-components\"\n CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們看到一些新的參數,比如:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"CONNECT_BOOTSTRAP_SERVERS:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一組主機和端口組合,用於建立到 Kafka 集羣的初始連接"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"CONNECT_KEY_CONVERTER:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"用於將鍵(key)從"},{"type":"codeinline","content":[{"type":"text","text":"connect"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"格式序列化爲與 Kafka 兼容的格式。類似地,對於 "},{"type":"codeinline","content":[{"type":"text","text":"CONNECT_VALUE_CONVERTER"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",我們使用 AvroConverter 進行序列化。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"映射大量 source 和 sink 連接器插件並在 "},{"type":"codeinline","content":[{"type":"text","text":"CONNECT_PLUGIN_PATH"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 中指定它們是非常的重要。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"ksqlDB"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"ksqldb-server:\n image: confluentinc\/cp-ksqldb-server:6.0.0\n hostname: ksqldb-server\n container_name: ksqldb-server\n depends_on:\n - broker\n - schema-registry\n ports:\n - \"8088:8088\"\n volumes:\n - \".\/producers\/debezium-debezium-connector-postgresql\/:\/usr\/share\/kafka\/plugins\/debezium-debezium-connector-postgresql\/\"\n - \".\/consumers\/confluentinc-kafka-connect-elasticsearch\/:\/usr\/share\/kafka\/plugins\/confluentinc-kafka-connect-elasticsearch\/\"\n environment:\n KSQL_LISTENERS: \"http:\/\/0.0.0.0:8088\"\n KSQL_BOOTSTRAP_SERVERS: \"broker:9092\"\n KSQL_KSQL_SCHEMA_REGISTRY_URL: \"http:\/\/schema-registry:8081\"\n KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: \"true\"\n KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: \"true\"\n KSQL_KSQL_STREAMS_MAX_TASK_IDLE_MS: 2000\n KSQL_CONNECT_GROUP_ID: \"ksql-connect-cluster\"\n KSQL_CONNECT_BOOTSTRAP_SERVERS: \"broker:9092\"\n KSQL_CONNECT_KEY_CONVERTER: \"io.confluent.connect.avro.AvroConverter\"\n KSQL_CONNECT_VALUE_CONVERTER: \"io.confluent.connect.avro.AvroConverter\"\n KSQL_CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: \"http:\/\/schema-registry:8081\"\n KSQL_CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: \"http:\/\/schema-registry:8081\"\n KSQL_CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE: \"false\"\n KSQL_CONNECT_CONFIG_STORAGE_TOPIC: \"ksql-connect-configs\"\n KSQL_CONNECT_OFFSET_STORAGE_TOPIC: \"ksql-connect-offsets\"\n KSQL_CONNECT_STATUS_STORAGE_TOPIC: \"ksql-connect-statuses\"\n KSQL_CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1\n KSQL_CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1\n KSQL_CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1\n KSQL_CONNECT_PLUGIN_PATH: \"\/usr\/share\/kafka\/plugins\"\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如果不打算使用 "},{"type":"codeinline","content":[{"type":"text","text":"Kafka-Connect"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",並且不需要獨立於 "},{"type":"codeinline","content":[{"type":"text","text":"ksql"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"擴展 "},{"type":"codeinline","content":[{"type":"text","text":"Kafka-Connect"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",那麼可以爲 "},{"type":"codeinline","content":[{"type":"text","text":"ksql"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"設置 "},{"type":"codeinline","content":[{"type":"text","text":"embedded-connect"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"配置,這將暴露來自 "},{"type":"codeinline","content":[{"type":"text","text":"ksqldb-server"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的連接點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除此之外,還有一個環境變量需要考慮:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"KSQL_KSQL_STREAMS_MAX_TASK_IDLE_MS"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":在當前版本的 ksqlDB,對於流式表關聯,關聯的結果可能變成不確定的,即如果在流事件之前還沒有創建或更新被關聯的表中的實時事件,那您可能無法關聯成功。當流中的某個事件在某個特定時間戳到達時,配置這個環境變量可以做一些等待讓這個事件加載到表中。這提高了關聯的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"可預測性"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",但可能會導致某些"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"性能下降"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。在"},{"type":"link","attrs":{"href":"https:\/\/cwiki.apache.org\/confluence\/display\/KAFKA\/KIP-695%3A+Further+Improve+Kafka+Streams+Timestamp+Synchronization","title":null,"type":null},"content":[{"type":"text","text":"這裏"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們正在努力改善這一點。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實際上,如果你不能清楚地理解上面的內容,我建議你現在就使用這個配置,因爲它很有效;它實際上需要另一篇文章來詳細討論"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"時間同步"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",或者如果你仍然好奇,你可以觀看這個由來自 Confluent 的 Matthias j. Sax 製作的"},{"type":"link","attrs":{"href":"https:\/\/www.confluent.io\/resources\/kafka-summit-2020\/the-flux-capacitor-of-kafka-streams-and-ksqldb\/","title":null,"type":null},"content":[{"type":"text","text":"視頻"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"ksqldb-cli:\n image: confluentinc\/cp-ksqldb-cli:6.0.0\n container_name: ksqldb-cli\n depends_on:\n - broker\n - ksqldb-server\n entrypoint: \/bin\/sh\n tty: true\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在測試或開發環境中,使用 "},{"type":"codeinline","content":[{"type":"text","text":"ksqldb-cli"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"服務來嘗試和測試流非常方便。即使在生產環境中,如果您想探索事件流或 Ktables,或者手動創建或過濾流,也可以這樣做。儘管如此,還是建議您使用 ksql 或 kafka 客戶端或其 REST 端點自動創建流、表或主題,這些我們將在下面進行討論。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b3\/e2\/b3820925253c5288f3fff27030d153e2.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"圖片由作者提供:目前爲止對我們的架構進行的更詳細觀察"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"初始化數據"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"流"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"streams-init:\n build: jobs\/streams-init\n container_name: streams-init\n depends_on:\n - zookeeper\n - broker\n - schema-registry\n - ksqldb-server\n - ksqldb-cli\n - postgres\n - elasticsearch\n - connect\n env_file:\n - .env\n environment:\n ZOOKEEPER_HOSTS: \"zookeeper:2181\"\n KAFKA_TOPICS: \"brands, brand_products\"\n networks:\n - project_network"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個服務的目的是進行流初始化和 Kafka 內部配置,以及我們正在使用的其他服務。在部署時,我們不希望在服務器上手動創建主題、流、連接等。因此,我們使用爲每個服務提供的 REST 服務,並編寫 shell 腳本來自動化這個過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的配置腳本如下所示:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"#!\/bin\/bash\n\n\n# Setup ENV variables in connectors json files\nsed -i \"s\/POSTGRES_USER\/${POSTGRES_USER}\/g\" connectors\/postgres.json\nsed -i \"s\/POSTGRES_PASSWORD\/${POSTGRES_PASSWORD}\/g\" connectors\/postgres.json\nsed -i \"s\/POSTGRES_DB\/${POSTGRES_DB}\/g\" connectors\/postgres.json\nsed -i \"s\/ELASTIC_PASSWORD\/${ELASTIC_PASSWORD}\/g\" connectors\/elasticsearch.json\n\n\n# Simply wait until original kafka container and zookeeper are started.\nexport WAIT_HOSTS=zookeeper:2181,broker:9092,schema-registry:8081,ksqldb-server:8088,elasticsearch:9200,connect:8083\nexport WAIT_HOSTS_TIMEOUT=300\n\/wait\n\n\n# Parse string of kafka topics into an array\n# https:\/\/stackoverflow.com\/a\/10586169\/4587961\nkafkatopicsArrayString=\"$KAFKA_TOPICS\"\nIFS=', ' read -r -a kafkaTopicsArray <<< \"$kafkatopicsArrayString\"\n\n\n# A separate variable for zookeeper hosts.\nzookeeperHostsValue=$ZOOKEEPER_HOSTS\n\n\n# Terminate all queries\ncurl -s -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"SHOW QUERIES;\"}' | \\\n jq '.[].queries[].id' | \\\n xargs -Ifoo curl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"TERMINATE 'foo';\"}'\n \n\n\n# Drop All Tables\ncurl -s -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"SHOW TABLES;\"}' | \\\n jq '.[].tables[].name' | \\\n xargs -Ifoo curl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"DROP TABLE \\\"foo\\\";\"}'\n\n\n\n\n# Drop All Streams\ncurl -s -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"SHOW STREAMS;\"}' | \\\n jq '.[].streams[].name' | \\\n xargs -Ifoo curl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" \\\n -H \"Content-Type: application\/vnd.ksql.v1+json; charset=utf-8\" \\\n -d '{\"ksql\": \"DROP STREAM \\\"foo\\\";\"}'\n \n\n\n# Create kafka topic for each topic item from split array of topics.\nfor newTopic in \"${kafkaTopicsArray[@]}\"; do\n # https:\/\/kafka.apache.org\/quickstart\n curl -X DELETE http:\/\/elasticsearch:9200\/enriched_$newTopic --user elastic:${ELASTIC_PASSWORD}\n curl -X DELETE http:\/\/schema-registry:8081\/subjects\/store.public.$newTopic-value\n kafka-topics --create --topic \"store.public.$newTopic\" --partitions 1 --replication-factor 1 --if-not-exists --zookeeper \"$zookeeperHostsValue\"\n curl -X POST -H \"Content-Type: application\/vnd.schemaregistry.v1+json\" --data @schemas\/$newTopic.json http:\/\/schema-registry:8081\/subjects\/store.public.$newTopic-value\/versions\n\n\ndone\n\n\ncurl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" -H \"Accept: application\/vnd.ksql.v1+json\" -d \n\n{ \"ksql\": \"CREATE STREAM \\\\\"brands\\\\\" WITH (kafka_topic = \\'store.public.brands\\', value_format = \\'avro\\');\", \"streamsProperties\": {} }'\ncurl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" -H \"Accept: application\/vnd.ksql.v1+json\" -d \n\n{ \"ksql\": \"CREATE STREAM \\\\\"enriched_brands\\\\\" WITH ( kafka_topic = \\'enriched_brands\\' ) AS SELECT CAST(brand.id AS VARCHAR) as \\\\\"id\\\\\", brand.tenant_id as \\\\\"tenant_id\\\\\", brand.name as \\\\\"name\\\\\" from \\\\\"brands\\\\\" brand partition by CAST(brand.id AS VARCHAR) EMIT CHANGES;\", \"streamsProperties\": {} }'\n\n\ncurl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" -H \"Accept: application\/vnd.ksql.v1+json\" -d \n\n{ \"ksql\": \"CREATE STREAM \\\\\"brand_products\\\\\" WITH ( kafka_topic = \\'store.public.brand_products\\', value_format = \\'avro\\' );\", \"streamsProperties\": {} }'\ncurl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" -H \"Accept: application\/vnd.ksql.v1+json\" -d \n\n{ \"ksql\": \"CREATE TABLE \\\\\"brands_table\\\\\" AS SELECT id as \\\\\"id\\\\\", latest_by_offset(tenant_id) as \\\\\"tenant_id\\\\\" FROM \\\\\"brands\\\\\" group by id EMIT CHANGES;\", \"streamsProperties\": {} }'\ncurl -X \"POST\" \"http:\/\/ksqldb-server:8088\/ksql\" -H \"Accept: application\/vnd.ksql.v1+json\" -d \n\n{ \"ksql\": \"CREATE STREAM \\\\\"enriched_brand_products\\\\\" WITH ( kafka_topic = \\'enriched_brand_products\\' ) AS SELECT \\\\\"brand\\\\\".\\\\\"id\\\\\" as \\\\\"brand_id\\\\\", \\\\\"brand\\\\\".\\\\\"tenant_id\\\\\" as \\\\\"tenant_id\\\\\", CAST(brand_product.id AS VARCHAR) as \\\\\"id\\\\\", brand_product.name AS \\\\\"name\\\\\" FROM \\\\\"brand_products\\\\\" AS brand_product INNER JOIN \\\\\"brands_table\\\\\" \\\\\"brand\\\\\" ON brand_product.brand_id = \\\\\"brand\\\\\".\\\\\"id\\\\\" partition by CAST(brand_product.id AS VARCHAR) EMIT CHANGES;\", \"streamsProperties\": {} }'\n\n\ncurl -X DELETE http:\/\/connect:8083\/connectors\/enriched_writer\ncurl -X \"POST\" -H \"Content-Type: application\/json\" --data @connectors\/elasticsearch.json http:\/\/connect:8083\/connectors\n\n\ncurl -X DELETE http:\/\/connect:8083\/connectors\/event_reader\ncurl -X \"POST\" -H \"Content-Type: application\/json\" --data @connectors\/postgres.json http:\/\/connect:80"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這就是我們目前的工作方式:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在運行任何任務之前,我們確保所有的服務都"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"準備好了"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":";"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們需要確保主題在 Kafka 上"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"已存在"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",或者我們創建新的主題;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"即使有 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"schema 更新"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",我們的數據流也應該是可用的;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當底層數據 srouce 或 sink 的密碼或版本更改,需要再次創建連接。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"共享這個配置腳本的目的只是爲了"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"演示"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一種自動化這些 pipeline的方法。完全相同的配置可能並不適合您,但是自動化工作流和避免在任何環境中的進行手工部署的想法始終是一樣的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了讓這個數據基礎設施能夠真正快速地運行起來,請參考 Github 倉庫:"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/github.com\/behindthescenes-group\/oesophagus","title":null,"type":null},"content":[{"type":"text","text":"behindthescenes-group\/oesophagus"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在你的終端中克隆代碼庫並執行以下操作:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"cp default.env .env\ndocker-compose up -d"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在Postgres 數據庫 "},{"type":"codeinline","content":[{"type":"text","text":"store"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"中創建 "},{"type":"codeinline","content":[{"type":"text","text":"brands"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 表:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"CREATE TABLE brands (\n id serial PRIMARY KEY,\n name VARCHAR (50),\n tenant_id INTEGER\n);\nCREATE TABLE brand_products (\n id serial PRIMARY KEY,\n brand_id INTEGER,\n name VARCHAR(50)\n);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在"},{"type":"codeinline","content":[{"type":"text","text":"brands"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表中插入一些記錄:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"INSERT INTO brands VALUES(1, 'Brand Name 1', 1);\nINSERT INTO brands VALUES(2, 'Brand Name 2', 1);\nINSERT INTO brands VALUES(3, 'Brand Name 3', 2);\nINSERT INTO brands VALUES(4, 'Brand Name 4', 2);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後"},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表中的一些記錄:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"INSERT INTO brand_products VALUES(1, 1, 'Product Name 1');\nINSERT INTO brand_products VALUES(2, 2, 'Product Name 2');\nINSERT INTO brand_products VALUES(3, 3, 'Product Name 3');\nINSERT INTO brand_products VALUES(4, 4, 'Product Name 4');\nINSERT INTO brand_products VALUES(5, 1, 'Product Name 5');"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在 Elasticsearch 的中查看填充了"},{"type":"codeinline","content":[{"type":"text","text":"tenant_id"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的"},{"type":"codeinline","content":[{"type":"text","text":"brand_products"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" :"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"curl localhost:9200\/enriched_brand_products\/_search --user elastic:your_password"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我將持續爲上述代碼庫做出貢獻:添加在 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Kubernetes"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 部署多節點 Kafka 基礎設施的配置,編寫更多連接器,使用期望的服務實現"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"即插即用"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"架構的框架。請在"},{"type":"link","attrs":{"href":"https:\/\/forms.gle\/GGg2hvnEpG6r4bgg7","title":null,"type":null},"content":[{"type":"text","text":"這裏"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"自由的提交貢獻,或讓我知道在你在當前配置中所遇到的任何"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"數據工程問題"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下一步"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我希望這篇文章能給你一個關於部署和運行完整 Kafka 技術棧的清晰思路,這是一個構建實時流處理應用程序的基礎且有效的示例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"根據產品或公司的自身特點,部署過程根據需要可能會有所不同。我還計劃在本系列的下一部分中就這樣一個系統在可伸縮性方面進行探討,那將是關於在相同使用場景下如何在 "},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Kubernetes"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 上部署這樣的基礎設施的討論。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"英文原文鏈接"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":":"},{"type":"link","attrs":{"href":"https:\/\/towardsdatascience.com\/enabling-a-powerful-search-capability-building-and-deploying-a-real-time-stream-processing-etl-a27ecb0ab0ae","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/towardsdatascience.com\/enabling-a-powerful-search-capability-building-and-deploying-a-real-time-stream-processing-etl-a27ecb0ab0ae"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章