Pulsar Flink Connector 2.5.0 正式發佈

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過不斷地努力,社區成功發佈 Pulsar Flink Connector 2.5.0 版本。Pulsar Flink Connector 集成了 Apache Pulsar 和 Apache Flink(數據處理引擎),允許 Apache Flink 向 Apache Pulsar 讀寫數據。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"項目地址:https://github.com/streamnative/pulsar-flink/tree/release-2.5.0 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面將詳細介紹 Pulsar Flink Connector 2.5.0 引入的新特性,希望能夠幫助大家更好地理解 Pulsar Flink Connector。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink 是一款快速發展的分佈式計算引擎,在 1.11 版本中,支持以下新特性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 核心引擎引入了非對齊的 Checkpoint 機制。這一機制明顯改善了 Flink 容錯機制,它可以提高嚴重反壓作業的 Checkpoint 速度。 "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 提供一套新的 Source 接口。通過統一流和批作業 Source 的運行機制,支持常用的內部實現,如事件時間處理、watermark 生成和空閒併發檢測。這套新的 Source 接口可以極大地降低開發新 Source 的複雜度。 "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Flink SQL 支持變動數據捕獲(Change Data Capture,CDC)。它使 Flink 可以方便地通過像 Debezium 這類工具來翻譯和消費數據庫的變動日誌。Table API 和 SQL 也有助於文件系統連接器支持更多用戶場景和格式,從而支持將流式數據從 Pulsar 寫入 Hive 等場景。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" PyFlink 優化多個部分的性能,包括支持向量化的用戶自定義函數(Python UDF)。這些改動使 Flink Python 接口可以與常用的 Python 庫(如 Pandas 和 NumPy)進行相互操作,從而使 Flink 更適合數據處理與機器學習的場景。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在新版本發佈後,爲了讓小夥伴們儘快使用支持 Flink 1.11 的 Pulsar Flink Connector,我們對新版 Pulsar Flink Connector 進行了升級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們發現這次升級難度很大,問題在於 Flink 新版本對於公開 API 的支持有增減(基礎的 FieldsDataType 類型、StreamTableEnvironment 包變更和 execute 方法的變化)、Table 檢查 Schema 操作變更爲啓動時檢查、連接器運行時轉換爲 Catalog,直接使新舊版本不兼容。 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過多方考量,我們最終決定新增 pulsar-flink-1.11 模塊來支持 Flink 1.11。在這裏非常感謝 BIGO 團隊的陳航、吳展鵬,爲社區貢獻了 Flink 1.11 的兼容升級技術支持。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar Schema 包含消息的類型結構信息,它可以很好地和 Flink Table 進行集成。在 Flink 1.9 時,SQL 類型可以綁定物理類型,用於 Pulsar 的 SchemaType。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是在 Flink 1.11,Table 變更後,SQL 類型只能使用默認的物理類型,Pulsar 的 SchemaType 不支持 Flink 日期、事件的默認物理類型。我們爲 Pulsar Schema 添加了新的原生類型,使 Pulsar Schema 可以和 Flink SQL 類型系統集成起來。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Pulsar Flink Connector 新特性詳解 "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以下是 Pulsar Flink Connector 2.5.0 中添加的一些主要的功能。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"pulsar-flink"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"📣 支持 Flink 1.11 和 flink-sql DDL"},{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink 1.11 版本升級的幅度較大,一些公共 API 發生了增刪,導致 Flink 1.9、Flink 1.11 的 Pulsar 連接器無法做到兼容。本次變更使項目分爲兩個模塊,來支持不同版本的 Flink。BIGO 的陳航、吳展鵬童鞋爲此特性付出了很大的努力。 "}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 支持 Flink 1.11 版本 "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 新增 Flink-sql DDL 支持 "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 更新 topic 分區策略,使消費更均勻 "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Flink 1.11 兼容 Pulsar schema"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有關實現的更多信息,請參見 PR-115:https://github.com/streamnative/pulsar-flink/pull/115 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"📣 添加 PulsarDeserializationSchema 接口"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"抽象 PulsarDeserializationSchema 接口,使用戶可以自定義解碼,獲得更多源信息。有關實現的更多信息,請參見 PR-95: https://github.com/streamnative/pulsar-flink/pull/95 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"貢獻者:@wuzhanpeng"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"📣 Flink Sink 增加 JSON 支持"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink Sink 實現中,Pulsar Schema 類型支持 JSON 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有關實現的更多信息,請參見 PR-116:https://github.com/streamnative/pulsar-flink/pull/116 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"貢獻者:@jianyun8023"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"📣 PulsarCatalog 變更爲基於 GenericInMemoryCatalog 實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PulsarCatalog 的實現變更爲繼承 GenericInMemoryCatalog。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有關實現的更多信息,請參見 PR-91:https://github.com/streamnative/pulsar-flink/pull/91。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"貢獻者:@sijie"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"Pulsar Schema"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"📣 增加 Java 8 時間、日期類型到 Pulsar Schema 的原生類型"},{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲 Pulsar Schema 增加 Java 8 常用的 Instant、LocalDate、LocalTime、LocalDateTime 等類型支持。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有關實現的更多信息,請參見PR-7874:https://github.com/apache/pulsar/pull/7874 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"貢獻者:@jianyun8023"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pulsar Flink Connector 2.5.0 的發佈,對於這個快速發展的項目來說,是一個大的里程碑。在此特別感謝爲本次版本發佈做出貢獻的陳航、吳展鵬、郭斯傑、趙建雲。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你有好的想法或想成爲項目貢獻者,歡迎提交 ISSUE,也可以參考我們的貢獻指南:https://github.com/streamnative/pulsar-flink/issues。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"相關鏈接"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?_biz=MzU3Mzg4OTMyNQ==&mid=2247488124&idx=1&sn=002d1acbfa47887a4a28f14a4459f7ea&scene=21#wechatredirect","title":""},"content":[{"type":"text","text":"Flink 1.11 新特性(Flink-China)"}]},{"type":"text","text":" "}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Pulsar Flink Connector:https://github.com/streamnative/pulsar-flink/tree/release-2.5.0"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" streamnative/pulsar-flink:https://github.com/streamnative/pulsar-flink/issues"}]}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章