简单的Kafka:没有ZooKeeper的Kafka

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Apache Kafka®的核心是日志。日志是一个简单的数据结构,它通过顺序读写与底层硬件密切配合。以日志为中心的设计利用了高效的磁盘缓冲、CPU缓存、预读、零拷贝等许多特性,从而带来了众所周知的高效率和吞吐量。对于那些刚接触Kafka的人来说,这些主题以及它作为提交日志的底层实现,通常是他们学习Apache Kafka的第一件事。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"但是日志本身的代码在整个系统中只占相对较小的一部分。Kafka的代码库中有很大一部分是负责在集群中多个broker之间分配分区(即日志)、分配领导权、处理故障等。这些代码使Kafka成为一个可靠和可信的分布式系统。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"从历史上看,Apache ZooKeeper是分布式代码工作的关键部分。ZooKeeper提供了可靠的元数据存储,这些元数据存储了系统中最重要的信息:比如分区在哪里,哪个副本是Leader等等。项目早期使用ZooKeeper是有意义的,因为它是一个强大且经过验证的工具。但归根结底,ZooKeeper是一个基于一致性日志的特殊文件系统\/触发器API。Kafka是一个建立在一致性日志之上的发布\/订阅API。这使得操作系统的人员可以跨两个日志实现、两个网络层和两个安全实现(每个实现都有不同的工具和监视钩子)对通信和性能进行调优、配置、监视、保护和评估。它变得不必要的复杂。这种固有的和不可避免的复杂性促使了最近的一个倡议,即用一个完全运行在Kafka内部的仲裁服务来取代ZooKeeper。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"当然,更换ZooKeeper是一项相当大的工作,去年4月,我们启动了一个社区倡议,以加快进度,并在年底前交付一个工作系统。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2f\/2f675fdbdcdb9f5278ff0ae888b7c9c3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我刚刚和Jason, Colin以及KIP-500团队坐在一起,经历了Kafka服务器的完整生命周期,生产,消费和所有zookeeper免费。非常甜蜜!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/twitter.com\/benstopford?ref_src=twsrc^tfw|twcamp^tweetembed|twterm^1338931076979372038|twgr^|twcon^s1_&ref_url=https%3A%2F%2Fwww.confluent.io%2Fblog%2Fkafka-without-zookeeper-a-sneak-peek%2F","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"Ben Stopford@benstopford"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"所以我们很高兴地说,KIP-500代码的早期访问已经提交到trunk,预计将包括在即将发布的2.8版本中。第一次,你可以在没有ZooKeeper的情况下运行Kafka。我们称之为 Kafka Raft 元数据模式,通常缩写为KRaft(发音像craft)模式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"注意,有一些特性在这个早期版本中是不可用的。我们还不支持使用acl和其他安全特性或事务。而且,在KRaft模式下,不支持分区重分配和JBOD(预计在今年晚些时候的Apache Kafka版本中会提供这些功能)。因此,考虑 Quorum 控制器是一个实验性的功能,我们不建议将其置于生产工作负载之下。然而,如果你确实尝试过这个软件,你会发现它有很多新的优点:它的部署和操作更简单,你可以把Kafka作为一个单独的进程来运行,而且它可以在每个集群中容纳更多的分区(见下面的数据信息)。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"Quorum 控制器: 事件驱动的共识"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如果你选择使用新的 Quorum 控制器运行Kafka,所有以前由Kafka控制器和ZooKeeper承担的元数据功能,都会合并到这个新的服务中,运行在Kafka集群中。如果有需要的话,Quorum 控制器还可以在专用硬件上运行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/10\/109860232d5405aadf758a512d35500c.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"但在内部,它变得有趣起来。Quorum 控制器使用新的 KRaft 协议来确保元数据在仲裁中被精确地复制。这个协议在很多方面与ZooKeeper的ZAB协议和Raft相似,但有一些重要的区别,其中一个显著且合适的区别是它使用了事件驱动的架构。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Quorm控制器使用事件源存储模型存储其状态,这确保始终可以准确地重新创建内部状态机。用于存储此状态的事件日志(也称为元数据主题)通过快照定期地进行压缩,以确保日志不会无限增长。Quorm中的其他控制器通过响应活动控制器,创建并存储在其日志中的事件来跟踪活动控制器。因此,如果一个节点由于分区事件而暂停,那么它可以在重新登录时通过访问日志来快速地赶上它错过的任何事件。这大大减少了不可用窗口,改善了系统的最坏情况恢复时间。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/03\/0365ed50557e411cc4343d7ebbc89d1a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"事件驱动的内部共识"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"KRaft协议的事件驱动特性意味着,与基于ZooKeeper的控制器不同,仲裁控制器在成为活动状态之前不需要从ZooKeeper加载状态。当领导权发生变化时,新的活动控制器已经在内存中拥有所有提交的元数据记录。此外,KRaft协议中使用的事件驱动机制也用于跨集群跟踪元数据。以前使用rpc处理的任务现在得益于事件驱动以及使用实际日志进行通信。这些改变带来的一个令人愉快的结果是,Kafka现在可以比以前支持更多的分区。让我们更详细地讨论一下。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"扩展Kafka:支持数百万个分区"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Kafka集群可以支持的分区数由两个属性决定: 每个节点的分区数限制和集群范围的分区限制。两者都很有趣,但是到目前为止,元数据管理一直是集群范围限制的主要瓶颈。以前的Kafka改进建议(KIPs)已经改进了每个节点的限制,尽管总有更多的事情可以做。但是Kafka的可伸缩性主要依赖于增加节点来获得更多的容量。这就使集群范围限制变得重要的地方,因为它定义了系统内可伸缩性的上限。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"新的Quorum控制器旨在处理每个集群中更多的分区。为了评估这一点,我们进行了类似于2018年运行的那些测试,以公布Kafka固有的分区限制。这些测试测量关闭和恢复所花费的时间,这是指旧控制器的O(#partitions)操作。正是这个操作为Kafka在单个集群中所能支持的分区数量设定了上限。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"之前的实现,正如Jun Rao在上面的文章中解释的那样,可以达到200K分区,限制因素是在外部共识(ZooKeeper)和内部leader管理(Kafka controller)之间移动关键元数据所花费的时间。使用新的仲裁控制器,这两个角色由相同的组件提供服务。事件驱动的方法意味着控制器故障转移现在几乎是即时的。下面是在我们的实验室中运行200万个分区(是上一个上限的10倍)的集群:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/19\/19fb676d80f6d4b7e4bfeda377c89158.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

With ZooKeeper-Based Controller

With Quorum Controller

Controlled Shutdown Time (2 million partitions)

135 sec.

32 sec.

Recovery from Uncontrolled Shutdown (2 million partitions)

503 sec.

37 sec."}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"控制和不控制停机的两种措施都很重要。受控关闭会影响常见的操作场景,如滚动重启:部署软件更改的标准过程,同时保持整个过程的可用性。从不受控制的关闭中恢复可能更重要,因为它设置了系统的恢复时间目标(RTO),例如在发生意外故障后,例如VM或pod崩溃或数据中心不可用。虽然这些度量只是更广泛的系统性能指标,但它们直接度量了众所周知的ZooKeeper使用带来的瓶颈。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"注意,控制和非控制的测量是不能直接比较的。不受控制的政府停摆案包括了选出新领导人所需的时间,而控制案则没有。这种差异是故意的,以保持控制病例与Jun Rao的原始测量。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"集群规模下降: 单一进程运行Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"Kafka经常被认为是重量级的基础设施,比如管理zookeeper的复杂性(因为它是一个单独的分布式系统) 就是这种看法存在的重要原因。这通常会导致项目在开始时选择更轻量级的消息队列,比如ActiveMQ或rabbitmq这样的传统队列,然后在规模需要时转移到Kafka。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"这是不幸的,因为Kafka提供的抽象,形成了一个提交日志,是适用于小规模的工作负载,你可能看到在一个初创公司,因为它是在Netflix或Instagram的高吞吐量。更重要的是,如果你想添加流处理,你需要Kafka和它的提交日志抽象,不管它是使用Kafka Streams, ksqlDB,还是其他的流处理框架。但是由于管理两个独立系统kafka和zookeeper的复杂性,用户常常觉得他们必须在规模和入门的方便性之间做出选择。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"现在已经不是这样了。KIP-500和KRaft模式提供了一种很棒的、轻量级的方式来开始使用Kafka,或者使用它作为ActiveMQ或RabbitMQ等单片代理的替代方案。轻量级的单进程部署也更适合于边缘场景和那些使用轻量级硬件的场景。云为这个问题增加了一个有趣的切入角度。像融合云这样的托管服务完全消除了操作负担。因此,无论您是希望运行自己的集群,还是让它为您运行,都可以从小规模开始,随着底层用例的扩展(可能)扩展到大规模——所有这些都使用相同的基础设施。让我们看看单进程部署是什么样子的。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"带着没有ZooKeeper的的Kafka兜风"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"新的Quorm控制器今天在trunk中已经以试验性的功能提供出来,预计将包含在即将发布的Apache Kafka 2.8版本中。那么你能用它做什么呢? 如上所述,一个简单但非常酷的新特性是创建单个进程Kafka集群的能力,如下面的简短演示所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"演示文档地址: "},{"type":"link","attrs":{"href":"https:\/\/asciinema.org\/a\/403794\/embed","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"https:\/\/asciinema.org\/a\/403794\/embed"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"当然,如果您想要扩展它以支持更高的吞吐量并添加复制以容错,您只需要添加新的代理进程。如你所知,这是基于kraft的 Quorm 控制器的早期访问版本。请不要将它用于高负载的工作环境中。在接下来的几个月里,我们将添加最后缺失的部分,执行协议的TLA+建模,并在融合云中完善 Quorm控制器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"您现在可以自己尝试新的Quorm 控制器。在GitHub上"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/kafka\/blob\/6d1d68617ecd023b787f54aafc24a4232663428d\/config\/kraft\/README.md","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"查看完整的描述"}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"。"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/kafka\/blob\/6d1d68617ecd023b787f54aafc24a4232663428d\/config\/kraft\/README.md","title":null,"type":null},"content":[{"type":"text","marks":[{"type":"underline"}],"text":"现在去尝试"}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#1A1A1A","name":"user"}}],"text":"背后的团队"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"如果没有Apache Kafka社区和一群分布式系统工程师在大流行期间不知疲倦地工作,在大约9个月的时间里将它从零变成一个正常工作的系统,这将是(并将继续是)一个巨大的努力。我们想扩展特别感谢 Colin McCabe, Jason Gustafson, Ron Dagostino, Boyang Chen, David Arthur, Jose Garcia Sancio, Guozhang Wang, Alok Nikhil, Deng Zi Ming, Sagar Rao, Feyman, Chia-Ping Tsai, Jun Rao, Heidi Howard,和Apache Kafka社区的所有成员帮助实现这一目标。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文网址"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https:\/\/www.confluent.io\/blog\/kafka-without-zookeeper-a-sneak-peek\/"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"译者:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"韩欣"},{"type":"text","text":",腾讯云中间件-微服务产品中心技术总监,微服务平台TSF、消息队列CKafka \/ TDMQ、微服务观测平台TSW等中间件产品的负责人。负责中间件相关产品的规划,架构和落地实施,有超过十三年的研发架构经验。目前关注在云计算中间件相关领域,致力于整合PaaS技术资源,构建基于微服务的技术中台,为企业的数字化转型提供基础支持。"}]}]}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章