如何使用Kafka、MongoDB和Maxwell’s Daemon构建SQL数据库的审计系统

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本文要点"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"审计日志系统有很多应用场景,而不仅仅是存储用于审计目的的数据。除了合规性和安全性的目的之外,它还能够被市场营销团队使用,以便于锁定目标用户,也可以用来生成重要的告警。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"数据库内置的审计日志功能可能并不够用,要处理所有的用户场景,它肯定不是理想的方式。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,有很多的开源工具,如"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemons"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/debezium.io\/","title":"","type":null},"content":[{"type":"text","text":"Debezium"}]},{"type":"text","text":",它们能够以最少的基础设施和时间需求支持这些需求。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Maxwell’s daemons能够读取SQL bin日志并发送事件到各种生产者,比如"},{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Kafka"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/kinesis\/","title":"","type":null},"content":[{"type":"text","text":"Amazon Kinesis"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sqs\/","title":"","type":null},"content":[{"type":"text","text":"SQS"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.rabbitmq.com\/","title":"","type":null},"content":[{"type":"text","text":"Rabbit MQ"}]},{"type":"text","text":"等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SQL数据库生成的bin日志必须是基于ROW的格式,这样才能使整个环境运行起来。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假设你正在使用关系型数据来维护事务性数据并且你需要存储某些数据的审计跟踪信息,而这些数据本身是以表的形式存在的。如果你像大多数开发人员那样,那么最终所采用的方案可能如下所示:"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 使用数据库的审计日志功能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大多数数据库都提供了插件来支持审计日志。这些插件可以很容易地安装和配置,以便于记录数据。但是,这种方式存在如下的问题:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"完整的审计日志插件一般只有企业级版本才提供。社区版可能会缺失这样的插件。以MySQL为例,"},{"type":"link","attrs":{"href":"https:\/\/dev.mysql.com\/doc\/refman\/5.7\/en\/audit-log.html","title":"","type":null},"content":[{"type":"text","text":"审计日志插件"}]},{"type":"text","text":"只有企业版中才能使用。值得一提的是,MySQL社区版的用户依然可以安装来自MariaDB或Percona的其他审计日志组件以绕过这个限制。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"数据库级别的审计日志会导致数据库服务器10-20%的额外负载,正如"},{"type":"link","attrs":{"href":"http:\/\/blog.symedia.pl\/2016\/10\/performance-impact-general-slow-query-log.html","title":"","type":null},"content":[{"type":"text","text":"该文"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/www.percona.com\/blog\/2009\/02\/10\/impact-of-logging-on-mysql%E2%80%99s-performance","title":"","type":null},"content":[{"type":"text","text":"该文"}]},{"type":"text","text":"所讨论的。通常,对于高负载的系统,我们可能想要仅对较慢的查询启用审计日志,而不是针对所有的查询。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"审计日志会写入到日志文件中,数据不易于搜索。为了实现数据分析和审计的目的,我们可能想要审计数据能够遵循可搜索的格式。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大量的审计归档文件会消耗非常重要的数据库存储,因为它们存储在与数据库相同的服务器上。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 使用应用程序来负责审计日志"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要实现这一点,你可以采用如下的方案之一:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"a.在更新现有的数据之前,复制现有的数据到另外一个表中,然后再更新当前表中的数据。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"b.为数据添加一个版本号,然后每次更新都会插入一条已递增版本号的数据。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"c.写入到两个数据库表中,其中一张表包含最新的数据,另外一张表包含审计跟踪信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作为设计可扩展系统的一项原则,我们必须要避免多次写入相同的数据,因为这不仅会降低系统的性能,还会引发各种数据不同步的问题。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"那么企业为什么需要审计数据呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在开始介绍审计日志系统的架构之前,我们首先看一下各种组织对审计日志系统的一些需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"合规性和审计:审计人员需要从他们的角度出发,以有意义和相关的方式获取数据。数据库审计日志适用于DBA团队,但并不适合审计人员。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"对于任何大型软件来说,一个最基本的需求就是能够在遇到安全漏洞的时候生成重要的告警。审计日志可以用来实现这一点。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你必须回答各种问题,比如谁访问了数据,数据在此之前的状态是什么,在更新的时候都修改了哪些内容以及内部用户是否滥用了权限等等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"还有很重要的一点需要注意,因为审计跟踪信息能够有助于识别渗透者,这能够强化对“内部人员”的威慑力。人们如果知道自己的行为会被审查,那么他们就不太可能会访问未经授权的数据库或篡改特定的数据。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所有的行业,从金融和能源到餐饮服务和公共项目,都需要分析数据访问情况,并定期向各种政府机构提交详细的报告。根据“健康保险流通与责任法案(Health Insurance Portability and Accountability Act,HIPAA)”,该法案要求医疗服务供应商提供所有接触他们数据记录的每个人的审计跟踪数据,这个要求要到数据行和记录级别。新的欧盟通用数据保护条例(European Union General Data Protection Regulation,GDPR)也有类似的需求。萨班斯-奥克斯利法案(Sarbanes-Oxley Act,SOX)对公众公司提出了广泛的会计法规。这些组织需要定期分析数据访问情况并生成详细的报告。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本文中,我将会使用像Maxwell’s Daemon和Kafka这样的技术提供一个可扩展的方案,以管理审计跟踪数据。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"问题陈述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"构建一个独立于应用程序和数据模型的审计系统。该系统必须要具备可扩展性并且经济划算。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"架构"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"重要提示:本系统只适用于使用MySQL数据库的情况,并且使用基于ROW的"},{"type":"link","attrs":{"href":"https:\/\/dev.mysql.com\/doc\/refman\/5.7\/en\/binary-log-formats.html","title":"","type":null},"content":[{"type":"text","text":"binlog日志格式"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我们讨论解决方案的细节之前,我们先快速看一下本文中所讨论的每项技术。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"text","text":"(MD)是一个来自"},{"type":"link","attrs":{"href":"https:\/\/www.zendesk.com\/","title":"","type":null},"content":[{"type":"text","text":"Zendesk"}]},{"type":"text","text":"的开源项目,它会读取MySQL bin日志并将ROW更新以JSON的格式写入到Kafka、Kinesis或其他流平台上。Maxwell的运维开销非常低,除了MySQL和一些写入数据的地方之外,就没有其他的需求了,如"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"项目网站"}]},{"type":"text","text":"所述。简而言之,MD是一个数据变化捕获(Change-Data-Capture,CDC)的工具。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"市场上有很多可用的CDC变种,比如Redhat的Debezium、Netflix的DBLog以及LinkedIn的Brooklyn。我们这里的环境可以采用这些工具中的任意一个来实现。但是,Netflix的DBLog以及LinkedIn的Brooklyn是为了满足不足的使用场景而开发的,正如上述的链接中所阐述的那样。不过,Debezium与MD非常类似,可以用来取代我们的架构中的MD。关于该选择MD还是Debezium,我简要列出了几件需要考虑的事情。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Debezium只能写入数据到Kafka中,至少这是它支持的主要的生产者。而MD支持各种生产者,包括Kafka。MD支持的生产者是afka, "},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/kinesis\/","title":"","type":null},"content":[{"type":"text","text":"Kinesis"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/cloud.google.com\/pubsub\/docs\/overview","title":"","type":null},"content":[{"type":"text","text":"Google Cloud Pub\/Sub"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sqs\/","title":"","type":null},"content":[{"type":"text","text":"SQS"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.rabbitmq.com\/","title":"","type":null},"content":[{"type":"text","text":"Rabbit MQ"}]},{"type":"text","text":"和Redis。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD提供了编写自己的生产者并对其进行配置的方案。详情可参考该"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/producers\/","title":"","type":null},"content":[{"type":"text","text":"文档"}]},{"type":"text","text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Debezium的优势在于它可以从多个源读取变化数据,比如"},{"type":"link","attrs":{"href":"https:\/\/www.mysql.com\/","title":"","type":null},"content":[{"type":"text","text":"MySQL"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.mongodb.com\/","title":"","type":null},"content":[{"type":"text","text":"MongoDB"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.postgresql.org\/","title":"","type":null},"content":[{"type":"text","text":"PostgreSQL"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.microsoft.com\/en-in\/sql-server\/","title":"","type":null},"content":[{"type":"text","text":"SQL Server"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"http:\/\/cassandra.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Cassandra"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.ibm.com\/in-en\/products\/db2-database","title":"","type":null},"content":[{"type":"text","text":"DB2"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/www.oracle.com\/index.html","title":"","type":null},"content":[{"type":"text","text":"Oracle"}]},{"type":"text","text":"。在添加新的数据源方面,他们非常活跃。而MD目前只支持MySQL数据源。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Apache Kafka"}]},{"type":"text","text":"是一个开源的分布式事件流平台,能够用于高性能的数据管道、流分析、数据集成和任务关键型的应用。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"MongoDB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.mongodb.com\/","title":"","type":null},"content":[{"type":"text","text":"MongoDB"}]},{"type":"text","text":"是一个通用的、基于文档的分布式数据库,它是为现代应用开发人员和云时代所构建的。我们使用MongoDB只是为了进行阐述,你可以选择其他的方案,比如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/s3\/","title":"","type":null},"content":[{"type":"text","text":"S3"}]},{"type":"text","text":",也可以选择其他的时序数据库如"},{"type":"link","attrs":{"href":"https:\/\/www.influxdata.com\/","title":"","type":null},"content":[{"type":"text","text":"InfluxDB"}]},{"type":"text","text":"或"},{"type":"link","attrs":{"href":"http:\/\/cassandra.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Cassandra"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下图展示了审计跟踪方案的数据流图。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/res.infoq.com\/articles\/database-audit-system-kafka\/en\/resources\/1Figure-1-Data-flow-diagram-1609154417022.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"图1 数据流图"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在审计跟踪管理系统中,要涉及到如下几个步骤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"应用程序执行数据库写入、更新或删除操作。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"SQL数据库将会以ROW格式为这些操作生成bin日志。这是SQL数据库相关的配置。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"Maxwell’s Daemon轮询SQL bin日志,读取新的条目并将其写入到Kafka主题中。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"消费者应用轮询Kafka主题以读取数据并进行处理。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"消费者将处理后的数据写入到新的数据存储中。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"环境搭建"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"为了实现简便的环境搭建,我们在所有可能的地方都尽可能使用Docker容器。如果你的机器还没有安装docker的话,那么可以考虑安装"},{"type":"link","attrs":{"href":"https:\/\/www.docker.com\/products\/docker-desktop","title":"","type":null},"content":[{"type":"text","text":"Docker Desktop"}]},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"MySQL数据库"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.在本地运行mysql服务器。如下的命令将会在3307端口启动一个mysql容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker run -p 3307:3306 -p 33061:33060 --name=mysql83 -d mysql\/mysql-server:latest\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.如果这是全新安装的话,我们并不知道root密码,运行如下的命令在控制台打印密码出来。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker logs mysql83 2>&1 | grep GENERATED\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3.如果需要的话,登录容器并更改密码。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\nalter user 'root'@'localhost' IDENTIFIED BY 'abcd1234'\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4.处于安全的原因,mysql docker容器默认不允许从外部应用进行连接。我们需要运行如下的命令改变这一点。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"update mysql.user set host = '%' where user='root';\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"5.从mysql提示窗口退出并重启docker容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker container restart mysql83\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"6.重新登录mysql客户端并运行如下的命令为maxwell’s daemon创建用户。关于该步骤的详细信息,请参考"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/quickstart\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon的快速指南"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\nset global binlog_format=ROW;\nset global binlog_row_image=FULL;\nCREATE USER 'maxwell'@'%' IDENTIFIED BY 'pmaxwell';\nGRANT ALL ON maxwell.* TO 'maxwell'@'%';\nGRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'%';\nCREATE USER 'maxwell'@'localhost' IDENTIFIED BY 'pmaxwell';\nGRANT ALL ON maxwell.* TO 'maxwell'@'localhost';\nGRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'localhost';\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka代理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"搭建Kafka是一项非常简单直接的任务。从"},{"type":"link","attrs":{"href":"https:\/\/www.apache.org\/dyn\/closer.cgi?path=\/kafka\/2.6.0\/kafka_2.13-2.6.0.tgz","title":"","type":null},"content":[{"type":"text","text":"该链接"}]},{"type":"text","text":"下载Kafka。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"运行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提取Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"tar -xzf kafka_2.13-2.6.0.tgz\ncd kafka_2.13-2.6.0\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"运行Zookeeper,这是目前使用Kafka所需要的"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/zookeeper-server-start.sh config\/zookeeper.properties\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一个单独的终端启动Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-server-start.sh config\/server.properties\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一个单独的终端创建主题"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-topics.sh --create --topic maxwell-events --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述的命令会启动一个Kafka代理并在其中创建一个名为“"},{"type":"text","marks":[{"type":"strong"}],"text":"maxwell-events"},{"type":"text","text":"”的主题。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要推送消息到该Kafka主题,我们可以在新的终端运行如下的命令"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-producer.sh --topic maxwell-events --broker-list localhost:9092\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述的命令会给我们显示一个提示,从中可以输入消息内容,然后点击回车键,以便于发送消息到Kafka中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消费来自Kafka主题的消息"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-producer.sh --topic quickstart-events --broker-list localhost:9092\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通过该"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/quickstart\/#download","title":"","type":null},"content":[{"type":"text","text":"地址"}]},{"type":"text","text":"下载maxwell’s daemon。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"将其解压并运行如下的命令。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/maxwell --user=maxwell --password=pmaxwell --host=localhost --port=3307 --producer=kafka --kafka.bootstrap.servers=localhost:9092 --kafka_topic=maxwell-events\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这样的话,我们就建立好了Maxwell来监控前面所搭建的数据库的bin日志。当然,我们也可以只监控几个数据库或几个表。关于这方面的更多信息,请参考"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/bootstrapping\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon配置"}]},{"type":"text","text":"文档。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"测试环境"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要测试搭建的环境是否正确的话,我们可以连接MySQL,并在一张表中插入一些数据。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\n\nCREATE DATABASE maxwelltest;\n\nUSE maxwelltest;\n\nCREATE TABLE Persons (\n PersonId int NOT NULL AUTO_INCREMENT,\n LastName varchar(255),\n FirstName varchar(255),\n City varchar(255),\n primary key (PersonId)\n\n);\n\nINSERT INTO Persons (LastName, FirstName, City) VALUES ('Erichsen', 'Tom', 'Stavanger');\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在,在另外一个终端中,运行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-consumer.sh --topic maxwell-events --from-beginning --bootstrap-server localhost:9092\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在终端中,你应该能够看到如下所示的内容:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"{\"database\":\"maxwelltest\",\"table\":\"Persons\",\"type\":\"insert\",\"ts\":1602904030,\"xid\":17358,\"commit\":true,\"data\":{\"PersonId\":1,\"LastName\":\"Erichsen\",\"FirstName\":\"Tom\",\"City\":\"Stavanger\"}}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如我们所看到的,Maxwell’s Daemon捕获到了数据库插入事件并写入一个JSON字符串到Kafka主题中,其中包含了事件的详情。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"搭建MongoDB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要在本地运行MongoDB,可以运行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker run --name mongolocal -p 27017:27017 mongo:latest\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka消费者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka-consumer的代码可以通过"},{"type":"link","attrs":{"href":"https:\/\/github.com\/vishalsinha27\/kmaxwell","title":"","type":null},"content":[{"type":"text","text":"GitHub项目"}]},{"type":"text","text":"获取。下载源码并参考README文档以了解如何运行。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"最终测试"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最后,我们的环境搭建终于完成了。登录MySQL数据库并运行任意的插入、删除或更新命令。如果环境搭建正确的话,将会在mongodb auditlog数据库中看到相应的条目。我们可以愉快地开始进行审计了!"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"结论"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本文中所描述的系统在实际部署中能够很好地运行,为我们提供了一个用户数据之外的额外数据源,但是在采用这种架构之前,有些权衡你必须要注意。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"基础设施成本:要运行这种环境,需要额外的基础设施。数据要经历网络上的多次跳转,从数据库到Kafka,再到另外一个数据库,后面可能还会到一个备份中。这会增加基础设施的成本。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"因为数据要经历多次跳转,审计日志无法以实时的形式进行维护。它可能会延迟几秒到几分钟。我们可能会反问“谁能需要实时的审计日志呢?”但是,如果你计划使用这种数据进行实时监控的话,必须要考虑到这一点。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"在这个架构中,我们捕获了数据的变化,而不是谁改变了数据。如果你还关心哪个数据库用户改变了数据的话,那么这种设计就不能提供直接的支持了。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在强调完这种架构的一些权衡之后,我想重申一下这种环境的收益,它的主要好处在于:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这种环境减少了数据库在审计日志方面的性能损耗,并且满足传统数据源在市场营销和告警方面的需要。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"易于搭建,并且比较健壮:环境中任意组件的任意问题都不会造成数据的丢失。例如,如果MD出现故障的话,数据依然会保存在bin日志文件中,当daemon下次运行的时候,能够从上次处理的地方继续读取。如果Kafka代理出现故障的话,MD能够探测到并且会停止从bin日志中读取数据。如果Kafka消费者崩溃的话,数据会依然保留在Kafka代理中。所以,在最糟糕的情况下,审计日志会延迟但是不会出现数据丢失。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"环境搭建过程非常简单,并不需要耗费太多的开发精力。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者简介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Vishal Sinha 是一位充满激情的技术专家,对分布式计算和大型可扩展系统有着专业的知识和浓厚的兴趣。目前,他在一家领先的印度独角兽公司担任技术总监。在16年的软件行业生涯中,他曾在多家跨国公司和创业公司工作,开发过各种大规模的系统,并领导过一个由众多软件工程师组成的团队。他喜欢解决复杂的问题和尝试新技术。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文链接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/articles\/database-audit-system-kafka\/","title":"","type":null},"content":[{"type":"text","text":"Building a SQL Database Audit System using Kafka, MongoDB and Maxwell's Daemon"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章