如何使用Kafka、MongoDB和Maxwell’s Daemon構建SQL數據庫的審計系統

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"本文要點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"審計日誌系統有很多應用場景,而不僅僅是存儲用於審計目的的數據。除了合規性和安全性的目的之外,它還能夠被市場營銷團隊使用,以便於鎖定目標用戶,也可以用來生成重要的告警。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據庫內置的審計日誌功能可能並不夠用,要處理所有的用戶場景,它肯定不是理想的方式。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,有很多的開源工具,如"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemons"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/debezium.io\/","title":"","type":null},"content":[{"type":"text","text":"Debezium"}]},{"type":"text","text":",它們能夠以最少的基礎設施和時間需求支持這些需求。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Maxwell’s daemons能夠讀取SQL bin日誌併發送事件到各種生產者,比如"},{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Kafka"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/kinesis\/","title":"","type":null},"content":[{"type":"text","text":"Amazon Kinesis"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sqs\/","title":"","type":null},"content":[{"type":"text","text":"SQS"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.rabbitmq.com\/","title":"","type":null},"content":[{"type":"text","text":"Rabbit MQ"}]},{"type":"text","text":"等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SQL數據庫生成的bin日誌必須是基於ROW的格式,這樣才能使整個環境運行起來。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假設你正在使用關係型數據來維護事務性數據並且你需要存儲某些數據的審計跟蹤信息,而這些數據本身是以表的形式存在的。如果你像大多數開發人員那樣,那麼最終所採用的方案可能如下所示:"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 使用數據庫的審計日誌功能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大多數數據庫都提供了插件來支持審計日誌。這些插件可以很容易地安裝和配置,以便於記錄數據。但是,這種方式存在如下的問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"完整的審計日誌插件一般只有企業級版本才提供。社區版可能會缺失這樣的插件。以MySQL爲例,"},{"type":"link","attrs":{"href":"https:\/\/dev.mysql.com\/doc\/refman\/5.7\/en\/audit-log.html","title":"","type":null},"content":[{"type":"text","text":"審計日誌插件"}]},{"type":"text","text":"只有企業版中才能使用。值得一提的是,MySQL社區版的用戶依然可以安裝來自MariaDB或Percona的其他審計日誌組件以繞過這個限制。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據庫級別的審計日誌會導致數據庫服務器10-20%的額外負載,正如"},{"type":"link","attrs":{"href":"http:\/\/blog.symedia.pl\/2016\/10\/performance-impact-general-slow-query-log.html","title":"","type":null},"content":[{"type":"text","text":"該文"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/www.percona.com\/blog\/2009\/02\/10\/impact-of-logging-on-mysql%E2%80%99s-performance","title":"","type":null},"content":[{"type":"text","text":"該文"}]},{"type":"text","text":"所討論的。通常,對於高負載的系統,我們可能想要僅對較慢的查詢啓用審計日誌,而不是針對所有的查詢。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"審計日誌會寫入到日誌文件中,數據不易於搜索。爲了實現數據分析和審計的目的,我們可能想要審計數據能夠遵循可搜索的格式。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大量的審計歸檔文件會消耗非常重要的數據庫存儲,因爲它們存儲在與數據庫相同的服務器上。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 使用應用程序來負責審計日誌"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要實現這一點,你可以採用如下的方案之一:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"a.在更新現有的數據之前,複製現有的數據到另外一個表中,然後再更新當前表中的數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"b.爲數據添加一個版本號,然後每次更新都會插入一條已遞增版本號的數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"c.寫入到兩個數據庫表中,其中一張表包含最新的數據,另外一張表包含審計跟蹤信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作爲設計可擴展系統的一項原則,我們必須要避免多次寫入相同的數據,因爲這不僅會降低系統的性能,還會引發各種數據不同步的問題。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"那麼企業爲什麼需要審計數據呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開始介紹審計日誌系統的架構之前,我們首先看一下各種組織對審計日誌系統的一些需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"合規性和審計:審計人員需要從他們的角度出發,以有意義和相關的方式獲取數據。數據庫審計日誌適用於DBA團隊,但並不適合審計人員。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於任何大型軟件來說,一個最基本的需求就是能夠在遇到安全漏洞的時候生成重要的告警。審計日誌可以用來實現這一點。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你必須回答各種問題,比如誰訪問了數據,數據在此之前的狀態是什麼,在更新的時候都修改了哪些內容以及內部用戶是否濫用了權限等等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還有很重要的一點需要注意,因爲審計跟蹤信息能夠有助於識別滲透者,這能夠強化對“內部人員”的威懾力。人們如果知道自己的行爲會被審查,那麼他們就不太可能會訪問未經授權的數據庫或篡改特定的數據。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所有的行業,從金融和能源到餐飲服務和公共項目,都需要分析數據訪問情況,並定期向各種政府機構提交詳細的報告。根據“健康保險流通與責任法案(Health Insurance Portability and Accountability Act,HIPAA)”,該法案要求醫療服務供應商提供所有接觸他們數據記錄的每個人的審計跟蹤數據,這個要求要到數據行和記錄級別。新的歐盟通用數據保護條例(European Union General Data Protection Regulation,GDPR)也有類似的需求。薩班斯-奧克斯利法案(Sarbanes-Oxley Act,SOX)對公衆公司提出了廣泛的會計法規。這些組織需要定期分析數據訪問情況並生成詳細的報告。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本文中,我將會使用像Maxwell’s Daemon和Kafka這樣的技術提供一個可擴展的方案,以管理審計跟蹤數據。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"問題陳述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"構建一個獨立於應用程序和數據模型的審計系統。該系統必須要具備可擴展性並且經濟划算。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"重要提示:本系統只適用於使用MySQL數據庫的情況,並且使用基於ROW的"},{"type":"link","attrs":{"href":"https:\/\/dev.mysql.com\/doc\/refman\/5.7\/en\/binary-log-formats.html","title":"","type":null},"content":[{"type":"text","text":"binlog日誌格式"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我們討論解決方案的細節之前,我們先快速看一下本文中所討論的每項技術。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"text","text":"(MD)是一個來自"},{"type":"link","attrs":{"href":"https:\/\/www.zendesk.com\/","title":"","type":null},"content":[{"type":"text","text":"Zendesk"}]},{"type":"text","text":"的開源項目,它會讀取MySQL bin日誌並將ROW更新以JSON的格式寫入到Kafka、Kinesis或其他流平臺上。Maxwell的運維開銷非常低,除了MySQL和一些寫入數據的地方之外,就沒有其他的需求了,如"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/","title":"","type":null},"content":[{"type":"text","text":"項目網站"}]},{"type":"text","text":"所述。簡而言之,MD是一個數據變化捕獲(Change-Data-Capture,CDC)的工具。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"市場上有很多可用的CDC變種,比如Redhat的Debezium、Netflix的DBLog以及LinkedIn的Brooklyn。我們這裏的環境可以採用這些工具中的任意一個來實現。但是,Netflix的DBLog以及LinkedIn的Brooklyn是爲了滿足不足的使用場景而開發的,正如上述的鏈接中所闡述的那樣。不過,Debezium與MD非常類似,可以用來取代我們的架構中的MD。關於該選擇MD還是Debezium,我簡要列出了幾件需要考慮的事情。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Debezium只能寫入數據到Kafka中,至少這是它支持的主要的生產者。而MD支持各種生產者,包括Kafka。MD支持的生產者是afka, "},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/kinesis\/","title":"","type":null},"content":[{"type":"text","text":"Kinesis"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/cloud.google.com\/pubsub\/docs\/overview","title":"","type":null},"content":[{"type":"text","text":"Google Cloud Pub\/Sub"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/sqs\/","title":"","type":null},"content":[{"type":"text","text":"SQS"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.rabbitmq.com\/","title":"","type":null},"content":[{"type":"text","text":"Rabbit MQ"}]},{"type":"text","text":"和Redis。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD提供了編寫自己的生產者並對其進行配置的方案。詳情可參考該"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/producers\/","title":"","type":null},"content":[{"type":"text","text":"文檔"}]},{"type":"text","text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Debezium的優勢在於它可以從多個源讀取變化數據,比如"},{"type":"link","attrs":{"href":"https:\/\/www.mysql.com\/","title":"","type":null},"content":[{"type":"text","text":"MySQL"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.mongodb.com\/","title":"","type":null},"content":[{"type":"text","text":"MongoDB"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.postgresql.org\/","title":"","type":null},"content":[{"type":"text","text":"PostgreSQL"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.microsoft.com\/en-in\/sql-server\/","title":"","type":null},"content":[{"type":"text","text":"SQL Server"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"http:\/\/cassandra.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Cassandra"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.ibm.com\/in-en\/products\/db2-database","title":"","type":null},"content":[{"type":"text","text":"DB2"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/www.oracle.com\/index.html","title":"","type":null},"content":[{"type":"text","text":"Oracle"}]},{"type":"text","text":"。在添加新的數據源方面,他們非常活躍。而MD目前只支持MySQL數據源。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Apache Kafka"}]},{"type":"text","text":"是一個開源的分佈式事件流平臺,能夠用於高性能的數據管道、流分析、數據集成和任務關鍵型的應用。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"MongoDB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.mongodb.com\/","title":"","type":null},"content":[{"type":"text","text":"MongoDB"}]},{"type":"text","text":"是一個通用的、基於文檔的分佈式數據庫,它是爲現代應用開發人員和雲時代所構建的。我們使用MongoDB只是爲了進行闡述,你可以選擇其他的方案,比如"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/s3\/","title":"","type":null},"content":[{"type":"text","text":"S3"}]},{"type":"text","text":",也可以選擇其他的時序數據庫如"},{"type":"link","attrs":{"href":"https:\/\/www.influxdata.com\/","title":"","type":null},"content":[{"type":"text","text":"InfluxDB"}]},{"type":"text","text":"或"},{"type":"link","attrs":{"href":"http:\/\/cassandra.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"Cassandra"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖展示了審計跟蹤方案的數據流圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/res.infoq.com\/articles\/database-audit-system-kafka\/en\/resources\/1Figure-1-Data-flow-diagram-1609154417022.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"圖1 數據流圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在審計跟蹤管理系統中,要涉及到如下幾個步驟。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"應用程序執行數據庫寫入、更新或刪除操作。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"SQL數據庫將會以ROW格式爲這些操作生成bin日誌。這是SQL數據庫相關的配置。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"Maxwell’s Daemon輪詢SQL bin日誌,讀取新的條目並將其寫入到Kafka主題中。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"消費者應用輪詢Kafka主題以讀取數據並進行處理。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"消費者將處理後的數據寫入到新的數據存儲中。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"環境搭建"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了實現簡便的環境搭建,我們在所有可能的地方都儘可能使用Docker容器。如果你的機器還沒有安裝docker的話,那麼可以考慮安裝"},{"type":"link","attrs":{"href":"https:\/\/www.docker.com\/products\/docker-desktop","title":"","type":null},"content":[{"type":"text","text":"Docker Desktop"}]},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"MySQL數據庫"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.在本地運行mysql服務器。如下的命令將會在3307端口啓動一個mysql容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker run -p 3307:3306 -p 33061:33060 --name=mysql83 -d mysql\/mysql-server:latest\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.如果這是全新安裝的話,我們並不知道root密碼,運行如下的命令在控制檯打印密碼出來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker logs mysql83 2>&1 | grep GENERATED\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3.如果需要的話,登錄容器並更改密碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\nalter user 'root'@'localhost' IDENTIFIED BY 'abcd1234'\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4.處於安全的原因,mysql docker容器默認不允許從外部應用進行連接。我們需要運行如下的命令改變這一點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"update mysql.user set host = '%' where user='root';\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"5.從mysql提示窗口退出並重啓docker容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker container restart mysql83\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"6.重新登錄mysql客戶端並運行如下的命令爲maxwell’s daemon創建用戶。關於該步驟的詳細信息,請參考"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/quickstart\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon的快速指南"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\nset global binlog_format=ROW;\nset global binlog_row_image=FULL;\nCREATE USER 'maxwell'@'%' IDENTIFIED BY 'pmaxwell';\nGRANT ALL ON maxwell.* TO 'maxwell'@'%';\nGRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'%';\nCREATE USER 'maxwell'@'localhost' IDENTIFIED BY 'pmaxwell';\nGRANT ALL ON maxwell.* TO 'maxwell'@'localhost';\nGRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'localhost';\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka代理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"搭建Kafka是一項非常簡單直接的任務。從"},{"type":"link","attrs":{"href":"https:\/\/www.apache.org\/dyn\/closer.cgi?path=\/kafka\/2.6.0\/kafka_2.13-2.6.0.tgz","title":"","type":null},"content":[{"type":"text","text":"該鏈接"}]},{"type":"text","text":"下載Kafka。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提取Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"tar -xzf kafka_2.13-2.6.0.tgz\ncd kafka_2.13-2.6.0\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運行Zookeeper,這是目前使用Kafka所需要的"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/zookeeper-server-start.sh config\/zookeeper.properties\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一個單獨的終端啓動Kafka"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-server-start.sh config\/server.properties\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在一個單獨的終端創建主題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-topics.sh --create --topic maxwell-events --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述的命令會啓動一個Kafka代理並在其中創建一個名爲“"},{"type":"text","marks":[{"type":"strong"}],"text":"maxwell-events"},{"type":"text","text":"”的主題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要推送消息到該Kafka主題,我們可以在新的終端運行如下的命令"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-producer.sh --topic maxwell-events --broker-list localhost:9092\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上述的命令會給我們顯示一個提示,從中可以輸入消息內容,然後點擊回車鍵,以便於發送消息到Kafka中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"消費來自Kafka主題的消息"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-producer.sh --topic quickstart-events --broker-list localhost:9092\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Maxwell’s Daemon"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過該"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/quickstart\/#download","title":"","type":null},"content":[{"type":"text","text":"地址"}]},{"type":"text","text":"下載maxwell’s daemon。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將其解壓並運行如下的命令。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/maxwell --user=maxwell --password=pmaxwell --host=localhost --port=3307 --producer=kafka --kafka.bootstrap.servers=localhost:9092 --kafka_topic=maxwell-events\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣的話,我們就建立好了Maxwell來監控前面所搭建的數據庫的bin日誌。當然,我們也可以只監控幾個數據庫或幾個表。關於這方面的更多信息,請參考"},{"type":"link","attrs":{"href":"https:\/\/maxwells-daemon.io\/bootstrapping\/","title":"","type":null},"content":[{"type":"text","text":"Maxwell’s Daemon配置"}]},{"type":"text","text":"文檔。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"測試環境"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要測試搭建的環境是否正確的話,我們可以連接MySQL,並在一張表中插入一些數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker exec -it mysql83 mysql -uroot -p\n\nCREATE DATABASE maxwelltest;\n\nUSE maxwelltest;\n\nCREATE TABLE Persons (\n PersonId int NOT NULL AUTO_INCREMENT,\n LastName varchar(255),\n FirstName varchar(255),\n City varchar(255),\n primary key (PersonId)\n\n);\n\nINSERT INTO Persons (LastName, FirstName, City) VALUES ('Erichsen', 'Tom', 'Stavanger');\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在,在另外一個終端中,運行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"bin\/kafka-console-consumer.sh --topic maxwell-events --from-beginning --bootstrap-server localhost:9092\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在終端中,你應該能夠看到如下所示的內容:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"{\"database\":\"maxwelltest\",\"table\":\"Persons\",\"type\":\"insert\",\"ts\":1602904030,\"xid\":17358,\"commit\":true,\"data\":{\"PersonId\":1,\"LastName\":\"Erichsen\",\"FirstName\":\"Tom\",\"City\":\"Stavanger\"}}\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如我們所看到的,Maxwell’s Daemon捕獲到了數據庫插入事件並寫入一個JSON字符串到Kafka主題中,其中包含了事件的詳情。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"搭建MongoDB"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要在本地運行MongoDB,可以運行如下的命令:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"docker run --name mongolocal -p 27017:27017 mongo:latest\n"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kafka消費者"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka-consumer的代碼可以通過"},{"type":"link","attrs":{"href":"https:\/\/github.com\/vishalsinha27\/kmaxwell","title":"","type":null},"content":[{"type":"text","text":"GitHub項目"}]},{"type":"text","text":"獲取。下載源碼並參考README文檔以瞭解如何運行。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"最終測試"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,我們的環境搭建終於完成了。登錄MySQL數據庫並運行任意的插入、刪除或更新命令。如果環境搭建正確的話,將會在mongodb auditlog數據庫中看到相應的條目。我們可以愉快地開始進行審計了!"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結論"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本文中所描述的系統在實際部署中能夠很好地運行,爲我們提供了一個用戶數據之外的額外數據源,但是在採用這種架構之前,有些權衡你必須要注意。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"基礎設施成本:要運行這種環境,需要額外的基礎設施。數據要經歷網絡上的多次跳轉,從數據庫到Kafka,再到另外一個數據庫,後面可能還會到一個備份中。這會增加基礎設施的成本。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"因爲數據要經歷多次跳轉,審計日誌無法以實時的形式進行維護。它可能會延遲幾秒到幾分鐘。我們可能會反問“誰能需要實時的審計日誌呢?”但是,如果你計劃使用這種數據進行實時監控的話,必須要考慮到這一點。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"在這個架構中,我們捕獲了數據的變化,而不是誰改變了數據。如果你還關心哪個數據庫用戶改變了數據的話,那麼這種設計就不能提供直接的支持了。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在強調完這種架構的一些權衡之後,我想重申一下這種環境的收益,它的主要好處在於:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種環境減少了數據庫在審計日誌方面的性能損耗,並且滿足傳統數據源在市場營銷和告警方面的需要。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"易於搭建,並且比較健壯:環境中任意組件的任意問題都不會造成數據的丟失。例如,如果MD出現故障的話,數據依然會保存在bin日誌文件中,當daemon下次運行的時候,能夠從上次處理的地方繼續讀取。如果Kafka代理出現故障的話,MD能夠探測到並且會停止從bin日誌中讀取數據。如果Kafka消費者崩潰的話,數據會依然保留在Kafka代理中。所以,在最糟糕的情況下,審計日誌會延遲但是不會出現數據丟失。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"環境搭建過程非常簡單,並不需要耗費太多的開發精力。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Vishal Sinha 是一位充滿激情的技術專家,對分佈式計算和大型可擴展系統有着專業的知識和濃厚的興趣。目前,他在一家領先的印度獨角獸公司擔任技術總監。在16年的軟件行業生涯中,他曾在多家跨國公司和創業公司工作,開發過各種大規模的系統,並領導過一個由衆多軟件工程師組成的團隊。他喜歡解決複雜的問題和嘗試新技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/articles\/database-audit-system-kafka\/","title":"","type":null},"content":[{"type":"text","text":"Building a SQL Database Audit System using Kafka, MongoDB and Maxwell's Daemon"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章