深度探索Hadoop分佈式文件系統(HDFS)數據讀取流程

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 開篇","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hadoop分佈式文件系統(HDFS)是Hadoop大數據生態最底層的數據存儲設施。因其具備了海量數據分佈式存儲能力,針對不同批處理業務的大吞吐數據計算承載力,使其綜合複雜度要遠遠高於其他數據存儲系統。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此對Hadoop分佈式文件系統(HDFS)的深入研究,瞭解其架構特徵、讀寫流程、分區模式、高可用思想、數據存儲規劃等知識,對學習大數據技術大有裨益,尤其是面臨開發生產環境時,能做到胸中有數。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文重點從客戶端讀取HDFS數據的角度切入,通過Hadoop源代碼跟蹤手段,層層撥開,漸漸深入Hadoop機制內部,使其讀取流程逐漸明朗化。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. HDFS數據讀取整體架構流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a2/a29439d647f693a701769d73f0d1e98a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":" HDFS數據訪問整體架構流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖所示:描繪了客戶端訪問HDFS數據的簡化後整體架構流程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) 客戶端向hdfs namenode節點發送Path文件路徑的數據訪問的請求","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) Namenode會根據文件路徑收集所有數據塊(block)的位置信息,並根據數據塊在文件中的先後順序,按次序組成數據塊定位集合(located blocks),迴應給客戶端","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3) 客戶端拿到數據塊定位集合後,創建HDFS輸入流,定位第一個數據塊所在的位置,並讀取datanode的數據流。之後根據讀取偏移量定位下一個datanode並創建新的數據塊讀取數據流,以此類推,完成對HDFS文件的整個讀取。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. Hadoop源代碼分析","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過上述簡單描述,我們對客戶端讀取HDFS文件數據有了一個整體上概念,那麼這一節,我們開始從源代碼跟蹤的方向,深度去分析一下HDFS的數據訪問內部機制。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(一) namenode代理類生成的源代碼探索","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼我們要先從namenode代理生成說起呢?原因就是先了解清楚客戶端與namenode之間的來龍去脈,再看之後的數據獲取過程就有頭緒了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) 首先我們先從一個hdfs-site.xml配置看起","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"html"},"content":[{"type":"text","text":"\n \n dfs.client.failover.proxy.provider.fszx\n org.apache.hadoop.hdfs.server.namenode.ha.\n ConfiguredFailoverProxyProvider\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"配置中定義了namenode代理的提供者爲ConfiguredFailoverProxyProvider。什麼叫namenode代理?其實本質上就是連接namenode服務的客戶端網絡通訊對象,用於客戶端和namenode服務端的交流。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) 接着我們看看ConfiguredFailoverProxyProvider的源代碼繼承關係結構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/02/02946eeea933fbff9bc896ba1b7566c1.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"ConfiguredFailoverProxyProvider繼承關係圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是ConfiguredFailoverProxyProvider的繼承關係,頂端接口是FailoverProxyProvider,它包含了一段代碼:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":" /**\n * Get the proxy object which should be used until the next failover event\n * occurs.\n * @return the proxy object to invoke methods upon\n */\n public ProxyInfo getProxy();","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個方法返回的ProxyInfo就是namenode代理對象,當然客戶端獲取的ProxyInfo整個過程非常複雜,甚至還用到了動態代理,但本質上就是通過此接口拿到了namenode代理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3) 此時類關係演化成如下圖所示:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/1c/1cfd58ea29192a69ff092a46a1207c3f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"namonode創建過程類關係圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖ProxyInfo就是namenode的代理類,繼承的子類NNProxyInfo就是具體指定是高可用代理類。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(4) 那麼費了這麼大勁搞清楚的namenode代理,它的作用在哪裏呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就需要關注一個極爲重要的對象DFSClient了,它是所有客戶端向HDFS發起輸入輸出流的起點,如下圖所示:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6a/6aa0c2ef2c01321a146e7515707a443a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"DFSClient初始化過程類關係圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖實線代表了真實的調用過程,虛線代表了對象之間的間接關係。我們可以看到DFSClient是一個關鍵角色,它由分佈式文件系統對象(DistributeFileSystem)初始化,並在初始化中調用NameNodeProxiesClient等一系列操作,實現了高可用NNproxyInfo對象創建,也就是namenode代理,並最終作爲DFSClient對象的一個成員,在創建數據流等過程中使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(二) 讀取文件流的深入源代碼探索","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) 首先方法一樣,先找一個切入口。建立從HDFS下載文件到本地的一個簡單場景,以下是代碼片段:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"……\n//打開HDFS文件輸入流\ninput = fileSystem.open(new Path(hdfs_file_path));\n//創建本地文件輸出流\noutput = new FileOutputStream(local_file_path);\n//通過IOUtils工具實現數據流字節循環複製\nIOUtils.copyBytes(input, output, 4096, true);\n……","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"咱們再看看IOUtils的一段文件流讀寫的方法代碼:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"/**\n * Copies from one stream to another.\n * \n * @param in InputStrem to read from\n * @param out OutputStream to write to\n * @param buffSize the size of the buffer \n */\n public static void copyBytes(InputStream in, OutputStream out, int buffSize) \n throws IOException {\n PrintStream ps = out instanceof PrintStream ? (PrintStream)out : null;\n byte buf[] = new byte[buffSize];\n int bytesRead = in.read(buf);\n while (bytesRead >= 0) {\n out.write(buf, 0, bytesRead);\n if ((ps != null) && ps.checkError()) {\n throw new IOException(\"Unable to write to output stream.\");\n }\n bytesRead = in.read(buf);\n }\n }","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這段代碼是個標準的循環讀取HDFS InputStream數據流,然後向本地文件OutputStream輸出流寫數據的過程。我們的目標是深入到HDFS InputStream數據流的創建和使用過程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) 接下來我們開始分析InputStream的產生過程,如下圖所示:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ec/ec8a36797060fa31ba13f7fc985e9d40.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"InputStream打開流程類關係圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖實線代表了真實的調用過程,虛線代表了對象之間的間接關係。其代碼內部結構極爲複雜,我用此圖用最簡化的方式讓我們能快速的理解清楚他的原理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我來簡單講解一下這個過程:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一步是DistributeFileSystem通過調用DFSClient對象的open方法,實現對DFSInputStream對象的創建,DFSInputStream對象是真正讀取數據塊(LocationBlock)以及與datanode交互的實現邏輯,是真正的核心類。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二步,DFSClient在創建DFSInputStream的過程中,需要爲其傳入調用namenode代理而返回的數據塊集合(LocationBlocks)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三步,DFSClient創建一個裝飾器類HDFSDataInputStream,封裝了DFSInputStream,由裝飾器的父類FSDataInputStream最終返回給DistributeFileSystem,由客戶端開發者使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3) 最後我們再深入到數據塊讀取機制的源代碼上看看,如下圖所示:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c1/c1e86a454da4a48215a319d3523b4486.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"DFSInputStream數據讀取流程類關係圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖實線代表了真實的調用過程,虛線代表了對象之間的間接關係。實際的代碼邏輯比較複雜,此圖也是儘量簡化展現,方便我們理解。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一樣的,我來簡單講解一下這個過程:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一步FSDataInputStream裝飾器接受客戶端的讀取調用對DFSInputStream對象進行read(...)方法調用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二步 DFSInputStream會調用自身的blockSeekTo(long offset)方法,一方面根據offset數據偏移量,定位當前是否要讀取新的數據塊(LocationBlock),另一方面新的數據塊從數據塊集合(LocationBlocks)中找到後,尋找最佳的數據節點,也就是Hadoop所謂的就近原則,先看看本機數據節點有沒有副本,再次根據網絡距離着就近獲取副本。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三步通過FSDataInputStream副本上數據塊(LocationBlock)構建BlockReader對象,它就是真正讀取數據塊的對象。BlockReader對象它有不同的實現,由BlockReaderFactory.build根據條件最優選擇具體實現,BlockReaderLocal和BlockReaderLocalLegacy(based on HDFS-2246)是優選方案,也是short-circuit block readers方案,相當於直接從本地文件系統讀數據了,若short-circuit因爲安全等因素不可用,就會嘗試UNIX domain sockets的優化方案,再不行才考慮BlockReaderRemote建立TCP sockets的連接方案了。BlockReader的細節原理也非常值得深入一探究竟,待下次我專門寫一篇針對BlockReader原理機制文章。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4. 結束","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"非常感覺您能看完。下一篇我會對“","attrs":{}},{"type":"link","attrs":{"href":"https://xie.infoq.cn/article/4438a7b40a9d832e9eb7c4e67","title":""},"content":[{"type":"text","text":"Hadoop分佈式文件系統(HDFS)數據寫入流程","attrs":{}}]},{"type":"text","text":"”做一篇深度探索分析。期盼您的關注。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"作者:方順 西安守護石信息科技創始人 致力於IT工程師在大數據領域的技術提升","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.zhihu.com/column/c_151487501","title":null},"content":[{"type":"text","text":"前往我的知乎專欄——瞭解更多關於大數據的知識","attrs":{}}]},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":" ","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c1/c18eee3b3b896897a333305bc70cfc88.jpeg","alt":null,"title":"公衆號:守護石論數據","style":[{"key":"width","value":"25%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章