安全系列之——主流Hash散列算法介紹和使用

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個人在這個社會上生存,都會有一個屬於自己的標記,用於區分不同的個體。通常使用名字就可以了。但是一個名字也並不能完全表示一個人,因爲重名的人很多。所以我們可以使用一個身份證號或者指紋來表示獨一無二的一個人。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同樣在互聯網的世界,使用一個符號來表示一個獨一無二的事物也很重要。比如我們下載一個文件,文件的下載過程中會經過很多網絡服務器、路由器的中轉,如何保證這個文件下載過程中沒有丟包,被完整的下載下來了呢?我們不可能去檢測這個文件的每個字節,也不能簡單地利用文件名、文件大小這些極容易僞裝的信息去判斷。這時候,我們就需要一種指紋一樣的標誌來檢查文件的可靠性,這種指紋就是我們現在所用的Hash算法(也叫散列算法)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/16/1645db008af11482a3b91e92be9a6f96.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如從mysql官網下載mysql時,在軟件包的右下角,都會有一個MD5算法算出來的hash值。這個hash值有什麼用呢?其實這是給我們校驗下載的軟件包是否完整用的。當我們下載完成後,可以通過相關的手段,比如在linux系統中可以通過"},{"type":"codeinline","content":[{"type":"text","text":"md5sum"}]},{"type":"text","text":"這個命令,計算我們下載的軟件包的hash值,然後和官網給出的hash值進行比較,如果兩個相等,就表示文件被完整的下載了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所謂數據的完整性,指的是數據在網絡傳輸中是否被篡改、是否丟包,發送方發出的數據和接收方接收的數據是一樣的,就表明數據是完整的。如何評估數據的完整性?通常使用Hash散列函數。散列函數的主要任務是驗證數據的完整性。通過散列函數計算得到的結果叫做散列值,這個散列值也常常被稱爲數據的指紋( Fingerprint)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、Hash散列算法介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"概括來說,哈希(Hash)是將目標文本轉換成具有相同長度的、不可逆的雜湊字符串(或叫做消息摘要)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而加密(Encrypt)是將目標文本轉換成具有不同長度的、可逆的密文。Hash算法嚴格上來說並不屬於加密算法,而是與加密算法屬於並列關係的一種算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有加密就有解密,而hash算法是"},{"type":"text","marks":[{"type":"strong"}],"text":"不可逆"},{"type":"text","text":",因此不能算加密算法。這裏的不可逆既指不能根據轉換後的結果逆轉回原文,也指對於兩個輸入,即使其轉換結果相同也不能說這兩個輸入就一定相同。因爲,Hash算法的定義域是一個無限集合,但是值域確是一個有限集合,將一個無限集合映射到有限集合上,每個哈希結果都存在無數個可能的目標文本,因此哈希是一個多對一的映射,所以它也不存在逆映射。但是對於加密算法,它的結果往往取決於輸入,其定義域和值域都是無限集合,明顯是一個一一映射,對於一一映射,理論上都是可逆的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常見的Hash算法有:MD5、SHA-1、HMAC、HMAC-MD5、HMAC-SHA1等"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、Hash散列算法的特徵"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個優秀的散列算法有幾個重要的特徵:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1.固定長度。散列函數可以接受任意大小的數據,並輸出固定長度的散列值。比如MD5這個hash函數爲例,不管原始數據有多大,計算得到的hash散列值總是128比特。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":">"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2.雪崩效應。原始數據哪怕只有一個字節的修改,得到的hash值都會發生巨大的變化。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":">"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3.單向。只能從原始數據計算得到hash值,不能從hash值計算得到原始數據。所以散列算法不是加密解密算法,加密解密是可逆的,散列算法是不可逆的。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":">"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4.避免衝突。幾乎不可能找到一個數據和當前計算的這個數據計算出一樣的hash值,因此散列函數能夠確保數據的唯一性。目前標準的MD5算法理論碰撞概率是2的128次方分之一。正是因爲這種算法的碰撞概率很小,所以說我們在實際使用的過程之中才是可以無視這個數而直接使用MD5數據確定唯一性。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、散列算法的使用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.1文件傳輸"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在文件傳輸時,散列算法就是一種以較短的信息來保證文件唯一性的標誌,這種標誌與文件的每一個字節都相關,而且難以找到逆向規律。因此,當原有文件發生改變時,其標誌值也會發生改變,從而告訴文件使用者當前的文件已經不是你所需求的文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種場景,對hash碰撞的要求要低於計算的速度,因爲文件較大時,計算的速度會更重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.2消息摘要"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在密碼學中,hash算法的作用主要是用於消息摘要(Message Digest),它主要用於對整個消息的完整性進行校驗。舉個例子,我們登陸B站的時都需要輸入密碼,那麼B站的數據庫會保存明文的密碼嗎?如果會明文保存,B站的DBA肯定會看到每個人的密碼是什麼,很不安全;同時如果用戶在註冊登錄時也是明文在網絡上傳輸賬號密碼,這個信息也會被人惡意截取,都會有很多安全問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常一個系統都不會明文存儲用戶的密碼,一般,用戶在註冊的時候,密碼在用戶側還未提交時,就會使用密碼的明文計算一個hash值,然後傳輸到後端系統,並將密文記錄到數據庫中,用戶登錄時,在用戶側在使用相同的算法對密碼計算一個hash值,傳到後端後,將這個hash值和數據庫中的hash值進行比較,如果相同就登錄成功;這樣就避免了在網絡傳輸或公司的DBA泄露用戶密碼,而且密碼始終是在用戶側,所以只要用戶知道密碼的明文是什麼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這些應用場景裏,對於抗碰撞和抗篡改能力要求較高,對速度的要求在其次。一個設計良好的hash算法,其抗碰撞能力是很高的。以MD5爲例,其輸出長度爲128位,碰撞的概率是2的128次方分之一"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3.3數據結構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在用到hash進行管理的數據結構中,就對速度比較重視,對抗碰撞不太看中,只要保證hash均勻分佈就可以。比如Hashmap,hash值(key)存在的目的是加速鍵值對的查找,key的作用是爲了將元素適當地放在各個桶裏,對於抗碰撞的要求沒有那麼高。換句話說,hash出來的key,只要保證value大致均勻的放在不同的桶裏就可以了。但整個算法的set性能,直接與hash值產生的速度有關,所以這時候的hash值的產生速度就尤爲重要:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"/**\n * HashMap對象的hash()\n */\nstatic final int hash(Object key) {\n int h;\n //計算hashCode,並無符號移動到低位\n return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);\n}\n\n/**\n * Object對象的hashCode()\n */\npublic int hashCode() {\n int h = hash;\n //hash default value : 0 \n if (h == 0 && value.length > 0) {\n //value : char storage\n char val[] = value;\n for (int i = 0; i < value.length; i++) {\n h = 31 * h + val[i];\n }\n hash = h;\n }\n return h;\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很簡潔的一個乘加迭代運算,在不少的hash算法中,使用的是異或+加法進行迭代。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"四、Hash算法的使用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.1.MD5算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Message Digest Algorithm MD5(消息摘要算法5)爲計算機安全領域廣泛使用的一種散列函數,用以提供消息的完整性保護。是計算機廣泛使用的雜湊算法之一,將數據(如漢字)運算爲另一固定長度值,是雜湊算法的基礎原理,MD5的前身有MD2、MD3和MD4。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD5算法具有以下特點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1、壓縮性:任意長度的數據,算出的MD5值長度都是固定的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2、容易計算:從原數據計算出MD5值很容易。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3、抗修改性:對原數據進行任何改動,哪怕只修改1個字節,所得到的MD5值都有很大區別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4、強抗碰撞:已知原數據和其MD5值,想找到一個具有相同MD5值的數據(即僞造數據)是非常困難的。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD5應用場景:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1、一致性驗證"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2、數字簽名"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3、安全訪問認證"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD5代碼測試:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"package com.wuxiaolong.EncrypteDecrypt;\n\nimport java.io.File;\nimport java.io.FileInputStream;\nimport java.io.IOException;\nimport java.io.UnsupportedEncodingException;\nimport java.security.MessageDigest;\nimport java.security.NoSuchAlgorithmException;\n\n/**\n * Description:\n *\n * @author 諸葛小猿\n * @date 2020-08-25\n */\npublic class MD5Test {\n\n public static void main(String[] args) {\n\n // 字符串的Md5\n String content = \"諸葛小猿\";\n String hashStr = md5(content);\n System.out.println(content +\" => \"+ hashStr);\n\n //文件的MD5\n String filePath = \"G:\\\\FiddlerSetup.exe\";\n File file = new File(filePath);\n String fileHash = md5(file);\n System.out.println(\"file hash =>\" + fileHash);\n }\n\n /**\n * 計算字符串的MD5值\n * @param string 明文\n * @return 字符串的MD5值\n */\n public static String md5(String string) {\n if (string.isEmpty()) {\n return \"\";\n }\n MessageDigest md5 = null;\n try {\n md5 = MessageDigest.getInstance(\"MD5\");\n byte[] bytes = md5.digest(string.getBytes(\"UTF-8\"));\n String result = \"\";\n for (byte b : bytes) {\n String temp = Integer.toHexString(b & 0xff);\n if (temp.length() == 1) {\n temp = \"0\" + temp;\n }\n result += temp;\n }\n return result;\n } catch (NoSuchAlgorithmException e) {\n e.printStackTrace();\n } catch (UnsupportedEncodingException e) {\n e.printStackTrace();\n }\n return \"\";\n }\n\n /**\n * 計算文件的MD5值\n * @param file 文件File\n * @return 文件的MD5值\n */\n public static String md5(File file) {\n if (file == null || !file.isFile() || !file.exists()) {\n return \"\";\n }\n FileInputStream in = null;\n String result = \"\";\n byte buffer[] = new byte[0124];\n int len;\n try {\n MessageDigest md5 = MessageDigest.getInstance(\"MD5\");\n in = new FileInputStream(file);\n while ((len = in.read(buffer)) != -1) {\n md5.update(buffer, 0, len);\n }\n byte[] bytes = md5.digest();\n\n for (byte b : bytes) {\n String temp = Integer.toHexString(b & 0xff);\n if (temp.length() == 1) {\n temp = \"0\" + temp;\n }\n result += temp;\n }\n } catch (Exception e) {\n e.printStackTrace();\n }finally {\n if(null!=in){\n try {\n in.close();\n } catch (IOException e) {\n e.printStackTrace();\n }\n }\n }\n return result;\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運行結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"諸葛小猿 => 591a998db81be040be8591d7e2c2ddc2\nfile hash =>8adb254960a54c08c4453775d10574ba"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.2.SHA1算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"安全哈希算法(Secure Hash Algorithm)主要適用於數字簽名標準裏面定義的數字簽名算法(Digital Signature Algorithm DSA)。對於長度小於2^64位的消息,SHA1會產生一個160位的消息摘要。當接收到消息的時候,這個消息摘要可以用來驗證數據的完整性。在傳輸的過程中,數據很可能會發生變化,那麼這時候就會產生不同的消息摘要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SHA1算法原理:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先進行SHA1分組:對於任意長度的明文,SHA1可以產生160位的摘要。對明文的分組處理過程如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1)對數據流尾部添加0x80標記。任意長度的明文首先需要添加位數,使明文總長度爲448(mod512)位。將0x80字節追加到數據流尾部以後,源數據流的整個長度將會發生變化,考慮到還要添加64位(8個字節)的位長度,必須填充0 以使修改後的源數據流是64字節(512位)的倍數。在明文後添加位的方法是第一個添加位是1,其餘都是0。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2)然後將真正明文的長度(沒有添加位以前的明文長度)以64位表示,附加於前面已添加過位的明文後,此時的明文長度正好是 512位的倍數。當明文長度大於2的64次方時,僅僅使用低64位比特填充,附加到最後一個分組的末尾。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3)經過添加處理的明文,其長度正好爲512位的整數倍,然後按512位的長度進行分組(block),可以劃分成L份明文分組,我們用Y0,Y1,……,YL-1表示這些明文分組。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4)Sha1默認數據流以big endian 方式存放。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分組之後,對所得到的若干分組反覆重複處理。對每個明文分組的摘要生成過程如下:"}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1)將512位劃分成16個子明文分組,每個子分組32位"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2)申請5個鏈接變量a、b、c、d、e,初始爲H0、H1、H2、H3、H4"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3)將16個子分組擴展爲80份"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4)80個子分組進行4輪運算,得到新的a、b、c、d、e值"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"5)新的鏈接變量與原始鏈接變量進行求和"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"6)鏈接變量作爲下一個明文分組的初始鏈接變量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"7)最後一個分組的5個鏈接變量就是SHA1摘要"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SHA1有如下特性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不可以從消息摘要中復原信息;兩個不同的消息不會產生同樣的消息摘要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD5代碼測試:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"計算SHA1值的Java代碼與計算MD5值的代碼基本相同,區別只在於"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"MessageDigest.getInstance(\"MD5\"); //將\"MD5\"替換爲\"SHA1\"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以將的上面計算MD5值的兩個函數"},{"type":"codeinline","content":[{"type":"text","text":"md5(String string)"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"hash(File file)"}]},{"type":"text","text":"進行簡單的修改,將算法也作爲參數傳入,"},{"type":"codeinline","content":[{"type":"text","text":"hash(String string, String algorithm)"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"hash(File file, String algorithm)"}]},{"type":"text","text":",就可以動態支持MD5和SHA1兩種算法了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"4.3.MurmurHash算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MurmurHash 是一種非加密型哈希函數,適用於一般的哈希檢索操作。 由Austin Appleby在2008年發明, 並出現了多個變種,都已經發布到了公有領域。與其它流行的哈希函數相比,對於規律性較強的key,MurmurHash的隨機分佈特徵表現更良好。其在Redis,Memcached,Cassandra,HBase,Lucene都使用了這種hash算法。所有很有必要說一下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Redis在實現字典時用到了兩種不同的哈希算法,MurmurHash便是其中一種(另一種是djb)。MurmurHash在Redis中應用十分廣泛,包括數據庫、集羣、哈希鍵、阻塞操作等功能都用到了這個算法。發明算法的作者被邀到google工作,該算法最新版本是MurmurHash3,基於MurmurHash2改進了一些小瑕疵,使得速度更快,實現了32位(低延時)、128位HashKey,尤其對大塊的數據,具有較高的平衡性與低碰撞率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與MD5這些講究安全性的摘要算法比,MurmurHash並不關注安全性,比如在Redis內部只是爲主鍵做個Hash而已,就不需要安全性了。因此MurmurHash是一種non-cryptographic的hash算法,比安全散列算法快幾十倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Java中,有很多地方都使用了MurmurHash,比如Guava包、Jedis包,Cassandra包中都有這種hash算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MurmurHash算法總結:高運算性能,低碰撞率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"package com.wuxiaolong.EncrypteDecrypt;\n\nimport com.google.common.base.Charsets;\nimport com.google.common.base.Stopwatch;\nimport com.google.common.collect.Sets;\nimport com.google.common.hash.HashCode;\nimport com.google.common.hash.HashFunction;\nimport com.google.common.hash.Hashing;\nimport org.apache.commons.codec.digest.DigestUtils;\n\nimport java.nio.charset.StandardCharsets;\nimport java.util.Set;\n\n/**\n * Description:\n * 需要jar:\n * \n * com.google.guava\n * guava\n * 23.0\n * \n *\n * @author 諸葛小猿\n * @date 2020-08-25\n */\npublic class MurmurHashTest {\n\n\n public static void main(String[] args) {\n\n int testCount = 100000;\n String msg = \"諸葛小猿\";\n\n long totalTime = 0L;\n\n long start = System.currentTimeMillis();\n\n // Hashing.murmur3_32(seed)\n HashFunction murmur3 = Hashing.murmur3_32();\n\n Set set = Sets.newHashSet();\n for (int i = 0; i < testCount; i++) {\n\n // 計算每一次hash的耗時\n Stopwatch w = Stopwatch.createStarted();\n\n HashCode murmur3HashCode = murmur3.hashString(msg + i, Charsets.UTF_8);\n String murmur3HashCodeStr = murmur3HashCode.toString();\n\n System.out.println(String.format(\"murmur3's hashCode:%s,length:%s,it consumes:%s\", murmur3HashCodeStr, murmur3HashCodeStr.length(), w));\n\n set.add(murmur3HashCodeStr);\n }\n\n long end = System.currentTimeMillis();\n\n totalTime = end - start;\n\n System.out.println(\"set的元素個數\" + set.size());\n\n System.out.println(\"總耗時\" + totalTime);\n\n }\n\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"部分運行結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"murmur3's hashCode:a2e96039,length:8,it consumes:2.500 μs\nmurmur3's hashCode:12c86afb,length:8,it consumes:2.400 μs\nmurmur3's hashCode:ad6ff257,length:8,it consumes:1.200 μs\nmurmur3's hashCode:b71c537d,length:8,it consumes:1.200 μs\nmurmur3's hashCode:92508a73,length:8,it consumes:1.100 μs\nset的元素個數99998\n總耗時984"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏運行了100000次,但是Set中去重後只有99998個元素,可以使用"},{"type":"codeinline","content":[{"type":"text","text":"Hashing.murmur3_32(seed)"}]},{"type":"text","text":"的seed降低元素的重複。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MurmurHash的應用除了上面說的redis,在很多時候都可以應用到,比如短連接服務生成短連接、BloomFilter都可以使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"五、Hash算法的安全性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MD5、SHA1等hash算法作爲一種不可逆算法,一定程度上保證了密碼的安全性,但是MD5等hash算法真的是完全安全的嗎,其實不然。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從概率來說,2的128次方遍歷後至少出現兩個相同的MD5值,但是2的128次方有多大?3402823669209384634633746074317.7億,就算全世界最快的超級計算機也要跑幾十億年才能跑完。可是,王小云院士破解了MD5。這裏所說的破解,並不是給王小云院士一個MD5散列值,然後她就能通過計算還原出原文來。從密文推算出明文理論上是不可能的,所以王小云的研究成果不能通過 MD5 的散列值逆向推算出明文。王小云的研究成果是給定消息 "},{"type":"codeinline","content":[{"type":"text","text":"M1"}]},{"type":"text","text":",能夠計算獲取 "},{"type":"codeinline","content":[{"type":"text","text":"M2"}]},{"type":"text","text":",使得 "},{"type":"codeinline","content":[{"type":"text","text":"M2"}]},{"type":"text","text":" 產生的散列值與 "},{"type":"codeinline","content":[{"type":"text","text":"M1"}]},{"type":"text","text":" 產生的散列值相同。這樣,MD5 的抗碰撞性就不滿足了,使得 MD5 不再是安全的散列算法。從而導致MD5 用於數字簽名將存在嚴重問題,因爲可以篡改原始消息,而生成相同的 Hash 值。因此,業界專家普林斯頓計算機教授Edward Felten等強烈呼籲信息體系的設計者趕快更換籤名算法,而且他們側重這是一個需要當即處理的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時美國國家技能與規範局(NIST)於2004年8月24日宣佈專門談論,談論的首要內容爲:“在近來的世界暗碼學會議(Crypto 2004)上,研究人員宣佈他們發現了破解數種HASH算法的辦法,其間包含MD4,MD5,HAVAL-128,RIPEMD還有 SHA-0。剖析標明,於1994年代替SHA-0成爲聯邦信息處理規範的SHA-1的削弱條件的變種算法能夠被破解;但完好的SHA-1並沒有被破解,也沒有找到SHA-1的碰撞。研究結果闡明SHA-1的安全性暫時沒有問題,但隨着技能的發展,技能與規範局計劃在2010年之前逐步篩選SHA-1,換用別的更長更安全的算法(如SHA-224、SHA-256、SHA-384和SHA-512)來代替。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以從這裏也可以看出,單步的hash算法還是存在很大的漏洞,容易被碰撞。那麼該如何進一步的加強hash算法的安全性呢,可以使用如下的辦法:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.1.hash+鹽(salt)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"salt可以簡單的理解成:隨機產生的一定長度的,可以和密碼相結合,從而使hash算法產生不同結果的字符串。也就相當於你的新密碼 = 舊密碼 + 隨機的鹽值,然後對新密碼進行hash。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優點:這種方法會極大防止受到彩虹表的攻擊,因爲即便攻擊者構造出彩虹表,因爲你使用了 hash(密碼+ salt),攻擊者彩虹表裏的哈希值hash(密碼)和你數據庫中的哈希值是不同的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"5.2.增加計算的時間(哈希+salt+Iteration)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過迭代計算的方式增加計算密碼的成本。迭代的週期控制在用戶可以接受範圍內,這樣攻擊者的計算和時間成本就會大大增加。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一般到此時,hash結果就比較安全了。但是如果還需要更加地安全,可以繼續對這種方法計算出來的hash值使用加密算法加密。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參考文章:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://blog.csdn.net/qq_33408113/article/details/82635009"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://www.oschina.net/translate/state-of-hash-functions"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://blog.csdn.net/qq_24280381/article/details/72024860"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"https://blog.csdn.net/weiliangliang111/article/details/51457874"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關注公衆號,輸入“"},{"type":"text","marks":[{"type":"strong"}],"text":"java-summary"},{"type":"text","text":"”即可獲得源碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"完成,收工!"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/77/77e2a53cabfaaac7d98ee0a860144e29.gif","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"【"},{"type":"text","marks":[{"type":"strong"}],"text":"傳播知識,共享價值"},{"type":"text","text":"】,感謝小夥伴們的關注和支持,我是【"},{"type":"text","marks":[{"type":"strong"}],"text":"諸葛小猿"},{"type":"text","text":"】,一個彷徨中奮鬥的互聯網民工。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6a/6acafea3f4c9b96373b3f566ec7078e2.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章