[C++] MD5加密算法原理及实现

参考文献:
1. RFC1321 - R. Rivest
2. 中山大学 蔡国扬 老师的 Web安全课件

算法概述

  • MD5 使用 little-endian,输入任意不定长度信息,以 512 位长进行分组,生成四个32位数据,最后联合起来输出固定 128 位长的信息摘要。
  • MD5 算法的基本过程为:求余、取余、调整长度、与链接变量进行循环运算、得出结果。

RFC1321 中,算法共分为五步,对于每一步的细节我都会举出例子来更方便的理解。另外有一点需要注意的是,下文中若无特别说明,都是以比特为单位来阐述算法。

基本流程图

总控流程

一、Append Padding Bits

在原始消息的尾部进行填充,使得填充后的消息位数 L mod 512 = 448。
填充规则为,先填充一个 1,然后剩余的填充 0。并且填充是必须的,即使原始消息的长度模 512 后正好为 448 比特,也要进行填充。总之,填充的长度至少为 1 比特,最多为 512 比特

例如,原始消息为 12345678,总长度为 8 * 8 = 64 比特,那么需要填充 384 比特,即填充 1000…. 后面还有 380 个 0

填充后的消息用 16 进制表示(此处省略 0x)为
31 32 33 34 35 36 37 38 80 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

二、Append Length

计算原始消息(未填充 padding 前)的长度,用长 64 位的 b 表示。若 b 大于 264 ,即 64 位不够表示原始消息的长度时,只取低 64 位。将 b 填充至第一步填充后的消息尾部。

此时,填充后得到的消息总长度为 512 的倍数,也是 16 的倍数。将填充后的消息分割为 L 个 512 位的分组,Y0,Y1,...,YL1

注意,实际填充时不是直接将长度的 64 位二进制表示接上去就可以。而是先用两个 32 位的字来表示原始消息长度 b,将低位的字先填充,然后再填充高位的字,并且每个字在填充时使用 little-endian

little-endian:将低位字节排放在内存的低地址端,高位字节排放在内存的高地址端。

例如,原始消息为 12345678,总长度为 8 * 8 = 64 比特,用 64 位二进制表示为 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000000。分成两个 32 位的字:

  • 高位:00000000 00000000 00000000 00000000
  • 低位:00000000 00000000 00000000 01000000

低位字节的 little-endian 表示为 01000000 00000000 00000000 00000000

因此,应该填充的 64 位为 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

三、Initialize MD Buffer

初始化一个 128 位的 MD 缓冲区,也表示为 4 个 32 位寄存器 (A, B, C, D),用来迭代计算保存信息摘要。

对于 4 个 32 位的寄存器 A、B、C、D 分别初始化为 16 进制初始值,采用小端规则

word little-endian
A 01 23 45 67 0x67452301
B 89 AB CD EF 0xEFCDAB89
C FE DC BA 98 0x98BADCFE
D 76 54 32 10 0x10325476

四、Process Message in 16-Word Blocks

首先,定义四个轮函数,每个函数以 3 个 32 位字为输入,输出 1 个 32 位字。

Function return
F(X,Y,Z) (XY)(¬XZ)
G(X,Y,Z) (XZ)(Y¬Z)
H(X,Y,Z) XYZ
I(X,Y,Z) Y(X¬Z)

以第二步分割后的 512 比特的分组为单位,每一个分组 Yq (q = 0, 1, …, L - 1) 经过 4 轮循环的压缩算法,记为 Hmd5 ,对第三步初始化的 MD 缓冲区进行迭代更新,初始 MD 缓冲区记为 CV0=IV ;第 q 个分组处理后的 MD 缓冲区记为 CVq=Hmd5(Yq1,CVq1) ,最终输出结果为 CVL

另外,512 比特的分组再分割为 16 个 32 比特的字,记为 X0,X1,...,X15


特别注意: 这里的 X[k] 并不是顺序读取 32 比特直接形成,而是需要对读取的 4 个字节按照 little-endian 进行转换

例如,原始信息为 12345678,经过第一、二步填充后的十六进制(此处省略 0x)表示为
31 32 33 34 35 36 37 38 80 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 0 0 0 0 0 0 0

对于 X[0],顺序读取 32 比特直接形成得到的是 0x31323334,而实际计算时应该为 0x34333231

X[0…7] = {0x34333231, 0x38373635, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000}
X[8…15] = {0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000040, 0x00000000}


Hmd5 的具体步骤大致为,

  1. 输入上一轮的 128 位结果 CVq1 和第 q 个分组 Yq
  2. 用轮函数 F 和 T 表的 [1…16] 项及 X[i] 对上一轮的结果 CVq1 进行 16 次迭代计算
  3. 用轮函数 G 和 T 表的 [17…32] 项及 X[ρ2i] 对第 2 步的结果进行 16 次迭代计算
  4. 用轮函数 H 和 T 表的 [33…48] 项及 X[ρ3i] 对第 3 步的结果进行 16 次迭代计算
  5. 用轮函数 I 和 T 表的 [49…64] 项及 X[ρ4i] 对第 4 步的结果进行 16 次迭代计算
  6. 将上一轮结果 CVq1 的 4 个 32 位的字与第 5 步产生的 4 个 32 位的字分别进行模 232 加法,得到 CVq

232 加法大致为,两个 32 位字相加,若有第 33 位的进位,则舍弃。例如,0xFFFFFFFF + 0x00000001 = 0x00000000。

Hmd5 流程图

H_md5 流程图

Hmd5 的 2~5 步,每轮的一步运算逻辑为,

ab+((a+g(b,c,d)+X[k]+T[i])<<<s)

说明:
  • a, b, c, d 分别为 MD 缓冲区 (A, B, C, D) 的当前值
  • g:轮函数 (F, G, H, I 中的一个)
  • <<< s:将 32 位输入循环左移 s 位
  • X[k]: 当前处理消息分组的第 k 个 32 位字
  • T[i]: T 表的第 i 个元素,32 位字
  • +:模 232 加法

每次计算后,要对 MD 缓冲区进行循环右移。记一步运算后 MD 缓冲区为 (AA, BB, CC, DD),循环右移即令 A = DD, B = AA, C = BB, D = CC

流程图:

一步运算逻辑

各轮迭代中的 X[k]

  1. 轮函数 F 迭代,X[i], i = 0, 1,…, 15
  2. 轮函数 G 迭代,X[ρ2i], ρ2i = (1 + 5i) mod 16, i = 0, 1,…, 15
  3. 轮函数 H 迭代,X[ρ3i], ρ3i = (5 + 3i) mod 16, i = 0, 1,…, 15
  4. 轮函数 I 迭代,X[ρ4i], ρ4i = 7i mod 16, i = 0, 1,…, 15

用表格更清晰的表示为

轮函数 X[k] 中 k 依次为
F [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
G [1, 6, 11, 0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12]
H [5, 8, 11, 14, 1, 4, 7, 10, 13, 0, 3, 6, 9, 12, 15, 2]
I [0, 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9]

T 表的生成

T[i]=int(232|sin(i)|)

T[1…8] = {0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee, 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501}
T[9…16] = {0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be, 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821}
T[17…24] = {0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa, 0xd62f105d, 0x2441453, 0xd8a1e681, 0xe7d3fbc8}
T[25…32] = {0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed, 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a}
T[33…40] = {0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c, 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70}
T[41…48] = {0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x4881d05, 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665}
T[49…56] = {0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039, 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1}
T[57…64] = {0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1, 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391}

五、Output

根据 MD 缓冲区最后的结果 (A, B, C, D) 输出信息摘要,从 A 到 D,从低字节至高字节的顺序输出。

例如,原始信息为 12345678 经过上述步骤处理后得到的 (A, B, C, D) = {0xd25ad525, 0x0a40aa83, 0x6dc764f4, 0xad073c71}

  • A:输出 25 d5 5a d2
  • B:输出 83 aa 40 0a
  • C:输出 f4 64 c7 6d
  • D:输出 71 3c 07 ad

最终,输出结果为 25d55ad283aa400af464c76d713c07ad

C++ 实现

一、自己的实现方法(不可加密未知长度的原始消息)

我的实现方法是按照参考文献中的五步,一步一步做的,没有参照文献后面附录中的代码实现。这种实现方法在使用时必须满足一个前提,即在一开始就知道整个原始消息及其长度,因为这种实现方法在前两步就对原始消息进行填充,然后才一组一组进行处理,不能对长度未知的原始消息进行加密,这算是个缺陷吧。因此,后面我会给出 L. Peter Deutsch 的实现方法。

MD5.hpp

/*
 *   file: md5.hpp
 *   author: Els-y
 *   time: 2017-10-16 21:08:21
*/
#ifndef _MD5_H
#define _MD5_H

#include <string>
#include <vector>
#include <cstring>
#include <cmath>
#include <iostream>
#include <bitset>
using std::string;
using std::vector;
using std::bitset;
using std::cout;
using std::endl;
using std::sin;
using std::abs;

// default little-endian
class MD5 {
public:
    MD5();
    ~MD5();
    string encrypt(string plain);
    // 输出扩展后的消息
    void print_buff();

private:
    // 128 位 MD 缓冲区,md[0...3] = {A, B, C, D}
    vector<unsigned int> md;
    // 存储扩展后的消息
    unsigned char* buffer;
    // 扩展后的消息长度,以字节为单位
    unsigned int buffer_len;
    // 存放 4 个轮函数的数组
    unsigned int (MD5::*round_funcs[4])(unsigned int, unsigned int, unsigned int);

    // 初始化 MD 缓冲区
    void init_md();
    // 填充 padding 和 length
    void padding(string plain);
    void clear();
    void h_md5(int groupid);
    // 4 个轮函数
    unsigned int f_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int g_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int h_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int i_rf(unsigned int x, unsigned int y, unsigned int z);
    // 返回 MD 缓冲区转换后的 string 格式密文 
    string md2str();
    // 返回 buffer 中 [pos, pos + 3] 四个字节按照 little-endian 组成的 X
    unsigned int uchar2uint(int pos);
    // 返回 unsigned char 对应的十六进制 string
    string uchar2hex(unsigned char uch);
    // 返回 val 循环左移 bits 位的值
    unsigned int cycle_left_shift(unsigned int val, int bits);
    // 返回第 round 轮迭代中,第 step 步的 X 对应下标
    int get_x_index(int round, int step);
};

#endif

MD5.cpp

/*
 *   file: md5.cpp
 *   author: Els-y
 *   time: 2017-10-16 21:08:21
*/
#include "MD5.hpp"

/* -- public --*/
MD5::MD5() {
    buffer = NULL;
    round_funcs[0] = &MD5::f_rf;
    round_funcs[1] = &MD5::g_rf;
    round_funcs[2] = &MD5::h_rf;
    round_funcs[3] = &MD5::i_rf;
}

MD5::~MD5() {
    clear();
}

string MD5::encrypt(string plain) {
    init_md();
    clear();
    padding(plain);

    int group_len = buffer_len / 64;

    for (int i = 0; i < group_len; ++i) h_md5(i);

    return md2str();
}

void MD5::print_buff() {
    cout << "buffer_len = " << buffer_len << endl;
    for (int i = 0; i < buffer_len; ++i) {
        bitset<8> ch = buffer[i];
        cout << ch << " ";
    }
    cout << endl;
}

/* -- private --*/
// 初始化 MD 缓冲区
void MD5::init_md() {
    md = vector<unsigned int>({0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476});
}

// 填充 padding 和 length
void MD5::padding(string plain) {
    unsigned int plain_len = plain.size();
    unsigned long long plain_bits_len = plain.size() * 8;
    unsigned int fill_bits_len = plain_bits_len % 512 == 448 ? 512 : (960 - plain_bits_len % 512) % 512;
    unsigned int fill_len = fill_bits_len / 8;
    buffer_len = plain_len + fill_len + 8;
    buffer = new unsigned char[buffer_len];

    // 复制原始消息
    for (int i = 0; i < plain_len; ++i) buffer[i] = plain[i];

    // 填充 padding
    buffer[plain_len] = 0x80;
    for (int i = 1; i < fill_len; ++i) buffer[plain_len + i] = 0;

    // 填充原始消息 length
    for (int i = 0; i < 8; ++i) {
        unsigned char ch = plain_bits_len;
        buffer[plain_len + fill_len + i] = ch;
        plain_bits_len >>= 8;
    }
}

void MD5::clear() {
    if (buffer != NULL) {
        delete []buffer;
        buffer = NULL;
    }
}

void MD5::h_md5(int groupid) {
    int buff_begin = 64 * groupid;

    unsigned int next;
    vector<unsigned int> last_md(md);

    const unsigned int CYCLE_BITS[4][4] = {
        {7, 12, 17, 22},
        {5, 9, 14, 20},
        {4, 11, 16, 23},
        {6, 10, 15, 21}
    };

    // round = [0, 1, 2, 3] 分别对应 [F, G, H, I] 轮
    for (int round = 0; round < 4; ++round) {
        for (int i = 0; i < 16; ++i) {
            unsigned int x = uchar2uint(buff_begin + get_x_index(round, i) * 4);
            unsigned int t = 0x100000000UL * abs(sin(round * 16 + i + 1));
            next = md[1] + cycle_left_shift(md[0] + (this->*round_funcs[round])(md[1], md[2], md[3]) + x + t, CYCLE_BITS[round][i % 4]);
            // (A, B, C, D) 循环右移
            md[0] = md[3];
            md[3] = md[2];
            md[2] = md[1];
            md[1] = next;
        }
    }

    for (int i = 0; i < 4; ++i) md[i] += last_md[i];
}

// 4 个轮函数
unsigned int MD5::f_rf(unsigned int x, unsigned int y, unsigned int z) {
    return (x & y) | (~x & z);
}

unsigned int MD5::g_rf(unsigned int x, unsigned int y, unsigned int z) {
    return (x & z) | (y & ~z);
}

unsigned int MD5::h_rf(unsigned int x, unsigned int y, unsigned int z) {
    return x ^ y ^ z;
}

unsigned int MD5::i_rf(unsigned int x, unsigned int y, unsigned int z) {
    return y ^ (x | ~z);
}

// 返回 MD 缓冲区转换后的 string 格式密文
string MD5::md2str() {
    string res;

    for (int i = 0; i < 4; ++i) {
        unsigned int val = md[i];
        for (int j = 0; j < 4; ++j) {
            unsigned char ch = val;
            res += uchar2hex(ch);
            val >>= 8;
        }
    }

    return res;
}

// 返回 buffer 中 [pos, pos + 3] 四个字节按照 little-endian 组成的 X
unsigned int MD5::uchar2uint(int pos) {
    unsigned int val = 0;
    int end = pos + 3;
    for (int i = end; i >= pos; --i) {
        val |= buffer[i];
        if (i != pos) val <<= 8;
    }
    return val;
}

// 返回 unsigned char 对应的十六进制 string
string MD5::uchar2hex(unsigned char uch) {
    string res;
    unsigned char mask = 0x0F;

    for (int i = 1; i >= 0; --i) {
        char ch = uch >> (i << 2) & mask;
        if (ch < 10) ch += '0';
        else ch += 'A' - 10;
        res += ch;
    }

    return res;
}

// 返回 val 循环左移 bits 位的值
unsigned int MD5::cycle_left_shift(unsigned int val, int bits) {
    bits %= 32;
    return (val << bits) | (val >> (32 - bits));
}

// 返回第 round 轮迭代中,第 step 步的 X 对应下标
int MD5::get_x_index(int round, int step) {
    if (round == 0) {
        return step;
    } else if (round == 1) {
        return (1 + 5 * step) % 16;
    } else if (round == 2) {
        return (5 + 3 * step) % 16;
    } else {
        return (7 * step) % 16;
    }
}

main.cpp

#include <iostream>
#include "MD5.hpp"
using namespace std;

int main() {
    MD5 md5;

    string plain = "12345678";
    string cipher = md5.encrypt(plain);

    cout << "plain: " << plain << endl;
    cout << "cipher: " << cipher << endl;

    return 0;
}

输出结果为:

plain: 12345678
cipher: 25D55AD283AA400AF464C76D713C07AD

二、可加密未知长度的原始消息

md5.h

/*
  Copyright (C) 1999, 2002 Aladdin Enterprises.  All rights reserved.

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  L. Peter Deutsch
  [email protected]

 */
/* $Id: md5.h,v 1.2 2007/12/24 05:58:37 lilyco Exp $ */
/*
  Independent implementation of MD5 (RFC 1321).

  This code implements the MD5 Algorithm defined in RFC 1321, whose
  text is available at
    http://www.ietf.org/rfc/rfc1321.txt
  The code is derived from the text of the RFC, including the test suite
  (section A.5) but excluding the rest of Appendix A.  It does not include
  any code or documentation that is identified in the RFC as being
  copyrighted.

  The original and principal author of md5.h is L. Peter Deutsch
  <[email protected]>.  Other authors are noted in the change history
  that follows (in reverse chronological order):

  2002-04-13 lpd Removed support for non-ANSI compilers; removed
    references to Ghostscript; clarified derivation from RFC 1321;
    now handles byte order either statically or dynamically.
  1999-11-04 lpd Edited comments slightly for automatic TOC extraction.
  1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5);
    added conditionalization for C++ compilation from Martin
    Purschke <[email protected]>.
  1999-05-03 lpd Original version.
 */

#ifndef md5_INCLUDED
#  define md5_INCLUDED

/*
 * This package supports both compile-time and run-time determination of CPU
 * byte order.  If ARCH_IS_BIG_ENDIAN is defined as 0, the code will be
 * compiled to run only on little-endian CPUs; if ARCH_IS_BIG_ENDIAN is
 * defined as non-zero, the code will be compiled to run only on big-endian
 * CPUs; if ARCH_IS_BIG_ENDIAN is not defined, the code will be compiled to
 * run on either big- or little-endian CPUs, but will run slightly less
 * efficiently on either one than if ARCH_IS_BIG_ENDIAN is defined.
 */

typedef unsigned char md5_byte_t; /* 8-bit byte */
typedef unsigned int md5_word_t; /* 32-bit word */

/* Define the state of the MD5 Algorithm. */
typedef struct md5_state_s {
    md5_word_t count[2];    /* message length in bits, lsw first */
    md5_word_t abcd[4];        /* digest buffer */
    md5_byte_t buf[64];        /* accumulate block */
} md5_state_t;

#ifdef __cplusplus
extern "C" 
{
#endif

/* Initialize the algorithm. */
void md5_init(md5_state_t *pms);

/* Append a string to the message. */
void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes);

/* Finish the message and return the digest. */
void md5_finish(md5_state_t *pms, md5_byte_t digest[16]);

#ifdef __cplusplus
}  /* end extern "C" */
#endif

#endif /* md5_INCLUDED */

md5.cpp

/*
  Copyright (C) 1999, 2000, 2002 Aladdin Enterprises.  All rights reserved.

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  L. Peter Deutsch
  [email protected]

 */
/* $Id: md5.cpp,v 1.3 2008/01/20 22:52:04 lilyco Exp $ */
/*
  Independent implementation of MD5 (RFC 1321).

  This code implements the MD5 Algorithm defined in RFC 1321, whose
  text is available at
    http://www.ietf.org/rfc/rfc1321.txt
  The code is derived from the text of the RFC, including the test suite
  (section A.5) but excluding the rest of Appendix A.  It does not include
  any code or documentation that is identified in the RFC as being
  copyrighted.

  The original and principal author of md5.c is L. Peter Deutsch
  <[email protected]>.  Other authors are noted in the change history
  that follows (in reverse chronological order):

  2002-04-13 lpd Clarified derivation from RFC 1321; now handles byte order
    either statically or dynamically; added missing #include <string.h>
    in library.
  2002-03-11 lpd Corrected argument list for main(), and added int return
    type, in test program and T value program.
  2002-02-21 lpd Added missing #include <stdio.h> in test program.
  2000-07-03 lpd Patched to eliminate warnings about "constant is
    unsigned in ANSI C, signed in traditional"; made test program
    self-checking.
  1999-11-04 lpd Edited comments slightly for automatic TOC extraction.
  1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5).
  1999-05-03 lpd Original version.
 */

#include "md5.h"
#include <string.h>

#undef BYTE_ORDER    /* 1 = big-endian, -1 = little-endian, 0 = unknown */
#ifdef ARCH_IS_BIG_ENDIAN
#  define BYTE_ORDER (ARCH_IS_BIG_ENDIAN ? 1 : -1)
#else
#  define BYTE_ORDER 0
#endif

#define T_MASK ((md5_word_t)~0)
#define T1 /* 0xd76aa478 */ (T_MASK ^ 0x28955b87)
#define T2 /* 0xe8c7b756 */ (T_MASK ^ 0x173848a9)
#define T3    0x242070db
#define T4 /* 0xc1bdceee */ (T_MASK ^ 0x3e423111)
#define T5 /* 0xf57c0faf */ (T_MASK ^ 0x0a83f050)
#define T6    0x4787c62a
#define T7 /* 0xa8304613 */ (T_MASK ^ 0x57cfb9ec)
#define T8 /* 0xfd469501 */ (T_MASK ^ 0x02b96afe)
#define T9    0x698098d8
#define T10 /* 0x8b44f7af */ (T_MASK ^ 0x74bb0850)
#define T11 /* 0xffff5bb1 */ (T_MASK ^ 0x0000a44e)
#define T12 /* 0x895cd7be */ (T_MASK ^ 0x76a32841)
#define T13    0x6b901122
#define T14 /* 0xfd987193 */ (T_MASK ^ 0x02678e6c)
#define T15 /* 0xa679438e */ (T_MASK ^ 0x5986bc71)
#define T16    0x49b40821
#define T17 /* 0xf61e2562 */ (T_MASK ^ 0x09e1da9d)
#define T18 /* 0xc040b340 */ (T_MASK ^ 0x3fbf4cbf)
#define T19    0x265e5a51
#define T20 /* 0xe9b6c7aa */ (T_MASK ^ 0x16493855)
#define T21 /* 0xd62f105d */ (T_MASK ^ 0x29d0efa2)
#define T22    0x02441453
#define T23 /* 0xd8a1e681 */ (T_MASK ^ 0x275e197e)
#define T24 /* 0xe7d3fbc8 */ (T_MASK ^ 0x182c0437)
#define T25    0x21e1cde6
#define T26 /* 0xc33707d6 */ (T_MASK ^ 0x3cc8f829)
#define T27 /* 0xf4d50d87 */ (T_MASK ^ 0x0b2af278)
#define T28    0x455a14ed
#define T29 /* 0xa9e3e905 */ (T_MASK ^ 0x561c16fa)
#define T30 /* 0xfcefa3f8 */ (T_MASK ^ 0x03105c07)
#define T31    0x676f02d9
#define T32 /* 0x8d2a4c8a */ (T_MASK ^ 0x72d5b375)
#define T33 /* 0xfffa3942 */ (T_MASK ^ 0x0005c6bd)
#define T34 /* 0x8771f681 */ (T_MASK ^ 0x788e097e)
#define T35    0x6d9d6122
#define T36 /* 0xfde5380c */ (T_MASK ^ 0x021ac7f3)
#define T37 /* 0xa4beea44 */ (T_MASK ^ 0x5b4115bb)
#define T38    0x4bdecfa9
#define T39 /* 0xf6bb4b60 */ (T_MASK ^ 0x0944b49f)
#define T40 /* 0xbebfbc70 */ (T_MASK ^ 0x4140438f)
#define T41    0x289b7ec6
#define T42 /* 0xeaa127fa */ (T_MASK ^ 0x155ed805)
#define T43 /* 0xd4ef3085 */ (T_MASK ^ 0x2b10cf7a)
#define T44    0x04881d05
#define T45 /* 0xd9d4d039 */ (T_MASK ^ 0x262b2fc6)
#define T46 /* 0xe6db99e5 */ (T_MASK ^ 0x1924661a)
#define T47    0x1fa27cf8
#define T48 /* 0xc4ac5665 */ (T_MASK ^ 0x3b53a99a)
#define T49 /* 0xf4292244 */ (T_MASK ^ 0x0bd6ddbb)
#define T50    0x432aff97
#define T51 /* 0xab9423a7 */ (T_MASK ^ 0x546bdc58)
#define T52 /* 0xfc93a039 */ (T_MASK ^ 0x036c5fc6)
#define T53    0x655b59c3
#define T54 /* 0x8f0ccc92 */ (T_MASK ^ 0x70f3336d)
#define T55 /* 0xffeff47d */ (T_MASK ^ 0x00100b82)
#define T56 /* 0x85845dd1 */ (T_MASK ^ 0x7a7ba22e)
#define T57    0x6fa87e4f
#define T58 /* 0xfe2ce6e0 */ (T_MASK ^ 0x01d3191f)
#define T59 /* 0xa3014314 */ (T_MASK ^ 0x5cfebceb)
#define T60    0x4e0811a1
#define T61 /* 0xf7537e82 */ (T_MASK ^ 0x08ac817d)
#define T62 /* 0xbd3af235 */ (T_MASK ^ 0x42c50dca)
#define T63    0x2ad7d2bb
#define T64 /* 0xeb86d391 */ (T_MASK ^ 0x14792c6e)


static void
md5_process(md5_state_t *pms, const md5_byte_t *data /*[64]*/)
{
    md5_word_t
    a = pms->abcd[0], b = pms->abcd[1],
    c = pms->abcd[2], d = pms->abcd[3];
    md5_word_t t;
#if BYTE_ORDER > 0
    /* Define storage only for big-endian CPUs. */
    md5_word_t X[16];
#else
    /* Define storage for little-endian or both types of CPUs. */
    md5_word_t xbuf[16];
    const md5_word_t *X;
#endif

    {
#if BYTE_ORDER == 0
    /*
     * Determine dynamically whether this is a big-endian or
     * little-endian machine, since we can use a more efficient
     * algorithm on the latter.
     */
    static const int w = 1;

    if (*((const md5_byte_t *)&w)) /* dynamic little-endian */
#endif
#if BYTE_ORDER <= 0        /* little-endian */
    {
        /*
         * On little-endian machines, we can process properly aligned
         * data without copying it.
         */
        if (!((data - (const md5_byte_t *)0) & 3)) {
        /* data are properly aligned */
        X = (const md5_word_t *)data;
        } else {
        /* not aligned */
        memcpy(xbuf, data, 64);
        X = xbuf;
        }
    }
#endif
#if BYTE_ORDER == 0
    else            /* dynamic big-endian */
#endif
#if BYTE_ORDER >= 0        /* big-endian */
    {
        /*
         * On big-endian machines, we must arrange the bytes in the
         * right order.
         */
        const md5_byte_t *xp = data;
        int i;

#  if BYTE_ORDER == 0
        X = xbuf;        /* (dynamic only) */
#  else
#    define xbuf X        /* (static only) */
#  endif
        for (i = 0; i < 16; ++i, xp += 4)
        xbuf[i] = xp[0] + (xp[1] << 8) + (xp[2] << 16) + (xp[3] << 24);
    }
#endif
    }

#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n))))

    /* Round 1. */
    /* Let [abcd k s i] denote the operation
       a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
#define F(x, y, z) (((x) & (y)) | (~(x) & (z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + F(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
    /* Do the following 16 operations. */
    SET(a, b, c, d,  0,  7,  T1);
    SET(d, a, b, c,  1, 12,  T2);
    SET(c, d, a, b,  2, 17,  T3);
    SET(b, c, d, a,  3, 22,  T4);
    SET(a, b, c, d,  4,  7,  T5);
    SET(d, a, b, c,  5, 12,  T6);
    SET(c, d, a, b,  6, 17,  T7);
    SET(b, c, d, a,  7, 22,  T8);
    SET(a, b, c, d,  8,  7,  T9);
    SET(d, a, b, c,  9, 12, T10);
    SET(c, d, a, b, 10, 17, T11);
    SET(b, c, d, a, 11, 22, T12);
    SET(a, b, c, d, 12,  7, T13);
    SET(d, a, b, c, 13, 12, T14);
    SET(c, d, a, b, 14, 17, T15);
    SET(b, c, d, a, 15, 22, T16);
#undef SET

     /* Round 2. */
     /* Let [abcd k s i] denote the operation
          a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */
#define G(x, y, z) (((x) & (z)) | ((y) & ~(z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + G(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  1,  5, T17);
    SET(d, a, b, c,  6,  9, T18);
    SET(c, d, a, b, 11, 14, T19);
    SET(b, c, d, a,  0, 20, T20);
    SET(a, b, c, d,  5,  5, T21);
    SET(d, a, b, c, 10,  9, T22);
    SET(c, d, a, b, 15, 14, T23);
    SET(b, c, d, a,  4, 20, T24);
    SET(a, b, c, d,  9,  5, T25);
    SET(d, a, b, c, 14,  9, T26);
    SET(c, d, a, b,  3, 14, T27);
    SET(b, c, d, a,  8, 20, T28);
    SET(a, b, c, d, 13,  5, T29);
    SET(d, a, b, c,  2,  9, T30);
    SET(c, d, a, b,  7, 14, T31);
    SET(b, c, d, a, 12, 20, T32);
#undef SET

     /* Round 3. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + H(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  5,  4, T33);
    SET(d, a, b, c,  8, 11, T34);
    SET(c, d, a, b, 11, 16, T35);
    SET(b, c, d, a, 14, 23, T36);
    SET(a, b, c, d,  1,  4, T37);
    SET(d, a, b, c,  4, 11, T38);
    SET(c, d, a, b,  7, 16, T39);
    SET(b, c, d, a, 10, 23, T40);
    SET(a, b, c, d, 13,  4, T41);
    SET(d, a, b, c,  0, 11, T42);
    SET(c, d, a, b,  3, 16, T43);
    SET(b, c, d, a,  6, 23, T44);
    SET(a, b, c, d,  9,  4, T45);
    SET(d, a, b, c, 12, 11, T46);
    SET(c, d, a, b, 15, 16, T47);
    SET(b, c, d, a,  2, 23, T48);
#undef SET

     /* Round 4. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */
#define I(x, y, z) ((y) ^ ((x) | ~(z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + I(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  0,  6, T49);
    SET(d, a, b, c,  7, 10, T50);
    SET(c, d, a, b, 14, 15, T51);
    SET(b, c, d, a,  5, 21, T52);
    SET(a, b, c, d, 12,  6, T53);
    SET(d, a, b, c,  3, 10, T54);
    SET(c, d, a, b, 10, 15, T55);
    SET(b, c, d, a,  1, 21, T56);
    SET(a, b, c, d,  8,  6, T57);
    SET(d, a, b, c, 15, 10, T58);
    SET(c, d, a, b,  6, 15, T59);
    SET(b, c, d, a, 13, 21, T60);
    SET(a, b, c, d,  4,  6, T61);
    SET(d, a, b, c, 11, 10, T62);
    SET(c, d, a, b,  2, 15, T63);
    SET(b, c, d, a,  9, 21, T64);
#undef SET

     /* Then perform the following additions. (That is increment each
        of the four registers by the value it had before this block
        was started.) */
    pms->abcd[0] += a;
    pms->abcd[1] += b;
    pms->abcd[2] += c;
    pms->abcd[3] += d;
}

void
md5_init(md5_state_t *pms)
{
    pms->count[0] = pms->count[1] = 0;
    pms->abcd[0] = 0x67452301;
    pms->abcd[1] = /*0xefcdab89*/ T_MASK ^ 0x10325476;
    pms->abcd[2] = /*0x98badcfe*/ T_MASK ^ 0x67452301;
    pms->abcd[3] = 0x10325476;
}

void
md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes)
{
    const md5_byte_t *p = data;
    int left = nbytes;
    int offset = (pms->count[0] >> 3) & 63;
    md5_word_t nbits = (md5_word_t)(nbytes << 3);

    if (nbytes <= 0)
    return;

    /* Update the message length. */
    pms->count[1] += nbytes >> 29;
    pms->count[0] += nbits;
    if (pms->count[0] < nbits)
    pms->count[1]++;

    /* Process an initial partial block. */
    if (offset) {
    int copy = (offset + nbytes > 64 ? 64 - offset : nbytes);

    memcpy(pms->buf + offset, p, copy);
    if (offset + copy < 64)
        return;
    p += copy;
    left -= copy;
    md5_process(pms, pms->buf);
    }

    /* Process full blocks. */
    for (; left >= 64; p += 64, left -= 64)
    md5_process(pms, p);

    /* Process a final partial block. */
    if (left)
    memcpy(pms->buf, p, left);
}

void
md5_finish(md5_state_t *pms, md5_byte_t digest[16])
{
    static const md5_byte_t pad[64] = {
    0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
    };
    md5_byte_t data[8];
    int i;

    /* Save the length before padding. */
    for (i = 0; i < 8; ++i)
    data[i] = (md5_byte_t)(pms->count[i >> 2] >> ((i & 3) << 3));
    /* Pad to 56 bytes mod 64. */
    md5_append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1);
    /* Append the length. */
    md5_append(pms, data, 8);
    for (i = 0; i < 16; ++i)
    digest[i] = (md5_byte_t)(pms->abcd[i >> 2] >> ((i & 3) << 3));
}

main.cpp

#include <cstdio>
#include <cstring>
#include "md5.h"

int main() {
    md5_state_t s;
    char ss[] = "12345678";
    unsigned char result[16];
    md5_init(&s);
    md5_append(&s, (const unsigned char *)ss, strlen(ss));
    md5_finish(&s, (unsigned char *)result);
    for (int i = 0; i < 16; ++i) {
        printf("%x%x", (result[i] >> 4) & 0x0f, result[i] & 0x0f);
    }
    printf("\n");
    return 0;
}
发布了60 篇原创文章 · 获赞 18 · 访问量 8万+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章