全局唯一標識(UID)-多種實現方案

原文轉自:http://www.tanjp.com/archives/132 (即時修正和更新)

目錄

64位全局唯一標識

62進制字符串唯一ID

32位唯一ID(一年(388天)內的每秒內有128個唯一值)

32位唯一ID(一個月(48天)內的每秒內有1024個唯一值)

32位唯一ID(一週(12天)內的每秒內有4096個唯一值)


64位全局唯一標識

snowflake是用64位整型來表示一個序列號的,看上去和我們的時間戳很像,但是它的起始元年不是1970年,而是自定義的。然後時間戳裏面只是記錄下當前毫秒與自定義的元年時間差,這樣起始就用不到64位的bit來記錄整個時間戳,多出來的幾位就可以做其他的事情,來看下面的結構:

00000.......000 00000 00000 000000000000
|___________| |___| |___| |__________|
            |               |        |              |
        42bit        5bit    5bit       12bit

參考snowflake算法改造:

00000.......000 0000000000 000000000000
|___________| |________| |__________|
            |                     |             |
        42bit              10bit      12bit

64位的全局唯一ID,保證在138年內的每1毫秒內有4095個唯一值。

第一部分 長度爲42位,精確到毫秒級的自定義時間戳。當前時間戳減去指定時間開始的時間戳,最大可記錄138年。

第二部分 長度爲10位,根據不同業務ID做區分,取值範圍[0,1023]。

第三部分 長度爲12位,自增ID,取值範圍[0,4095]。

C++代碼實現:

class Uid64
{
public:
    	explicit Uid64(uint32 pn_1st_from_year = 2000, uint32 pn_2nd_business_id = 0);
    	uint64 generate(uint32 pn_2nd_business_id);
private:
    	uint32 mn_1st_from_year; //毫秒級時間
    	uint32 mn_2nd_business_id; //業務ID
    	uint32 mn_3rd_index; //自增
};
Uid64::Uid64(uint32 pn_1st_from_year, uint32 pn_2nd_business_id)
	: mn_1st_from_year(pn_1st_from_year)
	, mn_2nd_business_id(pn_2nd_business_id)
	, mn_3rd_index(0)
{
    namespace pt = boost::posix_time;
	if ( (mn_1st_from_year > pt::microsec_clock::universal_time().date().year())
		|| (mn_1st_from_year < 1970) )
	{
		mn_1st_from_year = 1970;
	}
	if (pn_2nd_business_id > 1023)
	{
		pn_2nd_business_id = 0;
	}
}
uint64	Uid64::generate(uint32 pn_2nd_business_id)
{
	namespace pt = boost::posix_time;
	mn_2nd_business_id = pn_2nd_business_id;
	if (mn_2nd_business_id > 1023)
	{
		mn_2nd_business_id = 0;
	}
	++mn_3rd_index;
	if (mn_3rd_index > 4095)
	{
		mn_3rd_index = 0;
	}
	pt::ptime now = pt::microsec_clock::universal_time();
	pt::ptime from_time(boost::gregorian::date(mn_1st_from_year, 1, 1));
	pt::time_duration time_span = now - from_time;
	uint64 zn_1st = time_span.total_milliseconds();
	uint64 zn_2nd = mn_2nd_business_id;
	uint64 zn_3rd = mn_3rd_index;
	uint64 zn_result = (zn_1st << 22) | (zn_2nd << 12) | zn_3rd;
	return zn_result;
}

測試例子:

void main()
{	
    const uint32 kLen = 100000;
	Uid64 zo_u64(2019, 1);
	uint64 zc_vec[kLen];
	for (uint32 i = 0; i < kLen; ++i)
	{
		zc_vec[i] = zo_u64.generate(5);
	}
    uint32 zn_repeat_count = 0;
    std::unordered_set<uint64> zc_check_repeat;
    for (uint32 i = 0; i < kLen; ++i)
    {
        auto it = zc_check_repeat.find(zc_vec[i]);
        if (it == zc_check_repeat.end())
        {
            zc_check_repeat.insert(zc_vec[i]);
        }
        else
        {
            ++zn_repeat_count;
            std::cout << "repeat:" << zc_vec[i] << std::endl;
        }
    }
        std::cout << std::endl 
            << "repeat count : " << zn_repeat_count << std::endl;
}

62進制字符串唯一ID

全局唯一ID在程序很多時候不能包含標點符號,於是設計一種不包含標點符號的可見字符轉換方案。非標點有效字符包括 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,分別對應數值 0~61,唯一ID的生成由多個數值轉成字符串後拼接起來,爲了正確解析和保證唯一性,N個數值拼接,需要N-1個以上帶字符串的長度信息。

拼接公式: 自增+線程ID+線程ID字符長度+進程ID+進程ID字符長度+相對時間戳+相對時間戳字符長度

例如:

Uid62 zo_u62;
std::cout << zo_u62.new_uid(135, 61) << std::endl;

結果: 1z1B22RlGj25

解析: 1(自增) z(61相應的字符) 1(61相應的字符的長度) B2(135相應的字符) 2(135相應的字符的長度) RlGj2(相對時間戳相應的字符) 5(相對時間戳相應的字符長度)

代碼實現:

 

class Uid62
{
public:
    explicit Uid62(uint32 pn_basetime = 1514736000U);
    std::string new_uid(uint32 pn_serverid, uint32 pn_actorid);
    std::string split_uid(const std::string & ps_uid);
private:
    Uid62(const Uid62&) = delete;
    Uid62(Uid62&&) = delete;
    Uid62& operator=(const Uid62&) = delete;
	Uid62& operator=(Uid62&&) = delete;
	void convert_to_str(std::stringstream & po_result, uint32 pn_val, bool pb_append_len = true);
	bool convert_to_uint32(uint32 & pn_result, const std::string & ps_str);
private:
	uint32 mn_basetime;
	uint32 mn_calc;
	uint32 mn_lasttime;
	std::string ms_code;
	std::unordered_map<uint8, uint8> mc_codemap;
};
Uid62::Uid62(uint32 pn_basetime)
	: mn_basetime(pn_basetime)
	, mn_calc(0)
	, mn_lasttime(0)
	, ms_code("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
{
	for (uint8 i = 0; i < ms_code.size(); ++i)
	{
		mc_codemap[ms_code[i]] = i;
	}
}
std::string Uid62::new_uid(uint32 pn_serverid, uint32 pn_actorid)
{
	uint32 zn_now = (uint32)time(NULL);
	if ( (pn_serverid > 999999999U) || (pn_actorid > 999999999U) || (mn_basetime > zn_now) )
	{
		return std::string();
	}
	if (zn_now == mn_lasttime)
	{
		// 同1s內, 累加
		mn_calc++;
	}
	else
	{
		mn_lasttime = zn_now;
		mn_calc = 1;
	}
	std::stringstream ss;
	convert_to_str(ss, mn_calc, false);
	convert_to_str(ss, pn_actorid, true);
	convert_to_str(ss, pn_serverid, true);
	convert_to_str(ss, zn_now - mn_basetime, true);	
	return std::move(ss.str());
}

std::string Uid62::split_uid(const std::string & ps_uid)
{
	for (uint8 i = 0; i < ps_uid.size(); ++i)
	{
		if (mc_codemap.find(ps_uid[i]) == mc_codemap.end())
		{
			return "error='invalid uid:" + ps_uid + "'";
		}
	}
	//time
	uint8 len_char_index = (uint8)ps_uid.size() - 1;
	char len_char = ps_uid[len_char_index];
	uint8 len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//長度不匹配
		return "error='invalid time len char : " + ps_uid + "'";
	}
	uint8 start_index = len_char_index - len;
	std::string time_str = ps_uid.substr(start_index, len + 1);
	if (start_index < 2)
	{
		//長度不匹配
		return "error='invalid formation no serverid : " + ps_uid + "'";
	}

	//serverid
	len_char_index = start_index - 1;
	len_char = ps_uid[len_char_index];
	len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//長度不匹配
		return "error='invalid serverid len char : " + ps_uid + "'";
	}
	start_index = len_char_index - len;
	std::string serverid_str = ps_uid.substr(start_index, len + 1);
	if (start_index < 2)
	{
		//長度不匹配
		return "error='invalid formation no actorid : " + ps_uid + "'";
	}

	//actorid
	len_char_index = start_index - 1;
	len_char = ps_uid[len_char_index];
	len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//長度不匹配
		return "error='invalid actorid len char : " + ps_uid + "'";
	}
	start_index = len_char_index - len;
	std::string actorid_str = ps_uid.substr(start_index, len + 1);
	if ((start_index < 1) || (start_index >= ms_code.size()))
	{
		//長度不匹配
		return "error='invalid formation no calc : " + ps_uid + "'";
	}

	//calc
	len = start_index;
	std::string calc_str = ps_uid.substr(0, len);
	calc_str.append(1, ms_code[len]);
	uint32 zn_serverid = 0;
	uint32 zn_actorid = 0;
	uint32 zn_calc = 0;
	uint32 zn_time = 0;

	if (!convert_to_uint32(zn_serverid, serverid_str))
	{
		return "error='convert serverid failed : " + serverid_str + "'";
	}
	if (!convert_to_uint32(zn_actorid, actorid_str))
	{
		return "error='convert actorid failed : " + actorid_str + "'";
	}
	if (!convert_to_uint32(zn_calc, calc_str))
	{
		return "error='convert calc failed : " + calc_str + "'";
	}
	if (!convert_to_uint32(zn_time, time_str))
	{
		return "error='convert time failed : " + time_str + "'";
	}
	std::stringstream ss;
	ss << "serverid=" << zn_serverid << ",actorid=" << zn_actorid
		<< ",tps=" << zn_calc << ",timespan=" << zn_time;
	return ss.str();
}

void Uid62::convert_to_str(std::stringstream & po_ss, uint32 pn_val, bool pb_append_len)
{
	uint32 len = 0;
	while (pn_val >= ms_code.length())
	{
		po_ss << ms_code[pn_val % ms_code.length()];
		pn_val /= (uint32)ms_code.length();
		++len;
	}
	if (pn_val >= 0)
	{
		//0 ~ 61
		po_ss << ms_code[pn_val];
		++len;		
	}
	if (pb_append_len)
	{	
		po_ss << ms_code[len];
	}
}

bool Uid62::convert_to_uint32(uint32 & pn_result, const std::string & ps_str)
{
	if (ps_str.size() < 2)
		return false; //非法結構

	char len_char = ps_str[ps_str.size() - 1];
	if (mc_codemap.find(len_char) == mc_codemap.end())
		return false; //非法長度

	uint8 len = mc_codemap[len_char];
	if (len > (ps_str.size() - 1) )
		return false; //長度不匹配

	char c = 0;
	uint32 zn_result = 0;
	for (uint8 i = 0; i < len; ++i)
	{
		c = ps_str[ps_str.size() - 1 - (len - i)];
		if (mc_codemap.find(c) == mc_codemap.end())
			return false; //非法字符
   
		uint32 val = mc_codemap[c];
		zn_result += val * (uint32)std::pow(ms_code.length(), i);
	}
	pn_result = zn_result;
	return true;
}

 

32位唯一ID(一年(388天)內的每秒內有128個唯一值)

將32位的整型劃分爲兩部分,存儲結構如下:
 00000........000 0000000
 |___________| |_____|
              |                |
          25bit          7bit

 第一部分 長度爲25位, 精確到秒級的時間戳。當前時間戳減去程序啓動時間開始的時間戳,最大可記錄1年(388天)。
 第二部分 長度爲7位, 自增ID,取值範圍[0,127]。

代碼實現:

 

 

class Uid32YearPS128
{
public:
    	Uid32YearPS128();
    	uint32 generate();
private:
    	uint32 mn_1st_from_time;
    	uint32 mn_2nd_index;
    	uint32 mn_limit_counter;
	uint32 mn_reset_time;
};
Uid32YearPS128::Uid32YearPS128()
	: mn_1st_from_time(std::time(0)-1) //從上一秒開始
	, mn_2nd_index(0)
	, mn_limit_counter(0)
	, mn_reset_time((uint32)std::time(0)){}
uint32	Uid32YearPS128::generate()
{
	uint32 now = (uint32)std::time(0);
	if (mn_limit_counter > 127)
	{
		if (mn_reset_time == now)
		{
			return 0; //同一秒內超過數量上限, 生成失敗
		}		
		mn_reset_time = now;
		mn_limit_counter = 0;
	}
	++mn_limit_counter;
	uint32 zn_1st = now - mn_1st_from_time;
	// 長度爲22位數值大小: 2^25 - 1 = 33554431, 33523200=388天
	if (zn_1st > 33523200)
	{
		//超過週期大小,迴歸
		mn_1st_from_time = now - 1;
		zn_1st = now - mn_1st_from_time;
	}
	++mn_2nd_index;
	if (mn_2nd_index > 127)
	{
		mn_2nd_index = 0;
	}
	return ((zn_1st << 7) | mn_2nd_index);
}

 

32位唯一ID(一個月(48天)內的每秒內有1024個唯一值)

將32位的整型劃分爲兩部分,存儲結構如下:

 

 00000........000 0000000000
 |___________| |________|
                |              |
            22bit         10bit

 第一部分 長度爲22位, 精確到秒級的時間戳。當前時間戳減去程序啓動時間開始的時間戳,最大可記錄1個月(48天)。
 第二部分 長度爲10位, 自增ID,取值範圍[0,1023]。

 

32位唯一ID(一週(12天)內的每秒內有4096個唯一值)

將32位的整型劃分爲兩部分,存儲結構如下:

 

 00000........000 000000000000
 |___________| |__________|
             |                     |
         20bit              12bit

 第一部分 長度爲20位, 精確到秒級的時間戳。當前時間戳減去程序啓動時間開始的時間戳,最大可記錄1周(12天)。
 第二部分 長度爲20位, 自增ID,取值範圍[0,4095]。

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章