全局唯一标识(UID)-多种实现方案

原文转自:http://www.tanjp.com/archives/132 (即时修正和更新)

目录

64位全局唯一标识

62进制字符串唯一ID

32位唯一ID(一年(388天)内的每秒内有128个唯一值)

32位唯一ID(一个月(48天)内的每秒内有1024个唯一值)

32位唯一ID(一周(12天)内的每秒内有4096个唯一值)


64位全局唯一标识

snowflake是用64位整型来表示一个序列号的,看上去和我们的时间戳很像,但是它的起始元年不是1970年,而是自定义的。然后时间戳里面只是记录下当前毫秒与自定义的元年时间差,这样起始就用不到64位的bit来记录整个时间戳,多出来的几位就可以做其他的事情,来看下面的结构:

00000.......000 00000 00000 000000000000
|___________| |___| |___| |__________|
            |               |        |              |
        42bit        5bit    5bit       12bit

参考snowflake算法改造:

00000.......000 0000000000 000000000000
|___________| |________| |__________|
            |                     |             |
        42bit              10bit      12bit

64位的全局唯一ID,保证在138年内的每1毫秒内有4095个唯一值。

第一部分 长度为42位,精确到毫秒级的自定义时间戳。当前时间戳减去指定时间开始的时间戳,最大可记录138年。

第二部分 长度为10位,根据不同业务ID做区分,取值范围[0,1023]。

第三部分 长度为12位,自增ID,取值范围[0,4095]。

C++代码实现:

class Uid64
{
public:
    	explicit Uid64(uint32 pn_1st_from_year = 2000, uint32 pn_2nd_business_id = 0);
    	uint64 generate(uint32 pn_2nd_business_id);
private:
    	uint32 mn_1st_from_year; //毫秒级时间
    	uint32 mn_2nd_business_id; //业务ID
    	uint32 mn_3rd_index; //自增
};
Uid64::Uid64(uint32 pn_1st_from_year, uint32 pn_2nd_business_id)
	: mn_1st_from_year(pn_1st_from_year)
	, mn_2nd_business_id(pn_2nd_business_id)
	, mn_3rd_index(0)
{
    namespace pt = boost::posix_time;
	if ( (mn_1st_from_year > pt::microsec_clock::universal_time().date().year())
		|| (mn_1st_from_year < 1970) )
	{
		mn_1st_from_year = 1970;
	}
	if (pn_2nd_business_id > 1023)
	{
		pn_2nd_business_id = 0;
	}
}
uint64	Uid64::generate(uint32 pn_2nd_business_id)
{
	namespace pt = boost::posix_time;
	mn_2nd_business_id = pn_2nd_business_id;
	if (mn_2nd_business_id > 1023)
	{
		mn_2nd_business_id = 0;
	}
	++mn_3rd_index;
	if (mn_3rd_index > 4095)
	{
		mn_3rd_index = 0;
	}
	pt::ptime now = pt::microsec_clock::universal_time();
	pt::ptime from_time(boost::gregorian::date(mn_1st_from_year, 1, 1));
	pt::time_duration time_span = now - from_time;
	uint64 zn_1st = time_span.total_milliseconds();
	uint64 zn_2nd = mn_2nd_business_id;
	uint64 zn_3rd = mn_3rd_index;
	uint64 zn_result = (zn_1st << 22) | (zn_2nd << 12) | zn_3rd;
	return zn_result;
}

测试例子:

void main()
{	
    const uint32 kLen = 100000;
	Uid64 zo_u64(2019, 1);
	uint64 zc_vec[kLen];
	for (uint32 i = 0; i < kLen; ++i)
	{
		zc_vec[i] = zo_u64.generate(5);
	}
    uint32 zn_repeat_count = 0;
    std::unordered_set<uint64> zc_check_repeat;
    for (uint32 i = 0; i < kLen; ++i)
    {
        auto it = zc_check_repeat.find(zc_vec[i]);
        if (it == zc_check_repeat.end())
        {
            zc_check_repeat.insert(zc_vec[i]);
        }
        else
        {
            ++zn_repeat_count;
            std::cout << "repeat:" << zc_vec[i] << std::endl;
        }
    }
        std::cout << std::endl 
            << "repeat count : " << zn_repeat_count << std::endl;
}

62进制字符串唯一ID

全局唯一ID在程序很多时候不能包含标点符号,于是设计一种不包含标点符号的可见字符转换方案。非标点有效字符包括 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,分别对应数值 0~61,唯一ID的生成由多个数值转成字符串后拼接起来,为了正确解析和保证唯一性,N个数值拼接,需要N-1个以上带字符串的长度信息。

拼接公式: 自增+线程ID+线程ID字符长度+进程ID+进程ID字符长度+相对时间戳+相对时间戳字符长度

例如:

Uid62 zo_u62;
std::cout << zo_u62.new_uid(135, 61) << std::endl;

结果: 1z1B22RlGj25

解析: 1(自增) z(61相应的字符) 1(61相应的字符的长度) B2(135相应的字符) 2(135相应的字符的长度) RlGj2(相对时间戳相应的字符) 5(相对时间戳相应的字符长度)

代码实现:

 

class Uid62
{
public:
    explicit Uid62(uint32 pn_basetime = 1514736000U);
    std::string new_uid(uint32 pn_serverid, uint32 pn_actorid);
    std::string split_uid(const std::string & ps_uid);
private:
    Uid62(const Uid62&) = delete;
    Uid62(Uid62&&) = delete;
    Uid62& operator=(const Uid62&) = delete;
	Uid62& operator=(Uid62&&) = delete;
	void convert_to_str(std::stringstream & po_result, uint32 pn_val, bool pb_append_len = true);
	bool convert_to_uint32(uint32 & pn_result, const std::string & ps_str);
private:
	uint32 mn_basetime;
	uint32 mn_calc;
	uint32 mn_lasttime;
	std::string ms_code;
	std::unordered_map<uint8, uint8> mc_codemap;
};
Uid62::Uid62(uint32 pn_basetime)
	: mn_basetime(pn_basetime)
	, mn_calc(0)
	, mn_lasttime(0)
	, ms_code("0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
{
	for (uint8 i = 0; i < ms_code.size(); ++i)
	{
		mc_codemap[ms_code[i]] = i;
	}
}
std::string Uid62::new_uid(uint32 pn_serverid, uint32 pn_actorid)
{
	uint32 zn_now = (uint32)time(NULL);
	if ( (pn_serverid > 999999999U) || (pn_actorid > 999999999U) || (mn_basetime > zn_now) )
	{
		return std::string();
	}
	if (zn_now == mn_lasttime)
	{
		// 同1s内, 累加
		mn_calc++;
	}
	else
	{
		mn_lasttime = zn_now;
		mn_calc = 1;
	}
	std::stringstream ss;
	convert_to_str(ss, mn_calc, false);
	convert_to_str(ss, pn_actorid, true);
	convert_to_str(ss, pn_serverid, true);
	convert_to_str(ss, zn_now - mn_basetime, true);	
	return std::move(ss.str());
}

std::string Uid62::split_uid(const std::string & ps_uid)
{
	for (uint8 i = 0; i < ps_uid.size(); ++i)
	{
		if (mc_codemap.find(ps_uid[i]) == mc_codemap.end())
		{
			return "error='invalid uid:" + ps_uid + "'";
		}
	}
	//time
	uint8 len_char_index = (uint8)ps_uid.size() - 1;
	char len_char = ps_uid[len_char_index];
	uint8 len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//长度不匹配
		return "error='invalid time len char : " + ps_uid + "'";
	}
	uint8 start_index = len_char_index - len;
	std::string time_str = ps_uid.substr(start_index, len + 1);
	if (start_index < 2)
	{
		//长度不匹配
		return "error='invalid formation no serverid : " + ps_uid + "'";
	}

	//serverid
	len_char_index = start_index - 1;
	len_char = ps_uid[len_char_index];
	len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//长度不匹配
		return "error='invalid serverid len char : " + ps_uid + "'";
	}
	start_index = len_char_index - len;
	std::string serverid_str = ps_uid.substr(start_index, len + 1);
	if (start_index < 2)
	{
		//长度不匹配
		return "error='invalid formation no actorid : " + ps_uid + "'";
	}

	//actorid
	len_char_index = start_index - 1;
	len_char = ps_uid[len_char_index];
	len = mc_codemap[len_char];
	if (len > len_char_index)
	{
		//长度不匹配
		return "error='invalid actorid len char : " + ps_uid + "'";
	}
	start_index = len_char_index - len;
	std::string actorid_str = ps_uid.substr(start_index, len + 1);
	if ((start_index < 1) || (start_index >= ms_code.size()))
	{
		//长度不匹配
		return "error='invalid formation no calc : " + ps_uid + "'";
	}

	//calc
	len = start_index;
	std::string calc_str = ps_uid.substr(0, len);
	calc_str.append(1, ms_code[len]);
	uint32 zn_serverid = 0;
	uint32 zn_actorid = 0;
	uint32 zn_calc = 0;
	uint32 zn_time = 0;

	if (!convert_to_uint32(zn_serverid, serverid_str))
	{
		return "error='convert serverid failed : " + serverid_str + "'";
	}
	if (!convert_to_uint32(zn_actorid, actorid_str))
	{
		return "error='convert actorid failed : " + actorid_str + "'";
	}
	if (!convert_to_uint32(zn_calc, calc_str))
	{
		return "error='convert calc failed : " + calc_str + "'";
	}
	if (!convert_to_uint32(zn_time, time_str))
	{
		return "error='convert time failed : " + time_str + "'";
	}
	std::stringstream ss;
	ss << "serverid=" << zn_serverid << ",actorid=" << zn_actorid
		<< ",tps=" << zn_calc << ",timespan=" << zn_time;
	return ss.str();
}

void Uid62::convert_to_str(std::stringstream & po_ss, uint32 pn_val, bool pb_append_len)
{
	uint32 len = 0;
	while (pn_val >= ms_code.length())
	{
		po_ss << ms_code[pn_val % ms_code.length()];
		pn_val /= (uint32)ms_code.length();
		++len;
	}
	if (pn_val >= 0)
	{
		//0 ~ 61
		po_ss << ms_code[pn_val];
		++len;		
	}
	if (pb_append_len)
	{	
		po_ss << ms_code[len];
	}
}

bool Uid62::convert_to_uint32(uint32 & pn_result, const std::string & ps_str)
{
	if (ps_str.size() < 2)
		return false; //非法结构

	char len_char = ps_str[ps_str.size() - 1];
	if (mc_codemap.find(len_char) == mc_codemap.end())
		return false; //非法长度

	uint8 len = mc_codemap[len_char];
	if (len > (ps_str.size() - 1) )
		return false; //长度不匹配

	char c = 0;
	uint32 zn_result = 0;
	for (uint8 i = 0; i < len; ++i)
	{
		c = ps_str[ps_str.size() - 1 - (len - i)];
		if (mc_codemap.find(c) == mc_codemap.end())
			return false; //非法字符
   
		uint32 val = mc_codemap[c];
		zn_result += val * (uint32)std::pow(ms_code.length(), i);
	}
	pn_result = zn_result;
	return true;
}

 

32位唯一ID(一年(388天)内的每秒内有128个唯一值)

将32位的整型划分为两部分,存储结构如下:
 00000........000 0000000
 |___________| |_____|
              |                |
          25bit          7bit

 第一部分 长度为25位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1年(388天)。
 第二部分 长度为7位, 自增ID,取值范围[0,127]。

代码实现:

 

 

class Uid32YearPS128
{
public:
    	Uid32YearPS128();
    	uint32 generate();
private:
    	uint32 mn_1st_from_time;
    	uint32 mn_2nd_index;
    	uint32 mn_limit_counter;
	uint32 mn_reset_time;
};
Uid32YearPS128::Uid32YearPS128()
	: mn_1st_from_time(std::time(0)-1) //从上一秒开始
	, mn_2nd_index(0)
	, mn_limit_counter(0)
	, mn_reset_time((uint32)std::time(0)){}
uint32	Uid32YearPS128::generate()
{
	uint32 now = (uint32)std::time(0);
	if (mn_limit_counter > 127)
	{
		if (mn_reset_time == now)
		{
			return 0; //同一秒内超过数量上限, 生成失败
		}		
		mn_reset_time = now;
		mn_limit_counter = 0;
	}
	++mn_limit_counter;
	uint32 zn_1st = now - mn_1st_from_time;
	// 长度为22位数值大小: 2^25 - 1 = 33554431, 33523200=388天
	if (zn_1st > 33523200)
	{
		//超过周期大小,回归
		mn_1st_from_time = now - 1;
		zn_1st = now - mn_1st_from_time;
	}
	++mn_2nd_index;
	if (mn_2nd_index > 127)
	{
		mn_2nd_index = 0;
	}
	return ((zn_1st << 7) | mn_2nd_index);
}

 

32位唯一ID(一个月(48天)内的每秒内有1024个唯一值)

将32位的整型划分为两部分,存储结构如下:

 

 00000........000 0000000000
 |___________| |________|
                |              |
            22bit         10bit

 第一部分 长度为22位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1个月(48天)。
 第二部分 长度为10位, 自增ID,取值范围[0,1023]。

 

32位唯一ID(一周(12天)内的每秒内有4096个唯一值)

将32位的整型划分为两部分,存储结构如下:

 

 00000........000 000000000000
 |___________| |__________|
             |                     |
         20bit              12bit

 第一部分 长度为20位, 精确到秒级的时间戳。当前时间戳减去程序启动时间开始的时间戳,最大可记录1周(12天)。
 第二部分 长度为20位, 自增ID,取值范围[0,4095]。

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章