hashtable概述

在前面介紹的RB-tree中，可以看出紅黑樹的插入、查找、刪除的平均時間複雜度爲O(nlogn)。但這是基於一個假設：輸入數據具有隨機性。而哈希表/散列表hash table在插入、刪除、查找上具有“平均常數時間複雜度”O(1)；這種表現是以統計爲基礎，且不依賴輸入數據的隨機性。

hashtable的實現主要要通過幾種方式：線性探測(linear probing)，二次探測(quadratic probing)，開鏈(separate chaining)......

在SGI STL中，就是使用開鏈這種方法。hash table表格內的每個元素爲一個桶子bucket。這裏bucket所維護的linked list不是

STL得list或slist，而是下面的hashtable node所形成的list。而buckets的聚合體使用vector來實現。

template <class Value>
struct __hashtable_node
{
    __hashtable_node* next;
    Value val;
}

使用注意

1.在使用hashtable的時候，不能直接調用<stl_hashtable.h>,應該含乳有用到hashtable的容器頭文件，例如：<hash_set.h>和<hash_map.h>。

#include<hash_set>
#include<hash_map>

2.hash function只能處理int,short,long,char和char*。

不能處理string，double和float類型，這三個類型需要用戶自定義hash function.

3.鍵值相同的元素，一定落在同一個bucket list中，鍵值不同的元素，有可能落在同一個bucket list中。

插入操作和表格重整

當對hashtable進行插入操作的時候，會判斷需要重整表格，也就是buckets這個vector。

書中講到：“表格重建與否”的判斷原則頗爲奇特，是那元素個數（把新增元素計入後）和buckets這個vector的大小來比較。如果前

者大於後者，就重建表格。由此可判知，每個bucket（list）的最大容量和buckets vector的大小相同。

這裏可以理解爲，最糟糕的情況下，每個元素都在第一個bucket中，那麼該list的容量最大與buckets的大小相同。在最好的情況

下，是每個bucket中都只有一個元素，從而可以使得查找元素的速度很快。但是在常規情況下，肯定會有某些bucket中的元素不

止一個，從而使得有的bucket中爲空。

不允許重複插入

pair<iterator,bool>insert_unique(const value_type& obj)
{
    resize(num_elements+1);//在該函數中判斷是否需要重整表格，若需要就進行重整
    return insert_unique_noresize(obj);
}

表格重整

template <class V, class K, class HF, class Ex, class Eq, class A>
void hashtable<V, K, HF, Ex, Eq, A>::resize(size_type num_elements_hint)
{
  const size_type old_n = buckets.size();//bucket vector 的大小
  /*如果元素個數(把新增元素計入後)比bucket vector 大，則需要重建表格*/
  if (num_elements_hint > old_n) {
      const size_type n = next_size(num_elements_hint);//找出下一個質數
		
      if (n > old_n) { //old_n不是質數表裏面的最大值時，纔可擴展
        vector<node*, A> tmp(n, (node*)0);//設立新的bucket vector，大小爲n
	    //以下處理每一箇舊的bucket
        for (size_type bucket = 0; bucket < old_n; ++bucket) {
            node* first = buckets[bucket];//指向節點所對應之串行(鏈表)的起始節點
            while (first) {//處理單個bucket中的鏈表
                size_type new_bucket = bkt_num(first->val, n);//找出節點落在哪一個新的bucket內
                buckets[bucket] = first->next;//令舊bucket指向其所對應的鏈表的下一個節點，以便迭代處理
                /*下面將當前節點插入到新的bucket內，成爲其對應鏈表的第一個節點，這裏的實現比較巧妙
                相當於插入新節點到新bucket vector中，新插入的元素插入到鏈表的首位置，這裏不同於一般的插入的是，
                由於之前已有元素佔據空間，這裏只是修改節點指針指向*/
                first->next = tmp[new_bucket];
                tmp[new_bucket] = first;
                first = buckets[bucket];//回到舊bucket所指的待處理鏈表，準備處理下一個節點
            }
        }
        buckets.swap(tmp);//vector::swap 新舊兩個buckets 對調（淺修改）
        /*對調兩方如果大小不同，大的會變小，小的會變大，離開時釋放local tmp 的內存*/
     }
  }
}

注意：每次調整的時候，不是擴大兩倍，是以質數來設定表格大小。

/*質數表*/
// Note: assumes long is at least 32 bits.
static const int __stl_num_primes = 28;
static const unsigned long __stl_prime_list[__stl_num_primes] =
{
	53, 97, 193, 389, 769,
	1543, 3079, 6151, 12289, 24593,
	49157, 98317, 196613, 393241, 786433,
	1572869, 3145739, 6291469, 12582917, 25165843,
	50331653, 100663319, 201326611, 402653189, 805306457,
	1610612741, 3221225473, 4294967291
};
 
/*以下找出上述28個質數之中，最接近並大於 n的那個質數（有的話），沒有取最大*/
inline unsigned long __stl_next_prime(unsigned long n)
{
	const unsigned long* first = __stl_prime_list;//首
	const unsigned long* last = __stl_prime_list + __stl_num_primes;//尾的下一位置
	/*泛型算法，返回一個迭代器，指向第一個不小於 n的元素*/
	const unsigned long* pos = lower_bound(first, last, n);
	return pos == last ? *(last - 1) : *pos;//如果沒有比它大的就取最大的
}

size_type next_size (size_type n) const {return __stl_next_prime(n);}

不允許重複插入，不需要重建表格

/*插入元素，不允許重複*/
template <class V, class K, class HF, class Ex, class Eq, class A>
pair<hashtable<V, K, HF, Ex, Eq, A>::iterator, bool>
hashtable<V, K, HF, Ex, Eq, A>::insert_unique_noresize(const value_type& obj)
{
	const size_type n = bkt_num(obj);//定位bucket
	node* first = buckets[n];
 
	/*判斷插入元素是否有重複*/
	for (node* cur = first; cur; cur = cur->next)
	if (equals(get_key(cur->val), get_key(obj)))
		return pair<iterator, bool>(iterator(cur, this), false);
 
	node* tmp = new_node(obj);//產生新節點 node_allocator::allocate()
	/*先插入節點放在鏈表最前面*/
	tmp->next = first;
	buckets[n] = tmp;
	++num_elements;//元素個數增加
	return pair<iterator, bool>(iterator(tmp, this), true);
}

允許重複插入

pair<iterator,bool>insert_equal(const value_type& obj)
{
    resize(num_elements+1);//在該函數中判斷是否需要重整表格，若需要就進行重整
    return insert_equal_noresize(obj);
}

不需要重建的情況下，允許重複插入

/*插入元素，允許重複*/
template <class V, class K, class HF, class Ex, class Eq, class A>
hashtable<V, K, HF, Ex, Eq, A>::iterator
hashtable<V, K, HF, Ex, Eq, A>::insert_equal_noresize(const value_type& obj)
{
	const size_type n = bkt_num(obj);//定位bucket
	node* first = buckets[n];//鏈表頭節點
 
	for (node* cur = first; cur; cur = cur->next)
	if (equals(get_key(cur->val), get_key(obj))) {//如果插入元素是重複的(與cur->val重複)
		node* tmp = new_node(obj);
		tmp->next = cur->next;//新增元素插入重複元素的後面
		cur->next = tmp;
		++num_elements;
		return iterator(tmp, this);
	}
	//沒有重複，等同於insert_unique_noresize()
	node* tmp = new_node(obj);
	tmp->next = first;
	buckets[n] = tmp;
	++num_elements;
	return iterator(tmp, this);
}

STL源碼剖析hashtable

hashtable概述

使用注意

插入操作和表格重整

不允許重複插入

表格重整

不允許重複插入，不需要重建表格

允許重複插入

不需要重建的情況下，允許重複插入

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

.NET週刊【5月第2期 2024-05-12】

dynamic_cast（C++primer習題19.3）

C++erase()

Ubuntu JetBrains（JetBrains Account Error：JetBrains Account connection error: www.jetbrains.com）

bash的pipe命令

Linux下硬鏈接與符號鏈接

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結