HashSet是一種元素集合,實現了Set接口,是一種快速的,去重的集合對象,繼承關係見圖1所示。
圖1 HashSet繼承關係圖
HashSet繼承自Set接口,繼承Set的還包括TreeSet,SortedSet等,他們都有一個基本的標準的特徵,不包含重複的元素。HashSet從名稱上容易看出,這是通過Hash來實現的元素去重以及對元素的各種快速操作。
HashSet是如何做到的呢?從成員變量和構造函數中就可以一窺端倪。
構造函數
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
首先看着兩個成員變量,HashSet中有一個HashMap,似乎已經預示着這兩個者之間有一些關聯,後面一個變量則印證了這個猜測,這個變量可能是要放置進HashMap中的。
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
/**
* Constructs a new set containing the elements in the specified
* collection. The <tt>HashMap</tt> is created with default load factor
* (0.75) and an initial capacity sufficient to contain the elements in
* the specified collection.
*
* @param c the collection whose elements are to be placed into this set
* @throws NullPointerException if the specified collection is null
*/
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
addAll(c);
}
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* the specified initial capacity and the specified load factor.
*
* @param initialCapacity the initial capacity of the hash map
* @param loadFactor the load factor of the hash map
* @throws IllegalArgumentException if the initial capacity is less
* than zero, or if the load factor is nonpositive
*/
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
/**
* Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
* the specified initial capacity and default load factor (0.75).
*
* @param initialCapacity the initial capacity of the hash table
* @throws IllegalArgumentException if the initial capacity is less
* than zero
*/
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
看到這些HashSet的構造函數,是不是有些似曾相識的感覺,與HashMap的構造函數有些雷同,或者說,這就是在構造HashMap。因爲實質上,HashSet就是由HashMap實現的,來看一下HashSet的關鍵函數就更清楚了。
關鍵函數
HashSet在構造之初,首先初始化了一個HashMap,並利用這個HashMap作爲存儲數據的容器。前面有介紹過,HashMap是一個鍵值對容器,節點元素以Node<Key,Value>形式組織。HashMap使用Key計算Hash,使用Value存儲對應的值。HashSet則只存儲單個的元素,這樣就可以只用HashMap的Key就可以實現。
HashSet關鍵的方法羅列如下,可以看到,所有的操作都是針對Key進行的。之前定義的PRESENT變量,就是爲了填充到HashMap中Node節點的Value部分。
/**
* Returns an iterator over the elements in this set. The elements
* are returned in no particular order.
*
* @return an Iterator over the elements in this set
* @see ConcurrentModificationException
*/
public Iterator<E> iterator() {
return map.keySet().iterator();
}
/**
* Returns the number of elements in this set (its cardinality).
*
* @return the number of elements in this set (its cardinality)
*/
public int size() {
return map.size();
}
/**
* Returns <tt>true</tt> if this set contains no elements.
*
* @return <tt>true</tt> if this set contains no elements
*/
public boolean isEmpty() {
return map.isEmpty();
}
/**
* Returns <tt>true</tt> if this set contains the specified element.
* More formally, returns <tt>true</tt> if and only if this set
* contains an element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>.
*
* @param o element whose presence in this set is to be tested
* @return <tt>true</tt> if this set contains the specified element
*/
public boolean contains(Object o) {
return map.containsKey(o);
}
/**
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element <tt>e</tt> to this set if
* this set contains no element <tt>e2</tt> such that
* <tt>(e==null ? e2==null : e.equals(e2))</tt>.
* If this set already contains the element, the call leaves the set
* unchanged and returns <tt>false</tt>.
*
* @param e element to be added to this set
* @return <tt>true</tt> if this set did not already contain the specified
* element
*/
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
/**
* Removes the specified element from this set if it is present.
* More formally, removes an element <tt>e</tt> such that
* <tt>(o==null ? e==null : o.equals(e))</tt>,
* if this set contains such an element. Returns <tt>true</tt> if
* this set contained the element (or equivalently, if this set
* changed as a result of the call). (This set will not contain the
* element once the call returns.)
*
* @param o object to be removed from this set, if present
* @return <tt>true</tt> if the set contained the specified element
*/
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
/**
* Removes all of the elements from this set.
* The set will be empty after this call returns.
*/
public void clear() {
map.clear();
}
HashSet不需要重新實現這些函數,都直接借用HashMap的就好。包裝在外層的HashSet實質上是一個代理,具體的操作都交由內部構造的HashMap實例實現。通過這樣的方式,可以更加方便高效地實現HashSet這種數據結構,也避免了以幾乎同樣方式再實現HashSet的冗餘。
小結
HashSet是一種支持快速操作的數據結構,理想情況下對HashSet的操作可達到常量級別,是存儲不重複元素的常用集合。HashSet本身不是線程安全的,在多線程環境下使用可能出現併發訪問問題,要在多線程環境下使用Set可以嘗試使用ConcurrentHashMap構造一個類似的ConcurrentHashSet來實現。