7-1 集合基礎和基於二分搜索樹的集合實現
- 集合中不能包含重複元素
- 回憶上一小節的二分搜索樹中,不能存放重複元素,所以這就是一個非常好的實現“集合”的底層數據結構
- 集合
public interface Set<E> {
void add(E e);
void remove(E e);
boolean contains(E e);
int getSize();
boolean isEmpty();
}
- 典型應用:客戶統計,文本詞彙量統計,
public class BSTSet<E extends Comparable<E>> implements Set<E> {
private BST<E> bst;
public BSTSet(){
bst = new BST<>();
}
@Override
public int getSize(){
return bst.size();
}
@Override
public int isEmpty(){
return bst.isEmpty();
}
@Override
public void add(E e){
bst.add(e);
}
@Override
public boolean contains(E e){
return bst.contains(e);
}
@Override
public void remove(E e){
bst.remove(e);
}
}
可以看到沒有太多修改,我們上一章實現的二分搜索樹就可以支持集合接口。
下面測試BSTSet這個數據結構
import java.util.ArrayList;
public class Main {
public static void main(String[] args) {
System.out.println("pride and prejudice");
ArrayList<String> words1 = new ArrayList<>();
FileOperation.readFile("pride-and-prejudice.txt",words1);
System.out.println("Total words:"+words1.size());
BSTSet<String> set1 = new BSTSet<>();
for (String word : words1)
set1.add(word);
System.out.println("Total different words:"+set1.getSize());
}
}
7-2 基於鏈表的集合實現
注意基於鏈表的集合實現其中的對象並不要求具有可比性。
import java.util.ArrayList;
public class LinkedListSet<E> implements Set<E> {
private LinkedList<E> list;
public LinkedListSet(){
list = new LinkedList<>();
}
@Override
public int getSize(){
return list.getSize();
}
@Override
public boolean isEmpty(){
return list.isEmpty();
}
@Override
public void add(E e){
if (!list.contains(e))
list.addFirst(e);
}
@Override
public boolean contains(E e){
return list.contains(e);
}
@Override
public void remove(E e){
list.removeElement(e);
}
public static void main(String[] args) {
System.out.println("Pride and Prejudice");
ArrayList<String> words1 = new ArrayList<>();
if (FileOperation.readFile("pride-and-prejudice.txt",words1)){
System.out.println("Total words :"+words1.size());
LinkedListSet<String> set1 = new LinkedListSet<>();
for (String word:words1)
set1.add(word);
System.out.println("Total different words:"+set1.getSize());
}
}
}
7-3 集合類的複雜度分析
比較兩種實現的性能差異
import java.util.ArrayList;
public class Main {
private static double testSet(Set<String> set,String filename){
long startTime = System.nanoTime();
System.out.println(filename);
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile(filename, words)) {
System.out.println("Total words: " + words.size());
for (String word : words)
set.add(word);
System.out.println("Total different words: " + set.getSize());
}
long endTime = System.nanoTime();
return (endTime - startTime)/ 1000000000.0;
}
public static void main(String[] args) {
String filename = "pride-and-prejudice.txt";
BSTSet<String> bstSet = new BSTSet<>();
double time1 = testSet(bstSet,filename);
System.out.println("BST set:"+time1+" s");
System.out.println();
LinkedListSet<String> linkedListSet = new LinkedListSet<>();
double time2 = testSet(linkedListSet, filename);
System.out.println("Linked List Set: " + time2 + " s");
}
}
顯然使用二分搜索樹(BSTSet)的效率是高於鏈表(LinkedListSet)的。
二分搜索樹的最壞情況,按照順序(順序or逆序)創建樹(如下圖),即最差爲O(n),如何解決這個問題,即創建平衡二叉樹。
7-4 Leetcode中的集合問題和更多集合相關問題
*Leetcode 804. Unique Morse Code Words
- https://leetcode.com/problems/unique-morse-code-words/description/
我們首先使用java標準庫提供的Treeset來解決,因爲java中的這個樹是由紅黑樹構建的,是一個平衡樹,不會出現上節所出現的最壞情況。
import java.util.TreeSet;
public class Solution {
public int uniqueMorseRepresentations(String[] words){
String[] codes = {".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--.."};
TreeSet<String> set = new TreeSet<>();
for (String word:words){
StringBuilder res = new StringBuilder();
for (int i=0;i<word.length();i++)
res.append(codes[word.charAt(i)-'a']);
set.add(res.toString());
}
return set.size();
}
}
當然我們也可以使用我們自己構建的鏈表實現,使用BSTSet<String> set = new BSTSet<>();
或者LinkedListSet<String> set = new LinkedListSet<>();
即可。
有序集合和無序結合
- 有序集合中的元素具有順序性 <-- 基於搜索樹的實現
- 無序集合中的元素沒有順序性 <-- 基於哈希表的實現
多重集合
- 集合中的元素可以重複
7-5 映射(Map)基礎
public interface Map<K,V> {
void add(K key,V value);
V remove(K key);
void set(K key,V newValue);
boolean contains(K key);
V get(K key);
int getSize();
boolean isEmpty();
}
7-6 基於鏈表的映射實現
首先我們使用鏈表這個底層數據結構實現映射類
public class LinkedListMap<K,V> implements Map<K,V> {
private class Node {
public K key;
public V value;
public Node next;
public Node(K key, V value, Node next) {
this.key = key;
this.value = value;
this.next = next;
}
public Node(K key, V value) {
this(key, value, null);
}
private Node() {
this(null, null, null);
}
@Override
public String toString() {
return key.toString() + " : " + value.toString();
}
}
private Node dummyHead;
private int size;
public LinkedListMap(){
dummyHead = new Node();
size = 0;
}
@Override
public int getSize(){
return size;
}
@Override
public boolean isEmpty(){
return size == 0;
}
//輔助函數
private Node getNode(K key){
Node cur = dummyHead.next;
while (cur!=null){
if (cur.key.equals(key))
return cur;
cur = cur.next;
}
return null;
}
@Override
public void add(K key,V value){
Node node = getNode(key);
if (node == null){
dummyHead.next = new Node(key,value,dummyHead.next);
size++;
}else
node.value = value;
}
@Override
public V remove(K key){
Node prev = dummyHead;
while(prev.next!=null){
if (prev.next.key.equals(key))
break;
prev = prev.next;
}
if (prev.next!= null){
Node delNode =prev.next;
prev.next = delNode.next;
delNode.next = null;
return delNode.value;
}
return null;
}
@Override
public void set(K key,V newValue){
Node node = getNode(key);
if (node == null)
throw new IllegalArgumentException(key + "not exist!");
node.value = newValue;
}
@Override
public V get(K key){
Node node = getNode(key);
return node.value;
}
@Override
public boolean contains(K key){
return getNode(key)!=null;
}
}
然後我們進行測試,我們測試set的時候是看文章中一共有多少個單詞,使用map可以測試對於每一個單詞,我們出現了幾次,即詞頻。
public static void main(String[] args){
System.out.println("Pride and Prejudice");
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile("pride-and-prejudice.txt", words)) {
System.out.println("Total words: " + words.size());
LinkedListMap<String, Integer> map = new LinkedListMap<>();
for (String word : words) {
if (map.contains(word))
map.set(word, map.get(word) + 1);
else
map.add(word, 1);
}
System.out.println("Total different words: " + map.getSize());
System.out.println("Frequency of PRIDE: " + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
System.out.println();
}
7-7 基於二分搜索樹的映射實現
import java.util.ArrayList;
public class BSTMap <K extends Comparable<K>,V> implements Map<K,V> {
private class Node{
public K key;
public V value;
private Node left,right;
public Node(K key,V value){
this.key = key;
this.value = value;
left = null;
right = null;
}
}
private Node root;
private int size;
public BSTMap(){
root = null;
size = 0;
}
@Override
public int getSize(){
return size;
}
@Override
public boolean isEmpty(){
return size == 0;
}
//向二分搜索樹中添加新的元素(key,Value)
@Override
public void add(K key,V value){
root = add(root,key,value);
}
//向以node爲根的二分搜索樹中插入元素(key,value),遞歸算法
//返回插入新節點後二分搜索樹的根
private Node add(Node node,K key,V value){
if (node == null){
size++;
return new Node(key,value);
}
if (key.compareTo(node.key)<0)
node.left = add(node.left,key,value);
else if (key.compareTo(node.key)>0)
node.right = add(node.right,key,value);
else
node.value = value;
return node;
}
//輔助函數
//返回以node爲根節點的二分搜索樹中,key所在的節點
private Node getNode(Node node,K key){
if (node == null)
return null;
if (key.equals(node.key))
return node;
else if (key.compareTo(node.key)<0)
return getNode(node.left,key);
else
return getNode(node.right,key);
}
@Override
public V get(K key){
Node node = getNode(root,key);
return node == null? null:node.value;
}
@Override
public boolean contains(K key){
return getNode(root,key)!=null;
}
@Override
public void set(K key,V newValue){
Node node = getNode(root,key);
if (node == null)
throw new IllegalArgumentException(key+"doesn't exist");
node.value = newValue;
}
// 返回以node爲根的二分搜索樹的最小值所在的節點
private Node minimum(Node node){
if(node.left == null)
return node;
return minimum(node.left);
}
// 刪除掉以node爲根的二分搜索樹中的最小節點
// 返回刪除節點後新的二分搜索樹的根
private Node removeMin(Node node){
if(node.left == null){
Node rightNode = node.right;
node.right = null;
size --;
return rightNode;
}
node.left = removeMin(node.left);
return node;
}
//從二分搜索樹中刪除鍵爲key的節點
@Override
public V remove(K key){
Node node = getNode(root,key);
if(node != null){
root = remove(root, key);
return node.value;
}
return null;
}
private Node remove(Node node, K key){
if( node == null )
return null;
if( key.compareTo(node.key) < 0 ){
node.left = remove(node.left , key);
return node;
}
else if(key.compareTo(node.key) > 0 ){
node.right = remove(node.right, key);
return node;
}
else{ // key.compareTo(node.key) == 0
// 待刪除節點左子樹爲空的情況
if(node.left == null){
Node rightNode = node.right;
node.right = null;
size --;
return rightNode;
}
// 待刪除節點右子樹爲空的情況
if(node.right == null){
Node leftNode = node.left;
node.left = null;
size --;
return leftNode;
}
// 待刪除節點左右子樹均不爲空的情況
// 找到比待刪除節點大的最小節點, 即待刪除節點右子樹的最小節點
// 用這個節點頂替待刪除節點的位置
Node successor = minimum(node.right);
successor.right = removeMin(node.right);
successor.left = node.left;
node.left = node.right = null;
return successor;
}
}
public static void main(String[] args){
System.out.println("Pride and Prejudice");
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile("pride-and-prejudice.txt", words)) {
System.out.println("Total words: " + words.size());
BSTMap<String, Integer> map = new BSTMap<>();
for (String word : words) {
if (map.contains(word))
map.set(word, map.get(word) + 1);
else
map.add(word, 1);
}
System.out.println("Total different words: " + map.getSize());
System.out.println("Frequency of PRIDE: " + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
System.out.println();
}
}
7-8 映射的複雜度分析和更多映射相關問題
測試兩種實現的映射的性能差異
import java.util.ArrayList;
public class testMap {
private static double testMap(Map<String, Integer> map, String filename){
long startTime = System.nanoTime();
System.out.println(filename);
ArrayList<String> words = new ArrayList<>();
if(FileOperation.readFile(filename, words)) {
System.out.println("Total words: " + words.size());
for (String word : words){
if(map.contains(word))
map.set(word, map.get(word) + 1);
else
map.add(word, 1);
}
System.out.println("Total different words: " + map.getSize());
System.out.println("Frequency of PRIDE: " + map.get("pride"));
System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
}
long endTime = System.nanoTime();
return (endTime - startTime) / 1000000000.0;
}
public static void main(String[] args) {
String filename = "pride-and-prejudice.txt";
BSTMap<String, Integer> bstMap = new BSTMap<>();
double time1 = testMap(bstMap, filename);
System.out.println("BST Map: " + time1 + " s");
System.out.println();
LinkedListMap<String, Integer> linkedListMap = new LinkedListMap<>();
double time2 = testMap(linkedListMap, filename);
System.out.println("Linked List Map: " + time2 + " s");
}
}
可以看到二分搜索樹實現的映射性能更好。
有序映射和無序映射
- 有序映射中的鍵具有順序性 <— 基於搜索樹的實現
- 無序映射中的鍵沒有順序性 <— 基於哈希表的實現
多重映射
- 多重映射中的鍵可以重複
總結:
7-9 Leetcode上更多集合和映射的問題
import java.util.ArrayList;
import java.util.TreeSet;
public class Solution349 {
public int[] intersection(int[] nums1,int[] nums2){
TreeSet<Integer> set = new TreeSet<>();
for (int num:nums1)
set.add(num);
ArrayList<Integer> list = new ArrayList<>();
for (int num:nums2){
if (set.contains(num)){
list.add(num);
set.remove(num);
}
}
int[] res = new int[list.size()];
for (int i=0;i<list.size();i++)
res[i]=list.get(i);
return res;
}
}
解決:
- 可以使用自己實現的包含重複元素的set
- 由於要對重複的元素進行計數,我們可以設計一個映射,即 元素:出現頻次
第二種方法的實現
import java.util.ArrayList;
import java.util.TreeMap;
public class Solution {
public int[] intersect(int[] nums1,int[] nums2){
TreeMap<Integer,Integer> map = new TreeMap<>();
for (int num:nums1){
if (!map.containsKey(num))
map.put(num,1);
else
map.put(num,map.get(num)+1);
}
ArrayList<Integer> res = new ArrayList<>();
for (int num:nums2){
if (map.containsKey(num)){
res.add(num);
map.put(num,map.get(num)-1);
if (map.get(num)==0)
map.remove(num);
}
}
int[] ret = new int[res.size()];
for (int i=0;i<res.size();i++)
ret[i] = res.get(i);
return ret;
}
}