來源於英文“retrieval”. Trie樹就是字符樹,其核心思想就是空間換時間。
舉個簡單的例子。
給你100000個長度不超過10的單詞。對於每一個單詞,我們要判斷他出沒出現過,如果出現了,第一次出現第幾個位置。
這題當然可以用hash來,但是我要介紹的是trie樹。在某些方面它的用途更大。比如說對於某一個單詞,我要詢問它的前綴是否出現過。這樣hash就不好搞了,而用trie還是很簡單。
現在回到例子中,如果我們用最傻的方法,對於每一個單詞,我們都要去查找它前面的單詞中是否有它。那麼這個算法的複雜度就是O(n^2)。顯然對於100000的範圍難以接受。現在我們換個思路想。假設我要查詢的單詞是abcd,那麼在他前面的單詞中,以b,c,d,f之類開頭的我顯然不必考慮。而只要找以a開頭的中是否存在abcd就可以了。同樣的,在以a開頭中的單詞中,我們只要考慮以b作爲第二個字母的……這樣一個樹的模型就漸漸清晰了……
我們可以看到,trie樹每一層的節點數是26^i級別的。所以爲了節省空間。我們用動態鏈表,或者用數組來模擬動態。空間的花費,不會超過單詞數×單詞長度。(轉自一大牛)
Trie樹的java代碼 實現如下:
- import java.util.ArrayList;
- import java.util.Iterator;
- import java.util.List;
- /** *//**
- * A word trie which can only deal with 26 alphabeta letters.
- * @author Leeclipse
- * @since 2007-11-21
- */
- public class Trie{
- private Vertex root;//一個Trie樹有一個根節點
- //內部類
- protected class Vertex{//節點類
- protected int words;
- protected int prefixes;
- protected Vertex[] edges;//每個節點包含26個子節點(類型爲自身)
- Vertex() {
- words = 0;
- prefixes = 0;
- edges = new Vertex[26];
- for (int i = 0; i < edges.length; i++) {
- edges[i] = null;
- }
- }
- }
- public Trie () {
- root = new Vertex();
- }
- /** *//**
- * List all words in the Trie.
- *
- * @return
- */
- public List< String> listAllWords() {
- List< String> words = new ArrayList< String>();
- Vertex[] edges = root.edges;
- for (int i = 0; i < edges.length; i++) {
- if (edges[i] != null) {
- String word = "" + (char)('a' + i);
- depthFirstSearchWords(words, edges[i], word);
- }
- }
- return words;
- }
- /** *//**
- * Depth First Search words in the Trie and add them to the List.
- *
- * @param words
- * @param vertex
- * @param wordSegment
- */
- private void depthFirstSearchWords(List words, Vertex vertex, String wordSegment) {
- Vertex[] edges = vertex.edges;
- boolean hasChildren = false;
- for (int i = 0; i < edges.length; i++) {
- if (edges[i] != null) {
- hasChildren = true;
- String newWord = wordSegment + (char)('a' + i);
- depthFirstSearchWords(words, edges[i], newWord);
- }
- }
- if (!hasChildren) {
- words.add(wordSegment);
- }
- }
- public int countPrefixes(String prefix) {
- return countPrefixes(root, prefix);
- }
- private int countPrefixes(Vertex vertex, String prefixSegment) {
- if (prefixSegment.length() == 0) { //reach the last character of the word
- return vertex.prefixes;
- }
- char c = prefixSegment.charAt(0);
- int index = c - 'a';
- if (vertex.edges[index] == null) { // the word does NOT exist
- return 0;
- } else {
- return countPrefixes(vertex.edges[index], prefixSegment.substring(1));
- }
- }
- public int countWords(String word) {
- return countWords(root, word);
- }
- private int countWords(Vertex vertex, String wordSegment) {
- if (wordSegment.length() == 0) { //reach the last character of the word
- return vertex.words;
- }
- char c = wordSegment.charAt(0);
- int index = c - 'a';
- if (vertex.edges[index] == null) { // the word does NOT exist
- return 0;
- } else {
- return countWords(vertex.edges[index], wordSegment.substring(1));
- }
- }
- /** *//**
- * Add a word to the Trie.
- *
- * @param word The word to be added.
- */
- public void addWord(String word) {
- addWord(root, word);
- }
- /** *//**
- * Add the word from the specified vertex.
- * @param vertex The specified vertex.
- * @param word The word to be added.
- */
- private void addWord(Vertex vertex, String word) {
- if (word.length() == 0) { //if all characters of the word has been added
- vertex.words ++;
- } else {
- vertex.prefixes ++;
- char c = word.charAt(0);
- c = Character.toLowerCase(c);
- int index = c - 'a';
- if (vertex.edges[index] == null) { //if the edge does NOT exist
- vertex.edges[index] = new Vertex();
- }
- addWord(vertex.edges[index], word.substring(1)); //go the the next character
- }
- }
- public static void main(String args[]) //Just used for test
- {
- Trie trie = new Trie();
- trie.addWord("China");
- trie.addWord("China");
- trie.addWord("China");
- trie.addWord("crawl");
- trie.addWord("crime");
- trie.addWord("ban");
- trie.addWord("China");
- trie.addWord("english");
- trie.addWord("establish");
- trie.addWord("eat");
- System.out.println(trie.root.prefixes);
- System.out.println(trie.root.words);
- List< String> list = trie.listAllWords();
- Iterator listiterator = list.listIterator();
- while(listiterator.hasNext())
- {
- String s = (String)listiterator.next();
- System.out.println(s);
- }
- int count = trie.countPrefixes("ch");
- int count1=trie.countWords("china");
- System.out.println("the count of c prefixes:"+count);
- System.out.println("the count of china countWords:"+count1);
- }
- }
- 運行:
- C:\test>java Trie
- 10
- 0
- ban
- china
- crawl
- crime
- eat
- english
- establish
- the count of c prefixes:4
- the count of china countWords:4