[LC] 347. Top K Frequent Elements

原創

2019-08-15 09:58

這題做法其實不止一種。最直觀的做法就是用heap，在java裏就是PriorityQueue。用一個大小爲k的最小堆來做，對每個數字出現的次數做統計，保留出現次數最大的k個即可。當然你放在PriorityQueue裏面的不能是一個簡單的Integer，因爲你不止要知道次數的排序，你還得知道是什麼數字對應的那個次數。所以你就需要override默認的comparator才行，你PriorityQueue裏面可以放任意一個能表達鍵值對形式的數據結構，鍵是數組裏面的數字，值是對應的出現次數，重寫的comparator根據值來排序即可。這裏我們試一下用Java 8新引進的Lambda driver（全金屬狂潮的羊肉系統）。。。不對，Lambda表達式來寫comparator。具體可以參見https://segmentfault.com/a/1190000009186509 之類的。。。代碼如下：

    public List<Integer> topKFrequent(int[] nums, int k) {
        HashMap<Integer, Integer> countMap = new HashMap<>();
        Queue<Integer[]> countQueue = new PriorityQueue<>((a, b) -> (a[1] - b[1]));
        for (int i : nums) {
            countMap.put(i, countMap.getOrDefault(i, 0) + 1);
        }
        
        for (Map.Entry<Integer, Integer> entry : countMap.entrySet()) {
            Integer num = entry.getKey();
            Integer count = entry.getValue();
            if (countQueue.size() < k || countQueue.peek()[1] < count) {
                Integer[] resultRow = {num, count};
                countQueue.add(resultRow);
            }
            
            if (countQueue.size() > k) {
                countQueue.poll();
            }
        }
        
        LinkedList<Integer> result = new LinkedList<>();
        while (!countQueue.isEmpty()) {
            result.addFirst(countQueue.poll()[0]);
        }
        
        return result;
    }

下面有一段類似的代碼，但是我用另一個數據結構代替了PriorityQueue，我用的是TreeSet，但是整個流程，基本是類似的，除去一些細微的變化，我先給出代碼：

    public List<Integer> topKFrequent(int[] nums, int k) {
        HashMap<Integer, Integer[]> countMap = new HashMap<>();
        TreeSet<Integer[]> countSet = new TreeSet<>((a, b) -> (a[1] != b[1] ? a[1] - b[1] : a[0] - b[0]));
        
        for (int num : nums) {
            Integer[] numCnt = countMap.getOrDefault(num, new Integer[]{num, 0});
            if (!countMap.containsKey(num)) {
                countMap.put(num, numCnt);
            }

            if (countSet.contains(numCnt)) {
                countSet.remove(numCnt);
            }
            
            numCnt[1]++;
            countSet.add(numCnt);
            if (countSet.size() > k) {
                countSet.pollFirst();
            }
        }
        
        LinkedList<Integer> result = new LinkedList<>();
        for (Integer[] numCnt : countSet) {
            result.addFirst(numCnt[0]);
        }
        
        return result;
    }

可以看得出來，第一個區別就是這個做法，我只用了一個for loop來完成的。主要目的，是爲了解決這一題的延伸問題（來自dropbox貌似），如果問題是，input是一個流和不是一個固定數組的話，我們能否得到答案。流和固定數組的區別在於就是，如果我時不時加一個新的數字，我能否很快的更新新的答案。能否處理流的關鍵就是是否能夠即時更新用於排序的數據結構，這裏用PriorityQueue的方式是沒辦法做到的，因爲PriorityQueue沒辦法更新放進queue裏面的數據。（其實heap是可以的，每次更新數據做一個heapify就好了。但是重新的完整的寫一個heap又真的很麻煩的），這個時候TreeSet就可以達到這個效果了，你可以通過logN的操作來刪除TreeSet中的一個節點已經插入TreeSet中的一個節點。所以其實對比priorityqueue的做法，我只需要多一個contains和remove的步驟就可以對流進行處理了。這裏TreeSet就可以很好地滿足我了。

這兩題的做法的複雜度都能夠滿足leetcode的要求就是小於O(nlogn)。which is O(nlogk)。

其實還有更好的做法，參考了這個https://leetcode.com/problems/top-k-frequent-elements/discuss/81602/Java-O(n)-Solution-Bucket-Sort

看鏈接就知道了，算法核心在於桶排序。也不算是一個完整的桶排序，是一個不需要進行桶內排序的桶排序。桶的設計比較粗暴但有效，桶i就表示出現了i次的數字組合，所以最多n個桶。可以和heap做法一樣，先用哈希表把每個數字進行出現次數的統計，然後根據這個統計，把數字放進對應的桶裏。然後再從高順位的桶往低順位遍歷結果，一旦解集達到了k，那麼我們就返回結果。代碼如下：

    public List<Integer> topKFrequent(int[] nums, int k) {
        LinkedList<Integer>[] numsList = new LinkedList[nums.length];
        HashMap<Integer, Integer> countMap = new HashMap<>();
        for (int num : nums) {
            countMap.put(num, countMap.getOrDefault(num, 0) + 1);
        }
        
        for (Map.Entry<Integer, Integer> countPair : countMap.entrySet()) {
            Integer num = countPair.getKey();
            Integer count = countPair.getValue();
            if (numsList[count - 1] == null) {
                numsList[count - 1] = new LinkedList<>();
            }
            numsList[count - 1].add(num);
        }
        
        LinkedList<Integer> result = new LinkedList<>();
        for (int i = numsList.length - 1; i >= 0; i--) {
            List<Integer> counts = numsList[i];
            if (counts != null) {
                for (Integer num : counts) {
                    result.add(num);
                    if (result.size() == k) {
                        return result;
                    }
                }
            }
        }
        
        return result;
    }

這個算法的複雜度顯而易見就是O(n)，雖然桶是n個，但元素總數也就n個，所以算上遍歷空桶，複雜度也就只有O(n)。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

[LC] 347. Top K Frequent Elements

[LC] 358. Rearrange String k Distance Apart

[LC] 348. Design Tic-Tac-Toe

[LC] 349. Intersection of Two Arrays

[LC] 350. Intersection of Two Arrays II

[LC] 347. Top K Frequent Elements

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結