leetcode 17.26. 稀疏相似度

兩個(具有不同單詞的)文檔的交集(intersection)中元素的個數除以並集(union)中元素的個數，就是這兩個文檔的相似度。例如，{1, 5, 3} 和 {1, 7, 2, 3} 的相似度是 0.4，其中，交集的元素有 2 個，並集的元素有 5 個。給定一系列的長篇文檔，每個文檔元素各不相同，並與一個 ID 相關聯。它們的相似度非常“稀疏”，也就是說任選 2 個文檔，相似度都很接近 0。請設計一個算法返回每對文檔的 ID 及其相似度。只需輸出相似度大於 0 的組合。請忽略空文檔。爲簡單起見，可以假定每個文檔由一個含有不同整數的數組表示。

輸入爲一個二維數組 docs，docs[i] 表示 id 爲 i 的文檔。返回一個數組，其中每個元素是一個字符串，代表每對相似度大於 0 的文檔，其格式爲 {id1},{id2}: {similarity}，其中 id1 爲兩個文檔中較小的 id，similarity 爲相似度，精確到小數點後 4 位。以任意順序返回數組均可。

示例:

輸入:
[
[14, 15, 100, 9, 3],
[32, 1, 9, 3, 5],
[15, 29, 2, 6, 8, 7],
[7, 10]
]
輸出:
[
“0,1: 0.2500”,
“0,2: 0.1000”,
“2,3: 0.1429”
]
提示：

docs.length <= 500
docs[i].length <= 500
相似度大於 0 的文檔對數不會超過 1000

解題思路

暴力枚舉，O(n^2*c), 勉強k過，精度卡的讓我懷疑人生
也可以
枚舉對於每個數子的相同組

代碼

枚舉每對

class Solution {
public:
    using ve2 = vector<vector<int>>;
    using ve1 = vector<int>;
    using si = ve1::size_type;
    vector<string> ans;
    
    inline string itoa(int a) {
        string s;
        if (a == 0) return "0";
        while (a) {
            s += (a%10)+'0';
            a/=10;
        }
        reverse(s.begin(), s.end());
        return s;
    }
    string out(int a) {
        string s = ": ";
        s += itoa(a/10000);
        s += '.';
        a %= 10000;
        string ss;
        for (int i = 0; i < 4; ++i) {
            ss += ((a%10)+'0');
            a/=10;
        }
        reverse(ss.begin(), ss.end());
        return s+ss;
    }
    void similarity(ve1 &a, ve1 &b, string &s) {
        int k = 0;
        for (si i = 0, j = 0; i < a.size() && j < b.size();) {
            if (a[i] == b[j]) k++,i++,j++;
            else if (a[i] > b[j]) j++;
            else i++;
        }
        if (k == 0) return ;
        double an = k*1./(a.size()+b.size()-k);
        k = an*100000;
        if (k%10 > 5 || (k%10 == 5 && k/10 > 0))
            k+=10;
        k/=10;
        if (k == 0) return;
        s += out(k);
        ans.push_back(s);
        return ;
    }
    void sol(ve2 &v) {
        for (si i = 0; i < v.size(); ++i) {
            for (si j = i+1; j < v.size(); ++j) {
                        string a = itoa(i)+","+itoa(j);
                        similarity(v[i], v[j], a);
            }
        }
    }

    vector<string> computeSimilarities(vector<vector<int>>& docs) {
        for (auto & i : docs)
            sort(i.begin(), i.end());
        sol(docs);
        return ans;
    }
};

AC 代碼2 ：枚舉對於每個數子的相同組

class Solution {
public:
    vector<string> computeSimilarities(vector<vector<int>>& docs) {
        map<int, vector<int>> ma;
        map<pair<int, int>, int> counts;
        vector<int>cnt;
        for (int i = 0; i < docs.size(); ++i) {
            cnt.push_back(docs[i].size());
            for (auto &d : docs[i]) {
                ma[d].push_back(i);
            }
        }
        
        for (const auto &[a, b] : ma) {
            for (int i = 0; i < b.size(); ++i) {
                for (int j = i+1; j < b.size(); ++j) {
                    ++counts[{b[i],b[j]}];
                }
            }
        }
        vector<string>ans;
        for (const auto &[a, b] : counts) {
            int fm = cnt[a.first]+cnt[a.second]-b;
            double f = b*1./fm;
            char s[20];
            sprintf(s, "%d,%d: %.4f", a.first, a.second, f+1e-9);
            ans.push_back(s);
        }
        return ans;
    }
};

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

leetcode 17.26. 稀疏相似度

leetcode 17.26. 稀疏相似度

解題思路

代碼

linux安裝cuda和cudnn

模擬手機設備：使用 Playwright 實現移動端自動化測試

Mellanox網卡開啓SR-IOV

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

linux服務器設置ssh免密

2018 深圳CCPC final B. Balance of the Force 枚舉最大值 + 線段樹 + 二分圖染色

leetcode 862

ubuntu 18.04 下tmux 使用教程

HDFS 原理（簡述）

c++ 表達式和各種轉換運算符介紹

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結