Clustering Uncertain Data Based on Probability Distribution Similarity

原創

2019-02-28 17:15

Clustering Uncertain Data Based on Probability Distribution Similarity

IEEE Transanctions on knowledge and data engineering

Bin Jiang et al.

Key Points

Common clustering algorithms cluster objects by their coordinates, but when the object is a collection of data, the inner distribution of the object should be considered when clustering. For example, two items are both rated as 4.5 star, but the votes have very different distribution, then we may cluster them into different categories.
The global idea remains similar to K-mean, but here the authors use Kullback-Leibler divergence (information entropy, relative entropy) to measure the distance between distributions.
Use kernel density to estimate the distribution and KL divergence and use fast Gauss transform to boost the computation.
Use K-medoids as clustering method.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

旋轉機械故障診斷公開數據集整理

旋轉機械故障診斷公開數據集整理衆所周知，當下做機械故障診斷研究最基礎的就是數據，再先進的方法也離不開數據的檢驗。筆者通過文獻資料收集到如下幾個比較常用的數據集並進行整理。鑑於目前尚未見比較全面的數據集整理介紹。數據來自原始研究方

2020-07-08 09:37:55

Latex 引用、索引不跳轉的問題

解決Latex 引用、索引不跳轉的問題使用Latex編輯文章，發現生成的pdf文檔的引用、索引不響應鼠標事件，也就是不跳轉。解決方法按原先的Latex->BibTex->Latex->Latex編譯，完成後不要直接點擊dv

2020-07-06 20:49:41

RECURRENT EXPERIENCE REPLAY IN DISTRIBUTED REINFORCEMENT LEARNING (R2D2)

R2D2採用了分佈式框架和LSTM，作爲model-free方法在Atari上實現了SOTA的效果。文章的主要貢獻在於兩個點：一是討論了一下加入RNN之後對於強化學習的MDP時間序列訓練方法的問題；另一個是自身的分佈式較大規模訓練

2020-07-04 03:32:17

EXPLORATION BY RANDOM NETWORK DISTILLATION (RND)

RND將獎勵分成了intrinsic reward和extrinsic reward兩部分，其中extrinsic reward相當於是原始獎勵，而intrinsic reward的計算則是通過設計了兩個網絡，並計算它們輸出的MS

2020-07-04 03:32:17

vscode-textlive-paper學習記錄

文章目錄vscode相關資源Codelatex 基礎設置默認pdf閱讀器背景設置代碼正反向設置代碼 vscode vscode關於latex設置 vscode寫latex設置-正反向搜索設置_知乎配置VScode編輯LaTeX及

2020-07-04 02:15:37

《Maximum Roaming Multi-Task Learning》閱讀筆記

2020-07-03 19:18:55

CVPR2019 （一）

1. Expressive Body Capture: 3D Hands, Face, and Body from a Single Image paper code Abstract：To facilitat

2020-07-03 04:29:20

[NLP論文閱讀]A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS

本文是閱讀 ICLR 會議論文 “A SIMPLE BUT TOUGH-TO-BEAT BASELINE FOR SENTENCE EMBEDDINGS” 所作筆記。論文 GitHub：https://github.com/Pr

2020-07-02 14:00:46

語音喚醒論文待看

最近沉迷於語音喚醒，順便在學術界上把語音喚醒摸個底，稍後可能放出語音喚醒的相關調研報告帶鏈接的都是有源碼的按照時間線劃分第一部分來自arXiv arXiv 中搜索關鍵詞 “Small-footprint Keyw

2020-07-02 04:04:53

全長文章(a full-length article)和簡短交流文章(a short communication article)有什麼區別？

What is the difference between a full-length article and a short communication article? 研究完整性方面：短文通常會報告一個有趣的發現或技術

2020-07-01 23:42:02

信息公開查詢

如今，網絡上信息氾濫，不僅父母輩會被朋友圈的“震驚體”迷惑，我們也也同樣會被社會上充斥的表面上看上去優秀資質的公司、企業等信息誘導，導致我們經受本不該承受的毒打。因此，想整理一下關於信息公開查詢指南的文件，供大家統一查閱。初期信

2020-07-01 23:42:00

針對碩士研究生文獻檢索的幾點體會

針對碩士研究生文獻檢索的幾點體會文獻檢索和閱讀貫穿於科研的各個環節（包括但不限於選題、制定實驗方案、開展實驗、數據收集與分析、論文撰寫、論文修改等），因此查閱文獻的能力應該是研究生期間應該着重培養的基本能力之一。多數碩士研

2020-07-01 20:52:09

Supplemental Material

Parallel GPF Solution: A GPU-CPU-Based Vectorization Parallelization and Sparse Technique for NR Implementation The

2020-06-30 09:10:21

紋理分割相關文獻

1. unsupervised image segmentation of color-texture regions in images and video 2001 原文下載鏈接 http://vision.ece.ucsb

2020-06-29 21:30:04

DeepConf: Automating Data Center Network Topologies Management with Machine Learning

public by sigcomm2018 NETAI 原：DeepConf: Automating Data Center Network Topologies Management with Machine Learning 譯：

2020-06-26 13:52:13

24小時熱門文章

最新文章

最新評論文章