K-means聚類方法

原創

2020-02-22 08:39

K-means聚類方法

就是把空間內點，分成K類。同一聚類中的對象相似度較高；而不同聚類中的對象相似度較小。

用均值來代表類中心，並用于衡量與新點的距離。

初始值：

根據先驗知識找到K個均值，做迭代初始值。

迭代公式：

1：從n個數據對象中選擇k個對象作爲初始聚類中心

2：將剩下的n-k個數據對象，按照他們和初始的k個值之間的距離大小，分配給與其最近的聚類。

3：計算形成的k個新聚類的聚類中心（該聚類中所有對象的均值）

4：重複2

5：類中心不再擺動，或者擺動幅度很小，趨於穩定，則終止。

測準函數一般使用均方差。

bin/mahout kmeans \

-i <input vectors directory> \

-c <input clusters directory> \

-o <output working directory> \

-k <optional number of initial clusters to sample from input vectors> \

-dm <DistanceMeasure> \

-x <maximum number of iterations> \

-cd <optional convergence delta. Default is 0.5> \

-ow <overwrite output directory if present>

-cl <run input vector clustering after computing Canopies>

-xm <execution method: sequential or mapreduce>

注意：當-k被指定的時候，-c目錄下的所有聚類都將被重寫，將從輸入的數據向量中隨機抽取-k個點作爲初始聚類的中心。

發佈了24 篇原創文章 · 獲贊 0 · 訪問量 6370

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

編程語言巔峯之戰，誰纔是真正的王者？| InfoQ編程語言排行榜

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-07-20 11:43:51

InfoQ 編程語言 2 月排行榜，更好的投票活動來了

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

InfoQ 中文站

2021-03-22 18:34:58

Android C++系列：JNI調用時的異常處理

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-19 10:03:53

國內最大的C++軟件項目之一，WPS的“自守”之道

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragr

2021-10-28 14:23:59

談 C++17 裏的 Memento 模式

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-10-18 14:13:51

談 C++17 裏的 State 模式之二

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-10-12 21:03:51

英特爾最新版C/C++編譯器採用LLVM架構，性能提升明顯

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

James R Reinders

2021-09-17 10:48:52

C++實用指南

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

Bartlomiej Filipek

2021-07-22 10:03:58

SpaceX龍飛船性能要求嚴苛，軟件開發存挑戰

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-05-14 13:43:55

縱觀20年間國外碼農的薪酬變化：漲幅下降，初級編碼崗大幅消失

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-03-22 18:35:29

如果編程語言是《權力的遊戲》中的角色？（上）

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-03-22 18:35:23

InfoQ 編程語言1月排行榜：邀你投票

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

InfoQ 中文站

2021-01-21 17:28:56

2021年最值得學習的10種編程語言

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

Statistics and Data

2021-01-19 14:13:58

我是如何愛上 Julia 編程語言的？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

Emmett Boudreau

2020-12-29 14:24:00

機器學習工程師需要掌握哪些編程語言？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-07-21 16:53:56

24小時熱門文章

最新文章

最新評論文章