K-means均值聚类算法寻找质心，Python

原創

2020-06-16 09:04

import numpy as np

# 欧氏距离计算
def distEclud(x,y):
    return np.sqrt(np.sum((x-y)**2))  # 计算欧氏距离

# 为给定数据集构建一个包含K个随机质心的集合
def randCent(dataSet,k):
    m,n = dataSet.shape
    centroids = np.zeros((k,n))
    for i in range(k):
        index = int(np.random.uniform(0,m)) 
        centroids[i,:] = dataSet[index,:]
    return centroids
 
# k均值聚类
def kmeans_open(dataSet,k):
    m = np.shape(dataSet)[0]  #行的数目
    # 第一列存样本属于哪一簇
    # 第二列存样本的到簇的中心点的误差
    clusterAssment = np.mat(np.zeros((m,2)))
    clusterChange = True
 
    # 第1步 初始化centroids
    centroids = randCent(dataSet,k)
    while clusterChange:
        clusterChange = False
 
        # 遍历所有的样本（行数）
        for i in range(m):
            minDist = 100000.0
            minIndex = -1
 
            # 遍历所有的质心
            #第2步 找出最近的质心
            for j in range(k):
                # 计算该样本到质心的欧式距离
                distance = distEclud(centroids[j,:],dataSet[i,:])
                if distance < minDist:
                    minDist = distance
                    minIndex = j
            # 第 3 步：更新每一行样本所属的簇
            if clusterAssment[i,0] != minIndex:
                clusterChange = True
                clusterAssment[i,:] = minIndex,minDist**2
        #第 4 步：更新质心
        for j in range(k):
            pointsInCluster = dataSet[np.nonzero(clusterAssment[:,0].A == j)[0]]  # 获取簇类所有的点
            centroids[j,:] = np.mean(pointsInCluster,axis=0)   # 对矩阵的行求均值
 
    return clusterAssment.A[:,0], centroids

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

智慧家庭场景的推荐系统的发展历程和方向 | InfoQ《公开课》

直播概要：隨着計算機的蓬勃發展，互聯網進入大數據和人工智能時代，爲了解決信息過載和長尾商品，推薦系統成爲唯一選擇，而面對不同的業務場景，爲了解決業務痛點，會根據不同的場景特點尋找不同的方法和手段來解決推薦中實際遇到的問題。在智慧家庭領域，

InfoQ 中文站

2021-12-21 10:54:01

Alexa 全球排名网站将关闭，排名曾引争议

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-14 14:53:55

Thinking Above Code：TLA+思维概述

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-12-07 17:23:58

你的2.6朵云里，会有火山引擎吗？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-07 10:28:54

数字化转型这么火，你真的看懂了吗？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-02 21:08:57

基于图像的机器学习技术将数十亿的电子商务产品分为数千个类别

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-29 16:28:50

如何用 PyTorch 构建 GAN？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-23 11:18:54

绕过硬件瓶颈，成倍提升芯片算力，软件层面深挖芯片性能可行吗？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-23 11:18:54

App Annie发布预测：TikTok 将达 15 亿活跃用户，遥遥领先 Instagram

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-19 19:53:55

Python的while循环

1.while循環的格式 while 條件: 條件滿足時，做的事情1 條件滿足時，做的事情2 條件滿足時，做的事情3 ...(省略)... demo

2023-10-10 11:37:31

python初识第二天

認識現實世界與虛擬世界的橋樑感受python帶來的魔力數據類型 Python裏，最常用的數據類型有三種——字符串(str)、整數(int)和浮點數(float) 字符串，字符串英文string，簡寫str 字符串的識別方式非常簡單—

2023-02-01 22:01:30

Python 的十大特性

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

Rupam Choudhary

2021-12-16 16:04:03

Python开发工程师[金融方向] Remote/Singapore (20k - 45k)

簡單介紹：要做的事：同交易員一起開發交易相關係統；能力要求：能獨立解決問題，完成項目開發，有較強的學習能力（技術和業務）品格正直，較強的心裏承壓能力；職業前景：能提供給你完全不同於互聯網公司的報酬上限，職業途徑；與一流交易員溝通機會，瞭解他

2021-12-09 17:53:05

JavaScript 浏览器统治地位不保？Python 有望取代

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-02 17:58:57

懒人畅听网，有声小说类目数据采集，多线程速采案例，Python爬虫120例之23例

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

梦想橡皮擦

2021-11-23 11:18:54

24小時熱門文章

最新文章

最新評論文章