python-scikit-learn-DBSCAN

原創

2020-06-15 21:57

import numpy as np
import sklearn.cluster as ske
#from sklearn.cluster import DBSCAN
from sklearn import metrics
import matplotlib.pyplot as plt

mac2id={}
onlinetimes=[]
f=open('c:/pythonpractice/python-data/netDuration.txt')
lines=f.readlines()
#skip the first row
#storage the data in dict ,key is mac and the  valus is duration
for line in range(1,len(lines)):
    mac=lines[line].split('\t')[2]
    onlinetime=int(lines[line].split('\t')[6])
    starttime=int(lines[line].split('\t')[4].split(' ')[1].split(':')[0])
    if mac not in mac2id:
        mac2id[mac]=len(onlinetimes) #use the index to refer to the onlintimes
        onlinetimes.append((starttime,onlinetime))
    else:
        onlinetimes[mac2id[mac]]=[(starttime,onlinetime)]
real_X=np.array(onlinetimes).reshape((-1,2))  # change the shape of onlietimes  to 2 colums
X=real_X[:,0:1] # chose the  hour of  starttime

db=ske.DBSCAN(eps=0.01,min_samples=20).fit(X)
labels=db.labels_
print('Labels:')
print(labels) #print the labels


raito=len(labels[labels[:]==-1])/len(labels)  # caculate the ratio
print('noise raito:',format(raito,'.2%'))

n_clusters_=len(set(labels))-(1 if -1 in labels else 0)

print('estimated sumber of clusters: %d' % n_clusters_)
print('silhousette coofficient: %0.3f' % metrics.silhouette_score(X,labels))

#print the cluster data and the label
for i in range(n_clusters_):
    print('Cluster',i,':')
    print(list(X[labels==i].flatten()))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python-scikit-learn-DBSCAN

985 碩士程序員，空窗 4 個月沒有 Offer！

我真的從測試轉成了開發......

nginx添加相應配置，通過瀏覽器訪問或curl時返回客戶端對應公網IP

[oeasy]python020在遊戲中體驗數值自由_勇闖地下城_終端文字遊戲

爲何我建議你學會抄代碼

營銷系統黑名單優化：位圖的應用解析

解密遊戲神作

導入地址表鉤取技術解析

盛大發布 | Zabbix 7.0 LTS--性能與擴展的卓越融合

mmsql 臨時表和主表 merge into 語法

python-scikit-learn-DBSCAN

python-kmeans

團購設計

Navicat連接SQL Server數據庫，報錯08001

sap定價設計條件技術

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結