邏輯迴歸分類器的決策邊界可視化

原創

呵呵镜

2020-06-09 18:12

最近在覆盤機器學習的內容，課程中最基礎的例子是利用sklearn中的LogisticRegression 來進行將數據進行分類訓練，並畫出決策邊界，這是課程中的效果圖，

下面來說一下我的程序：

首先加載數據，練習中給的數據及有三列，x1,x2,y,x1和x2 是特徵屬性，y作爲分類的結果，值有兩種 0和1 ，所以這是二分類的問題

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
data = pd.read_csv('data.csv')

加載數據後構建數據的屬性列和標籤列

X = np.array(data[['x1','x2']])
y = np.array(data['y'])

選用 sklearn 中的 LogisticRegression 邏輯迴歸來訓練數據

from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(X,y)

對於 LogisticRegression 函數的返回的屬性的幾點說明：

coef_：返回決策函數中的特徵係數

intercept_：返回決策函數的截距

具體函數說明請查看官方說明文檔：https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

簡單來介紹下邏輯迴歸的原理

經過學習的模型是一組權值，當測試樣本的數據輸入時，這組權值與測試數據按照線性加和得到

這裏是每個樣本的n個特徵。之後按照sigmoid函數的形式求出

由於sigmoid函數的定義域爲，值域爲，因此最基本的邏輯迴歸分類器適合對兩類目標進行分類。

所以Logistic迴歸最關鍵的問題就是研究如何求得這組權值。這個問題是用極大似然估計來做的。

上段程序在執行後即可獲得 classifier.coef_ 和 classifier.intercept_，可以採用print 打印出來看下返回值

print(classifier.coef_) #決策函數中的特徵係數 :w1,w2....wn
print(classifier.intercept_) #決策函數中的截距 :w0

因爲本數據集合中有兩個特徵列，即特徵數是2，所以classifier.coef_ 是一個shape是（1,2）的數組，而classifier.intercept_ 是一個shape爲（1，）的數組。

如果只有兩個特徵值，則很容易通過公式畫出邏輯迴歸的決策邊界

因此得到畫邊界線的函數

def x2(x1):
    print('coef_:',classifier.coef_ ,classifier.coef_[0][0])
    print("intercept_:",classifier.intercept_)
    print("coef_ shape:",classifier.coef_.shape)
    print("intercept_shape:",classifier.intercept_.shape)
    print("classes_:",classifier.classes_)
    return (-classifier.coef_[0][0] * x1 - classifier.intercept_[0]) / classifier.coef_[0][1]

最後將數據點散點圖形式呈現，並畫出邊界線

X1 = data[data['y'] == 0]
X2 = data[data['y'] == 1]

plt.scatter(X2['x1'],X2['x2'],c='blue')
plt.scatter(X1['x1'],X1['x2'],c='red')

x1_plot = np.linspace(0, 1,100)
print(x1_plot)
x2_plot = x2(x1_plot)
plt.plot(x1_plot, x2_plot)

plt.show()

執行後的結果：

參考文章：

https://blog.csdn.net/ariessurfer/article/details/41310525

http://www.imooc.com/article/36571

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

邏輯迴歸分類器的決策邊界可視化

Win10 LTSC 2019 安裝後的一些步驟

推薦2款開源、美觀的WinForm UI控件庫

NET9 AspnetCore將整合OpenAPI的文檔生成功能而無需三方庫

在Linux下管理MySQL的大小寫敏感性

Error: [WinError 10013] 以一種訪問權限不允許的方式做了一個訪問套接字的嘗試 ------windwos 查看什麼程序佔用了端口號

windows下 apache 配置 Django

深度學習-感知器算法是怎麼訓練的？

邏輯迴歸分類器的決策邊界可視化

線性迴歸訓練數據擬合過程及Python LinearRegression 代碼實現

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結