HanLP — 感知機(Perceptron)

感知機

感知機是根據輸入實例的特徵向量 x 對其進行二類分類的線性模型：

\[f(x)=sign(w\cdot x+b) \]

感知機模型對應於輸入空間（特徵空間）中的分離超平面 $ w\cdot x+b=0 $.其中w是超平面的法向量，b是超平面的截距。

可見感知機是一種線性分類模型，屬於判別模型。

感知機學習的假設

感知機學習的重要前提假設是訓練數據集是線性可分的。

感知機學習策略

感知機學的策略是極小化損失函數。

損失函數的一個自然選擇是誤分類點的總數。但是，這樣的損失函數不是參數 w, b的連續可導的函數，不易於優化。所以通常是選擇誤分類點到超平面 S 的總距離：

\[L(w,b)=-\sum_{x_i\in M}y_i(w\cdot x_i+b) \]

學習的策略就是求得使 L(w,b) 爲最小值的 w 和 b。其中 M 是誤分類點的集合。

感知機學習的算法

感知機學習算法是基於隨機梯度下降法的對損失函數的最優化算法，有原始形式和對偶形式，算法簡單易於實現。

原始形式

\[\min_{w,b}L(w,b)=-\sum_{x_i\in M}y_i(w\cdot x_i+b) \]

首先，任意選取一個超平面$ w_0, b_0 $,然後用梯度下降法不斷地極小化目標函數。極小化的過程中不是一次使 M 中所有誤分類點得梯度下降，而是一次隨機選取一個誤分類點，使其梯度下降。

\[\nabla_wL(w,b)=-\sum_{x_i\in M}y_ix_i \]

\[\nabla_bL(w,b)=-\sum_{x_i\in M}y_i \]

隨機選取一個誤分類點$ (x_i,y_i) $，對 w,b 進行更新：

\[w\leftarrow w+\eta y_ix_i \]

\[b\leftarrow b+\eta y_i \]

其中$ \eta(0<\eta\leq1) $是學習率。

對偶形式

對偶形式的基本想法是，將 w 和 b 表示爲是咧 $ x_i $ 和標記 $ y_i $的線性組合的形式，通過求解其係數而得到 w 和 b。

\[w\leftarrow w+\eta y_ix_i \]

\[b\leftarrow b+\eta y_i \]

逐步修改 w,b，設修改 n 次，則 w,b 關於$ (x_i,y_i) $ 的增量分別是 $ \alpha_iy_ix_i $ 和 $ \alpha_iy_i $, 這裏 $ \alpha_i=n_i\eta $。最後學習到的 w,b 可以分別表示爲：

\[w=\sum_{i=1}^{N}\alpha_iy_ix_i \]

\[b=\sum_{i=1}^{N}\alpha_iy_i \]

這裏， $ \alpha_i\geq0, i=1,2,...,N $,當 $ \eta=1 $時，表示第i個是實例點由於誤分類而進行更新的次數，實例點更新次數越多，說明它距離分離超平面越近，也就越難區分，該點對學習結果的影響最大。

感知機模型對偶形式： $$f(x)=sign(\sum_{j=1}^{N}\alpha_jy_jx_j\cdot x+b) $$ 其中$$\alpha=(\alpha_1,\alpha_2,...,\alpha_N)^T$$
學習時初始化 $ \alpha \leftarrow 0, b \leftarrow 0 $, 在訓練集中找分類錯誤的點，即：

\[y_i(\sum_{j=1}^{N}\alpha_jy_jx_j\cdot x_i+b)\leq 0 \]

然後更新：

\[\alpha_i \leftarrow \alpha_i+\eta \]

\[b\leftarrow b+\eta y_i \]

知道訓練集中所有點正確分類

對偶形式中訓練實例僅以內積的形式出現，爲了方便，可以預先將訓練集中實例間的內積計算出來以矩陣的形式存儲，即 Gram 矩陣。

總結

當訓練數據集線性可分的時候，感知機學習算法是收斂的，感知機算法在訓練數據集上的誤分類次數 k 滿足不等式:

\[k\leq (\frac{R}{\gamma})^2 \]

具體證明可見李航《統計學習方法》或林軒田《機器學習基石》。

當訓練當訓練數據集線性可分的時候，感知機學習算法存在無窮多個解，其解由於不同的初值或不同的迭代順序而可能不同，即存在多個分離超平面能把數據集分開。
感知機學習算法簡單易求解，但一般的感知機算法不能解決異或等線性不可分的問題。

導入相關包並創建數據集

爲了快速方便的創建數據集，此處採用 scikit-learn 裏的 make_blobs

import numpy as np
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# 創建一個數據集，X有兩個特徵，y={-1，1}
X, y = make_blobs(n_samples=500, centers=2, random_state=6)
y[y==0] = -1
plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
plt.xlabel("feature_1")
plt.ylabel("feature_2")
plt.show()

感知機（採用原始形式）

創建感知機模型的原始形式的類，並在訓練集上訓練，測試集上簡單測試。

import numpy as np
from sklearn.datasets import make_blobs  # 爲了快速方便的創建數據集，此處採用 scikit-learn 裏的 make_blobs
import matplotlib.pyplot as plt

# 創建一個數據集，X有兩個特徵，y={-1，1}
X, y = make_blobs(n_samples=500, centers=2, random_state=6)
y[y == 0] = -1
# plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap=plt.cm.Paired)
# plt.xlabel("feature_1")
# plt.ylabel("feature_2")
# plt.show()


class PerceptronRaw():
    def __init__(self):
        self.W = None;
        self.bias = None;

    def fit(self, x_train, y_train, learning_rate=0.05, n_iters=100, plot_train=True):
        print("開始訓練...")
        num_samples, num_features = x_train.shape
        self.W = np.random.randn(num_features)
        self.bias = 0

        while True:
            erros_examples = []
            erros_examples_y = []
            # 查找錯誤分類的樣本點
            for idx in range(num_samples):
                example = x_train[idx]
                y_idx = y_train[idx]
                # 計算距離
                distance = y_idx * (np.dot(example, self.W) + self.bias)
                if distance <= 0:
                    erros_examples.append(example)
                    erros_examples_y.append(y_idx)
            if len(erros_examples) == 0:
                break;
            else:
                print("修正參數 w => %s b => %s" % (self.W, self.bias))
                # 隨機選擇一個錯誤分類點，修正參數
                random_idx = np.random.randint(0, len(erros_examples))
                choosed_example = erros_examples[random_idx]
                choosed_example_y = erros_examples_y[random_idx]
                self.W = self.W + learning_rate * choosed_example_y * choosed_example
                self.bias = self.bias + learning_rate * choosed_example_y
        print("訓練結束")

        # 繪製訓練結果部分
        if plot_train is True:
            x_hyperplane = np.linspace(2, 10, 8)
            slope = -self.W[0] / self.W[1]
            intercept = -self.bias / self.W[1]
            y_hpyerplane = slope * x_hyperplane + intercept

            plt.xlabel("feature_1")
            plt.ylabel("feature_2")
            plt.xlim((2, 10))
            plt.ylim((-12, 0))
            plt.title("Dataset and Decision in Training(Raw)")
            plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train, s=30, cmap=plt.cm.Paired)
            plt.plot(x_hyperplane, y_hpyerplane, color='g', label='Decision_Raw')
            plt.legend(loc='upper left')
            plt.show()

    def predict(self, x):
        if self.W is None or self.bias is None:
            raise NameError("模型未訓練")
        y_predict = np.sign(np.dot(x, self.W) + self.bias)
        return y_predict


X_train = X[0:450]
y_train = y[0:450]
X_test = X[450:500]
y_test = y[450:500]

# 實例化模型，並訓練
model_raw = PerceptronRaw()
model_raw.fit(X_train, y_train)


# 測試，因爲測試集和訓練集來自同一分佈的線性可分數據集，所以這裏測試準確率達到了 1.0
y_predict = model_raw.predict(X_test)

accuracy = np.sum(y_predict == y_test) / y_predict.shape[0]
print("原始形式模型在測試集上的準確率: {0}".format(accuracy))
# 原始形式模型在測試集上的準確率: 1.0

感知機（採用對偶形式）

創建感知機模型的對偶形式的類，並在訓練集上訓練，測試集上簡單測試。

import numpy as np
from sklearn.datasets import make_blobs  # 爲了快速方便的創建數據集，此處採用 scikit-learn 裏的 make_blobs
import matplotlib.pyplot as plt

# 創建一個數據集，X有兩個特徵，y={-1，1}
X, y = make_blobs(n_samples=500, centers=2, random_state=6)
y[y == 0] = -1

# plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap=plt.cm.Paired)
# plt.xlabel("feature_1")
# plt.ylabel("feature_2")
# plt.show()

class PerceptronDuality():
    def __init__(self):
        self.alpha = None
        self.bias = None
        self.W = None
    def fit(self, x_train, y_train, learning_rate=1, n_iters=100, plot_train=True):
        print("開始訓練...")
        num_samples, num_features = x_train.shape
        self.alpha = np.zeros((num_samples,))
        self.bias = 0
        
        # 計算 Gram 矩陣
        gram = np.dot(x_train, x_train.T)

        while True:
            error_count = 0
            for idx in range(num_samples):
                inner_product = gram[idx]
                y_idx = y_train[idx]
                distance = y_idx * (np.sum(self.alpha * y_train * inner_product) + self.bias)
                # 如果有分類錯誤點，修正 alpha 和 bias，跳出本層循環，重新遍歷數據計算，開始新的循環
                if distance <= 0:
                    error_count += 1
                    self.alpha[idx] = self.alpha[idx] + learning_rate
                    self.bias = self.bias + learning_rate * y_idx
                    break  
            # 數據沒有錯分類點，跳出 while 循環
            if error_count == 0:
                break
        self.W = np.sum(self.alpha * y_train * x_train.T, axis=1)       
        print("訓練結束")
        
        # 繪製訓練結果部分
        if plot_train is True:
            x_hyperplane = np.linspace(2, 10, 8)           
            slope = -self.W[0]/self.W[1]
            intercept = -self.bias/self.W[1]
            y_hpyerplane = slope * x_hyperplane + intercept
            
            plt.xlabel("feature_1")
            plt.ylabel("feature_2")
            plt.xlim((2, 10))
            plt.ylim((-12, 0))
            plt.title("Dataset and Decision in Training(Duality)")
            plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train, s=30, cmap=plt.cm.Paired)
            plt.plot(x_hyperplane, y_hpyerplane, color='g', label='Decision_Duality')
            plt.legend(loc='upper left')
            plt.show()
            
    def predict(self, x):
        if self.alpha is None or self.bias is None:
            raise NameError("模型未訓練")
        y_predicted = np.sign(np.dot(x, self.W) + self.bias)
        return y_predicted

X_train = X[0:450]
y_train = y[0:450]
X_test = X[450:500]
y_test = y[450:500]

# 訓練
model_duality = PerceptronDuality()
model_duality.fit(X_train, y_train)

# 測試
y_predict_duality = model_duality.predict(X_test)
accuracy_duality = np.sum(y_predict_duality == y_test) / y_test.shape[0]

print("對偶形式模型在測試集上的準確率: {0}".format(accuracy_duality))
#對偶形式模型在測試集上的準確率: 1.0

比較兩個模型

分別從原始模型和對偶模型中獲取參數，可以看出，這兩個模型的分離超平面都不同，但是都能正確進行分類，這驗證了總結中的結論。

當訓練當訓練數據集線性可分的時候，感知機學習算法存在無窮多個解，其解由於不同的初值或不同的迭代順序而可能不同，即存在多個分離超平面能把數據集分開。

print("原始形式模型參數:")
print("W: {0}, bias: {1}".format(model_raw.W, model_raw.bias))
print()
print("對偶形式模型參數:")
print("W: {0}, bias: {1}".format(model_duality.W, model_duality.bias))

原始形式模型參數:
W: [-1.07796999 -3.05384787], bias: -11.700000000000031

對偶形式模型參數:
W: [-25.35285228 -70.71533848], bias: -268

源碼： https://gitee.com/VipSoft/VipPython/tree/master/perceptron

HanLP — 感知機(Perceptron) -- Python

感知機

感知機學習的假設

感知機學習策略

感知機學習的算法

原始形式

對偶形式

總結

導入相關包並創建數據集

感知機（採用原始形式）

感知機（採用對偶形式）

比較兩個模型

HTML頁面關於高分屏的設置

北歐瑞典挪威芬蘭瑞士TikTok海外網紅與YouTube博主的合作模式

歐洲英國德國法國TikTok與YouTube海外網紅達人的完美合作策略

druid數據源 xml配置

Fail to create wsdl definition

CXF WebService wsdl2java

ASP.NET MVC WebApi 接口返回 JOSN 日期格式化 date format

intellij foreach tab 快捷生成代碼不換行

Linux 監控工具htop

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結