開篇
前段時間在b站上看斯坦福大學cs231n計算機視覺的公開課,其實這門課並不是一門進階課程,他只是一門基礎的CV課程,如果大家想入門計算機視覺的可以去b站上搜一下這門課,簡介中前序課程建議學習cs229和cs131,但本人親測沒學過這兩門也關係不大,因爲cs231n講的還是很基礎的(我學過cs229,就是吳恩達老師的機器學習,但是沒上過cs131,但感覺影響和知識預備關聯不是很大,cs231n實在很基礎)。雖然課程很基礎,但是它的三個大作業確實是備受好評,我之所以會每天再花費一個多小時去學一遍這門較爲基礎的入門課只是因爲去做一遍它的官方作業而已,如果大家有興趣可以和我一起做cs231n作業1。這是github上的作業要求和一些指導,我們先從作業1開始做起,我現在做到了作業2,未來會在博客上更新整套三個作業的所有代碼。我們開始吧。
最近鄰KNN算法
最近鄰算法很容易理解,在空間中某個樣本點所屬的類別與它周圍出現的最多類別相同。
舉個例子,假如現在樣本點A,選出距離它最近的5個樣本B,C,D,E,F,分別對應類別1,1,2,3,3.我們可以看到最近的5個點中,出現最多的類別是1,所以我們預測A的類別是1.就這麼簡單。
這裏的參數我們一目瞭然,1.k值,選出最近的k個點;2.用什麼距離公式來衡量最近距離呢?我們可以選擇歐氏距離,即L2距離;或者曼哈頓距離,即L1距離,d(i,j)=|X1-X2|+|Y1-Y2|.
歐氏距離爲:
當p=2時,就是我們的歐氏距離。
我們確定了我們的可選參數和主要算法(選最近的k個點中出現的最多類別),現在我們來看作業1KNN的代碼。
代碼
import numpy as np
class KNearestNeighbor(object):
""" a kNN classifier with L2 distance """
def __init__(self):
pass
def train(self, X, y):
"""
Train the classifier. For k-nearest neighbors this is just
memorizing the training data.
Inputs:
- X: A numpy array of shape (num_train, D) containing the training data
consisting of num_train samples each of dimension D.
- y: A numpy array of shape (N,) containing the training labels, where
y[i] is the label for X[i].
"""
self.X_train = X
self.y_train = y
def predict(self, X, k=1, num_loops=0):
"""
Predict labels for test data using this classifier.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data consisting
of num_test samples each of dimension D.
- k: The number of nearest neighbors that vote for the predicted labels.
- num_loops: Determines which implementation to use to compute distances
between training points and testing points.
Returns:
- y: A numpy array of shape (num_test,) containing predicted labels for the
test data, where y[i] is the predicted label for the test point X[i].
"""
# 表示的是計算距離的方法
if num_loops == 0:
dists = self.compute_distances_no_loops(X)
elif num_loops == 1:
dists = self.compute_distances_one_loop(X)
elif num_loops == 2:
dists = self.compute_distances_two_loops(X)
else:
raise ValueError('Invalid value %d for num_loops' % num_loops)
return self.predict_labels(dists, k=k)
def compute_distances_two_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the
test data.
Inputs:
- X: A numpy array of shape (num_test, D) containing test data.
Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
is the Euclidean distance between the ith test point and the jth training
point.
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
# 這是一個距離矩陣
dists = np.zeros((num_test, num_train))
for i in range(num_test):
for j in range(num_train):
#####################################################################
# TODO: #
# Compute the l2 distance between the ith test point and the jth #
# training point, and store the result in dists[i, j]. You should #
# not use a loop over dimension. #
#####################################################################
dists[i][j] = np.linalg.norm((X[i] - self.X_train[j]), ord=2)
#####################################################################
# END OF YOUR CODE #
#####################################################################
return dists
def compute_distances_one_loop(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
# Compute the l2 distance between the ith test point and all training
# points, and store the result in dists[i, :].
# 有點兒向量化的意思
dists[i] = np.linalg.norm(X[i] - self.X_train, ord=2, axis=1)
return dists
def compute_distances_no_loops(self, X):
"""
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.
Input / Output: Same as compute_distances_two_loops
"""
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
#########################################################################
# TODO: #
# Compute the l2 distance between all test points and all training #
# points without using any explicit loops, and store the result in #
# dists. #
# #
# You should implement this function using only basic array operations; #
# in particular you should not use functions from scipy. #
# #
# HINT: Try to formulate the l2 distance using matrix multiplication #
# and two broadcast sums. #
#########################################################################
# 廣播特性進行相加
# 由於我們的距離數組的行數是num_test,列數是num_train
# 所以我們應該讓X_train的維度是(1,num_train)
# 而test的維度是(num_test,1)
dists += np.sum(self.X_train ** 2, axis=1).reshape(1, num_train)
dists += np.sum(X ** 2, axis=1).reshape(num_test, 1) # reshape for broadcasting
dists -= 2 * np.dot(X, self.X_train.T)
dists = np.sqrt(dists)
#########################################################################
# END OF YOUR CODE #
#########################################################################
return dists
def predict_labels(self, dists, k=1):
"""
Given a matrix of distances between test points and training points,
predict a label for each test point.
Inputs:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
gives the distance betwen the ith test point and the jth training point.
Returns:
- y: A numpy array of shape (num_test,) containing predicted labels for the
test data, where y[i] is the predicted label for the test point X[i].
"""
num_test = dists.shape[0]
y_pred = np.zeros(num_test)
for i in range(num_test):
# A list of length k storing the labels of the k nearest neighbors to
# the ith test point.
closest_y = []
#########################################################################
# TODO: #
# Use the distance matrix to find the k nearest neighbors of the ith #
# testing point, and use self.y_train to find the labels of these #
# neighbors. Store these labels in closest_y. #
# Hint: Look up the function numpy.argsort. #
#########################################################################
# 將距離從小到大進行排列,然後取前k的索引放入列表
closest_y = self.y_train[np.argsort(dists[i])[: k]]
#########################################################################
# TODO: #
# Now that you have found the labels of the k nearest neighbors, you #
# need to find the most common label in the list closest_y of labels. #
# Store this label in y_pred[i]. Break ties by choosing the smaller #
# label. #
#########################################################################
# y_pred[i]是closest_y中出現最多的數字,即我們要找的那個類別
y_pred[i] = np.argmax(np.bincount(closest_y))
#########################################################################
# END OF YOUR CODE #
#########################################################################
return y_pred
這裏我們的主要代碼就是這三個計算距離的函數,我們採取的計算公式都是歐氏距離,但是具體細節不同,一種是直接計算兩個向量之間的距離,另一種是取測試集中的每一個樣本,計算與訓練集向量之間的距離;第三種是一個二維矩陣,分別計算測試集中每一個樣本與訓練集中每個樣本的距離並填入矩陣中。
然後我們將距離從小到大排序,選取前k個最小距離,然後找出出現做多的類別標籤即可。(y_train中存儲着每個訓練集的正確標籤)。
這裏的英文提示都是原版作業給出的提示,對我們理解算法以及填充代碼很有幫助。建議大家做完作業以後再自己實現一遍,把除了核心算法的細節也弄清楚。
總結
KNN算法很簡單所以實現起來很easy,明天我們說linear_classifier線性分類器,一種簡單且基礎的分類模型。
說明:這一系列博客和我介紹的tensorflow或者pytorch實現的文章是不同,用pytorch實現的模型是利用了框架,而這裏是一種從底層開始的造輪子過程,可以看到,我們只引入numpy這個庫。兩種方法都是有效的,只不過從底層實現可以更好地理解算法。
大家可以先參考下Pytorch實現線性迴歸