導言
邏輯迴歸是機器學習中最基礎也最常用的模型
一句話介紹LR:
邏輯迴歸假設數據服從伯努利分佈
,通過極大化似然函數
的方法,運用梯度下降
來求解參數,來達到將數據二分類
的目的。
LR具體的推導可以參考李航老師的統計學習方法
爲什麼LR損失函數不選MSE,而採用極大似然估計(即交叉熵損失)
- 參考:https://www.cnblogs.com/smartwhite/p/9109815.html
- 對於線性迴歸,我們會選擇MSE, 因爲其J(θ)是凸函數
- 但是對於logistic迴歸,由於進行了sigmoid非線性映射就是
非凸函數
,所以可能在尋優的時候容易陷入局部最優
- 所以
考慮把sigmoid作log
,對其求二階導,結果大於0,說明其是凸函數,在用梯度下降法尋優時,可以保證找到全局最小。
TensorFlow實現
使用mnist數據集驗證模型的有效性
import tensorflow as tf
import os
from tensorflow.examples.tutorials.mnist import input_data
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
lr = 0.001
n_epoch = 25
batch_size = 64
def LR():
x = tf.placeholder(tf.float32, [None, 784])
# 這裏得用float,後面計算損失時才能乘
y = tf.placeholder(tf.float32, [None, 10])
w = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.constant(0.1,[10]))
pred = tf.nn.softmax(tf.matmul(x, w) + b)
cost = tf.reduce_mean(-tf.reduce_sum(tf.multiply(y,tf.log(pred)),axis=0))
pred_res = tf.argmax(pred, axis=-1)
result = tf.equal(pred_res, tf.argmax(y, axis=-1))
acc = tf.reduce_mean(tf.cast(result, tf.float32))
opt = tf.train.AdamOptimizer(lr).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(n_epoch):
loss = 0
total_batch_num = int(mnist.train.num_examples / batch_size)
for i in range(total_batch_num):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
_, cost_now = sess.run([opt, cost], feed_dict={x: batch_xs, y: batch_ys})
loss += cost_now
loss /= total_batch_num
if epoch % 5 == 0:
feed_train = {x: mnist.train.images, y: mnist.train.labels}
feeds_test = {x: mnist.test.images, y: mnist.test.labels}
train_acc = sess.run(acc, feed_train)
test_acc = sess.run(acc, feeds_test)
print("epoch:{}, cost = {}, train acc: {}, test acc: {},".format(epoch + 1, loss, train_acc, test_acc))
if __name__ == "__main__":
mnist = input_data.read_data_sets("/home/syd/syz/my_try/minst_data", one_hot=True)
train_img = mnist.train.images
train_label = mnist.train.labels
print("訓練集類型:", type(train_img))
print("訓練集維度:", train_img.shape)
test_img = mnist.test.images
test_label = mnist.test.labels
print("測試集類型:", type(test_img))
print("測試集維度:", test_img.shape)
print(test_label[0])
LR()
不調包直接實現LR
秋招做網易的線上筆試題的時候,遇到了這題,要求不調包,自己手動實現LR
給的輸入數據:
0.1 10 100 5 10 10
0.105 0.956 0.876 0.133 0.249 0
0.195 0.672 0.193 0.016 0.009 0
0.059 0.282 0.709 0.139 0.478 1
0.303 0.39 0.95 0.912 0.522 1
0.59 0.57 0.141 0.959 0.036 1
0.231 0.355 0.305 0.508 0.625 1
0.896 0.415 0.771 0.197 0.826 0
0.051 0.537 0.442 0.46 0.628 0
0.737 0.583 0.09 0.337 0.774 1
0.062 0.217 0.553 0.868 0.87 0
0.13 0.972 0.845 0.737 0.492
0.016 0.009 0.432 0.41 0.092
0.257 0.327 0.451 0.18 0.62
0.774 0.143 0.879 0.123 0.222
0.885 0.114 0.352 0.484 0.367
0.439 0.227 0.675 0.654 0.323
0.778 0.191 0.633 0.628 0.929
0.958 0.231 0.07 0.739 0.34
0.015 0.115 0.154 0.75 0.649
0.283 0.853 0.752 0.915 0.937
第一行爲學習率,正則約束,epoch,輸入訓練數據維度,訓練數據個數,測試數據個數
接下來爲訓練數據,訓練數據最後一維爲標籤
最後是測試數據,需要輸出我們預測的標籤
實際上給了10個訓練數據,10個測試數據,最後會有幾組訓練數據,我當時寫了下提交之後竟然一次通過了…但是不確定下面代碼是否有問題
python實現
import math
class lr:
def __init__(self,alpha, lamda, epoch, dim):
self.alpha = alpha
self.lamda = lamda
self.epoch = epoch
self.dim = dim
self.weights = [1 for _ in range(self.dim)]
def get_pred(self,a,b):
result = []
for line in a:
cur = 0
for i in range(len(line)):
cur += line[i]*b[i]
cur = 1/(1+math.exp(-cur))
result.append(cur)
return result
def grad_descent(self, pred, train_y):
bs = len(train_y)
dim = len(self.weights)
hx_y = [pred[i]-train_y[i] for i in range(bs)]
for j in range(dim):
cur = 0
for i in range(bs):
cur += (hx_y[i]*train_data[i][j])
cur += lamda*self.weights[j]
cur = cur*self.alpha/bs
self.weights[j] -= cur
return self.weights
def train(self,train_data,train_y):
for epc in range(self.epoch):
pred = self.get_pred(train_data,self.weights)
self.weights = self.grad_descent(pred, train_y)
def test(self,test_data):
pred_y = self.get_pred(test_data,self.weights)
print(pred_y)
test_y = []
for pred in pred_y:
if pred > 0.5:
test_y.append(1)
else:
test_y.append(0)
return test_y
# 學習率,正則約束,epoch,輸入訓練數據維度,訓練數據,測試數據
alpha, lamda, epoch, dim, train_bs, test_bs = input().strip().split()
alpha = float(alpha)
lamda = float(lamda)
epoch = int(epoch)
dim = int(dim)
train_bs = int(train_bs)
test_bs = int(test_bs)
train_data = []
train_y = []
test_data = []
for i in range(train_bs):
line = list(map(float,input().strip().split()))
train_data.append(line[:dim])
train_y.append(line[-1])
for i in range(test_bs):
test_data.append(list(map(float, input().strip().split())))
my_lr = lr(alpha, lamda, epoch, dim)
my_lr.train(train_data,train_y)
test_y = my_lr.test(test_data)
for pred in test_y:
print(pred)