Python fminunc 的替代方法

原創

2018-11-20 08:33

最近閒着沒事，想把coursera上斯坦福ML課程裏面的練習，用Python來實現一下，一是加深ML的基礎，二是熟悉一下numpy，matplotlib，scipy這些庫。

在EX2中，優化theta使用了matlab裏面的fminunc函數，不知道Python裏面如何實現。搜索之後，發現stackflow上有人提到用scipy庫裏面的minimize函數來替代。我嘗試直接調用我的costfunction和grad，程序報錯，提示(3,)和(100,1)dim維度不等，gradient vector不對之類的，試了N多次後，終於發現問題何在。。

首先來看看使用np.info(minimize)查看函數的介紹，傳入的參數有：

fun : callable
    The objective function to be minimized.

        ``fun(x, *args) -> float``

    where x is an 1-D array with shape (n,) and `args`
    is a tuple of the fixed parameters needed to completely
    specify the function.
x0 : ndarray, shape (n,)
    Initial guess. Array of real elements of size (n,),
    where 'n' is the number of independent variables.
args : tuple, optional
    Extra arguments passed to the objective function and its
    derivatives (`fun`, `jac` and `hess` functions).
method : str or callable, optional
    Type of solver.  Should be one of

        - 'Nelder-Mead' :ref:`(see here) <optimize.minimize-neldermead>`
        - 'Powell'      :ref:`(see here) <optimize.minimize-powell>`
        - 'CG'          :ref:`(see here) <optimize.minimize-cg>`
        - 'BFGS'        :ref:`(see here) <optimize.minimize-bfgs>`
        - 'Newton-CG'   :ref:`(see here) <optimize.minimize-newtoncg>`
        - 'L-BFGS-B'    :ref:`(see here) <optimize.minimize-lbfgsb>`
        - 'TNC'         :ref:`(see here) <optimize.minimize-tnc>`
        - 'COBYLA'      :ref:`(see here) <optimize.minimize-cobyla>`
        - 'SLSQP'       :ref:`(see here) <optimize.minimize-slsqp>`
        - 'trust-constr':ref:`(see here) <optimize.minimize-trustconstr>`
        - 'dogleg'      :ref:`(see here) <optimize.minimize-dogleg>`
        - 'trust-ncg'   :ref:`(see here) <optimize.minimize-trustncg>`
        - 'trust-exact' :ref:`(see here) <optimize.minimize-trustexact>`
        - 'trust-krylov' :ref:`(see here) <optimize.minimize-trustkrylov>`
        - custom - a callable object (added in version 0.14.0),
          see below for description.

    If not given, chosen to be one of ``BFGS``, ``L-BFGS-B``, ``SLSQP``,
    depending if the problem has constraints or bounds.
jac : {callable,  '2-point', '3-point', 'cs', bool}, optional
    Method for computing the gradient vector. Only for CG, BFGS,
    Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov,
    trust-exact and trust-constr. If it is a callable, it should be a
    function that returns the gradient vector:

        ``jac(x, *args) -> array_like, shape (n,)``

    where x is an array with shape (n,) and `args` is a tuple with
    the fixed parameters. Alternatively, the keywords
    {'2-point', '3-point', 'cs'} select a finite
    difference scheme for numerical estimation of the gradient. Options
    '3-point' and 'cs' are available only to 'trust-constr'.
    If `jac` is a Boolean and is True, `fun` is assumed to return the
    gradient along with the objective function. If False, the gradient
    will be estimated using '2-point' finite difference estimation.

需要注意的是fun關鍵詞參數裏面的函數，需要把優化的theta放在第一個位置，X,y，放到後面。並且，theta在傳入的時候一定要是一個一維shape（n,）的數組，不然會出錯。

然後jac是梯度，這裏的有兩個地方要注意，第一個是傳入的theta依然要是一個一維shape(n,)，第二個是返回的梯度也要是一個一維shape(n,)的數組。

總之，關鍵在於傳入的theta一定要是一個1D shape(n,)的，不然就不行。我之前爲了方便已經把theta塑造成了一個（n,1）的列向量，導致使用minimize時會報錯。所以，學會用help看說明可謂是相當重要啊~

import numpy as np
import pandas as pd
import scipy.optimize as op

def LoadData(filename):
    data=pd.read_csv(filename,header=None)
    data=np.array(data)
    return data

def ReshapeData(data):
    m=np.size(data,0)
    X=data[:,0:2]
    Y=data[:,2]
    Y=Y.reshape((m,1))
    return X,Y

def InitData(X):
    m,n=X.shape
    initial_theta = np.zeros(n + 1)
    VecOnes = np.ones((m, 1))
    X = np.column_stack((VecOnes, X))
    return X,initial_theta

def sigmoid(x):
    z=1/(1+np.exp(-x))
    return z

def costFunction(theta,X,Y):
    m=X.shape[0]
    J = (-np.dot(Y.T, np.log(sigmoid(X.dot(theta)))) - \
         np.dot((1 - Y).T, np.log(1 - sigmoid(X.dot(theta))))) / m
    return J

def gradient(theta,X,Y):
    m,n=X.shape
    theta=theta.reshape((n,1))
    grad=np.dot(X.T,sigmoid(X.dot(theta))-Y)/m
    return grad.flatten()

if __name__=='__main__':
    data = LoadData('ex2data1csv.csv')
    X, Y = ReshapeData(data)
    X, initial_theta = InitData(X)
    result = op.minimize(fun=costFunction, x0=initial_theta, args=(X, Y), method='TNC', jac=gradient)
    print(result)

最後結果如下，符合MATLAB裏面用fminunc優化的結果（fminunc:cost:0.203,theta:-25.161,0.206,0.201）

     fun: array([0.2034977])
     jac: array([8.95038682e-09, 8.16149951e-08, 4.74505693e-07])
 message: 'Local minimum reached (|pg| ~= 0)'
    nfev: 36
     nit: 17
  status: 0
 success: True
       x: array([-25.16131858,   0.20623159,   0.20147149])

此外，由於知道cost在0.203左右，所以我用最笨的梯度下降試了一下，由於後面實在是太慢了，所以設置while J>0.21，循環了大概13W次。。可見，使用集成好的優化算法是多麼重要。。。還有，在以前的理解中，如果一個學習速率不合適，J會一直髮散，但是昨天的實驗發現，有的速率開始會發散，後面還是會收斂。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python fminunc 的替代方法

Python 爬蟲：Spring Boot 反爬蟲的成功案例

京東科技數字化營銷能力的演進與最佳實踐| 京東雲技術團隊

Coursera NG 機器學習第六週 SVM分類 Spam Classifier Python 實現

Coursera NG 機器學習第八週異常檢測推薦系統 Python實現

Coursera NG 機器學習第三週手寫識別邏輯迴歸神經網 Python實現

cs224n Lecture 2 ：Word2Vec Skip-Gram CBOW Negative Sampling 總結

Coursera NG 機器學習第七週 KMeans PCA 圖像壓縮 Python實現

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結