Numpy簡單介紹

1.Numpy是什麼

很簡單，Numpy是Python的一個科學計算的庫，提供了矩陣運算的功能，其一般與Scipy、matplotlib一起使用。其實，list已經提供了類似於矩陣的表示形式，不過numpy爲我們提供了更多的函數。如果接觸過matlab、scilab，那麼numpy很好入手。在以下的代碼示例中，總是先導入了numpy：（通用做法import numpu as np 簡單輸入）

[python]view
plain copy

>>> import numpy as np  

>>> print np.version.version  

1.6.2  

2. 多維數組

多維數組的類型是：numpy.ndarray。

使用numpy.array方法

以list或tuple變量爲參數產生一維數組：

[python]view
plain copy

>>> print np.array([1,2,3,4])  

[1 2 3 4]  

>>> print np.array((1.2,2,3,4))  

[ 1.2  2.   3.   4. ]  

>>> print type(np.array((1.2,2,3,4)))  

<type 'numpy.ndarray'>

以list或tuple變量爲元素產生二維數組或者多維數組：

[python]view
plain copy

>>> x = np.array(((1,2,3),(4,5,6)))  

>>> x  

array([[1, 2, 3],  

       [4, 5, 6]])  

>>> y = np.array([[1,2,3],[4,5,6]])  

>>> y  

array([[1, 2, 3],  

       [4, 5, 6]])

numpy數據類型設定與轉換

numpy ndarray數據類型可以通過參數dtype 設定，而且可以使用astype轉換類型，在處理文件時候這個會很實用，注意astype 調用會返回一個新的數組，也就是原始數據的一份拷貝。

[python]view
plain copy

numeric_strings2 = np.array(['1.23','2.34','3.45'],dtype=np.string_)  

numeric_strings2  

Out[32]:   

array(['1.23', '2.34', '3.45'],   

      dtype='|S4')  

numeric_strings2.astype(float)  

Out[33]: array([ 1.23,  2.34,  3.45])

numpy索引與切片

index 和slicing ：第一數值類似數組橫座標，第二個爲縱座標

[python]view
plain copy

>>> x[1,2]  

6  

>>> y=x[:,1]  

>>> y  

array([2, 5])

涉及改變相關問題，我們改變上面y是否會改變x？這是特別需要關注的！

[python]view
plain copy

>>> y  

array([2, 5])  

>>> y[0] = 10  

>>> y  

array([10,  5])  

>>> x  

array([[ 1, 10,  3],  

       [ 4,  5,  6]])

通過上面可以發現改變y會改變x ，因而我們可以推斷，y和x指向是同一塊內存空間值，系統沒有爲y 新開闢空間把x值賦值過去。

[python]view
plain copy

arr = np.arange(10)  

arr  

Out[45]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])  

arr[4]  

Out[46]: 4  

arr[3:6]  

Out[47]: array([3, 4, 5])  

arr[3:6] = 12  

arr  

Out[49]: array([ 0,  1,  2, 12, 12, 12,  6,  7,  8,  9])

如上所示：當將一個標量賦值給切片時，該值會自動傳播整個切片區域，這個跟列表最重要本質區別，數組切片是原始數組的視圖，視圖上任何修改直接反映到源數據上面。

思考爲什麼這麼設計？ Numpy 設計是爲了處理大數據，如果切片採用數據複製話會產生極大的性能和內存消耗問題。

假如說需要對數組是一份副本而不是視圖可以如下操作：

[python]view
plain copy

arr_copy = arr[3:6].copy()  

arr_copy[:]=24  

arr_copy  

Out[54]: array([24, 24, 24])  

arr  

Out[55]: array([ 0,  1,  2, 12, 12, 12,  6,  7,  8,  9])

再看下對list 切片修改

[python]view
plain copy

l=range(10)  

l  

Out[35]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]  

l[5:8] = 12  

Traceback (most recent call last):  

  File "<ipython-input-36-022af3ddcc9b>", line 1, in <module>  

    l[5:8] = 12  

TypeError: can only assign an iterable  

l1= l[5:8]  

l1  

Out[38]: [5, 6, 7]  

l1[0]=12  

l1  

Out[40]: [12, 6, 7]  

l  

Out[41]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

這裏設計到python 中深淺拷貝，其中切片屬於淺拷貝，具體參考：python深淺拷貝

多維數組索引、切片

[python]view
plain copy

arr2d = np.arange(1,10).reshape(3,3)  

arr2d  

Out[57]:   

array([[1, 2, 3],  

       [4, 5, 6],  

       [7, 8, 9]])  

arr2d[2]  

Out[58]: array([7, 8, 9])  

arr2d[0][2]  

Out[59]: 3  

arr2d[0,2]  

Out[60]: 3

布爾型索引

這種類型在實際代碼中出現比較多，關注下。

[python]view
plain copy

names = np.array(['Bob','joe','Bob','will'])  

names == 'Bob'  

Out[70]: array([ True, False,  True, False], dtype=bool)

[python]view
plain copy

data  

Out[73]:   

array([[ 0.36762706, -1.55668952,  0.84316735, -0.116842  ],  

       [ 1.34023966,  1.12766186,  1.12507441, -0.68689309],  

       [ 1.27392366, -0.43399617, -0.80444728,  1.60731881],  

       [ 0.23361565,  1.38772715,  0.69129479, -1.19228023],  

       [ 0.51353082,  0.17696698, -0.06753478,  0.80448168],  

       [ 0.21773096,  0.60582802, -0.46446071,  0.83131122],  

       [ 0.50569072,  0.04431685, -0.69358155, -0.9629124 ]])  

data[data < 0] = 0  

data  

Out[75]:   

array([[ 0.36762706,  0.        ,  0.84316735,  0.        ],  

       [ 1.34023966,  1.12766186,  1.12507441,  0.        ],  

       [ 1.27392366,  0.        ,  0.        ,  1.60731881],  

       [ 0.23361565,  1.38772715,  0.69129479,  0.        ],  

       [ 0.51353082,  0.17696698,  0.        ,  0.80448168],  

       [ 0.21773096,  0.60582802,  0.        ,  0.83131122],  

       [ 0.50569072,  0.04431685,  0.        ,  0.        ]])

上面展示通過布爾值來設置值的手段。

數組文件輸入輸出

在跑實驗時經常需要用到讀取文件中的數據，其實在numpy中已經有成熟函數封裝好了可以使用

將數組以二進制形式格式保存到磁盤，np.save 、np.load 函數是讀寫磁盤的兩個主要函數，默認情況下，數組以未壓縮的原始二進制格式保存在擴展名爲.npy的文件中

[python]view
plain copy

arr = np.arange(10)  

np.save('some_array',arr)  

[python]view
plain copy

np.load('some_array.npy')  

Out[80]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])  

存取文本文件：

文本中存放是聚類需要數據，直接可以方便讀取到numpy array中，省去一行行讀文件繁瑣。

[python]view
plain copy

arr = np.loadtxt('dataMatrix.txt',delimiter=' ')  

arr  

Out[82]:   

array([[ 1.        ,  1.        ,  1.        ,  1.        ,  1.        ,  

         0.8125    ],  

       [ 0.52882353,  0.56271186,  0.48220588,  0.53384615,  0.61651376,  

         0.58285714],  

       [ 0.        ,  0.        ,  0.        ,  1.        ,  1.        ,  

         1.        ],  

       [ 1.        ,  0.92857143,  0.91857143,  1.        ,  1.        ,  

         1.        ],  

       [ 1.        ,  1.        ,  1.        ,  1.        ,  1.        ,  

         1.        ],  

       [ 0.05285714,  0.10304348,  0.068     ,  0.06512821,  0.05492308,  

         0.05244898],  

       [ 0.04803279,  0.08203125,  0.05516667,  0.05517241,  0.04953488,  

         0.05591549],  

       [ 0.04803279,  0.08203125,  0.05516667,  0.05517241,  0.04953488,  

         0.05591549]])

np.savetxt 執行相反的操作，這兩個函數在跑實驗加載數據時可以提供很多便利！！！

使用numpy.arange方法

[python]view
plain copy

>>> print np.arange(15)  

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]  

>>> print type(np.arange(15))  

<type 'numpy.ndarray'>  

>>> print np.arange(15).reshape(3,5)  

[[ 0  1  2  3  4]  

 [ 5  6  7  8  9]  

 [10 11 12 13 14]]  

>>> print type(np.arange(15).reshape(3,5))  

<type 'numpy.ndarray'>

使用numpy.linspace方法

例如，在從1到10中產生20個數：

[python]view
plain copy

>>> print np.linspace(1,10,20)  

[  1.           1.47368421   1.94736842   2.42105263   2.89473684  

   3.36842105   3.84210526   4.31578947   4.78947368   5.26315789  

   5.73684211   6.21052632   6.68421053   7.15789474   7.63157895  

   8.10526316   8.57894737   9.05263158   9.52631579  10.        ]

使用numpy.zeros，numpy.ones，numpy.eye等方法可以構造特定的矩陣

[python]view
plain copy

>>> print np.zeros((3,4))  

[[ 0.  0.  0.  0.]  

 [ 0.  0.  0.  0.]  

 [ 0.  0.  0.  0.]]  

>>> print np.ones((3,4))  

[[ 1.  1.  1.  1.]  

 [ 1.  1.  1.  1.]  

 [ 1.  1.  1.  1.]]  

>>> print np.eye(3)  

[[ 1.  0.  0.]  

 [ 0.  1.  0.]  

 [ 0.  0.  1.]]

獲取數組的屬性：

[python]view
plain copy

>>> a = np.zeros((2,2,2))  

>>> print a.ndim   #數組的維數  

3  

>>> print a.shape  #數組每一維的大小  

(2, 2, 2)  

>>> print a.size   #數組的元素數  

8  

>>> print a.dtype  #元素類型  

float64  

>>> print a.itemsize  #每個元素所佔的字節數  

8

Memory layout

The following attributes contain information about the memory layout of the array:

ndarray.flags	Information about the memory layout of the array.
ndarray.shape	Tuple of array dimensions.
ndarray.strides	Tuple of bytes to step in each dimension when traversing an array.
ndarray.ndim	Number of array dimensions.
ndarray.data	Python buffer object pointing to the start of the array’s data.
ndarray.size	Number of elements in the array.
ndarray.itemsize	Length of one array element in bytes.
ndarray.nbytes	Total bytes consumed by the elements of the array.
ndarray.base	Base object if memory is from some other object.

Array methods

An ndarray object has many methods which operate on or with the array in some fashion, typically returning an array result. These methods are briefly explained below. (Each method’s docstring has a more complete description.)

For the following methods there are also corresponding functions in numpy: all, any, argmax, argmin, argpartition, argsort, choose, clip,compress, copy, cumprod, cumsum, diagonal, imag, max, mean, min, nonzero, partition, prod, ptp, put, ravel, real, repeat, reshape, round,searchsorted, sort, squeeze, std, sum, swapaxes, take, trace, transpose, var.

更多Array的相關方法見：http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html

用到比較多函數示例：

[python]view
plain copy

>>> x  

array([[[ 0,  1,  2],  

        [ 3,  4,  5],  

        [ 6,  7,  8]],  

       [[ 9, 10, 11],  

        [12, 13, 14],  

        [15, 16, 17]],  

       [[18, 19, 20],  

        [21, 22, 23],  

        [24, 25, 26]]])  

>>> x.sum(axis=1)  

array([[ 9, 12, 15],  

       [36, 39, 42],  

       [63, 66, 69]])  

>>> x.sum(axis=2)  

array([[ 3, 12, 21],  

       [30, 39, 48],  

       [57, 66, 75]])

[python]view
plain copy

>>> np.sum([[0, 1], [0, 5]])  

6  

>>> np.sum([[0, 1], [0, 5]], axis=0)  

array([0, 6])  

>>> np.sum([[0, 1], [0, 5]], axis=1)  

array([1, 5])

合併數組

使用numpy下的vstack（垂直方向）和hstack（水平方向）函數：

[python]view
plain copy

>>> a = np.ones((2,2))  

>>> b = np.eye(2)  

>>> print np.vstack((a,b))  

[[ 1.  1.]  

 [ 1.  1.]  

 [ 1.  0.]  

 [ 0.  1.]]  

>>> print np.hstack((a,b))  

[[ 1.  1.  1.  0.]  

 [ 1.  1.  0.  1.]]

看一下這兩個函數有沒有涉及到淺拷貝這種問題：

[python]view
plain copy

>>> c = np.hstack((a,b))  

>>> print c  

[[ 1.  1.  1.  0.]  

 [ 1.  1.  0.  1.]]  

>>> a[1,1] = 5  

>>> b[1,1] = 5  

>>> print c  

[[ 1.  1.  1.  0.]  

 [ 1.  1.  0.  1.]]

通過上面可以知道，這裏進行是深拷貝，而不是引用指向同一位置的淺拷貝。

深拷貝數組

數組對象自帶了淺拷貝和深拷貝的方法，但是一般用深拷貝多一些：

[python]view
plain copy

>>> a = np.ones((2,2))  

>>> b = a  

>>> b is a  

True  

>>> c = a.copy()  #深拷貝  

>>> c is a  

False

基本的矩陣運算

轉置：

[python]view
plain copy

>>> a = np.array([[1,0],[2,3]])  

>>> print a  

[[1 0]  

 [2 3]]  

>>> print a.transpose()  

[[1 2]  

 [0 3]]

numpy.linalg模塊中有很多關於矩陣運算的方法：

特徵值、特徵向量：

[python]view
plain copy

>>> a = np.array([[1,0],[2,3]])  

>>> nplg.eig(a)  

(array([ 3.,  1.]), array([[ 0.        ,  0.70710678],  

       [ 1.        , -0.70710678]]))

zhaoyuxia517

發佈了10 篇原創文章 · 獲贊 67 · 訪問量 21萬+

私信關注

python numpy 基礎教程

Numpy簡單介紹

numpy數據類型設定與轉換

numpy索引與切片

多維數組索引、切片

數組文件輸入輸出

Memory layout

Array methods

BP神經網絡，BP推導過程，反向傳播算法，誤差反向傳播，梯度下降，權值閾值更新推導，隱含層權重更新公式

Numpy 入門教程

推薦算法

Scikit-learn使用總結

python numpy 基礎教程

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結