Scipy快速入門教程
Scipy Tutorial
1、介紹
SciPy是建立在 Python 的 NumPy 擴展上的數學算法和便利函數的集合。
SciPy 是在數學運算、科學和工程學方面被廣泛應用 Python 類庫。它包括統計、
優化、整合、線性代數模塊、傅里葉變換、信號和圖像處理、常微分方程求解器等,因
此被廣泛 應用在 器學習項目。
1.1 Scipy 組織
SciPy組織涵蓋不同科學計算領域的子包。下表總結了這些情況:
Subpackage | Description |
---|---|
cluster | 聚類算法(Clustering algorithms) |
constants | 物理和數學常量(Physical and mathematical constants) |
fftpack | 快速傅里葉變換(Fast Fourier Transform routines) |
integrate | 積分和常微分方程求解器(Integration and ordinary differential equation solvers) |
interpolate | 插值和濾波曲線(Interpolation and smoothing splines) |
io | 輸入輸出(Input and Output) |
linalg | 線性代數(Linear algebra) |
ndimage | N 維圖像處理(N-dimensional image processing) |
odr | 正交距離迴歸(Orthogonal distance regression) |
optimize | 優化和尋根例程(Optimization and root-finding routines) |
signal | 信號處理(Signal processing) |
sparse | 稀疏矩陣和相關例程(Sparse matrices and associated routines) |
spatial | 空間數據結構和算法(Spatial data structures and algorithms) |
special | 特殊函數(Special functions) |
stats | 統計分佈和函數(Statistical distributions and functions) |
SciPy 子軟件包需要單獨導入,例如:from scipy import linalg, optimize
2、基本功能
2.1 與 NumPy 的交互
SciPy 是建立在 NumPy 之上,對於所有基本的數組處理需求,可以使用 NumPy 函數:
1)索引技巧
有一些類實例專門使用了切片功能來爲數組構造提供有效的方法。這一部分將討論使用 numpy.mgrid、numpy.ogrid、numpy.r_、numpy.c_ 快速構造數組。
例如,使用 r_ 而不是使用 concatenate:
# concatenate:沿着現有的軸連接數組序列。
a = np.concatenate(([3], [0]*5, np.arange(-1, 1.002, 2/9.0)))
a = np.r_[3,[0]*5,-1:1:10j]
這樣可以簡化鍵入並使代碼更易讀。
“ r” 代表行串聯,因爲如果逗號之間的對象是二維數組,則它們按行堆疊(因此必須具有相應的列)。有一個等效的命令 c_可以按列堆疊二維數組,但與 r_ 一維數組的工作原理相同 。
另一個使用擴展切片符號的非常有用的類實例是函數 mgrid。在最簡單的情況下,這個函數可以用來構造1-D 範圍的數據,作爲 arange 的一種方便的替代品。它還允許在步長中使用複數來表示(包含在內的)端點之間的點數。然而,這個函數的真正目的是生成 N, N-D 數組,爲N-D卷提供座標數組。示例:
>>> np.mgrid[0:5,0:5]
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
>>> np.mgrid[0:5:4j,0:5:4j]
array([[[ 0. , 0. , 0. , 0. ],
[ 1.6667, 1.6667, 1.6667, 1.6667],
[ 3.3333, 3.3333, 3.3333, 3.3333],
[ 5. , 5. , 5. , 5. ]],
[[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ],
[ 0. , 1.6667, 3.3333, 5. ]]])
Having meshed arrays like this is sometimes very useful. However, it is not always needed just to evaluate some N-D function over a grid due to the array-broadcasting rules of NumPy and SciPy. If this is the only purpose for generating a meshgrid, you should instead use the function ogrid which generates an “open” grid using newaxis judiciously to create N, N-D arrays where only one dimension in each array has length greater than 1. This will save memory and create the same result if the only purpose for the meshgrid is to generate sample points for evaluation of an N-D function.
2)形狀操作
在這類函數中,有從 N-D 數組中壓縮出一維長度的例程,確保數組至少是1-、2-或3-D,並按行、列和“pages”(第三維)堆疊(連接)數組。還可以使用分割數組(與堆疊數組大致相反)的例程。
3)多項式(Polynomials)
在 SciPy 中有兩種(可互換的)處理一維多項式的方法。第一個是使用來自 NumPy 的 poly1d 類。這個類接受係數或多項式根來初始化多項式。然後,多項式對象可以在代數表達式中進行操作、積分、微分和計算。它甚至像一個多項式:
from numpy import poly1d
p = poly1d([3,4,5]) # 3x^2 + 4x + 5
p(0.5) # 如果多項式的x值爲 0.5,即 3x^2 + 4x + 5 = 7.75
p*p # 9x^4 + 24x^3 + 46x^2 +40x + 25
p.deriv() # 6x + 4 返回多項式的導數
p.integ(k=6) # x^3 + 2x^2 + 5x + 6 返回該多項式的不定積分
p([4, 5]) # array([ 69, 100]) 3*4^2 + 4*4 + 5 = 69
處理多項式的另一種方法是將多項式作爲一個係數數組,數組的第一個元素給出最高次的係數。有顯式的函數來加、減、乘、除、積分、微分和計算以係數序列表示的多項式。
4)向量函數(向量化)
NumPy 提供的特性之一是類向量化 vectorize,它將一個普通 Python 函數轉換爲一個“向量化函數”,該函數接受標量並將標量返回爲一個“向量化函數”,與其他NumPy函數具有相同的廣播規則(例如,通用函數或 ufuncs)。例如,假設你有一個 Python 函數 addsubtract 被定義爲:
def addsubtract(a,b):
if a > b:
return a - b
else:
return a + b
定義包含兩個標量變量的函數並返回標量結果。類 vectorize 可用於對該函數進行“向量化”:
vec_addsubtract = np.vectorize(addsubtract)
返回一個函數,它接受數組參數並返回數組結果:
>>> vec_addsubtract([0,3,6,9],[1,3,5,7])
array([1, 6, 1, 2])
這個特殊的函數可以寫成向量形式而不用向量化。但是,使用優化或集成例程的函數可能只能使用向量化。
5、類型處理
Note the difference between numpy.iscomplex/numpy.isreal and numpy.iscomplexobj/numpy.isrealobj. The former command is array-based and returns byte arrays of ones and zeros providing the result of the element-wise test. The latter command is object-based and returns a scalar describing the result of the test on the entire object.
Often it is required to get just the real and/or imaginary part of a complex number. While complex numbers and arrays have attributes that return those values, if one is not sure whether or not the object will be complex-valued, it is better to use the functional forms numpy.real and numpy.imag . These functions succeed for anything that can be turned into a NumPy array. Consider also the function numpy.real_if_close which transforms a complex-valued number with a tiny imaginary part into a real number.
Occasionally the need to check whether or not a number is a scalar (Python (long)int, Python float, Python complex, or rank-0 array) occurs in coding. This functionality is provided in the convenient function numpy.isscalar which returns a 1 or a 0.
6)其他有用的函數
There are also several other useful functions which should be
mentioned. For doing phase processing, the functions angle
,
and unwrap
are useful. Also, the linspace
and
logspace
functions return equally spaced samples in a linear or
log scale. Finally, it’s useful to be aware of the indexing
capabilities of NumPy. Mention should be made of the function
select
which extends the functionality of where
to
include multiple conditions and multiple choices. The calling
convention is select(condlist, choicelist, default=0)
. numpy.select
is a vectorized form of the multiple if-statement. It allows rapid
construction of a function which returns an array of results based on
a list of conditions. Each element of the return array is taken from
the array in a choicelist
corresponding to the first condition in
condlist
that is true. For example:
>>> x = np.arange(10)
>>> condlist = [x<3, x>5]
>>> choicelist = [x, x**2]
>>> np.select(condlist, choicelist)
array([ 0, 1, 2, 0, 0, 0, 36, 49, 64, 81])
Some additional useful functions can also be found in the module
scipy.special
. For example the factorial
and comb
functions compute n!-point approximation to the derivative of
order o. These weights must be multiplied by the function
corresponding to these points and the results added to obtain the
derivative approximation. This function is intended for use when only
samples of the function are available. When the function is an object
that can be handed to a routine and evaluated, the function
derivative
can be used to automatically evaluate the object at
the correct points to obtain an N-point approximation to the o-th
derivative at a given point.