2017.7.11 && numpy的broadcasting解析

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. There are, however, cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation.
廣播用以描述numpy中對兩個形狀不同的陣列進行數學計算的處理機制。較小的陣列“廣播”到較大陣列相同的形狀尺度上，使它們對等以可以進行數學計算。廣播提供了一種向量化陣列的操作方式，因此Python不需要像C一樣循環。廣播操作不需要數據複製，通常執行效率非常高。然而，有時廣播是個壞主意，可能會導致內存浪費以致計算減慢。

NumPy operations are usually done on pairs of arrays on an element-by-element basis. In the simplest case, the two arrays must have exactly the same shape, as in the following example:

Numpy操作通常由成對的陣列完成，陣列間逐個元素對元素地執行。最簡單的情形是兩個陣列有一樣的形狀，例如：

>>> a = np.array([1.0, 2.0, 3.0])
>>> b = np.array([2.0, 2.0, 2.0])
>>> a * b
array([ 2.,  4.,  6.])

The result is equivalent to the previous example where b was an array. We can think of the scalar b being stretchedduring the arithmetic operation into an array with the same shape as a. The new elements in b are simply copies of the original scalar. The stretching analogy is only conceptual. NumPy is smart enough to use the original scalar value without actually making copies, so that broadcasting operations are as memory and computationally efficient as possible.

上面兩種結果是一樣的，我們可以認爲尺度值b在計算時被延展得和a一樣的形狀。延展後的b的每一個元素都是原來尺度值的複製。延展的類比只是一種概念性的。實際上，Numpy並不需要真的複製這些尺度值，所以廣播運算在內存和計算效率上儘量高效。

The code in the second example is more efficient than that in the first because broadcasting moves less memory around during the multiplication (b is a scalar rather than an array).

上面的第二個例子比第一個更高效，因爲廣播在乘法計算時動用更少的內存。

General Broadcasting Rules

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when

they are equal, or
one of them is 1
對兩個陣進行操作時，NumPy逐元素地比較他們的形狀，從後面的維度向前執行。當以下情形出現時，兩個維度是兼容的：

1，它們相等
2，其中一個是1

If these conditions are not met, a ValueError: frames are not aligned exception is thrown, indicating that the arrays have incompatible shapes. The size of the resulting array is the maximum size along each dimension of the input arrays.
如果這些條件都沒有達到，將會拋出錯誤：frames are not aligned exception，表示兩個陣列形狀不兼容。結果陣列的尺寸與輸入陣列的各維度最大尺寸相同。

Arrays do not need to have the same number of dimensions. For example, if you have a 256x256x3 array of RGB values, and you want to scale each color in the image by a different value, you can multiply the image by a one-dimensional array with 3 values. Lining up the sizes of the trailing axes of these arrays according to the broadcast rules, shows that they are compatible:

陣列不需要有相同的維度。例如，如果你有一個256x256x3的RGB陣列，你想要對每一種顏色加一個權重，你就可以乘以一個擁有3個元素的一維陣列。將兩個陣列的各維度尺寸展開，從後往前匹配，如果滿足了上面的兩個條件，則這兩個陣列是兼容的。

Image  (3d array): 256 x 256 x 3
Scale  (1d array):             3
Result (3d array): 256 x 256 x 3

When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other.

當任何一個維度是1，那麼另一個不爲1的維度將被用作最終結果的維度。也就是說，尺寸爲1的維度將延展或“複製”到與另一個維度匹配。

In the following example, both the A and B arrays have axes with length one that are expanded to a larger size during the broadcast operation:

下面的例子，A和B兩個陣列中尺寸爲1的維度在廣播過程中都被拓展了。

A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5

Here are some more examples:

更多例子：

A      (2d array):  5 x 4
B      (1d array):      1
Result (2d array):  5 x 4

A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4

A      (3d array):  15 x 3 x 5
B      (3d array):  15 x 1 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 1
Result (3d array):  15 x 3 x 5

Here are examples of shapes that do not broadcast:

下面這些例子不能廣播

A      (1d array):  3
B      (1d array):  4 # trailing dimensions do not match  #維度尺寸不兼容

A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched #倒數第二個維度不兼容

An example of broadcasting in practice:

一個實際使用中用到廣播的例子：

>>> x = np.arange(4)
>>> xx = x.reshape(4,1)
>>> y = np.ones(5)
>>> z = np.ones((3,4))

>>> x.shape
(4,)

>>> y.shape
(5,)

>>> x + y
<type 'exceptions.ValueError'>: shape mismatch: objects cannot be broadcast to a single shape

>>> xx.shape
(4, 1)

>>> y.shape
(5,)

>>> (xx + y).shape
(4, 5)

>>> xx + y
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.],
       [ 4.,  4.,  4.,  4.,  4.]])

>>> x.shape
(4,)

>>> z.shape
(3, 4)

>>> (x + z).shape
(3, 4)

>>> x + z
array([[ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.]])

Broadcasting provides a convenient way of taking the outer product (or any other outer operation) of two arrays. The following example shows an outer addition operation of two 1-d arrays:

廣播提供了一種計算外積（或者任何外部運算）的便捷的方式。下面的例子展示了兩個一維陣列外加運算。

>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[  1.,   2.,   3.],
       [ 11.,  12.,  13.],
       [ 21.,  22.,  23.],
       [ 31.,  32.,  33.]])

Here the newaxis index operator inserts a new axis into a, making it a two-dimensional 4x1 array. Combining the 4x1array with b, which has shape (3,), yields a 4x3 array.

這裏的newaxis表示加入一個新的座標軸到a中，是它成爲一個二維4x1的陣列。b的形狀爲一維（3,）結合4x1d，產生一個4x3的陣列。

See this article for illustrations of broadcasting concepts.