PyTorch 01：深度學習工具 PyTorch 簡介

在此 notebook 中，你將瞭解 PyTorch，一款用於構建和訓練神經網絡的框架。PyTorch 在很多方面都和 Numpy 數組很像。畢竟，這些 Numpy 數組也是張量。PyTorch 會將這些張量當做輸入並使我們能夠輕鬆地將張量移到 GPU 中，以便在訓練神經網絡時加快處理速度。它還提供了一個自動計算梯度的模塊（用於反向傳播），以及另一個專門用於構建神經網絡的模塊。總之，與 TensorFlow 和其他框架相比，PyTorch 與 Python 和 Numpy/Scipy 堆棧更協調。

神經網絡

深度學習以人工神經網絡爲基礎。人工神經網絡大致產生於上世紀 50 年代末。神經網絡由多個像神經元一樣的單個部分組成，這些部分通常稱爲單元或直接叫做“神經元”。每個單元都具有一定數量的加權輸入。我們對這些加權輸入求和，然後將結果傳遞給激活函數，以獲得單元的輸出。

數學公式如下所示：

$y = f(w_1 x_1 + w_2 x_2 + b)$

$y = f\left(\sum_i w_i x_i +b \right)$

對於向量來說，爲兩個向量的點積/內積：

$h = \begin{bmatrix} x_1 \, x_2 \cdots x_n \end{bmatrix} \cdot \begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix}$

張量

實際上神經網絡計算只是對張量進行一系列線性代數運算，張量是矩陣的泛化形式。向量是一維張量，矩陣是二維張量，包含 3 個索引的數組是三維張量（例如 RGB 彩色圖像）。神經網絡的基本數據結構是張量，PyTorch（以及幾乎所有其他深度學習框架）都是以張量爲基礎。

這些是基本知識，我們現在來看 PyTorch 如何構建簡單的神經網絡。

# First, import PyTorch
import torch

def activation(x):
    """ Sigmoid activation function 
    
        Arguments
        ---------
        x: torch.Tensor
    """
    return 1/(1+torch.exp(-x))

### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 5))
# True weights for our data, random normal variables again
weights = torch.randn_like(features)
# and a true bias term
bias = torch.randn((1, 1))

print(features.shape)
print(weights.shape)

torch.Size([1, 5])
torch.Size([1, 5])

我在上面生成了一些數據，我們可以使用該數據獲取這個簡單網絡的輸出。這些暫時只是隨機數據，之後我們將使用正常數據。我們來看看：

features = torch.randn((1, 5)) 創建一個形狀爲 (1, 5) 的張量，其中有 1 行和 5 列，包含根據正態分佈（均值爲 0，標準偏差爲 1）隨機分佈的值。

weights = torch.randn_like(features) 創建另一個形狀和 features 一樣的張量，同樣包含來自正態分佈的值。

最後，bias = torch.randn((1, 1)) 根據正態分佈創建一個值。

和 Numpy 數組一樣，PyTorch 張量可以相加、相乘、相減。行爲都很類似。但是 PyTorch 張量具有一些優勢，例如 GPU 加速，稍後我們會講解。請計算這個簡單單層網絡的輸出。

練習：計算網絡的輸出：輸入特徵爲 features，權重爲 weights，偏差爲 bias。和 Numpy 類似，PyTorch 也有一個對張量求和的 torch.sum() 函數和 .sum() 方法。請使用上面定義的函數 activation 作爲激活函數。

## Calculate the output of this network using the weights and bias tensors
y = activation(torch.sum(features*weights) + bias)
y = activation((features*weights).sum() + bias)
print(y)

tensor([[0.1595]])

你可以在同一運算裏使用矩陣乘法進行乘法和加法運算。推薦使用矩陣乘法，因爲在 GPU 上使用現代庫和高效計算資源使矩陣乘法更高效。

如何對特徵和權重進行矩陣乘法運算？我們可以使用 torch.mm() 或 torch.matmul()，後者更復雜，並支持廣播。如果不對features 和 weights 進行處理，就會報錯：

>> torch.mm(features, weights)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-13-15d592eb5279> in <module>()
----> 1 torch.mm(features, weights)

RuntimeError: size mismatch, m1: [1 x 5], m2: [1 x 5] at /Users/soumith/minicondabuild3/conda-bld/pytorch_1524590658547/work/aten/src/TH/generic/THTensorMath.c:2033

在任何框架中構建神經網絡時，我們都會頻繁遇到這種情況。原因是我們的張量不是進行矩陣乘法的正確形狀。注意，對於矩陣乘法，第一個張量裏的列數必須等於第二個張量裏的行數。features 和 weights 具有相同的形狀，即 (1, 5)。意味着我們需要更改 weights 的形狀，以便進行矩陣乘法運算。

注意： 要查看張量 tensor 的形狀，請使用 tensor.shape。以後也會經常用到。

現在我們有以下幾個選擇：weights.reshape()、weights.resize_() 和 weights.view()。

weights.reshape(a, b) 有時候將返回一個新的張量，數據和 weights 的一樣，大小爲 (a, b)；有時候返回克隆版，將數據複製到內存的另一個部分。
weights.resize_(a, b) 返回形狀不同的相同張量。但是，如果新形狀的元素數量比原始張量的少，則會從張量裏刪除某些元素（但是不會從內存中刪除）。如果新形狀的元素比原始張量的多，則新元素在內存裏未初始化。注意，方法末尾的下劃線表示這個方法是原地運算。要詳細瞭解如何在 PyTorch 中進行原地運算，請參閱此論壇話題。
weights.view(a, b) 將返回一個張量，數據和 weights 的一樣，大小爲 (a, b)。

我通常使用 .view()，但這三個方法對此示例來說都可行。現在，我們可以通過 weights.view(5, 1) 變形 weights，使其具有 5 行和 1 列。

練習：請使用矩陣乘法計算網絡的輸出

## Calculate the output of this network using matrix multiplication
y = activation(torch.mm(features,weights.view(5,1)) + bias)
print(y)

tensor([[0.1595]])

堆疊

這就是計算單個神經元的輸出的方式。當你將單個單元堆疊爲層，並將層堆疊爲神經元網絡後，你就會發現這個算法的強大之處。一個神經元層的輸出變成下一層的輸入。對於多個輸入單元和輸出單元，我們現在需要將權重表示爲矩陣。

底部顯示的第一個層級是輸入，稱爲輸入層。中間層稱爲隱藏層，最後一層（右側）是輸出層。我們可以再次使用矩陣從數學角度來描述這個網絡，然後使用矩陣乘法將每個單元線性組合到一起。例如，可以這樣計算隱藏層（ $h_1$ 和 $h_2$ ）：

$\vec{h} = [h_1 \, h_2] = \begin{bmatrix} x_1 \, x_2 \cdots \, x_n \end{bmatrix} \cdot \begin{bmatrix} w_{11} & w_{12} \\ w_{21} &w_{22} \\ \vdots &\vdots \\ w_{n1} &w_{n2} \end{bmatrix}$

我們可以將隱藏層當做輸出單元的輸入，從而得出這個小網絡的輸出，簡單表示爲：

$y = f_2 \! \left(\, f_1 \! \left(\vec{x} \, \mathbf{W_1}\right) \mathbf{W_2} \right)$

### Generate some data
torch.manual_seed(7) # Set the random seed so things are predictable

# Features are 3 random normal variables
features = torch.randn((1, 3))
print(features.shape)

# Define the size of each layer in our network
n_input = features.shape[1]     # Number of input units, must match number of input features
n_hidden = 2                    # Number of hidden units 
n_output = 1                    # Number of output units

# Weights for inputs to hidden layer
W1 = torch.randn(n_input, n_hidden)
# Weights for hidden layer to output layer
W2 = torch.randn(n_hidden, n_output)
print(W1.shape)

# and bias terms for hidden and output layers
B1 = torch.randn((1, n_hidden))
B2 = torch.randn((1, n_output))

print(B1.shape)

torch.Size([1, 3])
torch.Size([3, 2])
torch.Size([1, 2])

**練習：**使用權重 W1 和 W2 以及偏差 B1 和 B2 計算此多層網絡的輸出。

## Your solution here
h = activation(torch.mm(features,W1) + B1)
output = activation(torch.mm(h,W2) + B2)
print(output)

tensor([[0.3171]])

如果計算正確，輸出應該爲 tensor([[ 0.3171]])。

隱藏層數量是網絡的參數，通常稱爲超參數，以便與權重和偏差參數區分開。稍後當我們討論如何訓練網絡時會提到，層級越多，網絡越能夠從數據中學習規律並作出準確的預測。

Numpy 和 Torch 相互轉換

加分題！PyTorch 可以實現 Numpy 數組和 Torch 張量之間的轉換。Numpy 數組轉換爲張量數據，可以用 torch.from_numpy()。張量數據轉換爲 Numpy 數組，可以用 .numpy() 。

import numpy as np
a = np.random.rand(4,3)
a

array([[0.85679551, 0.93626926, 0.284956  ],
       [0.18478529, 0.05775879, 0.44416887],
       [0.68360642, 0.51638607, 0.49711792],
       [0.18593368, 0.53402654, 0.23731168]])

b = torch.from_numpy(a)
b

tensor([[0.8568, 0.9363, 0.2850],
        [0.1848, 0.0578, 0.4442],
        [0.6836, 0.5164, 0.4971],
        [0.1859, 0.5340, 0.2373]], dtype=torch.float64)

b.numpy()

array([[0.85679551, 0.93626926, 0.284956  ],
       [0.18478529, 0.05775879, 0.44416887],
       [0.68360642, 0.51638607, 0.49711792],
       [0.18593368, 0.53402654, 0.23731168]])

Numpy 數組與 Torch 張量之間共享內存，因此如果你原地更改一個對象的值，另一個對象的值也會更改。

# Multiply PyTorch Tensor by 2, in place
b.mul_(2)

tensor([[1.7136, 1.8725, 0.5699],
        [0.3696, 0.1155, 0.8883],
        [1.3672, 1.0328, 0.9942],
        [0.3719, 1.0681, 0.4746]], dtype=torch.float64)

# Numpy array matches new values from Tensor
a

array([[1.71359101, 1.87253851, 0.569912  ],
       [0.36957058, 0.11551758, 0.88833775],
       [1.36721285, 1.03277214, 0.99423583],
       [0.37186737, 1.06805308, 0.47462336]])

PyTorch 01：深度學習工具 PyTorch 簡介

神經網絡

張量

堆疊

Numpy 和 Torch 相互轉換

CORS error 但是 status code 是200 OK

壓縮上傳的GPU數據的方案

使用skopeo同步鏡像

paper之論文閱讀方法

paper專區文章彙總

機器學習18：用Keras實現遷移學習方法，原理

機器學習17：用Keras實現圖片數據增廣的方法和實踐

機器學習7：樸素貝葉斯

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結