神經網絡與深度學習(第一章)(四)

The architecture of neural networks 神經網絡的結構

In the next section I’ll introduce a neural network that can do a pretty good job classifying handwritten digits. In preparation for that, it helps to explain some terminology that lets us name different parts of a network. Suppose we have the network: 
在下個章節我將介紹可以很好識別手寫數字的神經網絡。在開始之前需要解釋一下網絡各部分的術語。設想我們有下面這樣一張網絡:

As mentioned earlier, the leftmost layer in this network is called the input layer, and the neurons within the layer are called input neurons. The rightmost or output layer contains the output neurons, or, as in this case, a single output neuron. The middle layer is called a hidden layer, since the neurons in this layer are neither inputs nor outputs. The term “hidden” perhaps sounds a little mysterious - the first time I heard the term I thought it must have some deep philosophical or mathematical significance - but it really means nothing more than “not an input or an output”. The network above has just a single hidden layer, but some networks have multiple hidden layers. For example, the following four-layer network has two hidden layers: 
正如之前介紹的,網絡最左邊的一層被稱爲輸入層,這一層的神經元被稱爲輸入神經元。最右邊一層或稱之爲輸出層包含輸出神經元,或者如這個示例一樣只有一個輸出神經元。中間層被稱爲隱含層,這些層上的神經元既不是輸入也不是輸出。“隱含”也許聽起來有點神祕——一開始我聽到這個名詞我就想它一定有某些哲學的或數學的含義——但是它真的就是表示“不是輸入也不是輸出”。上面的網絡僅僅只有一個隱含層,但是其他網絡具有多個隱含層。例如下面的這個四層網絡有兩個隱含層:

Somewhat confusingly, and for historical reasons, such multiple layer networks are sometimes called multilayer perceptrons or MLPs, despite being made up of sigmoid neurons, not perceptrons. I’m not going to use the MLP terminology in this book, since I think it’s confusing, but wanted to warn you of its existence. 
多少會有些困惑,而且基於歷史原因,這樣的多層網絡有時候被稱爲多層感知器或者MLPs,儘管它們由sigmoid神經元而不是感知器構成。我不準備在本書中使用MLP,因爲我認爲它是混淆不清的,但是還是要提醒你它的存在。

The design of the input and output layers in a network is often straightforward. For example, suppose we’re trying to determine whether a handwritten image depicts a “9” or not. A natural way to design the network is to encode the intensities of the image pixels into the input neurons. If the image is a 64 by 64 greyscale image, then we’d have 4,096=64×64 input neurons, with the intensities scaled appropriately between 0 and 1. The output layer will contain just a single neuron, with output values of less than 0.5 indicating “input image is not a 9”, and values greater than 0.5 indicating “input image is a 9 “. 
網絡中輸入層和輸出層的設計通常是直截了當的。例如,設想我們嘗試鑑別一個手寫數字是否是“9”。一個設計這個網絡很自然的方法就是將圖片每個像素的灰度輸入神經元。如果這個圖片是6464像素的灰度圖,那麼我們將有4,096=64×64 個輸入神經元,其輸入是縮放到0,1 之間的灰度。輸出層只有一個神經元,輸出值小於0.5 代表“輸入圖片不是9”,輸出值大於0.5 代表“輸入圖片是9”。

While the design of the input and output layers of a neural network is often straightforward, there can be quite an art to the design of the hidden layers. In particular, it’s not possible to sum up the design process for the hidden layers with a few simple rules of thumb. Instead, neural networks researchers have developed many design heuristics for the hidden layers, which help people get the behaviour they want out of their nets. For example, such heuristics can be used to help determine how to trade off the number of hidden layers against the time required to train the network. We’ll meet several such design heuristics later in this book. 
不同於網絡中輸入層和輸出層的設計通常是直截了當的,設計隱含層實在是個藝術。通常來說,很難用僅僅幾條經驗法則來總結隱含層的設計過程。相反,神經網絡的研究者開發了許多試探方法來設計隱含層,這些方法幫助人們獲得他們想要網絡擁有的性能。例如,這些方法可以被用來幫助決定如何權衡隱含層的數量和網絡的訓練時間。我們將在本書的後續章節接觸一些這樣試探方法。

Up to now, we’ve been discussing neural networks where the output from one layer is used as input to the next layer. Such networks are called feedforward neural networks. This means there are no loops in the network - information is always fed forward, never fed back. If we did have loops, we’d end up with situations where the input to the σ function depended on the output. That’d be hard to make sense of, and so we don’t allow such loops. 
目前爲止,我們已經討論了神經網絡中一層的輸出被用作下一層的輸入。這樣的網絡被稱爲前饋神經網絡。這意味着在網絡中不存在環——信息總是向前傳播的,永遠不向後。如果我們確實擁有環,我們必須結束於σ 函數的輸入依賴於輸出的情況。這個很難被理解,所以我們不允許這樣的環。

However, there are other models of artificial neural networks in which feedback loops are possible. These models are called recurrent neural networks. The idea in these models is to have neurons which fire for some limited duration of time, before becoming quiescent. That firing can stimulate other neurons, which may fire a little while later, also for a limited duration. That causes still more neurons to fire, and so over time we get a cascade of neurons firing. Loops don’t cause problems in such a model, since a neuron’s output only affects its input at some later time, not instantaneously. 
但是,還是有其他人工神經網絡模型也許有反饋環。這些模型被稱爲循環神經網絡。這些模型的思路是其中有的神經元可以在休眠之前激活有限的時間。這種激活可以激勵其他神經元,因此久而久之我們得到了串聯的神經元激活。在這種模型中環路不會引起問題,因爲這些神經元的輸出延遲一段時間才影響它的輸入,而不是立即的。

Recurrent neural nets have been less influential than feedforward networks, in part because the learning algorithms for recurrent nets are (at least to date) less powerful. But recurrent networks are still extremely interesting. They’re much closer in spirit to how our brains work than feedforward networks. And it’s possible that recurrent networks can solve important problems which can only be solved with great difficulty by feedforward networks. However, to limit our scope, in this book we’re going to concentrate on the more widely-used feedforward networks. 
循環神經網絡沒有前饋神經網絡有影響力,部分是因爲它的學習算法目前爲止能力較弱。不過循環神經網絡仍然是非常有趣的。他們比前饋神經網絡更貼近於人腦的工作模式。並且也許循環神經網絡可能可以解決目前前饋神經網絡很難解決的重要問題。但是限於本書我們將聚焦於被更廣泛使用的前饋神經網絡。

轉自:http://blog.csdn.net/forrestyanyu/article/details/54860561

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章