Galaxy ZOO銀河星空圖的識別分類


本文是以2015年,Galaxy-zoo星空圖分類展開敘述,主要歸納了論文中的重點方法和技巧

http://benanne.github.io/2014/04/05/galaxy-zoo.html


挑戰賽: Kaggle 

素材來源:Galaxy Zoo users (zooites) would classify images of galaxies from the Sloan Digital Sky Survey.

users are asked to describe the morphology of galaxies based on images.  The questions form a decision tree which is shown in the figure below, 



My solution: convnets

my solution is based around convolutional neural networks (convnets)

Transfer learning by pre-training a deep neural network on another dataset (say, ImageNet), chopping off the top layer and then training a new classifier,  There were no requests to use external data in the competition forums (a requirement to be allowed to use it), so I guess nobody tried this approach.

Overfitting

As Geoffrey Hinton has been known to say, if you’re not overfitting, your network isn’t big enough. avoiding overfitting.

I tackled this problem with three orthogonal approaches:

  • data augmentation
  • dropout and weight norm constraints
  • modifying the network architecture to increase parameter sharing
The best model I found has about 42 million parameters.儘管overfiting,但要表現仍然很好,可以逐步改進他。


Software and hardware

I used scikit-image for preprocessing and augmentation
I used PythonNumPy and Theano to implement my solution.

Preprocessing and data augmentation

Cropping and downsampling

424x424 colour JPEG images, along with 37 weighted probabilities 
so I cropped all images to 207x207. I then downsampled them 3x to 69x69
裁剪,下采樣,用了3種方式縮到69*69

Exploiting spatial invariances

Images of galaxies are rotation invariant: 
They are also scale invariant and translation invariant to a limited extent

Each training example was perturbed before presenting it to the network by randomly scaling it, rotating it, translating it and optionally flipping it. I used the following parameter ranges:

  • rotation: random with angle between 0° and 360° (uniform)
  • translation: random with shift between -4 and 4 pixels (relative to the original image size of 424x424) in the x and y direction (uniform)
  • zoom: random with scale factor between 1/1.3 and 1.3 (log-uniform)
  • flip: yes or no (bernoulli)
Because both the initial downsampling to 69x69 and the random perturbation are affine transforms, they could be combined into one affine transformation step (I used scikit-image for this). This sped up things significantly and reduced information loss.
四種變換全部集中在一個Affine tansforms中。最後形成69*69的圖.

Colour perturbation

顏色擾動變化
he colour of the images was changed as described in Krizhevsky et al. 2012the first component had a much larger eigenvalue than the other two。一個通道比其他2個通道有更大的特徵值。
 the standard deviation for the scale factor alpha was set to 0.5.??沒懂這意思,



The model has 7 layers: 4 convolutional layers and 3 dense layers. All convolutional layers include a ReLU nonlinearity (i.e. f(x) = max(x, 0)). The first, second and fourth convolutional layers are followed by 2x2 max-pooling. 
the convolutional part of the network is applied to 16 different parts of the input image

The dense part consists of two maxout layers with 2048 units (Goodfellow et al. 2013), both of which take the maximum over pairs of linear filters (so 4096 linear filters in total). 

. Using maxout here instead of regular dense layers with ReLUs helped to reduce overfitting a lot, compared to dense layers with 4096 linear filters. Using maxout in the convolutional part of the network as well proved too computationally intensive.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章