Caffe 實例手寫數字mnist訓練與測試過程（Windows + CPU Only）

1、原英文網址TrainingLeNet on MNIST with Caffe

基本環境 Windows下CPU運行

2、程序準備

2.1 Caffe準備

從https://github.com/BVLC/caffe/tree/windows下載Prebuilt Release和源代碼，解壓後將Prebuilt Release中bin目錄下的文件複製到下載的Caffe源程序Source目錄下（爲了後面執行方便，也可以不復制了）。

需要執行Caffe.exe和convert_mnist_data.exe在Prebuilt Release中，而examples在源代碼中，所以兩個都要下載

2.2 cygwin安裝

安裝cygwin（windows版本，爲了執行sh文件）

3、數據準備

3.1 訓練數據

從http://yann.lecun.com/exdb/mnist/下載，並解壓。注意解壓可能會自動修改文件名，一定要手工修改文件名與以下一致起來

train-images-idx3-ubyte: 訓練集樣本 (9912422 bytes)

train-labels-idx1-ubyte: 訓練集對應標註 (28881 bytes)

t10k-images-idx3-ubyte: 測試集圖片 (1648877 bytes)

t10k-labels-idx1-ubyte: 測試集對應標註 (4542 bytes)

3.2、create_mnist.sh

create_mnist.sh是將下載的數據轉換爲lmdb格式，具體轉換原因，可以從網上搜索，在執行之前需要修改examples\mnist\Create_mnist.sh，主要是設置輸入輸出數據目錄設置。修改後內容如下

#!/usr/bin/envsh
# Thisscript converts the mnist data into lmdb/leveldb format,
#depending on the value assigned to $BACKEND.
set -e
 
EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=bin   # convert_mnist_data.exe所在目錄
 
BACKEND="lmdb"
 
echo"Creating ${BACKEND}..."
 
rm -rf$EXAMPLE/mnist_train_${BACKEND}
rm -rf$EXAMPLE/mnist_test_${BACKEND}
 
#bin 修改爲exe（因爲prebuilt是exe文件不是bin文件）
$BUILD/convert_mnist_data.exe$DATA/train-images-idx3-ubyte \
  $DATA/train-labels-idx1-ubyte$EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.exe$DATA/t10k-images-idx3-ubyte \
  $DATA/t10k-labels-idx1-ubyte$EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}
 
echo "Done."

3.3 生成lmdb格式數據庫

在cygwin中執行 bashexamples/mnist/create_mnist.sh生成lmdb數據庫，在examples/mnist/目錄下

4、數據訓練

數據訓練，使用train_lenet.sh進行訓練。

4.1 train_lenet.sh

#!/usr/bin/envsh
set-e
./bin/caffetrain --solver=examples/mnist/lenet_solver.prototxt $@

這裏需要先確定你的caffe.exe在什麼路徑位置。

4.2 執行train_lenet.sh

在cygwin中執行

bash examples/mnist/create_mnist.sh

訓練結果爲四個文件。

lenet_iter_5000.caffemodel

lenet_iter_5000.solverstate

lenet_iter_10000.caffemodel

lenet_iter_10000.solverstate

5、測試圖片

根據訓練結果可以測試現有圖片是屬於哪個類別。參見博客用caffe訓練好的lenet_iter_10000.caffemodel測試單張mnist圖片

5.1 測試參數與輸出文件準備

5.1.1 deploy.prototxt 文件

用訓練好的caffemodel來測試單張圖片需要一個deploy.prototxt文件來指定網絡的模型構造。事實上deploy.prototxt文件與lenet_train_test.prototxt文件類似，只是首尾有些差別。仿照博客 http://www.cnblogs.com/denny402/p/5685818.html 中的教程用deploy.py文件來生成deploy.prototxt文件。

直接用別人生成好的也可以。注意在depoy.prototxt文件中指定正確的該圖片的通道數。

可以直接使用deploy.prototxt 結果

name: "LeNet"
/*原來訓練與測試兩層數據層*/
/*layer {
 name: "mnist"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TRAIN
 }
 transform_param {
   scale: 0.00390625
 }
 data_param {
   source: "examples/mnist/mnist_train_lmdb"
   batch_size: 64
   backend: LMDB
 }
}
layer {
 name: "mnist"
 type: "Data"
 top: "data"
 top: "label"
 include {
   phase: TEST
 }
 transform_param {
   scale: 0.00390625
 }
 data_param {
   source: "examples/mnist/mnist_test_lmdb"
   batch_size: 100
   backend: LMDB
 }
}*/
 
/*被替換成如下*/
 
layer {
  name:"data"
 type: "Input"
 top: "data"
 input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }
}
 
/*卷積層與全連接層中的權值學習率，偏移值學習率，偏移值初始化方式,因爲這些值在caffemodel文件中已經提供*/
layer {
  name:"conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 convolution_param {
   num_output: 20
   kernel_size: 5
   stride: 1
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"pool1"
 type: "Pooling"
 bottom: "conv1"
 top: "pool1"
 pooling_param {
   pool: MAX
   kernel_size: 2
   stride: 2
  }
}
layer {
  name:"conv2"
 type: "Convolution"
 bottom: "pool1"
 top: "conv2"
 convolution_param {
   num_output: 50
   kernel_size: 5
   stride: 1
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"pool2"
 type: "Pooling"
 bottom: "conv2"
 top: "pool2"
 pooling_param {
   pool: MAX
   kernel_size: 2
   stride: 2
  }
}
layer {
  name:"ip1"
 type: "InnerProduct"
 bottom: "pool2"
 top: "ip1"
 inner_product_param {
   num_output: 500
   weight_filler {
     type: "xavier"
   }
 }
}
layer {
  name:"relu1"
 type: "ReLU"
 bottom: "ip1"
 top: "ip1"
}
layer {
  name:"ip2"
 type: "InnerProduct"
 bottom: "ip1"
 top: "ip2"
 inner_product_param {
   num_output: 10
   weight_filler {
     type: "xavier"
   }
 }
}
 
/*刪除了原有的測試模塊的測試精度層*/
 
/*輸出層的類型由SoftmaxWithLoss變成Softmax，訓練是輸出時是loss，應用時是prob。*/
layer {
  name:"prob"
 type: "Softmax"
 bottom: "ip2"
 top: "prob"
}

注意在depoy.prototxt文件中指定正確的該圖片的通道數。

5.1.2 準備一個均值文件

因爲classify.py中的測試接口caffe.Classifier需要訓練圖片的均值文件作爲輸入參數，而實際lenet-5訓練時並未計算均值文件，所以這裏創建一個全0的均值文件輸入。編寫一個zeronp.py文件如下，

import numpy as np
zeros=np.zeros((28,28,1),dtype=np.float32)
np.save('meanfile.npy',zeros)  #k=channels,H=height,W=width

執行 python zeronp.py

生成均值文件 meanfile.npy。這裏注意寬高要與輸入測試的圖片寬高一致。這裏參考：https://github.com/BVLC/caffe/issues/320

5.2 準備分類Python文件Classify.py

修改classify.py（在原有的文件上修改保存爲classifymnist.py文件）（修改的文件注意其中路徑需要與你執行目錄相對路徑一致）

#!/usr/bin/envpython
"""
classify.pyis an out-of-the-box image classifer callable from the command line.
 
Bydefault it configures and runs the Caffe reference ImageNet model.
"""
import numpyas np
import os
import sys
import argparse
import glob
import time
import pandasas pd#插入數據分析包
 
import caffe
 
def main(argv):
    pycaffe_dir = os.path.dirname(__file__)
 
    parser = argparse.ArgumentParser()
    # Required arguments: input and output files.
    parser.add_argument(
        "input_file",
        help="Input image, directory, or npy."
    )
    parser.add_argument(
        "output_file",
        help="Output npy filename."
    )
    # Optional arguments.
    parser.add_argument(
        "--model_def",
        default=os.path.join(pycaffe_dir,
                "deploy.prototxt"),#指定lenet-5的deploy.prototxt模型位置
        help="Model definition file."
    )
    parser.add_argument(
        "--pretrained_model",
        default=os.path.join(pycaffe_dir,
                "lenet_iter_10000.caffemodel"),#指定lenet-5的caffemodel模型位置
        help="Trained model weights file."
    )
#######新增^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    parser.add_argument(
        "--labels_file",
        default=os.path.join(pycaffe_dir,
                "synset_words.txt"),#指定輸出結果對應的類別名文件
        help="mnist result words file"
    )
    parser.add_argument(
        "--force_grayscale",
        action='store_true',  #增加一個變量將輸入圖像強制轉化爲灰度圖，因爲lenet-5訓練用的就是灰度圖
        help="Converts RGB images down to single-channelgrayscale versions," +
                   "useful for single-channel networks likeMNIST."
    )
    parser.add_argument(
        "--print_results",
        action='store_true',#輸入參數要求打印輸出結果
        help="Write output text to stdout rather than serializingto a file."
    )
#######新增vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    parser.add_argument(
        "--gpu",
        action='store_true',
        help="Switch for gpu computation."
    )
    parser.add_argument(
        "--center_only",
        action='store_true',
        help="Switch for prediction from center crop aloneinstead of " +
             "averaging predictions across crops (default)."
    )
    parser.add_argument(
        "--images_dim",
        default='28,28',    #指定圖像高與寬
        help="Canonical 'height,width' dimensions of inputimages."
    )
    parser.add_argument(
        "--mean_file",
        default=os.path.join(pycaffe_dir,
                             'meanfile.npy'),#指定均值文件
        help="Data set image mean of [Channels x Height x Width]dimensions " +
             "(numpy array). Set to '' for no meansubtraction."
    )
    parser.add_argument(
        "--input_scale",
        type=float,
        help="Multiply input features by this scale to finishpreprocessing."
    )
    parser.add_argument(
        "--raw_scale",
        type=float,
        default=255.0,
        help="Multiply raw input by this scale beforepreprocessing."
    )
    parser.add_argument(
        "--channel_swap",
        default='2,1,0',
        help="Order to permute input channels. The defaultconverts " +
             "RGB -> BGR since BGR is the Caffe default by wayof OpenCV."
    )
    parser.add_argument(
        "--ext",
        default='jpg',
        help="Image file extension to take as input when adirectory " +
             "is given as the input file."
    )
    args = parser.parse_args()
 
    image_dims = [int(s) for sin args.images_dim.split(',')]
 
    mean, channel_swap = None,None
    ifnot args.force_grayscale:
        ifargs.mean_file:
            mean =np.load(args.mean_file).mean(1).mean(1)
        ifargs.channel_swap:
            channel_swap = [int(s) for s in args.channel_swap.split(',')]
 
    if args.gpu:
        caffe.set_mode_gpu()
        print("GPU mode")
    else:
        caffe.set_mode_cpu()
        print("CPU mode")
 
    # Make classifier.
    classifier =caffe.Classifier(args.model_def, args.pretrained_model,
            image_dims=image_dims, mean=mean,
            input_scale=args.input_scale,raw_scale=args.raw_scale,
            channel_swap=channel_swap)
 
    # Load numpy array (.npy), directory glob (*.jpg), or image file.
    args.input_file =os.path.expanduser(args.input_file)
    ifargs.input_file.endswith('npy'):
        print("Loading file: %s" % args.input_file)
        inputs = np.load(args.input_file)
    elifos.path.isdir(args.input_file):
        print("Loading folder: %s" % args.input_file)
        inputs =[caffe.io.load_image(im_f)
                 forim_fin glob.glob(args.input_file +'/*.' +args.ext)]
    else:
        print("Loading image file: %s" % args.input_file)
        inputs =[caffe.io.load_image(args.input_file,notargs.force_grayscale)]#強制圖片爲灰度圖
 
    print("Classifying %d inputs." % len(inputs))
 
    # Classify.
    start = time.time()
    scores = classifier.predict(inputs,not args.center_only).flatten()
    print("Done in %.2f s." % (time.time() - start))
 
   #增加輸出結果打印到終端^^^^^^^^
    # print
    ifargs.print_results:
        withopen(args.labels_file)as f:
            labels_df = pd.DataFrame([{'synset_id':l.strip().split(' ')[0],'name':' '.join(l.strip().split(' ')[1:]).split(',')[0]}for l inf.readlines()])
            labels = labels_df.sort('synset_id')['name'].values
 
            indices =(-scores).argsort()[:5]
            predictions = labels[indices]
            printpredictions
            printscores
 
            meta = [(p, '%.5f' % scores[i])for i,pin zip(indices, predictions)]
            printmeta
#增加輸出結果打印到終端vvvvvvvvvvv
 
 
    # Save
    print("Saving results into %s" % args.output_file)
    np.save(args.output_file, predictions)
 
 
if __name__ =='__main__':
    main(sys.argv)

執行 python classifymnist.py

（需要另外安裝numpy pandas）

5.3 準備輸出文件synset_words.txt

synset_words.txt輸出結果標籤文件

0 Zero
1 One
2 Two
3 Three
4 Four
5 Five
6 Six
7 Seven
8 Eight
9 Nine

5.4 準備測試圖片文件

1.jpg是28X28的要識別的圖片文件，注意是黑底白字，否則無法識別

5.5 準備執行批處理文件

寫一個批處理文件runtest.bat，爲了執行方便

python classifymnist.py --print_results --force_grayscale--center_only --labels_file synset_words.txt 1.jpg resultsfile

其中1.jpg是28X28的要識別的圖片文件，注意是黑底白字，否則無法識別

6、輸出結果

Caffe 實例手寫數字mnist訓練與測試過程（Windows + CPU Only）

卡爾曼濾波 -- 從推導到應用轉發

卡爾曼濾波及其他

OPEN DNN Tensorflow ROS

基於深度學習的2D圖像目標檢測

機器人視覺抓取論文及代碼資源

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Caffe 實例 手寫數字mnist訓練與測試過程（Windows + CPU Only）

Caffe 實例手寫數字mnist訓練與測試過程（Windows + CPU Only）