MobileNetv2-SSDLite訓練自己的數據集(caffe-ssd安裝)

首先確保CUDACUDNN已經安裝配置好

1. 創建虛擬環境:

conda create –n py2ssd python=2.7 (我的虛擬環境名爲:fdpy2)

2.安裝基礎庫:

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler

sudo apt-get install --no-install-recommends libboost-all-dev

sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev

sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

sudo apt-get install git cmake build-essential

3.安裝protoccaffe要求protoc版本不高於2.6.1

參考:https://blog.csdn.net/lwplwf/article/details/76532804

 

先用命令 whereis protoc可以查看哪些路徑下安裝了protoc

命令which protoc可以查看默認選用protoc的路徑

命令 protoc --version可以查看當前protoc版本

系統默認的protobuf路徑爲/usr/bin/protoc

若版本不滿足,則安裝protoc 2.6.1

(1)下載

https://github.com/google/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz

(2)安裝

tar -zxvf protobuf-2.6.1.tar.gz # 解壓

cd protobuf-2.6.1/ # 進入目錄

./configure # 配置安裝文件  (默認安裝位置爲/usr/local/bin,若同時需要多個版本的protoc,也可自定義安裝位置)

make # 編譯

make check # 檢測編譯安裝的環境

sudo make install # 安裝

(3)檢查安裝版本

protoc –version

安裝成功則顯示:libprotoc 2.6.1

若出現錯誤或者還是顯示的老版本號,錯誤原因:protobuf的默認安裝library路徑是/usr/local/lib,而/usr/local/lib不在ubuntu體系默認的LD_LIBRARY_PATH裏,所以就找不到lib

則:sudo gedit ~/.bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

source ~/.bashrc

至此,protoc安裝好了

 

4. 安裝 caffe-ssd

從此鏈接下載caffe-ssd(他已經添加好了 ReLU6層): https://github.com/chuanqi305/ssd

cd caffe-ssd

sudo cp Makefile.config.example Makefile.config

sudo gedit Makefile.config

4-1:

修改Makefile.config:

Sudo gedit Makefile.config

(1)USE_CUDNN := 1

(2)USE_OPENCV := 1

(3)OPENCV_VERSION := 3

(4)CUDA_DIR := /usr/local/cuda

(5)如果是CUDA-9.0及以上的話,則將20和21的算力去除,如下:

CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \

              -gencode arch=compute_35,code=sm_35 \

              -gencode arch=compute_50,code=sm_50 \

              -gencode arch=compute_52,code=sm_52 \

              -gencode arch=compute_60,code=sm_60 \

             -gencode arch=compute_61,code=sm_61 \

              -gencode arch=compute_61,code=compute_61

(6) 注:此處爲指定虛擬環境下(最開始新建的虛擬環:py2ssd)python包的路徑,而不是大環境下的python

ANACONDA_HOME := $(HOME)/anaconda2/envs/py2ssd

PYTHON_INCLUDE := $(ANACONDA_HOME)/include \

                $(ANACONDA_HOME)/include/python2.7 \

                $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include

假如虛擬環境是python3.6版本的,第(6)點如下執行:

ANACONDA_HOME := $(HOME)/anaconda2/envs/py3

PYTHON_INCLUDE := $(ANACONDA_HOME)/include \

                $(ANACONDA_HOME)/include/python3.6m \

                $(ANACONDA_HOME)/lib/python3.6/site-packages/numpy/core/include

並將 PYTHON_LIBRARIES := boost_python-py35 python3.6m 打開,修改成這樣(系統中只有boost_python-py35,沒有boost_python-py36),重點:同時將找到虛擬環境中的libpython3.6m.so文件,複製到/usr/lib/x86_64-linux-gnu下;確認/usr/lib/x86_64-linux-gnu下的libboost_python-py35.so的文件存在,如果不存在py35後綴,而存在別的py3*文件,如libboost_python-py34.so,則修改Makefile.config中的PYTHON_LIBRARIES修改爲boost_python-py34 python3.6m。

(7) PYTHON_LIB := $(ANACONDA_HOME)/lib

(8) WITH_PYTHON_LAYER := 1  

(9)替換以下兩行

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include

LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

將上兩行換成下面:

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial

LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

(10) BUILD_DIR := build

(11) TEST_GPUID := 0

(12) Q ?= @

4-2:

修改Makefile:

sudo gedit Makefile

(1)將下面一行替換成下面一行

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5          

改爲下面:

LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

(2) 指定protoc路徑

$(Q)protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<

改爲下面:

$(Q)/usr/local/bin/protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<

(3)將這行:

#NVCCFLAGS += -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)

修改如下:

NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)

注意:假如你是在python3的虛擬環境下配置caffe的話,還應將Makelfile的215行左右的PYTHON_LIBRARY := boost_python python2.7   修改爲 PYTHON_LIBRARY := boost_python python35 (當你的python爲python3.6時)

5.編譯caffe

make all -j4

make test -j4

make runtest -j4

make pycaffe

6. import caffe  (添加環境變量)

若果提示ImportError: No module named caffe,需要把caffe下的Python路徑導入環境變量中去。sudo vim ~/.bashrc,最後一行加上export PYTHONPATH="/home/lz/Documents/caffe-ssd/python:$PYTHONPATH",export LD_LIBRARY_PATH=/home/lz/Documents/caffe-ssd/build/lib:$LD_LIBRARY_PATH這裏的路徑寫上你自己的路徑,記得source ~/.bashrc。否則的話只能在這個目錄下執行Python,導入caffe了。

 

錯誤錦集:

 

1.  報錯  nvcc fatal : Unknown option ‘fPIC’

解決:

NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)  

在-Xcompiler前少了一個空格

 

2. 報錯  PyErr_Print’未定義的引用; Py_NoneStruct等未定義,修改如下,問題解決

解決:PYTHON_LIBRARIES := boost_python-py35 python3.6m  (我的python是3.6的)

 

3. 報錯F0104 17:15:55.187031 27536 math_functions.cu:42] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0)  CUBLAS_STATUS_EXECUTION_FAILED

解決:這個錯誤一般是出在make runtest期間,可停掉其他終端的進程,make clean後再 重新編譯即可通過

 

4. 報錯

.build_release/lib/libcaffe.so: undefined reference to `boost::cpp_regex_traits<char>::toi(char const*&, char const*, int) const'

.build_release/lib/libcaffe.so: undefined reference to `boost::re_detail::get_default_error_string(boost::regex_constants::error_type)'

.build_release/lib/libcaffe.so: undefined reference to `boost::re_detail::cpp_regex_traits_implementation<char>::transform(char const*, char const*) const'

.build_release/lib/libcaffe.so: undefined reference to `boost::re_detail::put_mem_block(void*)'

.build_release/lib/libcaffe.so: undefined reference to `boost::basic_regex<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::do_assign(char const*, char const*, unsigned int)'

.build_release/lib/libcaffe.so: undefined reference to `boost::re_detail::raise_runtime_error(std::runtime_error const&)'

解決:交叉編譯caffe時,遇見boost函數undefined reference to `boost::xxxxxx

修改makefile文件,將需要的boost::xxxxxx庫,加到LIBRARIES後面,修改Makefile 中的LIBRARIES ,將boost_regex加入進去

# We will also explicitly add stdc++ to the link target.

LIBRARIES +=  boost_regex boost_atomic boost_thread stdc++

 

5.報錯

CXX/LD -o .build_release/test/test_all.testbin src/caffe/test/test_caffe_main.cpp

.build_release/cuda/src/caffe/test/test_im2col_kernel.o:在函數‘caffe::Im2colKernelTest_Test2D_Test::TestBody()’中:

tmpxft_000065d0_00000000-5_test_im2col_kernel.compute_61.cudafe1.cpp:(.text._ZN5caffe28Im2colKernelTest_Test2D_TestIdE8TestBodyEv[_ZN5caffe28Im2colKernelTest_Test2D_TestIdE8TestBodyEv]+0xd1f):對‘void caffe::im2col_gpu_kernel(int, double const*, int, int, int, int, int, int, int, int, int, int, int, int, double*)’未定義的引用

.build_release/cuda/src/caffe/test/test_im2col_kernel.o:在函數‘caffe::Im2colKernelTest_Test2D_Test::TestBody()’中:

解決:答案在這裏 https://github.com/BVLC/caffe/issues/6790

 

6. 當 sudo apt-get install libboost-all-dev時:

報錯:

The following packages have unmet dependencies:

 libboost-all-dev : Depends: libboost-iostreams-dev but it is not going to be installed

                    Depends: libboost-python-dev but it is not going to be installed

                    Depends: libboost-regex-dev but it is not going to be installed

E: Unable to correct problems, you have held broken packages.

解決:

查看 hold packages

$ dpkg --get-selections | grep hold

如果沒有,使用 aptitude 安裝

$ sudo apt-get install aptitude

$ sudo aptitude install libboost-all-dev

或者更換源

 

7. 報錯  

[libprotobuf FATAL google/protobuf/stubs/common.cc:78] This program was compiled against version 2.6.1 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.0.0).  Contact the program author for an update.  If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library.  (Version verification failed in "/build/mir-O8_xaj/mir-0.26.3+16.04.20170605/obj-x86_64-linux-gnu/src/protobuf/mir_protobuf.pb.cc".)

解決:

這就是protoc 版本不匹配或者衝突,文章最開始有講述如何安裝protoc, 可參考:https://blog.csdn.net/zhou4411781/article/details/100676193

8. 報錯  Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERRO

解決: 更換CUDA 和CUDNN版本,我的從9.0 更換至9.2,問題解決

 

9. 報錯 src/caffe/data_transformer.cpp:2:33: fatal error: opencv2/core/core.hpp: No such file or directory

compilation terminated.

Makefile:580: recipe for target '.build_release/src/caffe/data_transformer.o' failed

make: *** [.build_release/src/caffe/data_transformer.o] Error 1

make: *** Waiting for unfinished jobs....

解決:sudo apt-get install libopencv-dev

 

10. 報錯  cannot find -lopencv_imgcodecs

解決:imgcodecs 是opencv3裏帶有的,但是我的opencv是3版本的,不知道爲啥報錯,將Makefile.config 裏的,use_opencv = 1 打開,同時將opencv_version = 3註釋掉,勉強不報錯了

 

11. 報錯

.build_release/tools/caffe: error while loading shared libraries: libcudart.so.9.0: cannot open shared object file: No such file or directory

解決: sudo cp /usr/local/cuda/lib64/libcudart.so.9.0 /usr/local/lib/libcudart.so.9.0 && sudo ldconfig (應該是CUDA 安裝好後未添加環境變量)

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章