Ubuntu 16.04 Anaconda3+opencv3.4.6+Caffe+多GPU+CUDA10.1
默認已經安裝NVIDIA顯卡驅動以及CUDA、cudnn
博主的安裝環境是是Ubuntu16.04 、NVIDIA 2080TI最新顯卡、CUDA10.1、CUDNN10.1
如果沒有安裝請自行解決相關安裝
Anaconda安裝
wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-4.2.0-Linux-x86_64.sh #如果沒有,則需要安裝wget
bash Anaconda3-4.2.0-Linux-x86_64.sh #安裝anaconda,一路yes
其他問題請查閱相關資料,建議添加中科大鏡像,速度要快點、
Caffe下載
sudo apt install git
git clone https://github.com/BVLC/caffe.git #下載linux版的caffe
這裏下載速度真的慢,可以直接去網站上下載,然後進行解壓
相關依賴庫安裝
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install -y build-essential cmake git pkg-config
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install -y libatlas-base-dev
sudo apt-get install -y --no-install-recommends libboost-all-dev
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev
這下面的操作能執行就執行,其實也只是爲了規避一些報錯,雖然下面還會報很多很多錯誤。。。。。。。。。
conda install libgcc
conda install protobuf
conda install -c menpo opencv3 #安裝opencv3
opencv安裝
這裏根據CUDA版本問題,樓主的版本是CUDA10.1。
opencv3.2一直無法安裝,所以建議安裝opencv-3.4.6.在這裏折騰了很久很久,親測,安裝3.4.6版本沒有問題
- 進入官網 : http://opencv.org/releases.html , 選擇 3.4.6 版本的 source , 下載 opencv-3.4.6.zip
- 解壓到你要安裝的位置,命令行進入已解壓的文件夾 opencv-3.4.6目錄下,執行:
mkdir build #創建編譯的文件目錄
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
# cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D FORCE_VTK=ON -D WITH_TBB=ON -D WITH_V4L=ON -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_CUBLAS=ON -D CUDA_NVCC_FLAGS="-D_FORCE_INLINES --expt-relaxed-constexpr" -D WITH_GDAL=ON -D WITH_XINE=ON -D BUILD_EXAMPLES=ON ..
make -j8 # CPU核數多就多設置一點
- 然後編譯,編譯成功後安裝:
sudo make install #安裝
- 安裝完成後通過查看 opencv 版本驗證是否安裝成功:
pkg-config --modversion opencv
caffe文件修改
在此之前,我們得加入以下環境變量
cd /home/user
gedit ~/.bashrc
在文件末尾添加以下兩行(注意 usr爲博主自己用戶名,筆者這裏爲amax,請根據情況自行修改)
export LD_LIBRARY_PATH=/home/amax/anaconda3/lib:$LD_LIBRARY_PATH
export PYTHONPATH=/home/amax/caffe/python:$PYTHONPATH
然後
source ~/.bashrc
然後進入caffe目錄下
cd /home/user/caffe #進入caffe目錄
cp Makefile.config.example Makefile.config
官方推薦用kate修改make文件,不過 gedit也行
sudo apt-get install kate
博主vi也用不慣,就用low b gedit了
接着
Makefile.config文件修改
sudo gedit Makefile.config
1.查找
USE_CUDNN := 1
USE_OPENCV := 1
OPENCV_VERSION := 3
WITH_PYTHON_LAYER := 1 (這一句大概在94行 自行查找 ctrl+F)
全部取消註釋
39-45行修改,刪除掉compute_20 21
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
2.然後修改python目錄,因爲採用的是anaconda,所以
註釋掉原先的python2.7引用目錄,修改爲anaconda
大概70-87行,這裏的usr不修改
# PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := $(HOME)/anaconda3
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python3.7m \
$(ANACONDA_HOME)/lib/python3.7m/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
PYTHON_LIBRARIES := boost_python-py35 python3.5m
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
*注意PYTHON_LIBRARIES
修改boost_python-py35,在/usr/lib/x86_64-linux-gnu目錄下查看是否有libboost_python-py35.so*
3. 大概97-100行
註釋原先的INCLUDE_DIRS
和LIBRARY_DIRS
修改爲,並添加LINKFLAGS := -Wl,-rpath,$(ANACONDA_HOME)/lib
# Whatever else you find you need goes here.
# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
LINKFLAGS := -Wl,-rpath,$(ANACONDA_HOME)/lib #添加此行
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
ok! Makefile.config修改完成!保存關閉!
4. cuda10.1用戶請注意
終端輸入
find . -type f -exec sed -i -e 's^"hdf5.h"^"hdf5/serial/hdf5.h"^g' -e 's^"hdf5_hl.h"^"hdf5/serial/hdf5_hl.h"^g' '{}' \;
cd /usr/lib/x86_64-linux-gnu
sudo ln -s libhdf5_serial.so.10.1.0 libhdf5.so
sudo ln -s libhdf5_serial_hl.so.10.0.2 libhdf5_hl.so
5. 進入caffe/python環境目錄下
cd python
for req in $(cat requirements.txt); do pip install $req; done
如果報錯
for req in $(cat requirements.txt); do sudo -H pip install $req --upgrade; done
Makefile文件修改
返回caffe目錄
cd ..
sudo gedit Makefile
修改429行
1. 註釋NVCCFLAGS += -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
修改爲
NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
2. LIBRARIES
修改 大約181行左右
註釋掉
# LIBRARIES += glog gflags protobuf boost_system boost_filesystem m
修改爲
LIBRARIES += glog gflags protobuf leveldb snappy \ lmdb boost_system boost_filesystem hdf5_hl hdf5 m \ opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs opencv_videoio
ok!保存!
CMakeLists.txt文件修改
69行# ---[ Includes
修改,添加如下內容
set(${CMAKE_CXX_FLAGS} "-D_FORCE_INLINES ${CMAKE_CXX_FLAGS}")
至此文件修改基本完成!!!
make
make all -j48
樓主兩顆CPU,hahahahahaha 一般 -j4
make test -j48
make runtest -j48
這一步,很可能會報錯
1.
[----------] Global test environment tear-down
[==========] 1096 tests from 150 test cases ran. (106016 ms total)
[ PASSED ] 1095 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] BenchmarkTest/1.TestTimerMilliSeconds, where TypeParam = caffe::CPUDevice<double>
1 FAILED TEST
Makefile:523: recipe for target 'runtest' failed
make: *** [runtest] Error 1
因爲是多GPU所以
export MKL_CBWR=AUTO
2.
Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.8.16, library is 1.8.18
SUMMARY OF THE HDF5 CONFIGURATION
=================================
General Information:
-------------------
HDF5 Version: 1.8.18
Configured on: Thu Nov 16 20:07:06 UTC 2017
Configured by: root@0dbf0ee2-5a1e-455f-760a-510cd936b9c5
Configure mode: production
Host system: x86_64-conda_cos6-linux-gnu
Uname information: Linux 0dbf0ee2-5a1e-455f-760a-510cd936b9c5 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Byte sex: little-endian
Libraries: static, shared
Installation point: /home/amax/anaconda3
Compiling Options:
------------------
Compilation Mode: production
C Compiler: /home/amax/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc
CFLAGS: -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -I/home/amax/anaconda3/include
H5_CFLAGS: -std=c99 -pedantic -Wall -W -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align -Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs -Winline -O -finline-functions
AM_CFLAGS:
CPPFLAGS: -D_FORTIFY_SOURCE=2 -O2
H5_CPPFLAGS: -D_GNU_SOURCE -D_POSIX_C_SOURCE=200112L -DNDEBUG -UH5_DEBUG_API
AM_CPPFLAGS: -I/home/amax/anaconda3/include
Shared C Library: yes
Static C Library: yes
Statically Linked Executables: no
LDFLAGS: -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,-rpath,/home/amax/anaconda3/lib -L/home/amax/anaconda3/lib
H5_LDFLAGS:
AM_LDFLAGS: -L/home/amax/anaconda3/lib
Extra libraries: -lrt -lpthread -lz -ldl -lm
Archiver: /home/amax/anaconda3/bin/x86_64-conda_cos6-linux-gnu-ar
Ranlib: /home/amax/anaconda3/bin/x86_64-conda_cos6-linux-gnu-ranlib
Debugged Packages:
API Tracing: no
Languages:
----------
Fortran: yes
Fortran Compiler: /home/amax/anaconda3/bin/x86_64-conda_cos6-linux-gnu-gfortran
Fortran 2003 Compiler: yes
Fortran Flags:
H5 Fortran Flags:
AM Fortran Flags:
Shared Fortran Library: yes
Static Fortran Library: yes
C++: yes
C++ Compiler: /home/amax/anaconda3/bin/x86_64-conda_cos6-linux-gnu-c++
C++ Flags: -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -pipe -I/home/amax/anaconda3/include
H5 C++ Flags:
AM C++ Flags:
Shared C++ Library: yes
Static C++ Library: yes
Features:
---------
Parallel HDF5: no
High Level library: yes
Threadsafety: yes
Default API Mapping: v18
With Deprecated Public Symbols: yes
I/O filters (external): deflate(zlib)
MPE: no
Direct VFD: no
dmalloc: no
Clear file buffers before write: yes
Using memory checker: no
Function Stack Tracing: no
Strict File Format Checks: no
Optimization Instrumentation: no
Bye...
*** Aborted at 1563351176 (unix time) try "date -d @1563351176" if you are using GNU date ***
PC: @ 0x7f9e5c2c2428 gsignal
*** SIGABRT (@0x3e8000037bd) received by PID 14269 (TID 0x7f9e65dd3440) from PID 14269; stack trace: ***
@ 0x7f9e5c668390 (unknown)
@ 0x7f9e5c2c2428 gsignal
@ 0x7f9e5c2c402a abort
@ 0x7f9e60caaf7c H5check_version
@ 0x4a9dc9 caffe::HDF5OutputLayerTest_TestForward_Test<>::TestBody()
@ 0x95b523 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x954b3a testing::Test::Run()
@ 0x954c88 testing::TestInfo::Run()
@ 0x954d65 testing::TestCase::Run()
@ 0x95603f testing::internal::UnitTestImpl::RunAllTests()
@ 0x956363 testing::UnitTest::Run()
@ 0x47129d main
@ 0x7f9e5c2ad830 __libc_start_main
@ 0x479249 _start
@ 0x0 (unknown)
Makefile:548: recipe for target 'runtest' failed
make: *** [runtest] 已放棄 (core dumped)
這個問題很麻煩
Headers are 1.8.16, library is 1.8.18
HDF5 library and header mismatch error
頭文件與庫文件匹配的錯誤
conda list
有兩個方法
conda remove hdf5
再重新sudo make runtest -j
但是博主測試了發現報錯
error while loading shared libraries: libhdf5_hl.so.100: cannot open shared object file:
參考https://blog.csdn.net/qq_33144323/article/details/81275540
將Anaconda lib的路徑添加到環境變量LD_LIBRARY_PATH
中
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:{anaconda_dir}/lib
source ~/.bashrc
可是已經添加過,所以重新創建軟鏈接發現最終還是HDF5 library and header 匹配問題。
2. 所以第二種,安裝一樣的版本。
還是先卸載hdf5conda remove hdf5
conda search hdf5
很可惜 ,沒有1.8.16版本.怎麼辦???
所以執行conda install hdf5=1.8.16
失敗,
查閱官網https://anaconda.org/biobuilds/hdf5
conda install -c biobuilds hdf5
安裝1.8.16成功!!!
make pycaffe
make distribute
python #python環境測試
python
>>> import caffe as cf
>>> print(cf.__version__)
1.0.0
完成!!
—————————————————————————————————————————————————————
[1]:https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide#the-gpu-support-prerequisites
[2]: https://blog.csdn.net/lilin020401/article/details/94674866#第7步 安裝 opencv3.4.6
[3]: https://blog.csdn.net/baidu_28342107/article/details/82022342
[4]: https://www.cnblogs.com/zjutzz/p/5716453.html?utm_source=itdadao&utm_medium=referral
[5]:https://www.cnblogs.com/bitterain/p/10529030.html